哈维尔·加西亚·德·莱尼兹
验证专家 in 工程
自然语言处理开发人员
哈维尔 is an engineer with over nine years of experience in AI and data science. 除了他在机器学习方面的专业知识, 自然语言处理, 软件工程, 哈维尔's unique strength lies in harmonizing business with technology. His consulting tenures at EY and 埃森哲咨询公司 have furnished him with invaluable experience, where he successfully implemented data and AI technology across diverse industries and geographies globally.
Portfolio
Experience
Availability
首选的环境
Azure, Visual Studio Code (VS Code), MacOS, Linux, Amazon Web Services (AWS), Windows
最神奇的...
...product I've developed is a Generative AI tool that structures business data, 文档, 电子邮件, 和录音,并处理了数百万份文件.
工作Experience
人工智能和全栈工程师
自由职业者
- 发达 a 网络 app that allows users to search for restaurants in a natural language based on their characteristics.
- Designed and deployed a simple architecture using AWS stack such as Amazon EC2, Amazon RDS, Amazon S3, 弹性负载均衡(ELB), 等.
- 设计了一个RAG流水线, prompt-engineering a query parsing module and keyword matching functionalities using full-text search.
首席技术官
智能检索
- 领导公司的技术战略和产品开发.
- Deployed the platform to Azure using Azure DevOps CI/CD pipelines.
- 发达 a retrieval-augmented generation (RAG) pipeline to allow search in natural language over business 文档 such as financial statements, 发票, 合同, 和更多的, 利用OpenAI的GPT服务.
首席数据科学家
EY
- Led a multidisciplinary product team of 50+ team members building AI-driven products with a focus on NLP and generative AI.
- 发达, 训练有素的, 并为不同的功能开发了多个模型, 包括布局检测, 文档分类, 命名实体识别, 以及问答和分段排序模型.
- 发达 and 训练有素的 a deep learning model (CNN) that cleans lines, 污渍, 涂鸦, and other imperfections on invoice images to improve the downstream accuracy of an 光学字符识别 engine.
- Trained a gradient-boosting classifier to evaluate the severity of changes in baseline FATCA and CRS regulatory texts compared to local implementations. 获得92分的交叉验证F1分数.5.
- Implemented a topic modeling LDA model on US FDA reports to obtain insights for a wealth and asset management firm seeking investments in pharmaceutical companies.
数据工程师
埃森哲咨询公司
- Designed and developed risk assessment processes for multichannel applications (smartphone app, 网络, 自动取款机, (西班牙国际银行的分行).
- 发达 data pipelines for the risk assessment process of credit cards and online personal loans.
- 开发SQL查询,分析风险客户数据指标.
Experience
保险理赔支付自动化
我开发了一个由光学字符识别组成的管道, 布局检测技术, and named-entity recognition (NER) models to 提取 the relevant information from the 发票 accurately. I also built the validation module to identify and validate medical diagnoses against the policyholder coverage.
最后, I developed the 提取ion confidence methodology to help determine claims reimbursements to be processed automatically or reviewed by a human, 取决于不同模型的置信度和业务规则.
按揭合同审核自动化
I 训练有素的 a model to classify between main 合同 and their annexes, 扩展, 以及使用TF-IDF特征来训练分类器的修改. I also developed the validation module to disambiguate and match 合同 and DB rows and perform the comparison to highlight differences.
税务宽免申请资格
The goal was to increase the efficiency of the application process for a tax relief program offered by the government due to COVID-19 that received millions of requests.
我开发了一个由手写文本检测组成的管道, 布局检测技术, 分类(检测文档类型), and NER (named-entity recognition) models to 提取 the relevant information from the 文档 accurately. I also developed the confidence module that prioritized manual review of applications based on business rules and models' confidence.
发票验证自动化
The project's pipeline started with 光学字符识别 technology to 提取 text from scanned 文档 accurately. I employed named-entity recognition (NER) models to identify and categorize key data points within these texts, 例如供应商名称, 日期, 和数量. An important part of the project was the development of classification models to accurately detect and categorize different document types, 自动检测其相关数据点.
另外, I implemented fuzzy matching algorithms to link items listed in 发票 with corresponding entries in 采购订单. This approach was key in identifying mismatches and inconsistencies.
Skills
语言
Python、SQL
其他
自然语言处理(NLP), GPT, 生成预训练变压器(GPT), 人工智能(AI), 机器学习, 深度学习, OpenAI GPT-4 API, OpenAI GPT-3 API, OpenAI, 数据抓取, 计算机视觉, 光学字符识别, 手写识别, 文本分类, Tf-idf, 信息检索, 提示工程, 认知计算, 自定义模型, 网页抓取, 体系结构, API集成, 集成, 咨询, 生成预训练变压器3 (GPT-3)
库/ api
SpaCy, Scikit-learn, 熊猫, Azure认知服务, Keras, 反应, SQLAlchemy
工具
命名实体识别(NER)
范例
自动化,Scrum,敏捷软件开发,B2B
平台
码头工人, Azure, Visual Studio Code (VS Code), MacOS, Linux, Amazon Web Services (AWS), Windows
框架
瓶
存储
PostgreSQL
认证
自然语言处理纳米级
Udacity
机器学习工程师纳米学位
Udacity
如何使用Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
分享你的需求
选择你的才能
开始你的无风险人才试验
对顶尖人才的需求很大.
开始招聘