cv
Basics
| Name | Antonio Lopardo |
| Label | ML Engineer |
| antonio.lopardo@outlook.com | |
| Url | https://github.com/AntonioLopardo |
| Summary | Machine Learning Engineer with expertise in LLMs, vector search, and data management systems |
Work
-
2023.06 - Present Zurich
Machine Learning Engineer
Veezoo
Working on LLM-based approaches for natural language to SQL translation
- Helped transition the company's product from logic-based query parser to LLM-first approach
- Finetuned language models for natural language to SQL translation through intermediate representation and proprietary query language
- Built human-in-the-loop evaluation and monitoring pipelines for finetuned and nonfine-tuned models
- Developed LLM prompting strategies for natural language to proprietary query language translation, without needing fine-tuning
Education
Projects
- 2022.10 - 2023.04
Clustering for Approximate Nearest Neighbor Search
For my master thesis, I've worked on methods to speed up approximate nearest neighbor search. More specifically, I focused on improving clustering to reduce the cost of probing IVF indexes.
- Empirical analysis of trade-off of different clustering approaches to billion-scale vector search
- End-to-end approach to IVF index optimization
- Trained indexes with different algorithms, including Lloyd's algorithm and variations of gradient descent
- 2022.04 - 2022.07
Word Vectors as KG embeddings
As part of the ETH AI Center project course, I've worked on verifying whether Knowledge Graph embeddings can be improved by initialization with contextual and non-contextual word vectors.
- 2022.03 - 2022.06
Program Synthesis for Math Word Problems
Transformer-based LLMs have shown incredible generative capabilities, producing text and even conducting dialogue significantly better than any previous architecture. One area in which they still struggle in is math. We tested whether recent open sourced LLMs trained to generate code might be useful for math word problems by bypassing arithmetic operations all together and sticking to producing Python programs that perform the operations when run.
- 2021.05 - 2021.07
Medical QA Project with Retrieval augmented LM
The project entailed adapting and finetuning an open domain QA system, RAG, for the medical domain. Unlike other open domain QA systems, that encode all their knowledge in their parameters, RAG leans on a textual knowledge base, like Wikipedia, to retrieve relevant information to answer questions. We finetuned RAG components to work better on the medical domain and developed a textual knowledge base of our own.
- 2021.05 - 2021.07
Recommender Systems with BayesianSVD and Graph NNs
We experimented, as part of the collaborative filtering competition of the Computational Intelligence lab, with multiple state-of-the-art techniques to build recommender systems with limited or obscured data. A Bayesian version of SVD proved the most effective, but we also implemented Graph Convolutional Networks and other algorithms for matrix factorization and reconstruction.
- 2021.03 - 2021.06
Multi-Human Optical Flow
The task of optical flow estimation consists in reliably identifying the pixel-by-pixel motion of objects between two consecutive images or frames of a video. To detect the motion of multiple people in a scene, we added a form of teacher-forcing and more channel dropout to the state-of-the-art neural architecture for the task (RAFT) and then trained it on synthetic data we generated.
- 2019.03 - 2019.06
Tabletop FPS in Java
In the final project before graduation, we were tasked with porting a tabletop FPS, Adrenaline, into JAVA using only the standard libraries and JavaFX. We made a point of applying as many software design patterns and OOP principles as possible.
- 2018.07 - 2018.11
The Midterms on Twitter - US Elections prediction
The project focused on an NLP pipeline for prediction on the 2018 US Midterm elections. The system approaches the prediction problem by gathering local tweets and classifying them as democratic or republican aligned, thanks to an RNN trained on pre-labeled data. The research paper on the project got accepted into the 2018 IEEE international conference on Big Data.
Interests
| Triathlon |
| American Football |