Isaac Chung | Isaac Chung

Hi!👋 I'm Isaac.

My focus is on making AI systems scalable and maintainable. Currently I'm a Staff Machine Learning Scientist at Zendesk.

Previously at Clarifai, I led custom enterprise solution development for visual search and text moderation, built multi-modal retrieval systems, and led applied research in improving question-answering systems. I have spoken at various Python, ML conferences, and meet ups in Europe. My stack includes Python, Docker, Kubernetes, PostgreSQL, and Go.

My background is in Aerospace Engineering and Machine Learning and I hold undergraduate (B.A.Sc in EngSci) and graduate (M.A.Sc) degrees from the University of Toronto.

I work remotely in Europe, currently based in Tallinn, Estonia. In my spare time, I try to contribute to open source projects (e.g. MTEB), see the world, and stay active. I used to race triathlons actively (a sneak peek here and here). These days I'm more into cycling, running, and hiking.

Papers

Isaac Chung, Imene Kerboua and Márton Kardos and Roman Solomatin and Kenneth Enevoldsen. Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks. Championing Open-source DEvelopment in ML Workshop @ ICML 2025
Chenghao Xiao, Isaac Chung, Imene Kerboua, Jamie Stirling et al. MIEB: Massive Image Embedding Benchmark. arXiv:2504.10471, 2025.
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos et al. MMTEB: Massive Multilingual Text Embedding Benchmark. The Thirteenth International Conference on Learning Representation, 2025.
Isaac Chung, Phat Vo, Arman C. Kizilkale, and Aaron Reite. Efficient In-Domain Question Answering for Resource-Constrained Environments. arXiv:2409.17648, 2024.
Elizaveta Korotkova and Isaac Chung. Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety. arXiv:2303.15110, 2023.

Talks and Conferences

2025 Berlin Buzzwords (Berlin, Germany): Reproducibility in Embedding Benchmarks | Video | Slides
2025 PyData London (London, UK): Reproducibility in Embedding Benchmarks | Slides
2024 Swiss Python Summit (Zurich, Switzerland): Prototype to Production for RAG applications | Video | Slides
2024 Data Makers Fest (Porto, Portugal): Prototype to Production for RAG applications | Slides
2024 PyCon PL (Gliwice, Poland): Transcend the Knowledge Barriers in RAG | Slides
2024 PyCon LT (Vilnius, Lithuania): Speed up open source LLM-serving with llama-cpp-python | Video | Slides | Github
2024 PyCon LT (Vilnius, Lithuania): Transcend the Knowledge Barriers in RAG | Video | Slides
2023 TD Lab Live AI Talk (Remote): Beyond Llama2: Future Trends and Challenges with LLMs | Video | Slides
2023 EstoniAI Meetup Vol. 5 (Tallinn, Estonia): Panel Discussion on Recap AI developments and future trends
2022 ECIR Industry Day (Stavanger, Norway): Scaling Cross-Domain Content-Based Image Retrieval for E-commerce Snap and Search Application. Talk not recorded.
2021 Clarifai Perceive Conference (Remote): Automating Data Labeling for Deep Learning - AI-Automated Data Labeling | Video | Slides

Projects

Organizer @ PyData Tallinn
Maintainer @ Massive Text Embedding Benchmark (MTEB)
Strava Kudos Bot: https://github.com/isaac-chung/strava-kudos
Open source contributions, such as

Blogs

I log my learnings on Generative AI/ML in a blog and try to keep it within a 3-5min read
Neptune LLMOps Blog on building RAG systems
I have also written a few blogs for Clarifai.
- Here are a few recent examples:
  Supercharge your LLM via Retrieval Augmented Fine-tuning
  
  The Landscape of Multimodal Evaluation Benchmarks
  
  Do LLMs Reign Supreme In Few-Shot NER? Part III
  
  Do LLMs Reign Supreme In Few-Shot NER? Part II
  
  Do LLMs Reign Supreme In Few-Shot NER?
  
  Multi-modal Moderation
  
  A Comprehensive Guide To Vector Search

Consulting

I'm open to provide consulting services in ML/AI. Send me an email, or reach out over LinkedIn.

Hi!👋 I'm Isaac.​

Papers​

Talks and Conferences​

Projects​

Blogs​

Consulting​