Prof. Annie Lee

Annie En-Shiun Lee, PhD

Ontario Tech University (Assistant Professor)
University of Toronto (Status-only Assistant Professor)

Biography

Annie En-Shiun Lee is an Assistant Professor at Ontario Tech University (OTU) and a Status-only Assistant Professor at University of Toronto. Her goal is to make language technology as inclusive and accessible to as many people as possible. She directs the Lee Language Lab (L³), focusing on language diversity, multilinguality, and multiculturalism, aligning with OTU’s vision for “Tech with a Conscience”. Her research has been published in Nature Digital Medicine, ACM Computing Surveys, ACL, SIGCSE, IEEE TKDE, and Bioinformatics.

Dr. Lee is the demo co-chair for NAACL 2024 and has received numerous recognitions, including Outstanding Paper Award and Best Theme Paper Award at NAACL 2025, Audience Award at Teaching NLP 2024, ARIA Spotlight Award for MScAC 2024, as well as nominations for the Tim McTiernan Student Mentorship Award 2025 and Women in AI Researcher of the Year Award 2025.

She was an Assistant Professor (teaching stream) at the University of Toronto for the Elite professional Master’s. She earned her PhD from the University of Waterloo, was a visiting researcher at the Fields Institute and the Chinese University of Hong Kong, and worked as a research scientist at VerticalScope (Research lead) and Stradigi AI.

Research Interests

Multilinguality, Language Diversity, and Low-Resource Languages

Multilinguality, language diversity, and low-resource languages

Multicultural Bias and Multimodal Applications

Multicultural Bias and Multimodal Applications

Pedagogy for Natural Language Processing and Machine Learning

Pedagogy for Natural Language Processing and Machine Learning

Projects

ProxyLM (Findings NAACL 2025)

A lightweight performance proxy that predicts LM accuracy using ~30× less compute. Enables faster model selection, fine-tuning, and prompt iteration while maintaining high predictive reliability.

AlignFreeze (NAACL 2025)

Freezes early transformer layers to preserve syntactic knowledge during fine-tuning. Boosts zero-shot and cross-domain performance with minimal additional training, improving stability and efficiency.

WorldCuisine (NAACL 2025 – Best Theme Paper)

1.2M image–question pairs across 30 languages, capturing global culinary knowledge. A benchmark for cross-cultural multimodal reasoning with applications in cultural AI research.

Teaching NLP

Award-winning Teaching NLP workshop on empowering multilinguality in NLP education. Showcases teaching strategies, open resources, and collaborative projects funded by NSERC USRA and the Fields Institute.

URIEL+ World Language Database (COLING 2025)

Expanded typological and geographic language database with improved NLP integration. Includes a Python package for multilingual benchmarking, cross-lingual transfer, and dataset alignment.

AiTaigi Hokkien Learning App

Multimodal app for Taiwanese Hokkien featuring speech, text, and audio examples. Developed as a student-led project and awarded the Student Engagement Award by U of T Computer Science.

TranslationCorrect (ACL 2025)

Interactive translation quality assistant that detects and corrects machine translation errors. Enhances translator efficiency while maintaining linguistic fluency and semantic accuracy.

Multilingual Understanding and Reasoning of LLMs (Findings of EMNLP 2024)

This project aims to strengthen the multilingual understanding and reasoning capabilities of large language models (LLMs), with a focus on low-resource languages.

Full list on Google Scholar

Browse the complete, up-to-date publication list, citations, and co-authors.

Join Us

Choose the path that fits you. Please carefully follow the instructions below:

Collaborations (Academic & Industry)

Propose joint research, co-supervision, visiting positions, R&D projects, evaluations, or joint award submissions with L³.

Undergraduate Students (UofT, OTU & other institutions)

Pathways for undergrads to work with L³ via URA, Honours Thesis, University Works, or UofT CSC494/495.

MSc / PhD Applicants (OTU, UofT, etc.)

For applicants seeking graduate study with L³ at OTU/UofT. Start with the call, complete forms, and prepare your materials.

Letters of Recommendation (Former Students)

For former L³ students requesting an academic or industry reference letter.

Teaching Experience

Ontario Tech University (OTU)

University of Toronto

York University

Students

David Anugraha

David Anugraha

  • Lead author: URIEL+ Typological Knowledge Base.
  • Co-author: WorldCuisines & ProxyLM (Multilingual VQA, LM performance prediction).
  • Co-author: MT performance on low-resource languages.
  • Starting PhD, Stanford University (Fall 2025).
Enrique David Guzman Ramírez

Enrique David Guzman Ramírez

  • Data Engineer, J.D. Power.
  • MScAC student, University of Toronto.
  • Vector Scholarship in AI (2022–23).
Kosei Uemura

Kosei Uemura

  • Focus: Multilingual NLP & reasoning in LLMs.
  • Lead author: AfriInstruct (instruction tuning for African languages).
  • Co-author: Empowering the Future with Multilinguality & Language Diversity.
Mason Shipton

Mason Shipton

  • Co-author: URIEL+ (8,000+ language vectors).
  • Co-author: Empowering the Future (NLP course framework).
  • Programmer Analyst, Ontario Teachers’ Pension Plan (cloud solutions; Innovation Newsletter curator).
Labib Rahman

Labib Rahman

  • ExploRIEL — UI with chatbot for URIEL+ language distances & vectors.
  • SoulsBot+ — LLM-powered tutorial chatbot for Dark Souls: The Board Game.
  • LinguaQuest — RPG-style educational game for linguists.
  • Master’s student, Ontario Tech; researcher at Lee Lab & UXRLab.
Quang Phuoc Nguyen

Quang Phuoc Nguyen

  • Data Selection for Multilingual Alignment — selects optimal languages for LM fine-tuning.
  • Merlin: Curriculum Alignment — encoder–decoder stacking to improve multilingual alignment.
  • Game Dialogue Translation — survey of LLM performance in game localization.
Malikeh Ehghaghi

Malikeh Ehghaghi

Amane Takeuchi

Amane Takeuchi

  • Business Analyst, Amazon (Tokyo, Japan).
  • Research Project Lead & RA (ML model interpretation in clinical apps, NLP, CS education & EDI; PyTorch).
  • BSc Applied Math; Specialist Data Science; Major CS; Minor Math — University of Toronto (Dean’s List 2023).
  • Vice-Chair & Career Event Director, UofT Japan Network; TA for MAT135/136/235.
Tong Su

Tong Su

  • Software Engineer Intern, Vortexa (LLMs for maritime data parsing).
  • MSc Advanced Computer Science, University of Oxford (2024–2025).
  • Former Full-Stack Developer, Northbridge Financial (Angular, Django, .NET; 10,000+ users).
  • TA & Course Supporter, University of Toronto (Python, Unix/Git, Research Software).
  • Research Assistant (Lee Lab & AI for Justice): PEFT for low-resource NMT; first author — NAACL 2024.
  • Passed CFA Program Level I (Oct 2024).
Syed Mekhael Wasti

Syed Mekhael Wasti

  • Lead author, ACL 2025 demo: TranslationCorrect.
  • Co-author, TeachNLP 2024: Multilinguality paper.
  • MSc, Queen’s University (Vector Scholar), Fall 2025.
Hasti Toossi

Hasti Toossi

  • Software Engineer, PolyAI.
  • Research: NLP; Programming Languages (Type Theory).
  • Recent graduate, University of Toronto.
Aditya Khan

Aditya Khan

Eric Khiu

Eric Khiu

Vincent Shuai

Vincent Shuai

  • Student Engagement Award (2024), University of Toronto CS.
  • Software Engineer, TikTok.
Shou-Yi Hung (Ray)

Shou-Yi Hung (Ray)

Awards

Awards & Recognitions: Research Grants:

Instagram