I am an assistant professor in Computer Science (Courant Institute) and Data Science at New York University. I was an assistant professor in the Computer Science department at the University of Texas at Austin from 2020. Before UT, I was a researcher at Google AI in NYC and a Ph.D. student at UW, advised by Luke Zettlemoyer and Yejin Choi.
I enjoy studying real world language usages with simple and generalizable models. I also build benchmarks that allows us to evaluate NLP models, conduct model analysis, and bring the progresses in English NLP to a wider range of languages. Here are research topics that I am currently interested in:
- Continual Learning and Knowledge Editing: While LMs retain vast amounts of world knowledge seen during pretraining, such knowledge can get outdated. I am interested in retrieval augmentation and updating parametric knowledge in LMs.
- Long-form Question Answering: Enabling systems to produce paragraph-level answers opens up possibilities to handle more complicated questions and provide more comprehensive answers. LFQA merges two challenging research areas -- information retrieval and text generation. Furthermore, we have to synthesize information from multiple documents.
- Human-LM Interaction: NLP systems are getting deployed fast and widely. I am interested in improving human interactions with LM, for example, how should we present information such that users will not be misled by plausible yet imperfect model predictions? The deployment of models also creates opportunities to learn from interaction with users.
- Spoken Language Processing: Spoken language exhibits richer prosodic features that are absent in written text. Can we build textless NLP system which can work on speech signals, opening doors to handle languages without written scripts?
News
Research Group
I am fortunate to work with many talented students and collaborators. I don't have a group name, but my students like to play with naming my lab, such as (E)xplaining and (U)nderstanding (N)ature and (S)tructure/Synthesis (O)f (L)anguage. If you would like to join my lab, please apply to NYU CS or DS program as I will be recruiting 1-2 student this cycle. If you like to work with me, apply to NYU's Ph.D program and mention my name in the application.
The research at my lab has been supported by Google, Open Philanthrophy, CISCO, Sony, HomeDepot, Apple and NSF. Thank you!
- Current
- Michael J.Q. Zhang
- Anuj Diwan (w/ David Harwarth)
- Fangyuan Xu
- Hung-Ting Chen
- Leo Zeyu Liu (w/ Greg Durrett)
- Thom Lake (w/ Greg Durrett)
- Yuhan Liu
- Varun Yerram (w/ He He)
- Xiang Liu
- Nishant Balepur
- Ayush Jhaveri
- Deniz Qian
Publications
* refers to equal contribution.
- Preprint
- Beyond Single Embeddings: Capturing Diverse Targets with Multi-Query Retrieval, ArXiv 2025
- DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference, ArXiv 2025
- PropMEND: Hypernetworks for Knowledge Propagation in LLMs, ArXiv 2025
- Learning to Reason Across Parallel Samples for LLM Reasoning, ArXiv 2025
- On Language Modelsβ Sensitivity to Suspicious Coincidences, ArXiv 2025
- CodeUpdateArena: Benchmarking Knowledge Editing on API Updates, ArXiv 2024
- Peer-reviewed
- User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal, EMNLP 2025
- Scaling Rich Style-Prompted Text-to-Speech Datasets, EMNLP 2025
[project website] - Improving LLM-as-a-Judge Inference with the Judgment Distribution, EMNLP 2025 Findings
- RARe: Retrieval Augmented Retrieval with In-Context Examples, COLM 2025
- Rhapsody: A Dataset for Highlight Detection in Podcasts, COLM 2025
- RefreshKV: Updating Small KV Cache During Long-form Generation, ACL 2025
- CaLMQA: Exploring culturally specific long-form question answering across 23 languages, ACL 2025
- Diverging Preferences: When do Annotators Disagree and do Models Know?, ICML 2025
- Open-World Evaluation for Retrieving Diverse Perspectives, NAACL 2025
[talk slides] - From Distributional to Overton Pluralism: Investigating Large Language Model Alignment, NAACL 2025
[poster] - Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs, NAACL 2025 Findings
- Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions, ICLR 2025
[poster] - SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors, NeurIPS 2024
- Contrastive Learning to Improve Retrieval for Real-world Fact Checking, EMNLP 2024 FEVER Workshop (Oral)
[poster] - Exploring Design Choices for Building Language-Specific LLMs, EMNLP 2024 Findings
[poster] - Unit-based Speech-to-Speech Translation Without Parallel Data, EMNLP 2024 Findings
[poster] - Understanding Retrieval Augmentation for Long form Question Answering, COLM 2024
[poster] - AmbigDocs: Reasoning across Documents on Different Entities under the Same Name, COLM 2024
[project website],[poster] - Long-form Answers to Visual Questions from Blind and Low Vision People, COLM 2024 (Spotlight)
[project website][slides], [poster] - KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions, ACL 2024 Findings
[poster] - BAT: Learning to Reason about Spatial Sounds with Large Language Models, ICML 2024
[project website] - Complex Claim Verification with Evidence Retrieved in the Wild, NAACL 2024
- Crafting In-context Examples according to LMs' Parametric Knowledge, NAACL 2024 Findings
[poster] - RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation, ICLR 2024
[poster] - Learning to Reject with a Fixed Predictor: Application to Decontextualization, ICLR 2024
- Aligning Data with the Goals of an Organization and Its Workers: Designing Data Labeling for Social Service Case Notes, CHI 2024
- Mitigating Temporal Misalignment by Discarding Outdated Facts, EMNLP 2023
[poster] - Continually Improving Extractive QA via Human Feedback, EMNLP 2023
[poster] - Propagating Knowledge Updates to LMs Through Distillation, NeurIPS 2023
- Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge, ACL 2023
[poster] - When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants, ACL 2023, short
[poster] - A Critical Evaluation of Evaluations for Long-form Question Answering, ACL 2023
[data/code], [talk] -
Concise Answers to Complex Questions: Summarization of Long-form Answers, ACL 2023
[poster] -
Quantifying Train-Evaluation Overlap with Nearest Neighbors, ACL 2023 Findings
-
DIFFQG: Generating Questions to Summarize Factual Changes, EACL 2023
-
Continual Learning for On-Device Speech Recognition using Disentangled Conformers, ICASSP 2023
[poster] -
Understanding Postpartum Parents' Experiences via Two Digital Platforms, CSCW 2023 April Issue
-
Generating Literal and Implied Subquestions to Fact-check Complex Claims, EMNLP 2022
[project website], [poster] -
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence, EMNLP 2022
-
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality, EMNLP 2022
[code],[slides] -
Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources, Findings of EMNLP 2022
Xinyan Velocity Yu*, Akari Asai*, Trina Chatterjee, Junjie Hu, Eunsol Choi
[website] -
TyDiP: A Dataset for Politeness Classification in Nine Typologically Diverse Languages, Findings of EMNLP 2022
Anirudh Srinivasan, Eunsol Choi
[code/data],[poster] -
Entity Cloze By Date: What LMs Know About Unseen Entities, Findings of NAACL 2022 short
Yasumasa Onoe, Michael J.Q. Zhang, Eunsol Choi, Greg Durrett
[poster] -
Modeling Exemplification in Long-form Question Answering via Retrieval, NAACL 2022
Shufan Wang, Fangyuan Xu, Laure Thompson, Eunsol Choi, Mohit Iyyer
[slides] -
How do we answer complex questions: Discourse structure of long form answers, ACL 2022
Fangyuan Xu, Junyi Jessy Li, Eunsol Choi
[code],[poster],[slides] -
Simulating Bandit Learning from User Feedback for Extractive Question Answering, ACL 2022
Ge Gao, Eunsol Choi, Yoav Artzi
[code], [poster],[slides] -
Misinfo Reaction Frames: Reasoning about Readers' Reactions to News Headlines, ACL 2022
Saadia Gabriel, Skyler Hallinan, Maarten Sap, Pemi Nguyen, Franziska Roesner, Eunsol Choi, Yejin Choi
[website],[poster] -
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge, NeurIPS 2021 Benchmark and Datasets
Yasumasa Onoe, Michael J.Q. Zhang, Eunsol Choi, Greg Durrett
[webiste] -
SituatedQA: Incorporating Extra-Linguistic Contexts into QA, EMNLP 2021 long, π Outstanding Paper Award
Michael J.Q. Zhang, Eunsol Choi
[website], [slides] -
Learning with Different Amount of Annotation: From Zero to Many Labels, EMNLP 2021
Shujian Zhang, Chenyue Gong, Eunsol Choi
[code] -
Can NLI Models Verify QA Systems' Predictions?, Findings of EMNLP 2021
Jifan Chen, Eunsol Choi, Greg Durrett
[slides] -
Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval, ACL 2021
Akari Asai, Eunsol Choi
[code] -
Knowing More About Questions Can Help: Improving Calibration in Question Answering, Findings of ACL 2021
Shujian Zhang, Chenyue Gong, Eunsol Choi
[code] -
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned, PMLR 2021
Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, and more (EfficientQA participants)
[website] -
QED: A Framework and Dataset for Explanations in Question Answering, TACL 2021
Matthew Lamm, Jennimaria Palomaki, Chris Alberti, Daniel Andor, Eunsol Choi, Livio Baldini Soares, Michael Collins
[code] -
XOR QA: Cross-lingual Open-Retrieval Question Answering, NAACL 2021
Akari Asai, Jungo Kasai, Jonathan H. Clark, Kenton Lee, Eunsol Choi, Hannaneh Hajishirzi
Project Page [Code / Data / Leaderboard / Poster] -
Decontextualization: Making Sentences Stand-Alone, TACL 2021
Eunsol Choi, Jennimaria Palomaki, Matthew Lamm, Tom Kwiatkowski, Dipanjan Das, Michael Collins
[code/data], [poster] -
Entities as Experts: Sparse Memory Access with Entity Supervision, EMNLP 2020
Thibault FΓ©vry, Livio Baldini Soares, Nicholas FitzGerald, Eunsol Choi, Tom Kwiatkowski -
TYDI QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages, TACL 2020
Jonathan H. Clark, Eunsol Choi, Michael Collins, Dan Garrette, Tom Kwiatkowski, Vitaly Nikolaev, Jennimaria Palomaki
[Leaderboard / Website] [Code / Data Repo] -
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension, EMNLP 2019 workshop proceedings
Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and Danqi Chen -
Pair2Vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference, NAACL 2019
Mandar Joshi, Eunsol Choi, Omer Levy, Dan Weld and Luke Zettlemoyer -
No Permanent Friends or Enemies: Tracking Dynamic Relationships Between Nations from News, NAACL 2019 Long
Xiaochuang Han, Eunsol Choi, and Chenhao Tan -
FlowQA: Grasping Flow in History For Conversational Machine Comprehension, ICLR 2019, poster
Hsin-Yuan Huang, Eunsol Choi and Wen-tau Yih -
Neural Metaphor Detection in Context EMNLP 18 short
Ge Gao, Eunsol Choi, Yejin Choi and Luke Zettlemoyer
[Slides][Code] -
QuAC : Question Answering in Context EMNLP 18 long
Eunsol Choi*, He He*, Mohit Iyyer*, Mark Yatskar*, Wen-tau Yih, Yejin Choi, Percy Liang and Luke Zettlemoyer
Project Page [Code / Data / Leaderboard / Poster] -
Ultra-Fine Entity Typing ACL 18 long
Eunsol Choi, Omer Levy, Yejin Choi and Luke Zettlemoyer
Project Page [Code / Data / Slides] -
Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking EMNLP 17 short
Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova and Yejin Choi
Project Page [Code / Data / Slides] -
Zero-Shot Relation Extraction Via Reading Comprehension CoNLL 17 long
Omer Levy, Minjoon Seo, Eunsol Choi and Luke Zettlemoyer
Project Page [Code / Data/ Leaderboard] -
Coarse-to-Fine Question Answering for Long Documents ACL 17 long
Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alexandre Lacoste and Jonathan Berant
[Slides] -
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension ACL 17 long
Mandar Joshi, Eunsol Choi, Daniel Weld and Luke Zettlemoyer
Project Page [Code / Data / Leaderboard] -
Document-level Sentiment Inference with Social, Faction, and Discourse Context ACL 16 long
Eunsol Choi, Hannah Rashkin, Yejin Choi and Luke Zettlemoyer
[Data]
Errata: In Section 5, the metric should be micro-averaged instead of macro-averaged. In Section 2.2, second equation should be
. The current version here is fixed.
-
Extracting Structured Scholarly Information from Machine Translation Literature LREC 16 long
Eunsol Choi*, Matic Horvat *, Jonathan May, Kevin Knight, Daniel Marcu
[Poster][Code] -
Scalable Semantic Parsing With Partial Ontologies ACL 15 long
Eunsol Choi, Tom Kwiatkowski and Luke Zettlemoyer [Slides][Data (414MB)] -
Scaling Semantic Parsers with On-the-Fly Ontology Matching EMNLP 13 long
Tom Kwiatkowski, Eunsol Choi, Yoav Artzi and Luke Zettlemoyer
[Slides] -
Hedge detection as a lens on framing in the GMO debates: A position paper ACL 12 workshop
Eunsol Choi, Chenhao Tan, Lillian Lee, Cristian Danescu-Niculescu-Mizil, Jennifer Spindel
Project Page [Data / Slides]
Teaching
- Spring 2026 CS 2590: Natural Language Processing (grad)
- Fall 2025 DS-GA 1011: Fundamentals of Natural Language Processing (grad)
- Spring 2025: CSCI-GA.3033-116 Special Topics: Emerging Topics in Natural Language Processing (grad)
- UT Fall 2023: CS 395T Topics in Natural Language Processing (grad)
- UT Spring 2021-23: CS 378 Natural Language Processing (undergrad)
- UT Fall 2021: CS 388 Natural Language Processing (grad)
- UT Fall 2020: CS 395T Topics in Natural Language Processing (grad)
Service
- Workshop Co-chair, ACL 2024, AKBC 2021
- Program Co-chair, COLM 2025, AKBC 2022
- Organizing Committee, Rising Stars in EECS 2022
- Publicity Co-chair, EMNLP 2022
- Best Paper Committee, ICLR 2024, EMNLP 2025
- Senior Area Chair, EMNLP 2021,2, ACL 2023
- Area Chair, EMNLP 2019, ACL 2020, EMNLP 2020, NeurIPS 2021 (benchmark track), ICML 2023-24, COLM 2024
- Action Editor, ARR
- Standing Reviewer, TACL, CL, TMLR
- Reviewer, ACL, EMNLP, NeurIPS, ICLR, ICML, AKBC, and some related workshops occassionally (please see my CV).
- Workshop co-organizer, Workshop on Machine Reading for Question Answering, ACL 2018, EMNLP 2019, EMNLP 2021
- Efficient QA Workshop, NeurIPS 2020
- Faculty mentor, UT Graduate Women and Gender Minorities in Computing, 23-24
- Co-organizer, Rising stars in EECS 2022
Alumni
- Jifan Chen (w/ Greg Durrett)
- Yasumasa Onoe (w/ Greg Durrett)
- Anirudh Srinivasan
- Hung-ting Chen
- Pranav Atreya
- Abhilash Potluri
- Rishi Salem (w/ David Harwarth)
- Shankar Padmanabhan
- Yoonsang Lee
- Younghan Park
- Doeun Lee
- Atula Tejaswi Neerkaje
Personal
My name (μμ) means soft, persistent love in ancient Korean (or at least my father claims so). I use she/her pronoun. Here is my voice clips to help you pronounce my names: Korean version: [] Easier version: []
I am fortunate to embark my NLP journey at Cornell as an undergraduate with Lillian Lee.