Kai-Wei Chang's Lab

UCLA NLP Seminar Series

Welcome to our weekly seminar series.

Kai-Wei Chang's Lab
Date Speaker Title
April 4 Yulia Tsvetkov Optimizing for Long-Term Vision in a Fast-Paced Research World
April 11 Zhe Gan How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents
April 18 David Bamman Measuring Representation and Linguistic Variation in Hollywood
May 9 Aditya Kusupati Matryoshka Principles for Adaptive Intelligence
May 13 Alice Oh Beyond Accuracy: Rethinking LLM Evaluation for Real-World, Interactive, and Culturally Inclusive Scenarios
May 16 Emma Pierson Using New Data to Answer Old Questions
May 23 Shou-De Lin Mi-Yen Yeh Beyond Prompts: Neuron-Level Guidance for LLM, Model Merging for LLMs in Spectral Domain
May 30 Julie Kallini MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
June 6 Phillipe Laban

🚀 Upcoming Talks

MAY
30

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Person IconJulie Kallini

Clock IconMay 30, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: Julie Kallini is a second-year Ph.D. student in Computer Science at Stanford University, advised by Christopher Potts and Dan Jurafsky. Her research focuses on natural language processing (NLP), with an emphasis on computational linguistics/cognitive science, tokenization, and model architecture. Her paper, "Mission: Impossible Language Models," won Best Paper Award at ACL 2024. Her work is supported by the NSF Graduate Research Fellowship, the Stanford School of Engineering Graduate Fellowship, and the Stanford EDGE Fellowship. Before starting her Ph.D., Julie was a software engineer at Meta, where she worked on machine learning for advertisements. Julie graduated summa cum laude from Princeton University with a B.S.E. in Computer Science and a minor in Linguistics.

Abstract: Models that rely on subword tokenization have significant drawbacks, such as sensitivity to character-level noise like spelling errors and inconsistent compression rates across different languages and scripts. While character- or byte-level models like ByT5 attempt to address these concerns, they have not gained widespread adoption—processing raw byte streams without tokenization results in significantly longer sequence lengths, making training and inference inefficient. This work introduces MrT5 (MergeT5), a more efficient variant of ByT5 that integrates a token deletion mechanism in its encoder to dynamically shorten the input sequence length. MrT5 achieves up to 75% sequence length reduction with minimal performance loss, offering faster inference and competitive accuracy on multilingual and character-level tasks. Our approach presents a solution to the practical limitations of existing byte-level models.

🚨 Past Talks

MAY
13

Beyond Accuracy: Rethinking LLM Evaluation for Real-World, Interactive, and Culturally Inclusive Scenarios

Person IconAlice Oh

Clock IconMay 13, 2025, 4:15 PM

Location Icon3400 Boelter Hall (Co-located with CS 201)

Speaker Bio: Alice Oh is a Professor in the School of Computing at KAIST. Her major research area is at the intersection of natural language processing (NLP) and computational social science, with a recent focus on multilingual and multicultural aspects of LLMs. She collaborates with scholars in humanities and social sciences such as political science, education, and history. She has served as Program Chair for ICLR 2021 and NeurIPS 2022, General Chair for ACM FAccT 2022 and NeurIPS 2023, and DEI Chair for COLM 2024. She is the current President of SIGDAT which oversees EMNLP.

Abstract: Traditional evaluation methods for large language models (LLMs)—often centered on accuracy in static multiple-choice or short-answer questions—fail to capture the complexities of real-world use. As we envision LLMs serving users in dynamic, multicultural, and interactive scenarios, we must rethink what meaningful evaluation looks like. This talk presents our recent research to advance LLM evaluation through culturally aware, socially grounded, and interaction-driven benchmarks. We assess factual consistency across languages and regions, explore everyday knowledge in underrepresented cultures, and examine cultural inclusivity. We highlight that while LLMs may not appear to be socially biased in simple question-answering, they reveal their biases in generation tasks, which is more aligned with the actual LLM usage. We further introduce dynamic and interactive evaluation paradigms: LLM-as-an-Interviewer, which mimic real-time user interaction, and Flex-TravelPlanner, which evaluates planning adaptability under evolving and prioritized constraints. Together, these papers reveal that accuracy alone is insufficient; LLM evaluation must consider culture, context, interactivity, and adaptation. This talk calls for a broader evaluation agenda and presents these ten papers as starting points for more robust, inclusive, and realistic assessments.

MAY
16

Using New Data to Answer Old Questions

Person IconEmma Pierson

Clock IconMay 16, 2025, 2:00 PM

Location Icon289, Engineering VI (Virtual Speaker)

Speaker Bio: Emma Pierson is an assistant professor of computer science at UC Berkeley and core faculty in the Computational Precision Health program. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, Forbes 30 Under 30 in Science, AI2050 Early Career Fellowship, and Samsung AI Researcher of the Year. Her research has been published in venues including Nature, JAMA, The New England Journal of Medicine, PNAS, Nature Medicine, ICML and ICLR, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.

Abstract: The explosion of new data sources has created new opportunities, and necessitated new machine learning methods, to answer old questions in the health and social sciences. This talk discusses three stories under this theme: first, using image data to quantify inequality in policing; second, using text data to interpretably predict target variables and characterize disparities; and third, using address data to infer fine-grained migration patterns.

MAY
9

Matryoshka Principles for Adaptive Intelligence

Person IconAditya Kusupati

Clock IconMay 09, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: Aditya Kusupati is a Staff Research Scientist at Google DeepMind. He got his PhD from University of Washington and B.Tech from IIT Bombay. Between his B.Tech and PhD, he was a Research Fellow at Microsoft Research. His research focuses broadly on next-generation machine learning models geared towards adaptive intelligence.

Abstract: The increasing scale of deep learning models presents significant challenges for deployment across diverse computational environments, each with unique constraints on latency, memory, and energy. Traditional approaches often necessitate training and maintaining separate models for each desired operating point, leading to substantial overhead. This talk explores the "Matryoshka" principle, a promising paradigm for achieving computational adaptivity within a single trained artifact. Inspired by Russian nesting dolls, Matryoshka methods embed coarser, computationally cheaper structures within finer, more powerful ones, enabling dynamic adjustment of resource usage at inference time. This technique is highly generalizable across various fundamental components of Machine Learning like Embeddings, Transformers and even the integer data type for Quantization. The community extended it beyond just these components and has seen a wide array of deployments both across industry and open-source, serving over a Billion users daily. Collectively, these works demonstrate how the Matryoshka principle facilitates unified training of highly flexible models that can seamlessly adapt their computational footprint post-training, significantly simplifying deployment and enhancing efficiency across heterogeneous hardware.

APR
18

Measuring Representation and Linguistic Variation in Hollywood

Person IconProf. David Bamman, University of California Berkeley

Clock IconApr 18, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: David Bamman is an associate professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University. Bamman's work is supported by the National Endowment for the Humanities, National Science Foundation, an Amazon Research Award, and an NSF CAREER award.

Abstract: Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in the United States due to the Digital Millennium Copyright Act. In this talk, I'll discuss recent regulatory changes at the U.S. Copyright Office that allows for large-scale text and data mining of film, and describe our efforts to build a collection of 2,307 films representing the top 50 movies by U.S. box office over the period 1980 to 2022, along with award nominees. Building this collection allows us to carry out several large-scale computational studies of film; I'll discuss our work measuring changing patterns in the representation of gender and race/ethnicity over the past 43 years (where we see an increase in diversity over the past decade) and in leveraging it to model variation in emotional performances and choice of adverbial intensifiers over both narrative and historical time. This work illustrates a new frontier of the data-driven analysis of film at a large scale

APR
11

How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents

Person IconZhe Gan, Apple

Clock IconApril 11, 2025, 2:00 PM

Location Icon289, Engineering VI (Virtual Speaker)

Speaker Bio: Dr. Zhe Gan is a Research Scientist and Manager at Apple AI/ML, primarily working on building large-scale vision and multimodal foundation models. Before joining Apple, he was a Principal Researcher at Microsoft. He received his Ph.D. degree from Duke University in 2018. He has served as Area Chairs for top-tier AI conferences, and is a recipient of the Best Student Paper Honorable Mention Awards at CVPR 2021 and WACV 2021, respectively.

Abstract: Multimodal Large Language Models (LLMs) have become an increasing hot research topic. In this talk, I will present our recent works on how to build performant multimodal LLMs, along several fronts: (1) Pre-training, with focus on pre-training data choices, multimodal LLM pre-training and visual encoder pre-training; (2) Post-training, with focus on text-rich image understanding, visual referring and grounding, UI understanding, and reasoning; and (3) Generalist Agents, with focus on how to adapt multimodal LLMs into generalist embodied agents.

APR
4

Optimizing for Long-Term Vision in a Fast-Paced Research World

Person IconProf. Yulia Tsvetkov, University of Washington

Clock IconApr 4, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: Yulia Tsvetkov is an associate professor at the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research group works on fundamental advancements to large language models, multilingual NLP, and AI ethics/safety. This research is motivated by a unified goal: to extend the capabilities of human language technology beyond individual populations and across language boundaries, thereby making NLP tools available to all users. Prior to joining UW, Yulia was an assistant professor at Carnegie Mellon University and before that a postdoc at Stanford. Yulia is a recipient of NSF CAREER, Sloan Fellowship, Okawa Research award, and multiple paper awards and runner-ups at NLP, ML, and CSS conferences.

Abstract: The fast-paced race for larger language models—and the promise of financial gains for the winners—incentivizes heavier engineering with incremental ideas, often at the expense of long-term vision. While this approach advances industry products used by millions, it is not necessarily the right approach for academic research. In this talk, I will present novel task formulations and evaluation benchmarks that question mainstream assumptions about LLM architectures, training/alignment algorithms, and evaluation approaches. While proposed ideas contradict the common practice, they expose blind spots in LLMs reasoning abilities, and huge performance and fairness gaps in best commercial LLMs, highlighting directions for future research.

Organizing Committee

Faculty

Prof. Kai-Wei Chang

Prof. Nanyun Peng

Prof. Saadia Gabriel

Prof. Elisa Kreiss

Students

Tanmay Parekh

Yufei Tian

Ashima Suvarna

Yining Hong

Salman Rahman