UCLA NLP Seminar Series

Date	Speaker	Title
April 4	Yulia Tsvetkov	Optimizing for Long-Term Vision in a Fast-Paced Research World
April 11	Zhe Gan	How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents
April 18	David Bamman	Measuring Representation and Linguistic Variation in Hollywood
May 9	Aditya Kusupati	Matryoshka Principles for Adaptive Intelligence
May 13	Alice Oh	Beyond Accuracy: Rethinking LLM Evaluation for Real-World, Interactive, and Culturally Inclusive Scenarios
May 16	Emma Pierson	Using New Data to Answer Old Questions
May 23	Shou-De Lin Mi-Yen Yeh	Beyond Prompts: Neuron-Level Guidance for LLM, Model Merging for LLMs in Spectral Domain
May 30	Julie Kallini	MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
June 6	Phillipe Laban	On Interacting and Writing with LLMs

🚀 Upcoming Talks

JUN

On Interacting and Writing with LLMs

Philippe Laban

June 6, 2025, 2:00 PM

289, Engineering VI (Virtual Speaker)

Speaker Bio: Philippe Laban is a Research Scientist at Microsoft Research, based in New York. Philippe works at the intersection of NLP (~70%) and HCI (~30%), and is passionate about studying and building the future of reading and writing interfaces.

Abstract: In this two part talk, we will cover topics at the intersection of NLP and HCI. In the first part, we'll cover recent work on multi-turn evaluation of LLMs, with findings indicating that LLMs tend to get "lost in conversation" when user instructions are underspecified and require interactivity from the LLM. In the second part, we will turn to writing interfaces, first introducing an interface (InkSync) that can facilitate human-AI interaction for writing (and ensure factual correctness), and then turn to a high-level question: what does it mean for LLMs to produce creative writing, and how does AI writing compare to expert-level writing?

🚨 Past Talks

MAY

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini

May 30, 2025, 2:00 PM

289, Engineering VI

Speaker Bio: Julie Kallini is a second-year Ph.D. student in Computer Science at Stanford University, advised by Christopher Potts and Dan Jurafsky. Her research focuses on natural language processing (NLP), with an emphasis on computational linguistics/cognitive science, tokenization, and model architecture. Her paper, "Mission: Impossible Language Models," won Best Paper Award at ACL 2024. Her work is supported by the NSF Graduate Research Fellowship, the Stanford School of Engineering Graduate Fellowship, and the Stanford EDGE Fellowship. Before starting her Ph.D., Julie was a software engineer at Meta, where she worked on machine learning for advertisements. Julie graduated summa cum laude from Princeton University with a B.S.E. in Computer Science and a minor in Linguistics.

Abstract: Models that rely on subword tokenization have significant drawbacks, such as sensitivity to character-level noise like spelling errors and inconsistent compression rates across different languages and scripts. While character- or byte-level models like ByT5 attempt to address these concerns, they have not gained widespread adoption—processing raw byte streams without tokenization results in significantly longer sequence lengths, making training and inference inefficient. This work introduces MrT5 (MergeT5), a more efficient variant of ByT5 that integrates a token deletion mechanism in its encoder to dynamically shorten the input sequence length. MrT5 achieves up to 75% sequence length reduction with minimal performance loss, offering faster inference and competitive accuracy on multilingual and character-level tasks. Our approach presents a solution to the practical limitations of existing byte-level models.

MAY

Beyond Accuracy: Rethinking LLM Evaluation for Real-World, Interactive, and Culturally Inclusive Scenarios

Alice Oh

May 13, 2025, 4:15 PM

3400 Boelter Hall (Co-located with CS 201)

Speaker Bio: Alice Oh is a Professor in the School of Computing at KAIST. Her major research area is at the intersection of natural language processing (NLP) and computational social science, with a recent focus on multilingual and multicultural aspects of LLMs. She collaborates with scholars in humanities and social sciences such as political science, education, and history. She has served as Program Chair for ICLR 2021 and NeurIPS 2022, General Chair for ACM FAccT 2022 and NeurIPS 2023, and DEI Chair for COLM 2024. She is the current President of SIGDAT which oversees EMNLP.

Abstract: Traditional evaluation methods for large language models (LLMs)—often centered on accuracy in static multiple-choice or short-answer questions—fail to capture the complexities of real-world use. As we envision LLMs serving users in dynamic, multicultural, and interactive scenarios, we must rethink what meaningful evaluation looks like. This talk presents our recent research to advance LLM evaluation through culturally aware, socially grounded, and interaction-driven benchmarks. We assess factual consistency across languages and regions, explore everyday knowledge in underrepresented cultures, and examine cultural inclusivity. We highlight that while LLMs may not appear to be socially biased in simple question-answering, they reveal their biases in generation tasks, which is more aligned with the actual LLM usage. We further introduce dynamic and interactive evaluation paradigms: LLM-as-an-Interviewer, which mimic real-time user interaction, and Flex-TravelPlanner, which evaluates planning adaptability under evolving and prioritized constraints. Together, these papers reveal that accuracy alone is insufficient; LLM evaluation must consider culture, context, interactivity, and adaptation. This talk calls for a broader evaluation agenda and presents these ten papers as starting points for more robust, inclusive, and realistic assessments.

MAY

Using New Data to Answer Old Questions

Emma Pierson

May 16, 2025, 2:00 PM

289, Engineering VI (Virtual Speaker)

Speaker Bio: Emma Pierson is an assistant professor of computer science at UC Berkeley and core faculty in the Computational Precision Health program. She develops data science and machine learning methods to study inequality and healthcare. Her work has been recognized by best paper, poster, and talk awards, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, Forbes 30 Under 30 in Science, AI2050 Early Career Fellowship, and Samsung AI Researcher of the Year. Her research has been published in venues including Nature, JAMA, The New England Journal of Medicine, PNAS, Nature Medicine, ICML and ICLR, and she has also written for The New York Times, FiveThirtyEight, Wired, and various other publications.

Abstract: The explosion of new data sources has created new opportunities, and necessitated new machine learning methods, to answer old questions in the health and social sciences. This talk discusses three stories under this theme: first, using image data to quantify inequality in policing; second, using text data to interpretably predict target variables and characterize disparities; and third, using address data to infer fine-grained migration patterns.

MAY

Matryoshka Principles for Adaptive Intelligence

Aditya Kusupati

May 09, 2025, 2:00 PM

289, Engineering VI

Speaker Bio: Aditya Kusupati is a Staff Research Scientist at Google DeepMind. He got his PhD from University of Washington and B.Tech from IIT Bombay. Between his B.Tech and PhD, he was a Research Fellow at Microsoft Research. His research focuses broadly on next-generation machine learning models geared towards adaptive intelligence.

Abstract: The increasing scale of deep learning models presents significant challenges for deployment across diverse computational environments, each with unique constraints on latency, memory, and energy. Traditional approaches often necessitate training and maintaining separate models for each desired operating point, leading to substantial overhead. This talk explores the "Matryoshka" principle, a promising paradigm for achieving computational adaptivity within a single trained artifact. Inspired by Russian nesting dolls, Matryoshka methods embed coarser, computationally cheaper structures within finer, more powerful ones, enabling dynamic adjustment of resource usage at inference time. This technique is highly generalizable across various fundamental components of Machine Learning like Embeddings, Transformers and even the integer data type for Quantization. The community extended it beyond just these components and has seen a wide array of deployments both across industry and open-source, serving over a Billion users daily. Collectively, these works demonstrate how the Matryoshka principle facilitates unified training of highly flexible models that can seamlessly adapt their computational footprint post-training, significantly simplifying deployment and enhancing efficiency across heterogeneous hardware.

APR

Measuring Representation and Linguistic Variation in Hollywood

Prof. David Bamman, University of California Berkeley

Apr 18, 2025, 2:00 PM

289, Engineering VI

Speaker Bio: David Bamman is an associate professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University. Bamman's work is supported by the National Endowment for the Humanities, National Science Foundation, an Amazon Research Award, and an NSF CAREER award.

Abstract: Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in the United States due to the Digital Millennium Copyright Act. In this talk, I'll discuss recent regulatory changes at the U.S. Copyright Office that allows for large-scale text and data mining of film, and describe our efforts to build a collection of 2,307 films representing the top 50 movies by U.S. box office over the period 1980 to 2022, along with award nominees. Building this collection allows us to carry out several large-scale computational studies of film; I'll discuss our work measuring changing patterns in the representation of gender and race/ethnicity over the past 43 years (where we see an increase in diversity over the past decade) and in leveraging it to model variation in emotional performances and choice of adverbial intensifiers over both narrative and historical time. This work illustrates a new frontier of the data-driven analysis of film at a large scale

APR

How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents

Zhe Gan, Apple

April 11, 2025, 2:00 PM

289, Engineering VI (Virtual Speaker)

Speaker Bio: Dr. Zhe Gan is a Research Scientist and Manager at Apple AI/ML, primarily working on building large-scale vision and multimodal foundation models. Before joining Apple, he was a Principal Researcher at Microsoft. He received his Ph.D. degree from Duke University in 2018. He has served as Area Chairs for top-tier AI conferences, and is a recipient of the Best Student Paper Honorable Mention Awards at CVPR 2021 and WACV 2021, respectively.

Abstract: Multimodal Large Language Models (LLMs) have become an increasing hot research topic. In this talk, I will present our recent works on how to build performant multimodal LLMs, along several fronts: (1) Pre-training, with focus on pre-training data choices, multimodal LLM pre-training and visual encoder pre-training; (2) Post-training, with focus on text-rich image understanding, visual referring and grounding, UI understanding, and reasoning; and (3) Generalist Agents, with focus on how to adapt multimodal LLMs into generalist embodied agents.

APR

Optimizing for Long-Term Vision in a Fast-Paced Research World

Prof. Yulia Tsvetkov, University of Washington

Apr 4, 2025, 2:00 PM

289, Engineering VI

Speaker Bio: Yulia Tsvetkov is an associate professor at the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research group works on fundamental advancements to large language models, multilingual NLP, and AI ethics/safety. This research is motivated by a unified goal: to extend the capabilities of human language technology beyond individual populations and across language boundaries, thereby making NLP tools available to all users. Prior to joining UW, Yulia was an assistant professor at Carnegie Mellon University and before that a postdoc at Stanford. Yulia is a recipient of NSF CAREER, Sloan Fellowship, Okawa Research award, and multiple paper awards and runner-ups at NLP, ML, and CSS conferences.

Abstract: The fast-paced race for larger language models—and the promise of financial gains for the winners—incentivizes heavier engineering with incremental ideas, often at the expense of long-term vision. While this approach advances industry products used by millions, it is not necessarily the right approach for academic research. In this talk, I will present novel task formulations and evaluation benchmarks that question mainstream assumptions about LLM architectures, training/alignment algorithms, and evaluation approaches. While proposed ideas contradict the common practice, they expose blind spots in LLMs reasoning abilities, and huge performance and fairness gaps in best commercial LLMs, highlighting directions for future research.

Organizing Committee

Faculty

Prof. Kai-Wei Chang

Prof. Nanyun Peng

Prof. Saadia Gabriel

Prof. Elisa Kreiss

Students

Tanmay Parekh

Yufei Tian

Ashima Suvarna

Salman Rahman