Kai-Wei Chang's Lab

UCLA NLP Seminar Series

Welcome to our weekly seminar series.

Kai-Wei Chang's Lab
Date Speaker Title
April 4 Yulia Tsvetkov Optimizing for Long-Term Vision in a Fast-Paced Research World
April 11 Zhe Gan How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents
April 18 David Bamman Measuring Representation and Linguistic Variation in Hollywood
May 9 Swabha Swayamdipta
May 16 Emma Pierson
May 23 Shou-De Lin Mi-Yen Yeh

🚀 Upcoming Talks

APR
18

Measuring Representation and Linguistic Variation in Hollywood

Person IconProf. David Bamman, University of California Berkeley

Clock IconApr 18, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: David Bamman is an associate professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University. Bamman's work is supported by the National Endowment for the Humanities, National Science Foundation, an Amazon Research Award, and an NSF CAREER award.

Abstract: Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in the United States due to the Digital Millennium Copyright Act. In this talk, I'll discuss recent regulatory changes at the U.S. Copyright Office that allows for large-scale text and data mining of film, and describe our efforts to build a collection of 2,307 films representing the top 50 movies by U.S. box office over the period 1980 to 2022, along with award nominees. Building this collection allows us to carry out several large-scale computational studies of film; I'll discuss our work measuring changing patterns in the representation of gender and race/ethnicity over the past 43 years (where we see an increase in diversity over the past decade) and in leveraging it to model variation in emotional performances and choice of adverbial intensifiers over both narrative and historical time. This work illustrates a new frontier of the data-driven analysis of film at a large scale

🚨 Past Talks

APR
11

How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents

Person IconZhe Gan, Apple

Clock IconApril 11, 2025, 2:00 PM

Location Icon289, Engineering VI (Virtual Speaker)

Speaker Bio: Dr. Zhe Gan is a Research Scientist and Manager at Apple AI/ML, primarily working on building large-scale vision and multimodal foundation models. Before joining Apple, he was a Principal Researcher at Microsoft. He received his Ph.D. degree from Duke University in 2018. He has served as Area Chairs for top-tier AI conferences, and is a recipient of the Best Student Paper Honorable Mention Awards at CVPR 2021 and WACV 2021, respectively.

Abstract: Multimodal Large Language Models (LLMs) have become an increasing hot research topic. In this talk, I will present our recent works on how to build performant multimodal LLMs, along several fronts: (1) Pre-training, with focus on pre-training data choices, multimodal LLM pre-training and visual encoder pre-training; (2) Post-training, with focus on text-rich image understanding, visual referring and grounding, UI understanding, and reasoning; and (3) Generalist Agents, with focus on how to adapt multimodal LLMs into generalist embodied agents.

APR
4

Optimizing for Long-Term Vision in a Fast-Paced Research World

Person IconProf. Yulia Tsvetkov, University of Washington

Clock IconApr 4, 2025, 2:00 PM

Location Icon289, Engineering VI

Speaker Bio: Yulia Tsvetkov is an associate professor at the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research group works on fundamental advancements to large language models, multilingual NLP, and AI ethics/safety. This research is motivated by a unified goal: to extend the capabilities of human language technology beyond individual populations and across language boundaries, thereby making NLP tools available to all users. Prior to joining UW, Yulia was an assistant professor at Carnegie Mellon University and before that a postdoc at Stanford. Yulia is a recipient of NSF CAREER, Sloan Fellowship, Okawa Research award, and multiple paper awards and runner-ups at NLP, ML, and CSS conferences.

Abstract: The fast-paced race for larger language models—and the promise of financial gains for the winners—incentivizes heavier engineering with incremental ideas, often at the expense of long-term vision. While this approach advances industry products used by millions, it is not necessarily the right approach for academic research. In this talk, I will present novel task formulations and evaluation benchmarks that question mainstream assumptions about LLM architectures, training/alignment algorithms, and evaluation approaches. While proposed ideas contradict the common practice, they expose blind spots in LLMs reasoning abilities, and huge performance and fairness gaps in best commercial LLMs, highlighting directions for future research.

Organizing Committee

Faculty

Prof. Kai-Wei Chang

Prof. Nanyun Peng

Prof. Saadia Gabriel

Prof. Elisa Kreiss

Students

Tanmay Parekh

Yufei Tian

Ashima Suvarna

Yining Hong

Salman Rahman