| Date | Speaker | Title |
|---|---|---|
| October 3 | Brihi Joshi | Towards Richer User Signals for Personalization |
| October 10 | Jacob Andreas | Just Asking Questions |
| October 17 | Aviral Kumar | The Importance of Exploration for Test-Time Scaling |
| October 24 | Parisa Kordjamshidi | Reasoning under Uncertainty with Large Multimodal Language Models |
| October 31 | Rose Yu | TBD |
| November 14 | Arman Cohan | TBD |
| November 21 | Sherry Yang | TBD |
🚀 Upcoming Talks
The Importance of Exploration for Test-Time Scaling
October 17, 2025, 2:00 PM
https://ucla.zoom.us/meeting/register/1LfTUChHRWOA1zApfUcAlA
Speaker Bio: Aviral Kumar is an Assistant Professor of Computer Science and Machine Learning at Carnegie Mellon University, where he started in September 2024. He finished his PhD from UC Berkeley in 2023. His research focuses on reinforcement learning (RL), spanning fundamental advances in offline RL and scaling up RL, and more recently, the use of RL to train large language models (LLMs) and optimize test-time compute. He is a recipient of the Samsung AI Researcher of the Year Award (2024), the Schmidt Sciences AI2050 Early Career Fellowship (2024), and multiple best paper awards across workshops in RL, LLMs, and robotics at ICLR and ICML.
Abstract: RL has enabled language models to optimize long chains of thought (CoTs), yet the field still lacks clarity on what makes these approaches succeed. Conflicting empirical results across papers often stem from differences in setting rather than principle. In this talk, I will share our perspective: effective test-time scaling hinges on in-context exploration, the ability of a model to internally experiment and infer generalizable algorithmic procedures using additional compute at inference. I will describe two RL-based approaches for training models to perform such exploration. First, I will present e3, a curriculum-based recipe that teaches models to chain together existing skills in the base model, yielding the state-of-the-art <2B language model for math reasoning. Second, I will discuss cases where chaining alone is insufficient. There, we guide exploration by conditioning the model’s CoT on concise, self-generated natural language abstractions: short procedural summaries produced before launching into long reasoning traces. These abstractions help steer test-time search more effectively. Across tasks, conditioning RL on abstractions significantly improves in-context exploration and yields sustained performance gains even when conventional pass@k scaling plateaus.I will also talk briefly about some ongoing work that builds on these ideas to improve exploration for test-time scaling.
Reasoning under Uncertainty with Large Multimodal Language Models
October 24, 2025, 2:00 PM
289, Engineering VI
Speaker Bio: Parisa Kordjamshidi is an Associate Professor of Computer Science and Engineering at Michigan State University. Her research focuses on Natural Language Processing, multimodal reasoning across vision and language, and neuro-symbolic learning. She received her Ph.D. from KU Leuven and conducted postdoctoral research at the University of Illinois Urbana-Champaign. She is a recipient of the NSF CAREER, Amazon Faculty Research, and Fulbright Scholar Awards, and her research team received the NAACL 2025 Outstanding Research Paper Award. Dr. Kordjamshidi serves as Associate Editor of JAIR, Co-editor in chief of ARR (2026), Action Editor for TACL and has held roles in organization committee of major conferences including ACL, NAACL, EACL, EMNLP, ECML-PKDD, and AAAI. Currently, she is a visiting Associate Professor at UCLA spending a part of her sabbatical.
Abstract: Uncertainty in intelligent models has multiple facets. One aspect concerns a model’s own uncertainty or confidence in its generated outputs. Another pertains to factual knowledge about uncertainty within specific concepts. For example, statements such as “10–20% of lifelong smokers will develop lung cancer” express factual uncertainty derived from statistical data analyses and represented in text. A key research question is whether language models can form and convey such factual uncertainties—integrating information, drawing on their internal knowledge, and aligning this with their confidence when expressing opinions. While addressing this question is highly challenging, I will present our research that explores related directions and the following research question: 1) How do language models understand uncertainty expressions in natural language and perform probabilistic inference over them? 2) How can models be trained to follow the principles of probabilistic reasoning when handling uncertainty in text? 3) How can today’s large models reason over uncertain text? specifically focusing on mapping language into formal probabilistic logic programs?, and finally, in the context of grounding natural language in the visual modality, 4) How can uncertainty in perception be explicitly represented in reasoning? specifically focusing on mappings to differentiable probabilistic programs.
Past Talks
Just Asking Questions
October 10, 2025, 2:00 PM
https://ucla.zoom.us/meeting/register/1LfTUChHRWOA1zApfUcAlA
Speaker Bio:Jacob Andreas is an associate professor at MIT in the Department of Electrical Engineering and Computer Science as well as the Computer Science and Artificial Intelligence Laboratory. His research aims to understand the computational foundations of language learning, and to build intelligent systems that can learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. He has received a Sloan fellowship, an NSF CAREER award, MIT's Junior Bose and Kolokotrones teaching awards, and paper awards at ACL, ICML and NAACL.
Abstract: In the age of deep networks, "learning" almost invariably means "learning from examples". We train language models with human-generated text and labeled preference pairs, mage classifiers with large datasets of images, and robot policies with rollouts or demonstrations. When human learners acquire new concepts and skills, we often do so with richer supervision, especially in the form of language---we learn new concepts from examples accompanied by descriptions or definitions, and new skills from demonstrations accompanied by instructions. Current language models (LMs) support a limited form of language-based teaching via prompting, but it remains challenging to use natural language supervision to apply global, persistent changes to learned models. This talk will focus on two recent projects aimed at more effectively supervising LMs using language: first, on *eliciting* new information (by asking questions to human users of LMs); second, on *updating* language models to incorporate new information (by using LMs to automatically ask and answer questions about information implied by, but not explicitly stated in, training data). If time permits, I'll also discuss some applications of these techniques to educational settings (where we can optimize questions for human, rather than machine, learning). This is joint work with Belinda Li, Alex Tamkin, Noah Goodman, Feyza Akyürek, Ekin Akyürek, Leshem Choshen, Derry Wijaya, and Alexis Ross.
Towards Richer User Signals for Personalization
October 3, 2025, 2:00 PM
289, Engineering VI
Speaker Bio: Brihi Joshi is a final-year PhD student in Computer Science at the University of Southern California, advised by Xiang Ren and Swabha Swayamdipta. Her research focuses on human-AI interaction, with an emphasis on personalization, where she designs and evaluates interactive systems that adapt to users in meaningful and useful ways. Her work has been supported by fellowships from Apple and Amazon.
Abstract: Personalization is gaining attention across domains, with different works exploring signals ranging from user demographics to interaction history. The talk will begin by showing that common signals such as prompts and instructions are underspecified for truly useful personalization, leading only to surface-level changes; for example, failing to adapt to learners with different educational backgrounds. We will then present how LLMs can be used to synthesize richer signals, such as user explanations, that drive more meaningful personalization. Finally, we will share ongoing work on training systems to actively elicit useful user signals, and touch upon open problems on how we can obtain and use these user signals.