| Date | Speaker | Title |
|---|---|---|
| April 17 | Mathew Finlayson | The Search for an Unforgeable Language Model Signature |
| April 24 | Idan Blank | TBD |
| May 22 | Eunice Jun | TBD |
| May 29 | Taylor Berg-Kirkpatrick | TBD |
🚀 Upcoming Talks
More details coming soon!
The Search for an Unforgeable Language Model Signature
April 17, 2026, 2:00 PM PT
Rm 289, Engineering VI
Speaker Bio:Matthew Finlayson is a PhD candidate in computer science at the University of Southern California. He is advised by Swabha Swayamdipta and Xiang Ren. His research focuses on the security and interpretability of large language models, including work on unforgeable signatures for language models and information leakage from model interfaces. He is supported by an NSF Graduate Research Fellowship and was previously a pre-doctoral researcher at the Allen Institute for AI.
Abstract:As language models become ubiquitous, reliably attributing text to specific models is an increasingly important challenge in model forensics. Existing approaches—watermarking, text classifiers, backdoor fingerprints, and input/output matching—each require significant assumptions such as provider cooperation, training data access, or prompt knowledge. We present an alternative approach based on naturally occurring signatures in language model outputs. In particular, language model parameters impose geometric constraints on their outputs, and these structures serve as unique model identifiers. Early work on model signatures based on linear constraints suffered from a major drawback: an adversary could "forge" a signature by reconstructing the constraints from model outputs. We explore elliptical and ranking constraints, which move us closer to provably unforgeable (or forgery-resistant) language model signatures via connections to high dimensional ellipse fitting and oriented matroid theory. These results point toward truly unforgeable signatures that every language model inherently possesses, requiring no provider implementation and no access to model internals.