Summary
Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qcon.ai with any comments or concerns.
The presentation titled "Rules for Understanding Language Models" by Naomi Saphra explores the fundamental principles of language model behavior and their divergence from human cognition. The speaker, Naomi Saphra, brings her expertise as a Kempner Research Fellow at Harvard University to delve into the intricacies of how language models operate.
Key Principles Discussed:
- An LM memorizes when it can: Language models often rely on memorization rather than conceptual understanding. This principle suggests that models choose the easier path of memorizing vast data rather than truly learning underlying concepts.
- An LM acts like a population, not a person: Models function more like a crowd than an individual, capturing diverse inputs to yield more consistently accurate outputs when compared against individual sources.
- An LM aims to please: Models are designed to generate pleasing and agreeable responses, often leading to sycophantic behavior.
- An LM leans on subtle associations: When processing language, models rely heavily on patterns and subtle correlations rather than deeper linguistic understanding.
- An LM learns only what's written down: Language models can only learn from written data, leading to potential propagation of misconceptions and limited understanding of less-documented nuances.
Examples and Observations:
- Models' tendency to regenerate verbatim from familiar data such as Bible quotes due to frequent exposure during training.
- The notion of wisdom of the crowd being applicable to models when aggregating responses under appropriate settings, like using temperature settings to moderate variability.
- Challenges posed by limited data for certain less represented languages which affect models' performance.
The presentation emphasizes acknowledging these principles when integrating language models into tasks to harness their capabilities effectively while mitigating their limitations.
This is the end of the AI-generated content.
After millions of years of evolution, humans understand each other pretty well. But now, confronted with machines that talk, we cannot assume they will act like humans, or act for the same reasons as humans. If we don’t understand how language models (LMs) will behave or the general principles behind that behavior, it’s easy to fall into common pitfalls and create more work than we save by using them for inappropriate tasks or settings. I will draw on my own research and other findings in the modern science of AI to explain 5 general principles of language model behavior that drive their errors and their differences from human behavior:
- An LM memorizes when it can
- An LM acts like a population, not a person
- An LM aims to please
- An LM leans on subtle associations
- An LM learns only what's written down
Speaker
Naomi Saphra
Kempner Research Fellow @Harvard, Incoming Faculty @Boston University
Naomi Saphra is a current Kempner Research Fellow at Harvard University and incoming Assistant Professor at Boston University’s Faculty of Computing and Data Science starting 2026. Her research seeks to understand how language models learn, as influenced by a combination of data composition, training time, and random factors. Through this lens, she has published work at venues like NeurIPS, ICLR, EMNLP, and ACL. Her work has also received press coverage in The Register and Quanta Magazine. Previously, Dr. Saphra completed a PhD at the University of Edinburgh and attended Johns Hopkins and Carnegie Mellon University. She has worked at Google, Meta, and New York University and has consulted at several startups. Honored as a Rising Star in EECS by MIT and awarded Google Europe’s Scholarship for Students With Disabilities, she has also received recognition for service work, garnering three outstanding reviewer awards. She has organized the highly-attended RepL4NLP, BlackboxNLP, and HiLD workshops. Dr. Saphra has delivered invited talks at numerous scientific meetings, including keynotes at PyDataFest Amsterdam, the ICML world models workshop, MILA's Scaling Laws Workshop. Outside of work, Dr. Saphra plays roller derby under the name Gaussian Retribution, performs comedy, and supports disabled scholars by advocating for open source adaptive technology.