After millions of years of evolution, humans understand each other pretty well. But now, confronted with machines that talk, we cannot assume they will act like humans, or act for the same reasons as humans. If we don’t understand how language models (LMs) will behave or the general principles behind that behavior, it’s easy to fall into common pitfalls and create more work than we save by using them for inappropriate tasks or settings. I will draw on my own research and other findings in the modern science of AI to explain 5 general principles of language model behavior that drive their errors and their differences from human behavior:

An LM memorizes when it can
An LM acts like a population, not a person
An LM aims to please
An LM leans on subtle associations
An LM learns only what's written down

Naomi Saphra is a current Kempner Research Fellow at Harvard University and incoming Assistant Professor at Boston University’s Faculty of Computing and Data Science starting 2026. Her research seeks to understand how language models learn, as influenced by a combination of data composition, training time, and random factors. Through this lens, she has published work at venues like NeurIPS, ICLR, EMNLP, and ACL. Her work has also received press coverage in The Register and Quanta Magazine. Previously, Dr. Saphra completed a PhD at the University of Edinburgh and attended Johns Hopkins and Carnegie Mellon University. She has worked at Google, Meta, and New York University and has consulted at several startups. Honored as a Rising Star in EECS by MIT and awarded Google Europe’s Scholarship for Students With Disabilities, she has also received recognition for service work, garnering three outstanding reviewer awards. She has organized the highly-attended RepL4NLP, BlackboxNLP, and HiLD workshops. Dr. Saphra has delivered invited talks at numerous scientific meetings, including keynotes at PyDataFest Amsterdam, the ICML world models workshop, MILA's Scaling Laws Workshop. Outside of work, Dr. Saphra plays roller derby under the name Gaussian Retribution, performs comedy, and supports disabled scholars by advocating for open source adaptive technology.

Rules for Understanding Language Models

Speaker

Naomi Saphra

Find Naomi Saphra at:

Speaker

Naomi Saphra

Date

Location

Share