harvard library with trees in the foregound

Language & Cognition (LangCog)

Hosted by Psychology Language Development Labs (Snedeker Lab & Bergelson Lab), Meaning and Modality Lab (Davidson Lab)

Tuesdays, 5:30 – 7:00 pm ET in William James Hall 1550 (Harvard University)

Organizers: Briony Waite (bwaite@g.harvard.edu) and Yiqian Wang (yiqian_wang@g.harvard.edu)

Supported by Mind Brain Behavior Interfaculty Initiative

About Us

Join our mailing list

Fall 2025 Schedule

Date	Speaker and Talk Title	Location	Abstract
Sep 9	picnic, no speaker	WJH courtyard
Sep 16	Thomas Clark A Model of Incremental and Approximate Noisy-Channel Language Processing	WJH 1550	Available See below↓
Sep 23	Wednesday Bushong How do listeners represent linguistic information during real-time processing?	WJH 1550	Available See below↓
Sep 30	Ethan Wilcox Informational Overlap between Linguistic Channels: Prosody as a Case Study	WJH 601 (different from usual)	Available See below↓
Oct 7	Ryan O’Leary Flexibility and limits of spoken language comprehension under perceptual challenge	WJH 1550	Available See below↓
Oct 14	Meredith Rowe The Importance of Decontextualized Conversations for Preschool Children’s Language and Cognitive Development	WJH 1550	Available See below↓
Oct 21	Sophie Hao Towards a Linguistics of Large Language Models	WJH 1550	Available See below↓
Oct 28	Dorothy Ahn Reference Makers in Discourse	WJH 1550	Available See below↓
Nov 4	Joseph Coffey Mapping the global geography of childhood conversation	WJH 1550	Available See below↓
Nov 11	Ankana Saha From Humans to Language Models: Testing the Boundaries of Pragmatic Understanding	WJH 1550	Available See below↓
Nov 18	Zhenghan Qi Language learning as a window into neural plasticity	WJH 1550	Available See below↓
Nov 25	Thanksgiving, no speaker
Dec 2	Viola Schmitt	WJH 1550

Talks in Fall 2025

Nov 18

Language learning as a window into neural plasticity

Speaker: Zhenghan Qi, Northeastern University, Assistant Professor of Communication Sciences and Disorders and Psychology

Abstract (click to view):

The implicit ability to rapidly detect and extract regularities and variabilities from linguistic inputs is a building block of human cognition. The sensitive period hypothesis suggests young children outperform experienced language users in learning. But on the other hand, experienced language users can take advantage of their lifelong knowledge to scaffold further learning. Who is more efficient when facing new input? I will present evidence from both developmental studies and work with special populations, illustrating the dynamic relationship between long-term experience and short-term learning. I will highlight the malleability of learning processes through the lens of developmental differences and deviations in learning mechanisms across populations.

Nov 11

From Humans to Language Models: Testing the Boundaries of Pragmatic Understanding

Speaker: Ankana Saha, Harvard University, PhD candidate in the Department of Linguistics

Abstract (click to view):

How do we keep track of who or what is being talked about as discourse unfolds? Languages employ a range of referential expressions in anaphora, such as pronouns (he), definites (the boy), and demonstratives (that boy), but how discourse structure constrains their use remains a central question for theories of meaning. Focusing on definites and demonstratives in anaphoric contexts, this work investigates the pragmatic principles that govern their acceptability. Although anaphora is often assumed to offer limited insight into the distinction between definites and demonstratives, I argue that it in fact provides a powerful diagnostic of their contrasting pragmatic profiles. Using cross-linguistic experimental data from English, Turkish, and Bangla, I show that demonstratives in anaphora are uniquely sensitive to subtle information-structural contrasts, unlike definites. This empirical pattern brings to the fore the distinctive pragmatic signature of demonstratives and establishes a new experimental framework for testing competing theories of definite reference, particularly in languages whose definiteness systems are complex or debated, as shown for Mandarin and German. Extending this line of inquiry, I examine whether these discourse-level constraints on reference are computationally tractable in artificial language systems. Nineteen large language models (LLMs) were tested to evaluate how model size and architecture affect discourse sensitivity. While larger models closely mirrored human behavior in tracking the discourse-structural constraints on demonstratives, smaller models did not. These findings provide the first systematic assessment of discourse-pragmatic competence in LLMs, showing that the fine-grained pragmatic conditions underlying demonstrative use are not easily learnable from surface-level statistics alone and emerge only in more advanced models.

Nov 4

Mapping the global geography of childhood conversation

Speaker: Joseph R. Coffey, Ecole Normale Superieure, Fyssen Postdoctoral Fellow at the Laboratory of Cognitive Sciences and Psycholinguistics

Abstract (click to view):

Language is acquired within interactional frames, and yet the dynamics and social contexts of conversation differ across cultures. Our study utilizes recent advances in automatic speech processing to identify conversations children have with different participants from child-centered, daylong audio recordings. We analyze 23,000 hours of audio from 840 children (1-85 months, mean = 28 months) growing up in 13 countries, representing urban, agricultural, and foraging societies. Contrary to common assumptions, most of children’s conversations were not in dyadic contexts, but in multiparty contexts involving at least two other speakers. Dyadic conversation with an adult partner was relatively more common in communities from North America, Europe, and Israel. Dyadic conversation with another child was more common elsewhere, including all societies with reported engagement in subsistence agriculture or foraging. In addition, children with more siblings engaged in relatively more peer conversation and less adult conversation, while higher caregiver education predicted less peer conversation. Taken together, our findings represent the first foray into mapping the global geography of conversations in childhood.

Oct 28

Reference Makers in Discourse

Speaker: Dorothy Ahn, Rutgers University, Assistant Professor in the Department of Linguistics

Abstract (click to view):

Formal semantics literature often assume different underlying mechanisms for pronouns, definites, and demonstratives. In this talk, I compare the semantic contributions of different building blocks of these expressions and highlight similarities that surface upon closer investigation across different contexts of use and across different languages. I propose a uniform analysis in which all three types of expressions function minimally as labels for given entities of the discourse, which have implications on how the space of reference is partitioned. The apparent differences among the expressions along the scale of uniqueness, saliency, or at-issueness are argued to arise from broader discourse factors such as focus and Question Under Discussion (QUD) sensitivity that cut across the minimal lexical distinction.

Oct 21

Towards a Linguistics of Large Language Models

Speaker: Sophie Hao, Boston University, Moorman–Simon Assistant Professor of Linguistics and Data Science

Abstract (click to view):

Large language models (LLMs) are the most successful technique ever developed for generating natural language text; but they are based on principles that are fundamentally at odds with generativist assumptions. In this talk, I argue that LLMs have a “linguistic competence” that is similar to, but not the same as, human language. I make the case that investigating LLM language on its own terms, without measuring LLMs against a human standard, can lead to new perspectives on traditional theoretical debates. I conclude the talk by showing how the ongoing development and proliferation of LLM technologies provide opportunities for linguists of all subfields to make broad social impact.

Oct 14

The Importance of Decontextualized Conversations for Preschool Children’s Language and Cognitive Development

Speaker: Meredith L. Rowe, Harvard University, Saul Zaentz Professor of Early Learning and Development

Abstract (click to view):

Young children’s use of decontextualized language, or language that is abstract and displaced from the here-and-now, is associated with development of oral language (e.g., vocabulary, syntax) and cognitive (e.g., theory of mind, autobiographical memory) skills. In this talk, I present a series of studies from our lab highlighting these relations and suggest some potential mechanisms for why decontextualized language may be so helpful for learning.

Oct 7

Flexibility and limits of spoken language comprehension under perceptual challenge

Speaker: Ryan M. O’Leary, Northeastern University, Postdoctoral Fellow at the Institute for Cognitive and Brain Health

Abstract (click to view):

In spoken communication, successful comprehension depends not only on the linguistic information itself but also on the quality of the perceptual input that carries it. When the acoustic speech signal is degraded through alterations in rate, spectral content, or listener-specific factors such as hearing loss, the cognitive systems supporting comprehension must adapt. This talk will focus on a series of experiments that examine how listeners manage these challenges, with a particular focus on how changes to the input shape both online processing and downstream memory. Using a combination of pupillometry (moment-to-moment changes in pupil diameter thought to track cognitive effort), drift–diffusion modeling (to model speed-accuracy tradeoffs), and behavioral measures of accuracy and recall, these studies examine how different forms of signal perturbation (e.g., time-compression, noise-vocoding) interact with linguistic complexity and task demands. Taken together, these studies emphasize the flexibility of language processing under perceptual constraints and hint at perceptual limits under which comprehension can succeed.

Sep 30

Informational Overlap between Linguistic Channels: Prosody as a Case Study

Speaker: Ethan Wilcox, Georgetown University, Assistant Professor of Computational Linguistics

Abstract (click to view):

Linguistic communication takes place across multiple channels, including segmental information, prosody, co-speech gestures, and facial expressions. Often, each channel conveys its own unique information; however, channels are sometimes redundant with each other. In this talk, I present a series of collaborative studies seeking to quantify and characterize this redundancy and to explore its implications for linguistic and psycholinguistic theories. I focus on the redundancy between segmental information (operationalized as text) and various prosodic features, including pitch, duration, pause, energy, and a composite measure of prosodic prominence. The talk is divided into three parts. In the first part, we propose that redundancy between linguistic channels can be operationalized as their mutual information (MI), a measure from information theory that quantifies the amount of information one variable contains about another. I present a computational pipeline for estimating the MI between text and prosody, using large language models to capture the information carried in text. Applying the pipeline to an English audiobook corpus, we find significant redundancy for all prosodic features with both past and future textual information. In the second part, I extend this method to investigate linguistic typology by using these methods to provide a theory-neutral, principled way of estimating the degree to which a language is tonal. We predict that the mutual information between word identity and pitch should be higher in tonal languages, where lexical distinctions are made on pitch. We show that this is, indeed, the case, across ten languages from six different language families. In part three, I investigate the time scale of the prosody-text redundancy. In a corpus of English audiobooks, we show an asymmetric relationship, where prosody and past textual context display longer-scale informational overlap than prosody and future textual context. This suggests that prosody is more useful for recall, as opposed to prediction. I close by highlighting several limitations and future directions for this work. All research reported in this talk is collaborative, and conducted jointly with (but not limited to!) Alex Warstadt (UCSD), Tiago Pimentel (ETH Zürich), and Tamar Regev (MIT).

Sep 23

How do listeners represent linguistic information during real-time processing?

Speaker: Wednesday Bushong, Wellesley College, Assistant Professor of Cognitive and Linguistic Sciences and Psychology

Abstract (click to view):

Spoken language understanding requires listeners to infer abstract linguistic units (phonemes, words, etc.) from a temporally fleeting, high-dimensional signal. Traditional psycholinguistic theories contend that listeners must process speech in a radically incremental manner, making categorization judgments as quickly as possible while discarding lower-level gradient information about the speech signal. However, there is now substantial evidence that listeners can maintain at least some gradient information about previous speech input. My lab investigates what kind of representations listeners maintain about past input, and how they integrate those representations with current input to form a coherent percept. In this talk, I will discuss three recent studies from my lab. Studies 1-2 investigate how listeners integrate acoustic and semantic information across time by comparing human behavior to normative models of multi-cue integration. Study 3, using methods from the perceptual decision-making literature, aims to pinpoint what level of detail listeners maintain about speech input over time.

Sep 16

A Model of Incremental and Approximate Noisy-Channel Language Processing

Speaker: Thomas Clark, MIT, PhD student in the Department of Brain and Cognitive Sciences

Abstract (click to view):

How are comprehenders able to extract meaning from utterances in the presence of production errors? The Noisy-Channel theory provides an account grounded in Bayesian inference: comprehenders may interpret utterances non-literally when there is an alternative with higher prior probability and a plausible error likelihood. Yet the question of how such alternatives are generated and evaluated, given the intractability of the exact probabilistic inference problem, is open to debate. Here, we model noisy-channel processing as approximate probabilistic inference over intended sentences and production errors. We define a generative model of noisy language using a large language model prior and a symbolic error model. We then approximate the distribution over intended messages, conditional on some noisy utterance, using Sequential Monte Carlo inference with rejuvenation — an algorithm that is incremental but allows reanalysis of previous material. We demonstrate that the model reproduces known patterns in human behavior for implausible or erroneous sentences, without fitting to human data, in three case studies: a) inferential interpretations of implausible sentences with near neighbors in syntactic alternations, with an insertion-deletion asymmetry (Gibson et al., 2013), b) item-level edit preferences in sentences with ambiguous agreement errors (Qian & Levy, 2023), and c) a dissociation in surprisal between recoverable and non-recoverable errors (Ryskin et al., 2021). The model also motivated a novel prediction about regressive eye movements in the reading of temporarily plausible sentences, which was borne out experimentally in a pilot study using mouse-tracking for reading. Our results offer a step towards a flexible, algorithmic account of inference during real-world language comprehension. We make our code publicly available at: https://github.com/thomashikaru/noisy_channel_model