publications | Colton Casto

2025

The cerebellar components of the human language network

Colton Casto, Hannah Small, Moshe Poliak, Greta Tuckute, Benjamin Lipkin, Agata Wolna, Anila M. D’Mello, and Evelina Fedorenko

bioRxiv, 2025

Abs DOI Bib PDF

The cerebellum’s capacity for neural computation is arguably unmatched. Yet despite evidence of cerebellar contributions to cognition, including language, its precise role remains debated. Here, we systematically evaluate cerebellar language-responsive regions using precision fMRI. We identify four cerebellar regions that respond to language across modalities (Experiments 1a-b, n=754). One region–spanning Crus I/II/lobule VIIb–is selective for language relative to diverse non-linguistic tasks (Experiments 2a-f, n=732), and the rest exhibit mixed-selective profiles. Like the neocortical language system, the language-selective region is engaged by sentence-level meanings during comprehension and production (Experiments 3a-c, n=105), but it is less sensitive than the neocortex to individual word meanings and grammatical structure. Finally, all four regions, but especially Crus I/II/VIIb, are functionally connected to the neocortical language system (Experiment 4, n=85). We propose that these cerebellar regions constitute components of the extended language network, with one region supporting semantic processing, and the other three plausibly integrating information from diverse neocortical regions.
@article{Casto2025, title = {The cerebellar components of the human language network}, author = {Casto, Colton and Small, Hannah and Poliak, Moshe and Tuckute, Greta and Lipkin, Benjamin and Wolna, Agata and D'Mello, Anila M. and Fedorenko, Evelina}, journal = {bioRxiv}, volume = {}, issue = {}, pages = {1-109}, year = {2025}, url = {https://www.biorxiv.org/content/10.1101/2025.04.14.645351v1.abstract}, doi = {10.1101/2025.04.14.645351}, }

2024

Universality of representation in biological and artificial neural networks

Eghbal A. Hosseini, Colton Casto, Noga Zaslavsky, Colin Conwell, Mark Richardson, and Evelina Fedorenko

bioRxiv, 2024

Abs DOI Bib PDF

Many artificial neural networks (ANNs) trained with ecologically plausible objectives on naturalistic data align with behavior and neural representations in biological systems. Here, we show that this alignment is a consequence of convergence onto the same representations by high-performing ANNs and by brains. We developed a method to identify stimuli that systematically vary the degree of inter-model representation agreement. Across language and vision, we then showed that stimuli from high- and low-agreement sets predictably modulated model-to-brain alignment. We also examined which stimulus features distinguish high- from low-agreement sentences and images. Our results establish representation universality as a core component in the model-to-brain alignment and provide a new approach for using ANNs to uncover the structure of biological representations and computations.
@article{Hosseini2024, title = {Universality of representation in biological and artificial neural networks}, author = {Hosseini, Eghbal A. and Casto, Colton and Zaslavsky, Noga and Conwell, Colin and Richardson, Mark and Fedorenko, Evelina}, journal = {bioRxiv}, volume = {}, issue = {}, pages = {1-70}, year = {2024}, url = {https://www.biorxiv.org/content/10.1101/2024.12.26.629294v1.abstract}, doi = {10.1101/2024.12.26.629294}, }
Neural populations in the language network differ in the size of their temporal receptive windows

Tamar Regev^*, Colton Casto^*, Eghbal A. Hosseini, Markus Adamek, Anthony L. Ritaccio, Jon T. Willie, Peter Brunner, and Evelina Fedorenko

Nature Human Behavior, 2024

Abs DOI Bib PDF

Despite long knowing what brain areas support language comprehension, our knowledge of the neural computations that these frontal and temporal regions implement remains limited. One important unresolved question concerns functional differences among the neural populations that comprise the language network. Here we leveraged the high spatiotemporal resolution of human intracranial recordings (n = 22) to examine responses to sentences and linguistically degraded conditions. We discovered three response profiles that differ in their temporal dynamics. These profiles appear to reflect different temporal receptive windows, with average windows of about 1, 4 and 6 words, respectively. Neural populations exhibiting these profiles are interleaved across the language network, which suggests that all language regions have direct access to distinct, multiscale representations of linguistic input—a property that may be critical for the efficiency and robustness of language processing.
@article{RegevCasto2024, title = {Neural populations in the language network differ in the size of their temporal receptive windows}, author = {Regev, Tamar and Casto, Colton and Hosseini, Eghbal A. and Adamek, Markus and Ritaccio, Anthony L. and Willie, Jon T. and Brunner, Peter and Fedorenko, Evelina}, journal = {Nature Human Behavior}, volume = {8}, issue = {}, pages = {1924-1942}, year = {2024}, url = {https://www.nature.com/articles/s41562-024-01944-2}, doi = {10.1038/s41562-024-01944-2}, }
Information-making processes in the speaker’s brain drive human conversations forward

Ariel Goldstein, Haocheng Wang, Tom Sheffer, Mariano Schain, Zaid Zada, Leonard Niekerken, Bobbi Aubrey, Samuel A. Nastase, Harshvardhan Gazula, Colton Casto, Werner K. Doyle, Daniel Friedman, Sasha Devore, Patricia Dugan, Avinatan Hassidim, Michael Brenner, Yossi Matias, Orrin Devinsky, Adeen Flinker, and Uri Hasson

bioRxiv, 2024

Abs DOI Bib PDF

A conversation following an overly predictable pattern is likely boring and uninformative; conversely, if it lacks structure, it is likely nonsensical. The delicate balance between predictability and surprise has been well studied using information theory during speech perception, focusing on how listeners predict upcoming words based on context and respond to unexpected information. However, less is known about how speakers’ brains generate structured yet surprisingly informative speech. This study uses continuous electrocorticography (ECoG) recordings during free, 24/7 conversations to investigate the neural basis of speech production and comprehension. We employed large language models (Llama-2 and GPT-2) to calculate word probabilities based on context and categorized words into probable (top 30%) and improbable (bottom 30%) groups. We then extracted word embeddings from the LLMs and used encoding models to estimate the neural activity while producing or listening to probable and improbable words. Our findings indicate that before word-onset, the human brain functions in opposing, perhaps complementary, ways while listening and speaking. Results show that listeners exhibit increased neural encoding for predictable words before word onset, while speakers show increased encoding for surprising, improbable words. Speakers also show a lower speech production rate before articulating unexpected words, suggesting additional cognitive processes are involved in producing novel information. This indicates that human speech production includes information-making processes for generating informative words that are absent in language models, which primarily rely on statistical probabilities to generate contextually appropriate speech.
@article{Goldstein2024, title = {Information-making processes in the speaker's brain drive human conversations forward}, author = {Goldstein, Ariel and Wang, Haocheng and Sheffer, Tom and Schain, Mariano and Zada, Zaid and Niekerken, Leonard and Aubrey, Bobbi and Nastase, Samuel A. and Gazula, Harshvardhan and Casto, Colton and Doyle, Werner K. and Friedman, Daniel and Devore, Sasha and Dugan, Patricia and Hassidim, Avinatan and Brenner, Michael and Matias, Yossi and Devinsky, Orrin and Flinker, Adeen and Hasson, Uri}, journal = {bioRxiv}, volume = {}, issue = {}, pages = {1-21}, year = {2024}, url = {https://www.biorxiv.org/content/10.1101/2024.08.27.609946v1.abstract}, doi = {10.1101/2024.08.27.609946}, }
Distributed sensitivity to syntax and semantics throughout the language network

Cory Shain^*, Hope Kean^*, Colton Casto, Benjamin Lipkin, Josef Affourtit, Matthew Siegelman, Francis Mollica, and Evelina Fedorenko

Journal of Cognitive Neuroscience, 2024

Abs DOI Bib PDF

Human language is expressive because it is compositional: The meaning of a sentence (semantics) can be inferred from its structure (syntax). It is commonly believed that language syntax and semantics are processed by distinct brain regions. Here, we revisit this claim using precision fMRI methods to capture separation or overlap of function in the brains of individual participants. Contrary to prior claims, we find distributed sensitivity to both syntax and semantics throughout a broad frontotemporal brain network. Our results join a growing body of evidence for an integrated network for language in the human brain within which internal specialization is primarily a matter of degree rather than kind, in contrast with influential proposals that advocate distinct specialization of different brain areas for different types of linguistic functions.
@article{ShainKean2024, title = {Distributed sensitivity to syntax and semantics throughout the language network}, author = {Shain, Cory and Kean, Hope and Casto, Colton and Lipkin, Benjamin and Affourtit, Josef and Siegelman, Matthew and Mollica, Francis and Fedorenko, Evelina}, journal = {Journal of Cognitive Neuroscience}, volume = {36}, issue = {7}, pages = {1427-1471}, year = {2024}, url = {https://direct.mit.edu/jocn/article/36/7/1427/120796}, doi = {10.1162/jocn_a_02164}, }

2022

Shared computational principles for language processing in humans and deep language models

Ariel Goldstein, Zaid Zada, Eliav Buchnik, Mariano Schain, Amy Price, Samuel A. Nastase, Amir Feder, Dotan Emanuel, Alon Cohen, Aren Jansen, Harshvardhan Gazula, Gina Choe, Aditi Rao, Catherine Kim, Colton Casto, Lora Fanda, Werner Doyle, Daniel Friedman, Patricia Dugan, Lucia Melloni, Roi Reichart, Sasha Devore, Adeen Flinker, Liat Hasenfratz, Omar Levy, Avinatan Hassidim, Michael Brenner, Yossi Matias, Kenneth A. Norman, Orrin Devinsky, and Uri Hasson

Nature Neuroscience, 2022

Abs DOI Bib PDF

Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language.
@article{Goldstein2022, title = {Shared computational principles for language processing in humans and deep language models}, author = {Goldstein, Ariel and Zada, Zaid and Buchnik, Eliav and Schain, Mariano and Price, Amy and Nastase, Samuel A. and Feder, Amir and Emanuel, Dotan and Cohen, Alon and Jansen, Aren and Gazula, Harshvardhan and Choe, Gina and Rao, Aditi and Kim, Catherine and Casto, Colton and Fanda, Lora and Doyle, Werner and Friedman, Daniel and Dugan, Patricia and Melloni, Lucia and Reichart, Roi and Devore, Sasha and Flinker, Adeen and Hasenfratz, Liat and Levy, Omar and Hassidim, Avinatan and Brenner, Michael and Matias, Yossi and Norman, Kenneth A. and Devinsky, Orrin and Hasson, Uri}, journal = {Nature Neuroscience}, volume = {25}, issue = {}, pages = {369-380}, year = {2022}, url = {https://www.nature.com/articles/s41593-022-01026-4}, doi = {10.1038/s41593-022-01026-4}, }