Kazuki Irie

Postdoctoral Fellow
Department of Psychology
Gershman Computational Cognitive Neuroscience Lab
Harvard University

Northwest Building
52 Oxford St, Cambridge, MA 02138

Email: kirie@g.harvard.edu

Google Scholar / GitHub / dblp / OpenReview / arXiv

Bio

Kazuki Irie received his Ph.D. in Computer Science from RWTH Aachen University, Germany in 2020, after completing his undergraduate and Master’s studies in Applied Mathematics at École Centrale Paris and ENS Cachan, France. From 2020 to 2023, he was a postdoctoral researcher and lecturer at the Swiss AI Lab IDSIA, University of Lugano, Switzerland. He is currently a postdoctoral fellow at the Department of Psychology, Harvard University, USA.

His current research investigates the computational principles of learning, memory and decision making, with the dual goals of advancing general-purpose artificial intelligence and developing tools to better understand our own intelligence. The scope of his research interest has expanded from language modeling during his Ph.D. to general sequence and program learning during his postdoc, and now to computational neuroscience and cognitive science in his current post-postdoctoral research.

Publications

Please click the Publications tab to find the list of my publications.

My recent work has been published in machine learning conferences and journals (NeurIPS, ICML, ICLR, TMLR, EMNLP, …), and before 2020, my Ph.D. research on language modeling had been published in speech/language processing conferences (Interspeech, ICASSP, …); the time before language models became the powerhouse of modern chatbots.

I have several ongoing projects at the intersection of cognitive neuroscience and AI. Please check back later for updates!

For the most updated list of my publications, please refer to my Google Scholar.

Origin of my current research program

In my prior life as a Master’s and Ph.D. student (2013-2020), I intensively worked on language modeling before the current era of large language models or LLMs. My Ph.D. work has addressed several topics that are now popular in the field, including the development of large/deep (>100-layer) transformer language models in academia (Interspeech 2019) concurrently to OpenAI’s GPT-2 in 2019, mixture of experts to scale up language models at Google/YouTube (ICASSP 2018), identifying and addressing the “KV-cache” memory problem of transformers (ICASSP 2020),
distillation of large transformer language models into compact recurrent neural network language models (another ICASSP 2020; building off my earlier work on distilling LSTM language models into n-gram language models; ICASSP 2018), byte-level processing for modeling of low-resource languages with special sets of alphabets (ICASSP 2017), etc.

While LLMs certainly show impressive performance and attract the interest of many researchers, it remains debatable whether recent developments represent foundational advances in language modeling. My research program is based on my conviction—resulting from my unique and decade-long experience with language modeling—that, fundamental research on cognition is necessary to further advance Artificial Intelligence (AI). This has motivated my current collaborations with cognitive neuroscientists and psychologists; which, in turn, also allowed me to identify and operationalize opportunities to leverage advances in AI to catalyze the study of human intelligence and brain computation.

Recent talks

In reverse chronological order. Please click the title to find the presentation slides.

Little-known(?) Facts about Fast Weight Programmers
MIT CSAIL, Cambridge, MA, USA
Hosted by Prof. Yoon Kim
September 2024

Sequence Processing with Fast Weight Programmers
Columbia University, Department of Psychology, New York, NY, USA
Hosted by Prof. Nikolaus Kriegeskorte, Dr. Hossein Adeli, and Eivinas Butkus
September 2024

Continual Learning, Machine Self-Reference, and the Problem of Problem-Awareness
Columbia University, Department of Computer Science, New York, NY, USA
Hosted by Prof. Rich Zemel
July 2024

Rethinking synaptic modulations through the lens of key-value memory, and vice versa
Columbia University, Zuckerman Institute, New York, NY, USA
Hosted by Prof. Ashok Litwin-Kumar
July 2024

The Dual Form of the Perceptron (1964) meets the Age of Transformers
Northeastern University, Department of Computer Science, Boston, MA, USA
Hosted by Prof. David Bau
June 2024

Recent teaching material samples

Course Materials for “Deep Learning Lab” (Fall 2020 to 2022), 14-week introductory lecture on deep learning
All Materials (including Lecture Slides)
University of Lugano, Switzerland

Workshop Materials for “Introduction to Language Modeling for Neuroscientists” (June 2024)
Main PDF & Slides & Helper Notebooks
Columbia University, USA