Goldfish swimming in an aquarium of binary code (image: Abha Eli Phoboo/Craiyon)

ChatGPT knows how to use the word “tickle” in a sentence but it cannot feel the sensation. Can it then be said to understand the meaning of the word tickle the same way we humans do? 

In an ongoing debate, AI researchers are teasing apart whether large language models (LLMs) like ChatGPT and Google’s PaLM understand language in any humanlike sense. The relationship between embodiment and understanding is one question, along with the nature of intelligence and understanding. Should concepts of meaning, understanding, and intelligence be revisited to create a distinction between how humans and machines understand the world?

SFI researchers Melanie Mitchell and David C. Krakauer survey “The debate over understanding in AI’s large language models”  in their paper published in the Proceedings of the National Academy of Sciences on March 21 (available on arXiv). The authors examine the characteristics that make LLMs impressive but also susceptible to unhumanlike errors and note the “fascinating divergence” emerging in how we humans think about understanding in intelligent systems.

“Humans do all kinds of experiments to learn about the world. Our embodiment is fundamental to our intelligence,” says Mitchell.  “Large language models have the appearance of understanding but do not have experiences.” 

LLMs are pre-trained on large datasets. Human understanding is based on a set of mental concepts that we map from our experiences as we interact with the world. This underlines the stark difference between mental models that rely on statistical correlations, such as what LLMs use, versus those that rely on causal mechanisms.

“Large language models are fact-rich like a big library and more autonomous than an abacus. And like an abacus, they are tools that can be used to augment our intelligence — a kind of steampunk mechanical library. But we cannot confuse having this tool with having an understanding,” says Krakauer. 

The paper also takes into account the many threads of debate in the AI research community, including the familiar human tendency to “attribute understanding and agency to machines with even the faintest hint of humanlike language and behavior” and the mystery behind how LLMs are able to give the appearance of humanlike reasoning.

“We really wanted to report on what people are talking about, to summarize the different modes of discussions. It is apparent that we need a new vocabulary to talk about it,” says Mitchell. 

Read the paper "The debate over understanding in AI’s large language models" in PNAS (March 21, 2023):


Templeton World Charity Foundation Grant Award No. 2021-20650 "Building Diverse Intelligences Through Compositionality and Mechanism Design"