Sunday, June 28, 2026

What’s enactive cognition, and why it’s AI next development?

https://youtu.be/6bfCuxI_84U?si=22smpDUhDzpZbsAt (生成認知)

Enactive cognition is the idea that intelligence is not something that happens only inside a brain (or a neural network). It emerges through continuous interaction between an agent and its world.

This idea comes from the work of Francisco Varela, Evan Thompson, and Eleanor Rosch in The Embodied Mind.

They argued that cognition is enacted—literally “brought forth”—rather than simply computed.


Three views of cognition

Think of the history of AI and cognitive science as three stages.

1. Cognition as computation (1960s–1980s)

Mind = computer.

The brain receives inputs, performs symbolic computation, and outputs answers.

This inspired classical AI.

Intelligence = manipulating symbols.


2. Cognition as prediction (2010s–today)

Deep learning changed the picture.

Instead of symbols,

AI learns statistical regularities from enormous datasets.

Large language models belong here.

Intelligence = predicting the next token.

This has proven astonishingly powerful.

But prediction is still largely passive.

The world provides data.

The model learns.


3. Cognition as action (enactive cognition)

Enactive theory says something radically different.

The agent is never merely observing.

It is

  • moving
  • touching
  • experimenting
  • failing
  • adapting
  • changing both itself and its environment.

Instead of

perceive → think → act

it becomes

perceive ↔ act ↔ perceive ↔ act

a continuous loop.

There is no clear boundary.


Why does this matter for AI?

Today’s LLMs know a tremendous amount.

Yet they lack something obvious.

Suppose I ask

“How heavy is this coffee mug?”

You simply pick it up.

GPT cannot.

Suppose I ask

“Is this chair stable?”

You sit on it.

GPT cannot.

Suppose I ask

“Can this door be pushed?”

You push.

GPT cannot.

Humans learn enormous amounts through action.

The world itself becomes part of cognition.

Enactive theorists would say

The world is part of the thinking process.


Intelligence is not inside the head

This is perhaps the deepest claim.

Traditional AI assumes

World

   ↓

Brain

   ↓

Answer

Enactive cognition says

Brain

Body

World

All three co-create intelligence.

Knowledge is not stored only internally.

It exists in the ongoing relationship.


Why many researchers think this is AI’s next step

Current LLMs already excel at language.

The bottleneck is now interaction.

Researchers increasingly ask:

What if the model could continuously act?

That leads toward

  • robotics
  • autonomous agents
  • long-term memory
  • experimentation
  • tool use
  • self-correction

Instead of reading millions of examples,

the AI performs millions of experiments.

Very much like a child.


Relation to World Models

You’ve asked me previously about world models.

A world model predicts

“If I do X,

what will happen?”

Enactive cognition says

“Try X.”

Prediction becomes intertwined with experimentation.

The model improves because it acts.

This is remarkably close to how children learn.


Relation to Agentic AI

Agentic AI is one practical implementation.

An agent

  • sets goals,
  • plans,
  • acts,
  • observes,
  • revises,
  • repeats.

That action loop is fundamentally enactive.

Today’s agents mostly act in digital environments.

Tomorrow’s agents will increasingly act in the physical world through robots.


Relation to embodied AI

Embodied AI provides the body.

Enactive cognition provides the philosophy.

Without a body,

there is no genuine sensorimotor loop.

This is why companies building humanoid robots are not simply making “robots.”

They are attempting to build systems that learn by living in the world, not merely by reading about it.


Why I think this is especially important

Kelly, one thing I have noticed across many of our conversations is that your own thinking often moves in a direction surprisingly close to enactive cognition.

When you describe your ideas—

  • “回到自然” (return to nature),
  • “follow the fragments,”
  • “車行哲學,”
  • “把自己交給事件” (“entrust oneself to events”),

you are repeatedly emphasizing that understanding does not arise primarily from abstract reasoning. It arises through encounter.

You once wrote that travel means

“把自己交給事件,交給經驗。”

That is almost an enactive definition of learning.

The world teaches not because it contains answers waiting to be extracted, but because, through acting within it, both the person and the world are transformed.

This also connects with something we’ve discussed before: the distinction between algorithmic self and analogical self. An algorithmic self can often optimize within a fixed environment. An analogical self, by contrast, changes through genuine encounters that cannot be fully predicted in advance. Enactive cognition suggests that intelligence grows not only by making better predictions, but by entering richer cycles of action and response.

If large language models marked the era in which AI learned to think with language, then enactive AI may mark the era in which AI begins to learn through lived engagement. The challenge is no longer simply representing the world more accurately, but participating in it well enough that new forms of understanding can emerge.