AI Strategy Insights

In this blog, we share expert insights, practical guides, and the latest trends in AI and business strategy. Our goal is to equip you with the knowledge and tools you need to confidently navigate your digital transformation and achieve your business goals.

A complex digital illustration linking a glowing human brain to an artificial intelligence network, symbolizing the connection between biological and synthetic intelligence.

The Brain Is Not a Transformer, But the Comparison Still Teaches Us Something

April 06, 20266 min read
By Tyler Small, AI Transformation Advisor

People love to compare the brain to artificial intelligence. Usually they do it badly. They point to one brain structure, point to one AI component, and act like they found a one-to-one match. That is usually wrong. The brain is a living, chemical, massively recurrent control system shaped by evolution, not a software stack built by engineers. Still, the comparison can be useful when we stay disciplined. The real value is not in claiming that a brain region “is” an AI layer. The value is in asking whether certain biological systems and certain machine learning systems solve similar kinds of problems, such as routing information, selecting actions, recalling relevant memory, or predicting what happens next. (PMC)

The thalamus is a good place to start because it is often misunderstood. It is not just a dumb relay station. Work by S. Murray Sherman and others argues that much of the thalamus, especially higher-order relays, helps carry information from one cortical area to another through cortico-thalamo-cortical pathways. That matters because it shifts the thalamus from “passive switchboard” to “active participant in communication and control.” The thalamic reticular nucleus adds another layer by helping gate what gets through, which is one reason attention researchers have been so interested in it. The AI parallel here is not “the thalamus equals attention.” That is too sloppy. A better parallel is learned routing and gating in modular systems, especially Mixture-of-Experts models, where a router decides which expert pathways should handle a given input. That is not the same mechanism, but it is the same general problem: selective allocation of limited processing bandwidth. (PMC)

The basal ganglia offer a different kind of parallel. Their job is often framed as action selection: helping enable desired actions while suppressing competing ones. That is a strong conceptual fit with reinforcement learning and policy selection in AI. In both cases, the system faces competition among possible next moves and must commit to one course while inhibiting others. But again, the analogy breaks if you push it too far. Biological action selection depends on inhibitory pathways, neuromodulation, and deeply embodied feedback loops. Reinforcement learning systems usually optimize mathematical objectives with gradient methods and simplified reward signals. So the parallel is real at the level of function, but weak at the level of implementation. That distinction matters, because most bad neuro-AI writing confuses functional similarity with mechanistic identity. (PMC)

The hippocampus is one of the strongest comparisons because machine learning now has explicit memory architectures that make the analogy more than poetic. The hippocampus is heavily associated with episodic memory, and classic work ties the dentate gyrus and CA3 regions to pattern separation and pattern completion. In plain English, that means splitting similar experiences into distinct memory traces and later reconstructing a larger memory from a partial cue. That maps reasonably well to retrieval-augmented generation and related memory-augmented systems, where the model uses a cue to fetch relevant external information rather than relying only on what is baked into its parameters. Patrick Lewis and colleagues described RAG as combining parametric memory with non-parametric memory, which is exactly why the analogy is useful. But the hippocampus is still doing much more than a database lookup. It is a dynamic biological memory system with replay, consolidation, and rich interaction with the rest of the brain. So this is a helpful parallel, not a claim of sameness. (PMC)

The cerebellum points to yet another category of similarity: prediction. A large body of neuroscience frames the cerebellum as a system involved in forward models and sensory prediction error. In simple terms, it helps predict the consequences of actions and update behavior when reality does not match expectation. That makes it one of the cleaner conceptual cousins of model-based AI systems and world models. In machine learning, world models try to learn an internal representation of an environment so an agent can simulate futures, imagine outcomes, and plan. DreamerV3 and Genie are examples of this direction. Still, the difference is enormous. The cerebellum operates under hard real-time biological constraints in a body, while most AI world models operate in simulated environments or highly controlled computational settings. So the parallel is informative, but it should make us more humble, not more confident. (PMC)

There is a broader lesson here for anyone thinking seriously about intelligence. The best neuro-AI comparisons are not built on visual resemblance or catchy metaphors. They are built on recurring computational pressures. Systems need to route information without flooding everything. They need to choose among competing actions. They need to retrieve the right memory at the right time. They need to predict outcomes before committing resources. Brains and AI systems both face these pressures, so it is no surprise that loose parallels keep showing up. But that does not mean today’s frontier models are miniature brains, and it definitely does not mean neuroscience has already been translated into software. Right now, the more honest claim is that AI occasionally rediscovers abstract problems biology solved long ago, while solving them in radically different ways. (PMC)

That is why the comparison remains worth making. Not because it proves that human cognition is just computation in the narrow engineering sense. And not because it proves that transformers, routers, retrieval systems, or world models are brain-like. The value is that the comparison forces sharper questions. What is routing for. What is memory for. What is prediction for. What kind of architecture helps a system stay selective without becoming rigid, and adaptive without becoming chaotic. Those are better questions than “which AI layer is the thalamus,” and they are more likely to produce insight. If the next generation of AI gets closer to biological intelligence, it will probably happen not because we copied anatomy literally, but because we learned to recognize the deeper architectural problems both brains and machines must solve. (PMC)

References for further reading:

Sherman, S. M. “The thalamus is more than just a relay.” Current Opinion in Neurobiology, 2007. (PMC)

McAlonan, Brown, and Bowman. “Thalamic Reticular Nucleus Activation Reflects Attentional Gating during Classical Conditioning.” Journal of Neuroscience, 2000. (PMC)

Mink, J. W. “Basal Ganglia Mechanisms in Action Selection, Plasticity, and Dystonia.” European Journal of Paediatric Neurology, 2018. (PMC)

Yassa and Stark. “Pattern separation in the hippocampus.” Trends in Neurosciences, 2011. (PMC)

Knierim and Neunuebel. “Tracking the Flow of Hippocampal Computation: Pattern Separation, Pattern Completion, and Attractor Dynamics.” Neurobiology of Learning and Memory, 2016. (PMC)

Popa, Hewitt, and Ebner. “Cerebellum, Predictions and Errors.” Frontiers in Cellular Neuroscience, 2019. (PMC)

Fedus, Zoph, and Shazeer. “Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity.” Journal of Machine Learning Research, 2022. (Journal of Machine Learning Research)

Lewis et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” 2020. (UCL NLP)

Hafner et al. “Mastering Diverse Domains through World Models.” 2023. (Hugging Face)

Bruce et al. “Genie: Generative Interactive Environments.” 2024. (Google DeepMind)

Back to Blog