Cognitive robotics

Cognitive robotics is the subfield of robotics and artificial intelligence concerned with endowing robots with cognitive capabilities. The capabilities typically targeted include perception, attention, memory, reasoning, learning, planning, action selection, and social interaction. Cognitive robotics overlaps substantially with embodied AI, developmental robotics, social robotics, and bio-inspired robotics, but it is distinguished by its focus on the high-level mental processes that turn a moving machine into something that can be said to know, decide, and adapt.

The term was coined by the research group of Yves Lesperance, Hector Levesque, Fangzhen Lin, Daniel Marcu, Ray Reiter, and Richard Scherl at the University of Toronto in 1994 and was put forward more programmatically in the 1998 "Cognitive Robotics Manifesto" by Levesque and Reiter. In Levesque and Lakemeyer's 2008 chapter in the Handbook of Knowledge Representation, the field is defined as "the study of the knowledge representation and reasoning problems faced by an autonomous robot (or an agent) in a dynamic and incompletely known world."

Since that early Toronto work, cognitive robotics has expanded well beyond logical knowledge representation. It now spans developmental robotics in the tradition of Asada, Cangelosi, Pfeifer and Sandini; social robotics in the lineage of Brooks and Breazeal at MIT; symbolic cognitive architectures like Soar, ACT-R, and ICARUS being grounded on physical platforms; and the new wave of foundation-model robotics where vision-language-action systems such as RT-2, OpenVLA, π0, Helix, and Gemini Robotics drive humanoids using large pre-trained models.

Distinguishing features

Cognitive robotics sits at the intersection of robotics, artificial intelligence, cognitive science, and neuroscience. It can be contrasted with each of these neighbours along the following axes.

Comparison	Cognitive robotics emphasises	The other field emphasises
Vs. classical robotics	High-level cognition: reasoning, knowledge, language, social behaviour	Low-level control, mechanics, kinematics, motion planning
Vs. AI	Embodied agents acting in the physical world; perception-action loops	Abstract symbol manipulation, disembodied algorithms, software agents
Vs. cognitive psychology / neuroscience	Engineering perspective; build artefacts that can act	Empirical study of biological cognition
Vs. cognitive science	Synthetic methodology: "understanding by building"	Theory and behavioural experiment
Vs. behaviour-based robotics	Internal representation, deliberation, language understanding	Reactive layered behaviours without explicit world models
Vs. developmental robotics	Often takes adult-like cognition as the design target	Models the developmental trajectory from infancy

The boundary with developmental robotics is the most fluid. Asada and colleagues introduced "cognitive developmental robotics" (CDR) in 2009 specifically to bridge them: CDR uses physical embodiment and interaction to build up cognitive functions from body representation through to social behaviour, with the goal of understanding the development of human higher cognition through synthesis.

Origins and historical context

1950s to 1970s: classical AI and the first reasoning robot

Classical AI included robotics from the start. The most influential early system was Shakey, built at the Stanford Research Institute (SRI) between 1966 and 1972 under Charles Rosen, Nils Nilsson, Bertram Raphael, and Peter Hart. Shakey was the first mobile robot to reason about its actions: it integrated logical reasoning, autonomous plan creation, plan execution with error recovery, computer vision, navigation, and natural-language communication in a single physical system. The project produced the A* search algorithm, the Hough transform, and the visibility graph method as direct by-products. Shakey defined what "a robot that thinks" looked like for a generation.

1980s: Toronto and ATR

In the 1980s, dedicated cognitive robotics groups began to form. The Toronto cognitive robotics group around Hector Levesque and Ray Reiter started developing logical foundations for action and change, using Reiter's reformulation of the situation calculus. The ATR Cognitive Robotics group in Japan worked on perception and learning for autonomous robots.

1990s: Brooks, embodiment, and the situated revolt

In 1991, Rodney Brooks at MIT published "Intelligence Without Representation" in Artificial Intelligence (volume 47, pages 139 to 159). The paper argued that classical AI had foundered on representation, and that intelligence approached incrementally through perception and action need not require explicit symbolic models. Brooks's subsumption architecture, demonstrated on robots like Genghis and later Cog, organised behaviour into layers of simple competences (wander, avoid obstacles, follow walls) without a central world model. The paper became one of the most cited critiques of symbolic AI.

At the same time, the Toronto group went the other way. In 1994 Levesque, Reiter, Lesperance, Lin, and Scherl introduced GOLOG, a high-level programming language built on the situation calculus, designed specifically for cognitive robots that needed to reason about the effects of actions. GOLOG was extended to ConGolog (concurrent) and IndiGolog (incremental, supporting interleaved planning, sensing, and action) in collaboration with Yves Lesperance and Giuseppe De Giacomo.

The MIT humanoid robotics group under Brooks, with Cynthia Breazeal as a graduate student, built Cog (an upper-torso humanoid with 21 degrees of freedom and visual, auditory, vestibular, kinesthetic, and tactile senses) and Kismet (an expressive head designed for face-to-face social interaction). Kismet, completed in the late 1990s, is widely cited as the first social robot and as the founding artefact of social robotics.

2000s: developmental robotics and the iCub

The 2000s saw the consolidation of developmental robotics as a named field, driven by Max Lungarella, Giorgio Metta, Rolf Pfeifer, Giulio Sandini, Minoru Asada, Yasuo Kuniyoshi, and others. The signature artefact was the iCub: a one-metre humanoid the size of a 3.5-year-old child, designed by the RobotCub consortium and built at the Istituto Italiano di Tecnologia (IIT) in Genoa. The RobotCub project ran for 65 months from 1 September 2004 to 31 January 2010 with EUR 8.5 million from Unit E5 of the European Commission's Seventh Framework Programme. The cub in iCub stands for Cognitive Universal Body, and the platform was explicitly motivated by the embodied cognition hypothesis: that human-like manipulation is essential for human-like cognition. About thirty iCubs are in research labs, mostly in the European Union with one in the United States.

In 2007, Pfeifer and Bongard published How the Body Shapes the Way We Think: A New View of Intelligence (MIT Press), arguing that the structure of cognition is constrained and enabled by the morphology and material properties of the body. The book popularised a research methodology of "understanding by building" and a concrete agenda around morphological computation.

In 2009, Asada, Hosoda, Kuniyoshi, Ishiguro, Inui, Yoshikawa, Ogino, and Yoshida published "Cognitive Developmental Robotics: A Survey" in IEEE Transactions on Autonomous Mental Development, volume 1, issue 1, pages 12 to 34. The survey defined CDR's research agenda: physical embodiment as the foundation, then body representation, then motor and perceptual development, then social behaviour.

2010s: cognitive architectures meet real robots

In the 2010s, classic cognitive architectures from cognitive science were applied to physical robots more systematically. Soar (Laird), ACT-R (Anderson), ICARUS (Langley), CLARION (Sun), LIDA (Franklin), Sigma (Rosenbloom), GLAIR, and Verschure's biologically-inspired Distributed Adaptive Control (DAC) all saw robotic implementations. KnowRob, introduced by Moritz Tenorth and Michael Beetz in 2009 and described in the International Journal of Robotics Research in 2013, became the most widely used knowledge-processing framework for cognition-enabled robots; it uses ontologies and "virtual knowledge bases" computed on demand from the robot's perception and planning components.

David Vernon's 2014 textbook Artificial Cognitive Systems: A Primer (MIT Press) consolidated the field into the cognitivist, emergent, and hybrid paradigms, with chapters on autonomy, embodiment, learning, memory, knowledge, and social cognition.

2020s: the foundation-model wave

The biggest single change to cognitive robotics since the 1990s arrived with large pre-trained models. Google's PaLM-SayCan paper (Ahn et al., 2022) paired the PaLM language model with a learned affordance function and a library of low-level skills: the LLM proposed candidate actions, and the affordance function pruned them to those physically feasible in the current state. PaLM-SayCan reported 84% planning success and 74% execution success on a real mobile manipulator. RT-1 (Brohan et al., December 2022) introduced the Robotics Transformer, trained on 130,000 episodes covering 700+ tasks. RT-2 (Brohan et al., July 2023) extended the idea by treating actions as language tokens, co-fine-tuned with a vision-language model on Internet-scale data. The Open X-Embodiment / RT-X effort (Padalkar et al., 2024) pooled 60 datasets from 34 labs into one corpus of 1M+ trajectories across 22 embodiments and trained cross-embodiment policies. OpenVLA (Kim et al., 2024) released a 7B-parameter open-source VLA built on Llama 2 plus DINOv2 and SigLIP visual encoders, beating RT-2-X by 16.5% with seven times fewer parameters. Octo (Octo Model Team, 2024) added a fully open transformer diffusion policy trained on 800K episodes. Gemini Robotics from Google DeepMind, released in 2025 and updated as Gemini Robotics 1.5, brought the Gemini family directly into robot control. Physical Intelligence's pi0 (also written π0, Levine et al., October 2024) introduced a flow-matching action expert on top of a vision-language model and demonstrated long-horizon tasks like folding laundry; the company has raised more than USD 400 million and open-sourced the model. NVIDIA's Project GR00T, announced 18 March 2024 at GTC, is a foundation model targeting humanoid robots, with a dual-system architecture (System 1 reflexive, System 2 deliberative) and a Jetson Thor on-board computer; partners include 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Figure AI, Fourier Intelligence, Sanctuary AI, Unitree Robotics, and XPENG. Figure AI's Helix, released in 2024 and updated as Helix 02 in early 2026, is a VLA controlling the full humanoid upper body and now the whole body.

Cognitive architectures used in robotics

A cognitive architecture is a specification of the fixed structure of a mind: its memories, processes, and how they interact. Several have been applied to robotic platforms.

Architecture	Originator	Year	Style	Robotic uses
Soar	John Laird, Allen Newell, Paul Rosenbloom	1983	Symbolic with reinforcement learning, episodic and semantic memory	Robo-Soar (1991) on a Puma arm; mobile robots; REEM service robots; unmanned underwater vehicles
ACT-R	John Anderson	1993	Declarative and procedural memory, modular cognitive science model	Mobile robots, human-robot teams
ICARUS	Pat Langley	1991	Concepts and skills hierarchies	Indoor mobile robots, manipulation
CLARION	Ron Sun	1997	Dual-process: explicit symbolic plus implicit subsymbolic	Cognitive simulations and robot agents
LIDA	Stan Franklin	2006	Global Workspace consciousness model	Cognitive software and robots
Sigma	Paul Rosenbloom	2011	Graphical-model unification	Limited robotic deployments
GLAIR	Stuart Shapiro	1990s	Grounded layered architecture with integrated reasoning	Cassie / FEVAHR, manipulation
DAC	Paul Verschure	1992 onwards	Distributed Adaptive Control, biologically inspired layers	Mobile robots, the Ada robot, neuroprosthetics
Subsumption	Rodney Brooks	1986	Reactive layers without world model	Genghis, Allen, Herbert, Cog

Soar and ACT-R are the oldest and most widely used. Their similarities, together with Sigma, prompted the Common Model of Cognition initiative to articulate a shared abstract specification.

Cognitive robotics frameworks and platforms

System	Origin	Year	Role
GOLOG, ConGolog, IndiGolog	Levesque, Reiter, Lesperance, De Giacomo (Toronto, York, Sapienza)	1994 onwards	Situation-calculus-based programming languages for cognitive robots
KnowRob	Moritz Tenorth and Michael Beetz, Munich and Bremen	2009	Ontology-based knowledge processing framework for everyday manipulation
iCub	RobotCub consortium, IIT Genoa	2004	Open-source humanoid testbed for embodied cognition
Cog	Rodney Brooks, MIT	1993 to 2003	Upper-torso humanoid for developmental and social cognition
Kismet	Cynthia Breazeal, MIT	Late 1990s	Pioneer expressive social robot
Nico	Brian Scassellati, Yale	2005 onwards	Child-like humanoid for cognitive science
HUMANOID series	Atsuo Takanishi, Waseda	1980s onwards	Bipedal and emotional humanoids
Pepper	SoftBank Robotics, Aldebaran	2014	Mass-produced sociable humanoid; over 27,000 units at peak; deployed in retail, healthcare, hospitality, banking, education
REEM-C	PAL Robotics	2013	Research humanoid used with Soar and ROS
Robonaut	NASA and General Motors	2000s onwards	Humanoid for orbital and ground tasks
Atlas	Boston Dynamics	2013 onwards	Bipedal platform with limited explicit cognition, increasingly paired with foundation models
iRobot Roomba	iRobot, founded by Brooks, Greiner, Angle	2002	Minimal cognition: simple mapping and behaviour-based control

Many of these platforms run on the Robot Operating System (ROS) for middleware, which has become the default communication layer for cognitive-robot research stacks.

Key research themes

Perception. Object recognition, scene understanding, multimodal integration of vision, audio, touch, and proprioception. Modern cognitive robotics increasingly uses pretrained vision encoders (DINOv2, SigLIP, CLIP) as front-ends for higher-level reasoning.

Attention and saliency. Top-down and bottom-up attention models that direct sensors and computation to task-relevant regions. Joint attention, where two agents attend to the same object, is a particular research focus in social cognitive robotics.

Knowledge representation. Ontologies (KnowRob), semantic maps, scene graphs, and situation calculus. These provide the structured background a robot needs to reason about objects, places, capabilities, and norms.

Reasoning. Situation-calculus-based reasoning in the GOLOG family; classical and probabilistic planning; commonsense reasoning over everyday objects and situations; causal reasoning about why an action will or will not work.

Memory. Episodic memory of specific past experiences, semantic memory of general facts, and procedural memory of motor skills. Soar, ACT-R, and LIDA all distinguish these subsystems explicitly.

Learning. Developmental learning that ramps up complexity over time, imitation learning from human demonstrations, learning from demonstration on teleoperated trajectories, reinforcement learning, and the self-supervised pretraining that drives modern VLA models.

Social cognition. Theory of mind, gaze following, joint attention, empathy, and turn-taking. Kismet was the first artefact built explicitly to engage in face-to-face social interaction, and the line continues through Nao, Pepper, and modern humanoids.

Language and dialogue. From early work on instruction following with parsers and grammars to today's LLM-grounded dialogue systems that interpret "please load the dishwasher" and decompose it into a feasible plan.

Embodiment and morphology. Pfeifer and Bongard 2007 is the canonical reference. The argument is that cognitive abilities are shaped by what the body can sense and do; "morphological computation" exploits passive dynamics and material properties to offload work that would otherwise have to be computed.

Tool use, manipulation, and metacognition. Tool use is a long-standing benchmark for cognitive ability and a current frontier for VLA models. Metacognition (robots that monitor their own state, recognise their limitations, and decide when to ask for help) draws on uncertainty estimation, meta-reasoning, and explicit self-models.

Foundation-model era

The period since 2022 has seen a rapid succession of vision-language-action (VLA) and robot foundation models that fold parts of cognitive robotics into a single learned system.

Model	Authors	Year	Contribution
PaLM-SayCan	Ahn et al. (Google, Everyday Robots)	April 2022	First widely cited LLM-plus-affordance system; PaLM 540B as planner, value-function affordances as filter
PaLM-E	Driess et al. (Google)	March 2023	Embodied multimodal language model
RT-1	Brohan et al. (Google)	December 2022	Robotics Transformer trained on 130K episodes, 700+ tasks
RT-2	Brohan et al. (Google DeepMind)	July 2023	Vision-language-action model treating actions as language tokens; chain-of-thought planning
Open X-Embodiment / RT-X	Padalkar et al. (34 labs)	2024	1M+ trajectories across 22 embodiments
OpenVLA	Kim et al. (Stanford, UC Berkeley, Google DeepMind, TRI)	June 2024	7B open-source VLA on Llama 2 plus DINOv2 plus SigLIP
Octo	Octo Model Team	May 2024	Open transformer-diffusion generalist policy on 800K episodes
pi0	Physical Intelligence (Levine et al.)	October 2024	VLA flow-matching policy with action expert; folds laundry; later open-sourced
Gemini Robotics, Gemini Robotics-ER, 1.5	Google DeepMind	2025	VLA built on Gemini 2.0; embodied reasoning variant; cross-embodiment transfer
Project GR00T, GR00T N1	NVIDIA	March 2024 onwards	Humanoid foundation model with dual-system architecture; GR00T N1 first openly released
Helix, Helix 02	Figure AI	2024 to 2026	Full upper-body and then full-body humanoid VLA with System 1 / System 2 split

These systems bring cognitive abilities like instruction following, novel-object generalisation, and long-horizon planning into the robot stack without explicit symbolic engineering. They also import the open problems of large models: hallucinations, distribution shift, dataset bias, and limited interpretability.

Connections with cognitive science and neuroscience

Cognitive robotics has always been in conversation with cognitive science.

Embodied cognition. The view associated with George Lakoff, Francisco Varela, Evan Thompson, and Eleanor Rosch holds that cognition is grounded in the body's interactions with the environment. iCub was built around this hypothesis, and Pfeifer and Bongard 2007 is its robotic manifesto.

Predictive processing and active inference. Karl Friston's free energy principle has been picked up by cognitive roboticists, including Paul Verschure with DAC, as a unifying account of perception, action, and learning.

Mirror neurons and imitation. The discovery of mirror neurons in macaque area F5 by Rizzolatti and colleagues shaped a generation of imitation-learning research in robotics.

Joint attention. Developmental psychology of joint attention in infants directly inspired robotic gaze-following and shared-attention systems on Cog, Kismet, Nico, and iCub.

Notable real-world systems

iCub. Hundreds of papers on developmental learning of crawling, manipulation, gesture, language acquisition, and social cognition. The de facto reference platform for embodied cognition research.
Pepper. Over 27,000 units manufactured at peak; deployed by SoftBank Mobile in Japan, Carrefour in Europe, Pizza Hut in Asia, and hospitals in Belgium, France, Japan, and the United States. Production paused in 2020/2021 and Aldebaran went bankrupt in 2025, with assets acquired by Maxvision Technology Corp. of China in July 2025.
Roomba. iRobot's vacuum, founded in 1990 by Brooks, Helen Greiner, and Colin Angle, is the most successful behaviour-based commercial robot; later models added simple SLAM-based mapping.
Boston Dynamics Atlas. Best known for athletic locomotion; recent demos integrate LLM-driven task planning on top of low-level control.
Tesla Optimus. Demonstrating perception-language-action behaviours since 2022, increasingly with neural-network-driven policies.
Figure 02 with Helix; Apptronik Apollo with Gemini Robotics; Sanctuary AI Phoenix. Foundation-model-driven humanoids competing in commercial pilots through 2025 and 2026.

Open challenges

Real-world generalisation is the dominant problem: today's policies still fail on long tails of object shapes, lighting conditions, and clutter that humans handle easily. Bridging the sim-to-real gap and broadening dataset diversity remain active. Long-horizon autonomy, where a robot maintains coherent behaviour over hours or days, is largely unsolved outside curated demos. Sample efficiency lags far behind humans; an infant learns to grasp from far fewer trials than a current VLA. Safety and verification of cognitive behaviour, particularly in human-shared spaces, has no general solution. Interpretability of foundation-model decisions is poor, which complicates debugging and certification. Combining symbolic and connectionist methods, the perennial neuro-symbolic question, has new urgency now that LLMs supply much of the symbolic-style competence end-to-end. Common-sense knowledge in robots remains incomplete despite efforts like KnowRob and large LLMs. Energy efficiency on humanoids, where a battery-powered onboard compute budget meets real-time control, drives architectural choices like NVIDIA's Jetson Thor and Figure's onboard accelerators.

Conferences, journals, and venues

Venue	Type	Focus
ICDL (International Conference on Development and Learning)	Conference	Developmental and epigenetic robotics, cognitive development
HRI (ACM/IEEE International Conference on Human-Robot Interaction)	Conference	Social cognitive robotics, interaction design
ICRA (IEEE International Conference on Robotics and Automation)	Conference	Broad robotics including cognitive themes
IROS (IEEE/RSJ International Conference on Intelligent Robots and Systems)	Conference	Intelligent and cognitive robots
AAAI Cognitive Robotics Symposium	Symposium	Knowledge representation and reasoning for robots
IJCAI Cognitive Robotics Workshop	Workshop	Continuation of the Toronto manifesto tradition
RSS (Robotics: Science and Systems)	Conference	Algorithms and learning, increasingly VLA-heavy
CoRL (Conference on Robot Learning)	Conference	Robot learning, dominant venue for VLA work since 2017
IEEE Transactions on Cognitive and Developmental Systems (TCDS)	Journal	Cognition and development in natural and artificial systems
Cognitive Systems Research	Journal	Multidisciplinary cognitive systems
Frontiers in Robotics and AI	Journal	Open-access cognitive and developmental robotics

IEEE TCDS, formerly IEEE Transactions on Autonomous Mental Development (which published the Asada et al. 2009 survey in its inaugural issue), is the field's flagship journal and is closely tied to ICDL through joint special issues.

References

Levesque, H. and Lakemeyer, G. (2008). Cognitive robotics. In F. van Harmelen, V. Lifschitz, and B. Porter (eds.), *Handbook of Knowledge Representation*, chapter 23, pages 869 to 886. Elsevier, Amsterdam.
Asada, M., Hosoda, K., Kuniyoshi, Y., Ishiguro, H., Inui, T., Yoshikawa, Y., Ogino, M., and Yoshida, C. (2009). Cognitive developmental robotics: a survey. *IEEE Transactions on Autonomous Mental Development*, 1(1):12 to 34.
Brooks, R. A. (1991). Intelligence without representation. *Artificial Intelligence*, 47(1 to 3):139 to 159.
Cangelosi, A. and Schlesinger, M. (2015). *Developmental Robotics: From Babies to Robots*. MIT Press, Cambridge MA.
Pfeifer, R. and Bongard, J. (2007). *How the Body Shapes the Way We Think: A New View of Intelligence*. MIT Press, Cambridge MA.
Vernon, D. (2014). *Artificial Cognitive Systems: A Primer*. MIT Press, Cambridge MA.
Levesque, H. J., Reiter, R., Lesperance, Y., Lin, F., and Scherl, R. B. (1997). GOLOG: a logic programming language for dynamic domains. *Journal of Logic Programming*, 31(1 to 3):59 to 83.
De Giacomo, G., Lesperance, Y., and Levesque, H. J. (2000). ConGolog, a concurrent programming language based on the situation calculus. *Artificial Intelligence*, 121(1 to 2):109 to 169.
Tenorth, M. and Beetz, M. (2013). KnowRob: a knowledge processing infrastructure for cognition-enabled robots. *International Journal of Robotics Research*, 32(5):566 to 590.
Metta, G., Sandini, G., Vernon, D., Natale, L., and Nori, F. (2008). The iCub humanoid robot: an open platform for research in embodied cognition. *Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems (PerMIS)*.
Breazeal, C. (2002). *Designing Sociable Robots*. MIT Press.
Brooks, R. A., Breazeal, C., Marjanovic, M., Scassellati, B., and Williamson, M. (1999). The Cog project: building a humanoid robot. In *Computation for Metaphors, Analogy and Agents*, pages 52 to 87. Springer.
Nilsson, N. J. (1984). Shakey the Robot. SRI Technical Note 323. SRI International, Menlo Park.
Laird, J. E. (2012). *The Soar Cognitive Architecture*. MIT Press.
Anderson, J. R. (2007). *How Can the Human Mind Occur in the Physical Universe?* Oxford University Press.
Langley, P., Choi, D., and Rogers, S. (2009). Acquisition of hierarchical reactive skills in a unified cognitive architecture. *Cognitive Systems Research*, 10(4):316 to 332.
Sun, R. (2006). The CLARION cognitive architecture: extending cognitive modeling to social simulation. In *Cognition and Multi-Agent Interaction*, pages 79 to 99. Cambridge University Press.
Verschure, P. F. M. J. (2012). Distributed adaptive control: a theory of the mind, brain, body nexus. *Biologically Inspired Cognitive Architectures*, 1:55 to 72.
Ahn, M. et al. (2022). Do as I can, not as I say: grounding language in robotic affordances (PaLM-SayCan). *arXiv preprint arXiv:2204.01691*.
Brohan, A. et al. (2022). RT-1: robotics transformer for real-world control at scale. *arXiv preprint arXiv:2212.06817*.
Brohan, A. et al. (2023). RT-2: vision-language-action models transfer web knowledge to robotic control. *arXiv preprint arXiv:2307.15818*.
Padalkar, A. et al. (Open X-Embodiment Collaboration) (2024). Open X-Embodiment: robotic learning datasets and RT-X models. *IEEE International Conference on Robotics and Automation (ICRA)*.
Kim, M. J. et al. (2024). OpenVLA: an open-source vision-language-action model. *arXiv preprint arXiv:2406.09246*.
Octo Model Team (2024). Octo: an open-source generalist robot policy. *Robotics: Science and Systems (RSS)*.
Black, K. et al. (Physical Intelligence) (2024). pi0: a vision-language-action flow model for general robot control. *arXiv preprint arXiv:2410.24164*.
NVIDIA (2024). NVIDIA announces Project GR00T foundation model for humanoid robots and major Isaac robotics platform update. NVIDIA Press Release, 18 March 2024.
NVIDIA (2025). NVIDIA Isaac GR00T N1: an open foundation model for generalist humanoid robots. *arXiv preprint arXiv:2503.14734*.
Figure AI (2024 to 2026). Helix and Helix 02: vision-language-action models for generalist humanoid control. Figure technical reports.
Google DeepMind (2025). Gemini Robotics and Gemini Robotics 1.5: bringing AI into the physical world.
Lungarella, M., Metta, G., Pfeifer, R., and Sandini, G. (2003). Developmental robotics: a survey. *Connection Science*, 15(4):151 to 190.
Rosenbloom, P. S. (2013). *On Computing: The Fourth Great Scientific Domain*. MIT Press.
Levesque, H. J. and Reiter, R. (1998). High-level robotic control: beyond planning. AAAI Spring Symposium on Integrating Robotics Research.

Distinguishing features

Origins and historical context

1950s to 1970s: classical AI and the first reasoning robot

1980s: Toronto and ATR

1990s: Brooks, embodiment, and the situated revolt

2000s: developmental robotics and the iCub

2010s: cognitive architectures meet real robots

2020s: the foundation-model wave

Cognitive architectures used in robotics

Cognitive robotics frameworks and platforms

Key research themes

Foundation-model era

Connections with cognitive science and neuroscience

Notable real-world systems

Open challenges

Conferences, journals, and venues

References

Improve this article

Related Articles

ARC-AGI 2

ERQA

Physical Intelligence

Physical AI

NVIDIA Cosmos

Skild AI

Distinguishing features

Origins and historical context

1950s to 1970s: classical AI and the first reasoning robot

1980s: Toronto and ATR

1990s: Brooks, embodiment, and the situated revolt

2000s: developmental robotics and the iCub

2010s: cognitive architectures meet real robots

2020s: the foundation-model wave

Cognitive architectures used in robotics

Cognitive robotics frameworks and platforms

Key research themes

Foundation-model era

Connections with cognitive science and neuroscience

Notable real-world systems

Open challenges

Conferences, journals, and venues

References

Related Articles

ARC-AGI 2

ERQA

Physical Intelligence

Physical AI

NVIDIA Cosmos

Skild AI