# Cognitive robotics

> Source: https://aiwiki.ai/wiki/cognitive_robotics
> Updated: 2026-06-27
> Categories: Artificial Intelligence, Embodied AI, Robotics
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Cognitive robotics** is the subfield of [robotics](/wiki/robotics) and [artificial intelligence](/wiki/artificial_intelligence) concerned with endowing robots with higher-level cognitive capabilities: perception, attention, memory, [reasoning](/wiki/reasoning), knowledge representation, planning, learning, action selection, and social interaction, so that they can decide and act in complex, dynamic, and incompletely known environments [1]. It is distinguished from purely control-theoretic robotics by its focus on the mental processes (knowing, deliberating, adapting) that turn a moving machine into an agent, rather than on the low-level kinematics and feedback control that move the machine itself. Hector Levesque and Gerhard Lakemeyer define the field as "the study of the knowledge representation and reasoning problems faced by an autonomous robot (or an agent) in a dynamic and incompletely known world" [1].

Cognitive robotics overlaps substantially with [embodied AI](/wiki/embodied_ai), developmental robotics, social robotics, and bio-inspired robotics, and it draws on cognitive science and neuroscience for its models of mind. David Vernon, in his 2014 primer *Artificial Cognitive Systems*, frames the design space around a working definition of a cognitive system: "A cognitive system is an autonomous system that can perceive its environment, learn from experience, anticipate the outcome of events, act to pursue goals, and adapt to changing circumstances" [6]. The field is conventionally organised around two contrasting paradigms, the cognitivist (symbolic) approach and the emergent (embodied, self-organising) approach, with hybrid systems attempting to combine the two [6][33].

The term was coined by the research group of Yves Lesperance, Hector Levesque, Fangzhen Lin, Daniel Marcu, Ray Reiter, and Richard Scherl at the University of Toronto in 1994 and was put forward more programmatically in the 1998 "Cognitive Robotics Manifesto" by Levesque and Reiter [32]. Since that early Toronto work, cognitive robotics has expanded well beyond logical knowledge representation. It now spans developmental robotics in the tradition of Asada, Cangelosi, Pfeifer and Sandini [2][4]; social robotics in the lineage of Brooks and Breazeal at MIT [11][12]; symbolic cognitive architectures like Soar, ACT-R, and ICARUS being grounded on physical platforms [14][15][16]; and the new wave of foundation-model robotics where vision-language-action systems such as RT-2, OpenVLA, pi0, Helix, and Gemini Robotics drive humanoids using large pre-trained models [21][23][25].

## What is cognitive robotics, and how does it differ from traditional robotics?

Cognitive robotics sits at the intersection of robotics, artificial intelligence, cognitive science, and neuroscience. The clearest way to pin it down is by contrast with each of these neighbours: where control-theoretic and classical robotics ask "how do I move the actuators to follow this trajectory?", cognitive robotics asks "what should I do, why, and how do I know?". It can be contrasted with each neighbouring field along the following axes.

| Comparison | Cognitive robotics emphasises | The other field emphasises |
|---|---|---|
| Vs. classical / control-theoretic robotics | High-level cognition: reasoning, knowledge, language, social behaviour | Low-level control, mechanics, kinematics, motion planning, feedback control |
| Vs. AI | Embodied agents acting in the physical world; perception-action loops | Abstract symbol manipulation, disembodied algorithms, software agents |
| Vs. cognitive psychology / neuroscience | Engineering perspective; build artefacts that can act | Empirical study of biological cognition |
| Vs. cognitive science | Synthetic methodology: "understanding by building" | Theory and behavioural experiment |
| Vs. behaviour-based robotics | Internal representation, deliberation, language understanding | Reactive layered behaviours without explicit world models |
| Vs. developmental robotics | Often takes adult-like cognition as the design target | Models the developmental trajectory from infancy |

The boundary with developmental robotics is the most fluid. Lungarella, Metta, Pfeifer and Sandini, in their foundational 2003 survey, describe developmental robotics as "an emerging field located at the intersection of robotics, cognitive science and developmental sciences" [30]. Asada and colleagues introduced "cognitive developmental robotics" (CDR) in 2009 specifically to bridge them: CDR uses physical embodiment and interaction to build up cognitive functions from body representation through to social behaviour, with the goal of understanding the development of human higher cognition through synthesis [2].

## What are the two paradigms of cognition in cognitive robotics?

Vernon's *Artificial Cognitive Systems* and the earlier Vernon, Metta and Sandini survey organise the whole field around two broad classes of cognition, plus a hybrid middle ground [6][33].

| Paradigm | Core commitment | Associated methods | Robotic exemplars |
|---|---|---|---|
| Cognitivist (symbolic) | Cognition is rule-based manipulation of symbolic representations; a physical symbol system | Knowledge representation, logic, planning, search, production rules | GOLOG family, KnowRob, Soar, ACT-R, ICARUS on robots |
| Emergent | Cognition is a self-organising process arising from an agent's embodied interaction with its world | Connectionist, dynamical-systems, and enactive approaches; development | Subsumption robots, iCub developmental learning, DAC |
| Hybrid | Combine symbolic deliberation with sub-symbolic, learned components | Neuro-symbolic systems, learned skills under symbolic supervisors | PaLM-SayCan (LLM planner plus learned skills), many modern stacks |

The cognitivist position, descended from the physical-symbol-system hypothesis of Newell and Simon, treats the mind as computation over explicit symbols and was the dominant view through the first decades of AI [6]. The emergent position holds that cognition cannot be cleanly separated from the body and environment that produced it, and is "based to a greater or lesser extent on principles of self-organization" spanning connectionist, dynamical, and enactive systems [33]. Vernon notes that autonomy "is not necessarily implied by the cognitivist paradigm" but is central to the emergent paradigm, "since cognition is the process whereby an agent develops" [33]. The modern foundation-model turn (see below) is often read as a new, data-driven form of hybrid system: a large connectionist model that nonetheless absorbs symbol-like competences such as instruction following and planning [35].

## When did cognitive robotics begin? Origins and historical context

### 1950s to 1970s: classical AI and the first reasoning robot

Classical AI included robotics from the start. The most influential early system was Shakey, built at the Stanford Research Institute (SRI) between 1966 and 1972 under Charles Rosen, [Nils Nilsson](https://ai.stanford.edu/~nilsson/), Bertram Raphael, and Peter Hart [13]. Shakey was the first mobile robot to reason about its actions: it integrated logical reasoning, autonomous plan creation, plan execution with error recovery, computer vision, navigation, and natural-language communication in a single physical system [13]. The project produced the A* search algorithm, the Hough transform, and the visibility graph method as direct by-products. Shakey defined what "a robot that thinks" looked like for a generation.

### 1980s: Toronto and ATR

In the 1980s, dedicated cognitive robotics groups began to form. The Toronto cognitive robotics group around Hector Levesque and Ray Reiter started developing logical foundations for action and change, using Reiter's reformulation of the situation calculus [1]. The ATR Cognitive Robotics group in Japan worked on perception and learning for autonomous robots.

### 1990s: Brooks, embodiment, and the situated revolt

In 1991, [Rodney Brooks](https://people.csail.mit.edu/brooks/) at MIT published "Intelligence Without Representation" in *Artificial Intelligence* (volume 47, pages 139 to 159) [3]. The paper argued that classical AI had foundered on representation, and that intelligence approached incrementally through perception and action need not require explicit symbolic models [3]. Brooks's subsumption architecture, demonstrated on robots like Genghis and later Cog, organised behaviour into layers of simple competences (wander, avoid obstacles, follow walls) without a central world model [3]. The paper became one of the most cited critiques of symbolic AI and a founding statement of the emergent paradigm.

At the same time, the Toronto group went the other way. In 1994 Levesque, Reiter, Lesperance, Lin, and Scherl introduced GOLOG, a high-level programming language built on the situation calculus, designed specifically for cognitive robots that needed to reason about the effects of actions [7]. GOLOG was extended to ConGolog (concurrent) and IndiGolog (incremental, supporting interleaved planning, sensing, and action) in collaboration with Yves Lesperance and Giuseppe De Giacomo [8].

The MIT humanoid robotics group under Brooks, with Cynthia Breazeal as a graduate student, built Cog (an upper-torso humanoid with 21 degrees of freedom and visual, auditory, vestibular, kinesthetic, and tactile senses) and Kismet (an expressive head designed for face-to-face social interaction) [12]. Kismet, completed in the late 1990s, is widely cited as the first social robot and as the founding artefact of social robotics [11].

### 2000s: developmental robotics and the iCub

The 2000s saw the consolidation of developmental robotics as a named field, driven by Max Lungarella, Giorgio Metta, Rolf Pfeifer, Giulio Sandini, Minoru Asada, Yasuo Kuniyoshi, and others [30]. The signature artefact was the iCub: a one-metre humanoid the size of a 3.5-year-old child, designed by the RobotCub consortium and built at the Istituto Italiano di Tecnologia (IIT) in Genoa [10]. The RobotCub project ran for 65 months from 1 September 2004 to 31 January 2010 with EUR 8.5 million from Unit E5 of the European Commission's Seventh Framework Programme. The cub in iCub stands for Cognitive Universal Body, and the platform was explicitly motivated by the embodied cognition hypothesis: that human-like manipulation is essential for human-like cognition [10]. About thirty iCubs are in research labs, mostly in the European Union with one in the United States.

In 2007, Pfeifer and Bongard published *How the Body Shapes the Way We Think: A New View of Intelligence* (MIT Press), arguing that the structure of cognition is constrained and enabled by the morphology and material properties of the body [5]. The book popularised a research methodology of "understanding by building" and a concrete agenda around morphological computation [5].

In 2009, Asada, Hosoda, Kuniyoshi, Ishiguro, Inui, Yoshikawa, Ogino, and Yoshida published "Cognitive Developmental Robotics: A Survey" in IEEE Transactions on Autonomous Mental Development, volume 1, issue 1, pages 12 to 34 [2]. The survey defined CDR's research agenda: physical embodiment as the foundation, then body representation, then motor and perceptual development, then social behaviour [2].

### 2010s: cognitive architectures meet real robots

In the 2010s, classic cognitive architectures from cognitive science were applied to physical robots more systematically. Soar (Laird), ACT-R (Anderson), ICARUS (Langley), CLARION (Sun), LIDA (Franklin), Sigma (Rosenbloom), GLAIR, and Verschure's biologically-inspired Distributed Adaptive Control (DAC) all saw robotic implementations [14][15][16][17][18]. KnowRob, introduced by Moritz Tenorth and Michael Beetz in 2009 and described in the *International Journal of Robotics Research* in 2013, became the most widely used knowledge-processing framework for cognition-enabled robots; it uses ontologies and "virtual knowledge bases" computed on demand from the robot's perception and planning components [9].

David Vernon's 2014 textbook *Artificial Cognitive Systems: A Primer* (MIT Press) consolidated the field into the cognitivist, emergent, and hybrid paradigms, with chapters on autonomy, embodiment, learning, memory, knowledge, and social cognition [6].

### 2020s: the foundation-model wave

The biggest single change to cognitive robotics since the 1990s arrived with large pre-trained models. Google's PaLM-SayCan paper (Ahn et al., 2022) paired the PaLM language model with a learned affordance function and a library of low-level skills: the LLM proposed candidate actions, and the affordance function pruned them to those physically feasible in the current state [19]. On 101 real test instructions, PaLM-SayCan reported 84% planning success and 74% execution success on a real mobile manipulator from Everyday Robots [19]. RT-1 (Brohan et al., December 2022) introduced the Robotics Transformer, trained on 130,000 episodes covering 700+ tasks [20]. RT-2 (Brohan et al., July 2023) extended the idea by treating actions as language tokens, co-fine-tuned with a vision-language model on Internet-scale data [21]. The Open X-Embodiment / RT-X effort (Padalkar et al., 2024) pooled 60 datasets from 34 labs into one corpus of 1M+ trajectories across 22 embodiments and trained cross-embodiment policies [22]. OpenVLA (Kim et al., 2024) released a 7B-parameter open-source [vision-language-action model](/wiki/vision_language_action_model) built on Llama 2 plus DINOv2 and SigLIP visual encoders, beating RT-2-X by 16.5% with seven times fewer parameters [23]. Octo (Octo Model Team, 2024) added a fully open transformer diffusion policy trained on 800K episodes [24]. Gemini Robotics from Google DeepMind, released in 2025 and updated as Gemini Robotics 1.5, brought the Gemini family directly into robot control [29]. Physical Intelligence's [pi0](/wiki/pi0) (also written the Greek letter pi with subscript zero, Levine et al., October 2024) introduced a flow-matching action expert on top of a vision-language model and demonstrated long-horizon tasks like folding laundry; the company has raised more than USD 400 million and open-sourced the model [25]. NVIDIA's Project GR00T, announced 18 March 2024 at GTC, is a [foundation model](/wiki/foundation_model) targeting humanoid robots, with a dual-system architecture (System 1 reflexive, System 2 deliberative) and a Jetson Thor on-board computer; partners include 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Figure AI, Fourier Intelligence, Sanctuary AI, Unitree Robotics, and XPENG [26][27]. Figure AI's [Helix](/wiki/figure_helix), released in 2024 and updated as Helix 02 in early 2026, is a VLA controlling the full humanoid upper body and now the whole body [28].

## What are cognitive architectures, and which are used in robotics?

A cognitive architecture is a specification of the fixed structure of a mind: its memories, processes, and how they interact [14]. Several have been applied to robotic platforms.

| Architecture | Originator | Year | Style | Robotic uses |
|---|---|---|---|---|
| Soar | John Laird, Allen Newell, Paul Rosenbloom | 1983 (1987 AI Journal paper) | Symbolic with reinforcement learning, episodic and semantic memory | Robo-Soar (1991) on a Puma arm; mobile robots; REEM service robots; unmanned underwater vehicles |
| ACT-R | John Anderson | 1993 | Declarative and procedural memory, modular cognitive science model | Mobile robots, human-robot teams |
| ICARUS | Pat Langley | 1991 | Concepts and skills hierarchies | Indoor mobile robots, manipulation |
| CLARION | Ron Sun | 1997 | Dual-process: explicit symbolic plus implicit subsymbolic | Cognitive simulations and robot agents |
| LIDA | Stan Franklin | 2006 | Global Workspace consciousness model | Cognitive software and robots |
| Sigma | Paul Rosenbloom | 2011 | Graphical-model unification | Limited robotic deployments |
| GLAIR | Stuart Shapiro | 1990s | Grounded layered architecture with integrated reasoning | Cassie / FEVAHR, manipulation |
| DAC | Paul Verschure | 1992 onwards | Distributed Adaptive Control, biologically inspired layers | Mobile robots, the Ada robot, neuroprosthetics |
| Subsumption | Rodney Brooks | 1986 | Reactive layers without world model | Genghis, Allen, Herbert, Cog |

Soar and ACT-R are the oldest and most widely used. Soar was originally created by John Laird, Allen Newell, and Paul Rosenbloom, and presented as "Soar: An Architecture for General Intelligence" in the journal *Artificial Intelligence* in 1987 [34]; its stated goal is "to develop the fixed computational building blocks necessary for general intelligent agents" [34]. ACT-R, developed by John Anderson, models cognition with separate declarative and procedural memories grounded in cognitive-science data [15]. The similarities between Soar, ACT-R, and Sigma prompted the Common Model of Cognition initiative to articulate a shared abstract specification.

## Cognitive robotics frameworks and platforms

| System | Origin | Year | Role |
|---|---|---|---|
| GOLOG, ConGolog, IndiGolog | Levesque, Reiter, Lesperance, De Giacomo (Toronto, York, Sapienza) | 1994 onwards | Situation-calculus-based programming languages for cognitive robots |
| KnowRob | Moritz Tenorth and Michael Beetz, Munich and Bremen | 2009 | Ontology-based knowledge processing framework for everyday manipulation |
| iCub | RobotCub consortium, IIT Genoa | 2004 | Open-source humanoid testbed for embodied cognition |
| Cog | Rodney Brooks, MIT | 1993 to 2003 | Upper-torso humanoid for developmental and social cognition |
| Kismet | Cynthia Breazeal, MIT | Late 1990s | Pioneer expressive social robot |
| Nico | Brian Scassellati, Yale | 2005 onwards | Child-like humanoid for cognitive science |
| HUMANOID series | Atsuo Takanishi, Waseda | 1980s onwards | Bipedal and emotional humanoids |
| Pepper | SoftBank Robotics, Aldebaran | 2014 | Mass-produced sociable humanoid; over 27,000 units at peak; deployed in retail, healthcare, hospitality, banking, education |
| REEM-C | PAL Robotics | 2013 | Research humanoid used with Soar and ROS |
| Robonaut | NASA and General Motors | 2000s onwards | Humanoid for orbital and ground tasks |
| Atlas | Boston Dynamics | 2013 onwards | Bipedal platform with limited explicit cognition, increasingly paired with foundation models |
| iRobot Roomba | iRobot, founded by Brooks, Greiner, Angle | 2002 | Minimal cognition: simple mapping and behaviour-based control |

Many of these platforms run on the [Robot Operating System (ROS)](/wiki/ros) for middleware, which has become the default communication layer for cognitive-robot research stacks.

## What are the key research themes in cognitive robotics?

**Perception.** Object recognition, scene understanding, multimodal integration of vision, audio, touch, and proprioception. Modern cognitive robotics increasingly uses pretrained vision encoders (DINOv2, SigLIP, CLIP) as front-ends for higher-level reasoning.

**Attention and saliency.** Top-down and bottom-up attention models that direct sensors and computation to task-relevant regions. Joint attention, where two agents attend to the same object, is a particular research focus in social cognitive robotics.

**Knowledge representation.** Ontologies (KnowRob), semantic maps, scene graphs, and situation calculus [9]. These provide the structured background a robot needs to reason about objects, places, capabilities, and norms.

**Reasoning.** Situation-calculus-based reasoning in the GOLOG family [7]; classical and probabilistic planning; [commonsense reasoning](/wiki/commonsense_reasoning) over everyday objects and situations; causal reasoning about why an action will or will not work.

**Memory.** Episodic memory of specific past experiences, semantic memory of general facts, and procedural memory of motor skills. Soar, ACT-R, and LIDA all distinguish these subsystems explicitly [14][15].

**Learning.** Developmental learning that ramps up complexity over time, [imitation learning](/wiki/imitation_learning) from human demonstrations, learning from demonstration on teleoperated trajectories, [reinforcement learning](/wiki/reinforcement_learning), and the self-supervised pretraining that drives modern VLA models [23].

**Social cognition.** Theory of mind, gaze following, joint attention, empathy, and turn-taking. Kismet was the first artefact built explicitly to engage in face-to-face social interaction, and the line continues through Nao, Pepper, and modern humanoids [11].

**Language and dialogue.** From early work on instruction following with parsers and grammars to today's LLM-grounded dialogue systems that interpret "please load the dishwasher" and decompose it into a feasible plan [19].

**Embodiment and morphology.** Pfeifer and Bongard 2007 is the canonical reference [5]. The argument is that cognitive abilities are shaped by what the body can sense and do; "morphological computation" exploits passive dynamics and material properties to offload work that would otherwise have to be computed [5].

**Tool use, manipulation, and metacognition.** Tool use is a long-standing benchmark for cognitive ability and a current frontier for VLA models. Metacognition (robots that monitor their own state, recognise their limitations, and decide when to ask for help) draws on uncertainty estimation, meta-reasoning, and explicit self-models.

## How do foundation models change cognitive robotics?

The period since 2022 has seen a rapid succession of vision-language-action (VLA) and robot foundation models that fold parts of cognitive robotics into a single learned system [21][23][25]. Rather than hand-engineering symbolic planners, perception modules, and skill libraries, these systems learn instruction following, perception, and control jointly from large datasets, and so behave as a data-driven hybrid of the cognitivist and emergent paradigms [35].

| Model | Authors | Year | Contribution |
|---|---|---|---|
| PaLM-SayCan | Ahn et al. (Google, Everyday Robots) | April 2022 | First widely cited LLM-plus-affordance system; [PaLM](/wiki/palm) 540B as planner, value-function affordances as filter; 84% planning, 74% execution on 101 tasks |
| [PaLM-E](/wiki/palm-e_an_embodied_multimodal_language_model) | Driess et al. (Google) | March 2023 | Embodied multimodal language model |
| RT-1 | Brohan et al. (Google) | December 2022 | Robotics Transformer trained on 130K episodes, 700+ tasks |
| [RT-2](/wiki/rt-2) | Brohan et al. (Google DeepMind) | July 2023 | Vision-language-action model treating actions as language tokens; chain-of-thought planning |
| Open X-Embodiment / RT-X | Padalkar et al. (34 labs) | 2024 | 1M+ trajectories across 22 embodiments |
| [OpenVLA](/wiki/openvla) | Kim et al. (Stanford, UC Berkeley, Google DeepMind, TRI) | June 2024 | 7B open-source VLA on Llama 2 plus DINOv2 plus SigLIP; beats RT-2-X by 16.5% with 7x fewer parameters |
| Octo | Octo Model Team | May 2024 | Open transformer-diffusion generalist policy on 800K episodes |
| pi0 | Physical Intelligence (Levine et al.) | October 2024 | VLA flow-matching policy with action expert; folds laundry; later open-sourced |
| [Gemini Robotics](/wiki/gemini_robotics), Gemini Robotics-ER, 1.5 | Google DeepMind | 2025 | VLA built on Gemini 2.0; embodied reasoning variant; cross-embodiment transfer |
| Project GR00T, GR00T N1 | NVIDIA | March 2024 onwards | Humanoid foundation model with dual-system architecture; GR00T N1 first openly released |
| Helix, Helix 02 | Figure AI | 2024 to 2026 | Full upper-body and then full-body humanoid VLA with System 1 / System 2 split |

These systems bring cognitive abilities like instruction following, novel-object generalisation, and long-horizon planning into the robot stack without explicit symbolic engineering [21][23]. They also import the open problems of large models: hallucinations, distribution shift, dataset bias, and limited interpretability. Vernon, in a 2025 review, argues that this foundation-model route raises concerns of cost, availability, trustworthiness, robustness, transparency, security, and inclusion, and proposes an alternative research programme more closely aligned with human cognitive development [35].

## Connections with cognitive science and neuroscience

Cognitive robotics has always been in conversation with cognitive science.

**Embodied cognition.** The view associated with George Lakoff, Francisco Varela, Evan Thompson, and Eleanor Rosch holds that cognition is grounded in the body's interactions with the environment. iCub was built around this hypothesis, and Pfeifer and Bongard 2007 is its robotic manifesto [5][10].

**Predictive processing and active inference.** Karl Friston's free energy principle has been picked up by cognitive roboticists, including Paul Verschure with DAC, as a unifying account of perception, action, and learning [18].

**Mirror neurons and imitation.** The discovery of mirror neurons in macaque area F5 by Rizzolatti and colleagues shaped a generation of imitation-learning research in robotics.

**Joint attention.** Developmental psychology of joint attention in infants directly inspired robotic gaze-following and shared-attention systems on Cog, Kismet, Nico, and iCub [12].

## Notable real-world systems

* **iCub.** Hundreds of papers on developmental learning of crawling, manipulation, gesture, language acquisition, and social cognition. The de facto reference platform for embodied cognition research [10].
* **Pepper.** Over 27,000 units manufactured at peak; deployed by SoftBank Mobile in Japan, Carrefour in Europe, Pizza Hut in Asia, and hospitals in Belgium, France, Japan, and the United States. Production paused in 2020/2021 and Aldebaran went bankrupt in 2025, with assets acquired by Maxvision Technology Corp. of China in July 2025.
* **Roomba.** iRobot's vacuum, founded in 1990 by Brooks, Helen Greiner, and Colin Angle, is the most successful behaviour-based commercial robot; later models added simple SLAM-based mapping.
* **Boston Dynamics Atlas.** Best known for athletic locomotion; recent demos integrate LLM-driven task planning on top of low-level control.
* **[Tesla Optimus](/wiki/tesla_optimus).** Demonstrating perception-language-action behaviours since 2022, increasingly with neural-network-driven policies.
* **Figure 02 with Helix; Apptronik Apollo with Gemini Robotics; Sanctuary AI Phoenix.** Foundation-model-driven humanoids competing in commercial pilots through 2025 and 2026 [28][29].

## What are the open challenges in cognitive robotics?

Real-world generalisation is the dominant problem: today's policies still fail on long tails of object shapes, lighting conditions, and clutter that humans handle easily. Bridging the sim-to-real gap and broadening dataset diversity remain active. Long-horizon autonomy, where a robot maintains coherent behaviour over hours or days, is largely unsolved outside curated demos. Sample efficiency lags far behind humans; an infant learns to grasp from far fewer trials than a current VLA. Safety and verification of cognitive behaviour, particularly in human-shared spaces, has no general solution. Interpretability of foundation-model decisions is poor, which complicates debugging and certification [35]. Combining symbolic and connectionist methods, the perennial neuro-symbolic question, has new urgency now that [large language models](/wiki/large_language_model) supply much of the symbolic-style competence end-to-end. Common-sense knowledge in robots remains incomplete despite efforts like KnowRob and large LLMs [9]. Energy efficiency on humanoids, where a battery-powered onboard compute budget meets real-time control, drives architectural choices like NVIDIA's Jetson Thor and Figure's onboard accelerators.

## Conferences, journals, and venues

| Venue | Type | Focus |
|---|---|---|
| ICDL (International Conference on Development and Learning) | Conference | Developmental and epigenetic robotics, cognitive development |
| HRI (ACM/IEEE International Conference on [Human-Robot Interaction](/wiki/human_robot_interaction)) | Conference | Social cognitive robotics, interaction design |
| ICRA (IEEE International Conference on Robotics and Automation) | Conference | Broad robotics including cognitive themes |
| IROS (IEEE/RSJ International Conference on Intelligent Robots and Systems) | Conference | Intelligent and cognitive robots |
| AAAI Cognitive Robotics Symposium | Symposium | Knowledge representation and reasoning for robots |
| IJCAI Cognitive Robotics Workshop | Workshop | Continuation of the Toronto manifesto tradition |
| RSS (Robotics: Science and Systems) | Conference | Algorithms and learning, increasingly VLA-heavy |
| CoRL (Conference on Robot Learning) | Conference | Robot learning, dominant venue for VLA work since 2017 |
| IEEE Transactions on Cognitive and Developmental Systems (TCDS) | Journal | Cognition and development in natural and artificial systems |
| Cognitive Systems Research | Journal | Multidisciplinary cognitive systems |
| Frontiers in Robotics and AI | Journal | Open-access cognitive and developmental robotics |

IEEE TCDS, formerly IEEE Transactions on Autonomous Mental Development (which published the Asada et al. 2009 survey in its inaugural issue), is the field's flagship journal and is closely tied to ICDL through joint special issues [2].

## References

1. Levesque, H. and Lakemeyer, G. (2008). Cognitive robotics. In F. van Harmelen, V. Lifschitz, and B. Porter (eds.), *Handbook of Knowledge Representation*, chapter 23, pages 869 to 886. Elsevier, Amsterdam.
2. Asada, M., Hosoda, K., Kuniyoshi, Y., Ishiguro, H., Inui, T., Yoshikawa, Y., Ogino, M., and Yoshida, C. (2009). Cognitive developmental robotics: a survey. *IEEE Transactions on Autonomous Mental Development*, 1(1):12 to 34.
3. Brooks, R. A. (1991). Intelligence without representation. *Artificial Intelligence*, 47(1 to 3):139 to 159.
4. Cangelosi, A. and Schlesinger, M. (2015). *Developmental Robotics: From Babies to Robots*. MIT Press, Cambridge MA.
5. Pfeifer, R. and Bongard, J. (2007). *How the Body Shapes the Way We Think: A New View of Intelligence*. MIT Press, Cambridge MA.
6. Vernon, D. (2014). *Artificial Cognitive Systems: A Primer*. MIT Press, Cambridge MA.
7. Levesque, H. J., Reiter, R., Lesperance, Y., Lin, F., and Scherl, R. B. (1997). GOLOG: a logic programming language for dynamic domains. *Journal of Logic Programming*, 31(1 to 3):59 to 83.
8. De Giacomo, G., Lesperance, Y., and Levesque, H. J. (2000). ConGolog, a concurrent programming language based on the situation calculus. *Artificial Intelligence*, 121(1 to 2):109 to 169.
9. Tenorth, M. and Beetz, M. (2013). KnowRob: a knowledge processing infrastructure for cognition-enabled robots. *International Journal of Robotics Research*, 32(5):566 to 590.
10. Metta, G., Sandini, G., Vernon, D., Natale, L., and Nori, F. (2008). The iCub humanoid robot: an open platform for research in embodied cognition. *Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems (PerMIS)*.
11. Breazeal, C. (2002). *Designing Sociable Robots*. MIT Press.
12. Brooks, R. A., Breazeal, C., Marjanovic, M., Scassellati, B., and Williamson, M. (1999). The Cog project: building a humanoid robot. In *Computation for Metaphors, Analogy and Agents*, pages 52 to 87. Springer.
13. Nilsson, N. J. (1984). Shakey the Robot. SRI Technical Note 323. SRI International, Menlo Park.
14. Laird, J. E. (2012). *The Soar Cognitive Architecture*. MIT Press.
15. Anderson, J. R. (2007). *How Can the Human Mind Occur in the Physical Universe?* Oxford University Press.
16. Langley, P., Choi, D., and Rogers, S. (2009). Acquisition of hierarchical reactive skills in a unified cognitive architecture. *Cognitive Systems Research*, 10(4):316 to 332.
17. Sun, R. (2006). The CLARION cognitive architecture: extending cognitive modeling to social simulation. In *Cognition and Multi-Agent Interaction*, pages 79 to 99. Cambridge University Press.
18. Verschure, P. F. M. J. (2012). Distributed adaptive control: a theory of the mind, brain, body nexus. *Biologically Inspired Cognitive Architectures*, 1:55 to 72.
19. Ahn, M. et al. (2022). Do as I can, not as I say: grounding language in robotic affordances (PaLM-SayCan). *arXiv preprint arXiv:2204.01691*.
20. Brohan, A. et al. (2022). RT-1: robotics transformer for real-world control at scale. *arXiv preprint arXiv:2212.06817*.
21. Brohan, A. et al. (2023). RT-2: vision-language-action models transfer web knowledge to robotic control. *arXiv preprint arXiv:2307.15818*.
22. Padalkar, A. et al. (Open X-Embodiment Collaboration) (2024). Open X-Embodiment: robotic learning datasets and RT-X models. *IEEE International Conference on Robotics and Automation (ICRA)*.
23. Kim, M. J. et al. (2024). OpenVLA: an open-source vision-language-action model. *arXiv preprint arXiv:2406.09246*.
24. Octo Model Team (2024). Octo: an open-source generalist robot policy. *Robotics: Science and Systems (RSS)*.
25. Black, K. et al. (Physical Intelligence) (2024). pi0: a vision-language-action flow model for general robot control. *arXiv preprint arXiv:2410.24164*.
26. NVIDIA (2024). NVIDIA announces Project GR00T foundation model for humanoid robots and major Isaac robotics platform update. NVIDIA Press Release, 18 March 2024.
27. NVIDIA (2025). NVIDIA Isaac GR00T N1: an open foundation model for generalist humanoid robots. *arXiv preprint arXiv:2503.14734*.
28. Figure AI (2024 to 2026). Helix and Helix 02: vision-language-action models for generalist humanoid control. Figure technical reports.
29. Google DeepMind (2025). Gemini Robotics and Gemini Robotics 1.5: bringing AI into the physical world.
30. Lungarella, M., Metta, G., Pfeifer, R., and Sandini, G. (2003). Developmental robotics: a survey. *Connection Science*, 15(4):151 to 190.
31. Rosenbloom, P. S. (2013). *On Computing: The Fourth Great Scientific Domain*. MIT Press.
32. Levesque, H. J. and Reiter, R. (1998). High-level robotic control: beyond planning. AAAI Spring Symposium on Integrating Robotics Research.
33. Vernon, D., Metta, G., and Sandini, G. (2007). A survey of artificial cognitive systems: implications for the autonomous development of mental capabilities in computational agents. *IEEE Transactions on Evolutionary Computation*, 11(2):151 to 180.
34. Laird, J. E., Newell, A., and Rosenbloom, P. S. (1987). Soar: an architecture for general intelligence. *Artificial Intelligence*, 33(1):1 to 64.
35. Vernon, D. (2025). The future of research in cognitive robotics: foundation models or developmental cognitive models? *Advanced Robotics Research*. doi:10.1002/adrr.202500066.