Artificial intelligence terms
Last reviewed
May 11, 2026
Sources
32 citations
Review status
Source-backed
Revision
v3 ยท 2,497 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 11, 2026
Sources
32 citations
Review status
Source-backed
Revision
v3 ยท 2,497 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Terms, Artificial intelligence and Artificial intelligence applications
A glossary of common artificial intelligence terminology, grouped by category. Click any term for the full article.
Artificial general intelligence (AGI): A hypothetical AI matching or surpassing humans across most cognitive tasks.
Artificial Superintelligence (ASI): A hypothetical intellect that vastly exceeds humans in essentially every domain.
Machine learning: A field of AI using statistical algorithms that learn from data without explicit programming.
Deep learning: Machine learning using multilayered neural networks for tasks like classification and representation learning.
Neural network: A computational model inspired by biological neurons that learns by adjusting weighted connections.
Supervised learning: A learning paradigm that maps inputs to outputs using labeled example pairs.
Unsupervised learning: Learning patterns from unlabeled data without explicit targets.
Reinforcement learning: A paradigm where an agent learns by acting in an environment to maximize cumulative reward.
Generative AI: AI that uses generative models to produce text, images, video, audio, or code from prompts.
Natural language processing (NLP): The processing and understanding of human language by computers.
Computer vision: The study of methods for acquiring, processing, and understanding digital images.
Inference: Using a trained model to generate predictions or outputs from new inputs.
Transformers: A neural architecture using multi-head attention to process sequences in parallel, underlying most modern LLMs.
Attention mechanism: A method weighting how important each component of a sequence is relative to the others.
Multi-head Latent Attention (MLA): A DeepSeek attention variant that compresses key and value caches to reduce inference memory.
Large language models (LLMs): Neural networks trained on large text corpora that generate, summarize, translate, and analyze language.
Foundation models: Large models trained on broad data that can be adapted to many downstream tasks.
Frontier models: The most advanced foundation models, whose capabilities may pose novel risks.
GPT: Generative Pre-trained Transformer, a family of transformer language models pre-trained on unlabeled text.
GPT-1: OpenAI's first GPT, released in 2018.
GPT-2: OpenAI's 2019 1.5-billion-parameter LLM, initially released in stages over misuse concerns.
GPT-3: OpenAI's 175-billion-parameter 2020 LLM, notable for few-shot abilities.
GPT-4: OpenAI's multimodal LLM released in 2023, accepting text and image inputs.
BERT: A 2018 Google encoder-only transformer that learns bidirectional text representations.
LaMDA: Google's Language Model for Dialogue Applications, announced in 2021.
Diffusion Models: Generative models that produce data by reversing a step-by-step noise-adding process.
Stable Diffusion: A 2022 open-source text-to-image diffusion model.
ControlNet: A neural network add-on that conditions diffusion models on extra inputs like edges, depth, or poses.
Vision Transformer (ViT): A transformer for computer vision that splits images into patches processed with attention.
Vision Language Model (VLM): A multimodal model that jointly processes images and text.
Vision encoder: The component of a multimodal model that turns images into features for a language model.
Vision tokens: Discrete patch or feature representations of an image used by a vision or multimodal model.
Mixture of Experts (MoE): An architecture combining many expert subnetworks, with a router selecting experts per input.
Convolutional neural network (CNN): A feedforward network that uses learned filters to extract spatial features from images.
Recurrent neural network (RNN): A network that processes sequences by feeding outputs from one step back as inputs.
Generative adversarial network (GAN): A framework where a generator and discriminator compete to produce realistic synthetic data.
Multimodal AI: AI systems that integrate multiple data types such as text, images, audio, and video.
World models: AI systems that build internal models of an environment and predict how actions change it.
Computer-use model: A model trained to operate computer interfaces via screen, mouse, and keyboard.
OCR Models: Optical Character Recognition models that convert images of text into machine-readable strings.
Pre-training: The initial large-scale training phase where a model learns general patterns from broad data.
Post-training: Refinement after pre-training, including supervised fine-tuning and reinforcement-learning alignment.
Fine-tuning: Adapting a pre-trained model to a specific task by continuing training on task data.
Supervised fine-tuning (SFT): Fine-tuning on labeled input-output pairs that demonstrate desired behavior.
RLHF (Reinforcement learning from human feedback): Aligning a model with human preferences by training a reward model from feedback and optimizing against it.
LoRA (low-rank adaptation): A parameter-efficient fine-tuning method that injects small low-rank matrices while freezing base weights.
Knowledge distillation: Transferring knowledge from a large teacher model to a smaller student that approximates it.
Quantization: Reducing the numerical precision of model weights and activations to lower memory and speed up inference.
Pruning: Removing weights or neurons from a trained model to reduce size with limited accuracy loss.
Muon optimizer: A neural network optimizer using orthogonalized gradient updates that has gained attention for large-model training.
Benchmarks: Standardized tests comparing model performance on tasks such as reasoning, coding, or vision.
Leaderboards: Public rankings comparing AI models on benchmarks or user preferences.
Large language models ranking: Comparative listings of LLMs by capability, cost, or speed.
Emergent abilities: Capabilities that appear at sufficient model scale and are absent in smaller models.
Model collapse: Gradual quality degradation when models are repeatedly trained on AI-generated synthetic data.
The Pile: An 800 GB open English text dataset created by EleutherAI for training LLMs.
Prompt: The text or other input given to a model to specify a task.
System prompt: A high-level instruction that sets a model's persona, rules, or constraints across a conversation.
Prompt engineering: The practice of structuring inputs to generative models to produce desired outputs.
Prompt engineering for image generation: Writing prompts that guide text-to-image models toward desired visuals.
Prompt engineering for text generation: Crafting prompts that elicit accurate, useful answers from language models.
Meta Prompting: Using a model to design or improve prompts for itself or another model.
Chain-of-thought: A prompting technique where the model writes intermediate reasoning steps before its final answer.
Few shot: Providing a few examples in the prompt to demonstrate the desired task.
One shot: Prompting with a single example of the target task.
Zero-shot: Asking a model to perform a task with no examples, relying on the instruction alone.
Completion: The output text generated by an LLM in response to a prompt.
Tokens: Subword units that language models consume and produce; context limits are measured in tokens.
Context window: The maximum tokenized input a model can consider at once when generating output.
Vector embeddings: Numerical vectors representing words or items so that similar meanings are close in vector space.
Vector database: A database that stores embeddings and retrieves them by vector similarity for semantic search.
Retrieval-augmented generation (RAG): Retrieving external documents and feeding them to an LLM to supplement training knowledge.
Agentic Context Engineering: Designing what information, tools, and memories an agent sees in its context.
ReasoningBank: A memory framework that stores reasoning strategies for agents to reuse across tasks.
Burstiness: Variation in sentence length and complexity, used to distinguish human writing from AI text.
AI Agents: Autonomous systems that pursue goals, use tools, and take actions.
Minimum Viable Agent: The smallest functional agent that demonstrates end-to-end value on a real task.
Browser-use agent (BUA): An agent that controls a web browser to perform tasks.
Computer-use agent (CUA): An agent that operates a computer through screen, mouse, and keyboard.
VLA: A vision-language-action model mapping visual and text inputs to actions, often in robotics.
Deep Research: An agentic feature where a model plans, searches, and writes a long sourced report.
Custom GPTs: Configurable ChatGPT variants combining instructions, knowledge files, and tools.
HuggingGPT: A system using an LLM controller to coordinate specialized models on Hugging Face.
LangChain: An open-source framework for building LLM apps with prompts, tools, memory, and data.
Manus AI: A general-purpose autonomous AI agent launched in 2025.
OpenMule: An open agent project focused on autonomous task execution.
OpenRouter: A unified API for accessing many LLMs from different providers.
Model hubs: Platforms that host and distribute ML models, weights, and datasets.
Ollama: A tool for running open-source LLMs locally.
Cursor Rules: Project-level instructions configuring the Cursor AI code editor.
Claude Code: Anthropic's official command-line coding agent.
Claude --dangerously-skip-permissions: A Claude Code flag that bypasses tool-use prompts.
Claude Skills: Packaged instructions and resources that extend Claude for specialized tasks.
Vibe coding: AI-assisted development where the developer describes intent and an LLM writes code.
CodeGen: AI code generation, or models trained to write source code.
AI Project Management: Using AI tools to plan, track, and report on projects.
Presentations: AI tools that generate slides and visuals from prompts.
Paper2Video: A system that converts academic papers into narrated video summaries.
SEO for ChatGPT: Optimizing web content so AI assistants surface and cite it.
Art: AI-generated visual art produced by text-to-image or image-to-image models.
Image generator: A model that produces images from text or other prompts.
Text-to-image: Models that generate images from text descriptions.
Text-to-video: Models that generate short video clips from text descriptions.
Text-to-3D: Models that produce 3D objects or scenes from text prompts.
Text-to-audio: Models that generate speech, music, or sound effects from text.
AudioCraft: Meta's open-source library for generative audio, including MusicGen and AudioGen.
DreamStudio: Stability AI's web interface for generating images with Stable Diffusion.
Unstable Diffusion: A community project fine-tuning diffusion models for unrestricted image generation.
LongCat-Video: A long-form video generation model from Meituan's LongCat team.
Wan AI: Alibaba's family of open-source video generation models.
Segment Anything: A Meta model producing segmentation masks for any object indicated in an image.
Deepfake: Synthetic image, video, or audio media created by AI to depict real or fictional people.
OpenAI: An American AI research and deployment company headquartered in San Francisco.
Anthropic: An American AI safety company that develops the Claude models.
Google DeepMind: A British-American AI research lab and Alphabet subsidiary that builds Gemini.
Meta AI: Meta Platforms' AI division, which releases the Llama models.
Meta AI (Company): The corporate AI unit within Meta Platforms producing consumer AI products.
Mistral AI: A French AI company developing efficient open and proprietary LLMs.
xAI: Elon Musk's AI company, creator of the Grok family of LLMs.
Stability AI: The company behind Stable Diffusion and related open-source generative models.
DeepSeek: A Chinese AI lab known for high-performing open-weight LLMs and efficient training.
Moonshot AI: A Chinese startup developing the Kimi family of long-context language models.
MiniMax: A Chinese AI company developing conversational, text, image, and audio models.
Baidu AI: Baidu's AI division, which develops the Ernie family of foundation models.
Alibaba Cloud: Alibaba's cloud arm, which trains and serves the Qwen family of models.
ByteDance Seed: ByteDance's foundation model team behind the Doubao and Seed model families.
inclusionAI: An open-source AI organization that releases models and tools on Hugging Face.
Reflection AI: A New York based startup focused on autonomous coding agents and frontier research.
Z.ai: The international brand of Zhipu AI, a Chinese company developing the GLM models.
Frontier labs: The leading AI organizations training the most capable foundation models.
Claude: Anthropic's family of LLM assistants designed with a focus on safety and helpfulness.
Bard: Google's conversational AI service launched in 2023, later rebranded as Gemini.
Grok: The conversational AI assistant developed by xAI and integrated with the X platform.
Perplexity: An AI answer engine combining web search with LLM summarization and citations.
Qwen: Alibaba's family of open-weight foundation models for language, vision, and audio.
GLM: The General Language Model family developed by Zhipu AI for chat, coding, and multimodal tasks.
Q* OpenAI: A reported internal OpenAI project, subject of speculation about advanced reasoning.
Backdooring LLMs: Implanting hidden triggers in a language model that cause malicious behavior on certain inputs.
PoisonGPT: A demonstration of uploading a poisoned LLM to a hub while it behaves normally most of the time.
Prompt injection: An attack where inputs cause an LLM to ignore prior instructions and follow injected ones.
Prompt extraction: An attack that tries to recover a system's hidden prompt or instructions.
Purple Llama: Meta's umbrella project for trust and safety in open generative AI.
Hallucination: AI output that contains plausible-sounding but false or unsupported information.
AI alignment: The research area for steering AI systems toward intended human goals and values.
AI safety: An interdisciplinary field focused on preventing accidents, misuse, and harms from AI systems.
Ethics: The study of moral issues raised by AI, including bias, fairness, accountability, and transparency.
Controversies: Public debates and disputes around AI development, deployment, and impact.
Copyright: Legal questions about the ownership of training data, model outputs, and AI-assisted works.
AI bubble: The concern that AI valuations and investment have outpaced fundamentals.
Dead internet theory: A conspiracy theory that the internet is now dominated by bots and AI-generated content.
LLM Anxiety: Worry or stress caused by interacting with or depending on LLMs.
Manipulation problem: The risk of persuasive AI systems being used to influence beliefs or decisions at scale.
Grok 3 Jailbreak: A documented method for bypassing safety filters on xAI's Grok 3 model.
AI Monarchy: A scenario where a small group, or an AI itself, gains outsized control through advanced AI.
RSI (Recursive self-improvement): The hypothesized process by which an AGI rewrites its own code to become more capable.
NVIDIA: An American chip company whose GPUs are the dominant hardware for training and serving AI models.
GPU: A parallel processor used to accelerate deep learning training and inference.
CUDA: NVIDIA's parallel computing platform and API for general-purpose GPU computation.
cuDNN: NVIDIA's CUDA Deep Neural Network library of optimized primitives for deep learning.
NVIDIA DGX Spark: A compact NVIDIA AI workstation for local model training and inference.
NVIDIA Omniverse: NVIDIA's platform for 3D simulation, digital twins, and synthetic data.
Isaac GR00T: NVIDIA's foundation model and platform for general-purpose humanoid robots.
QPU: A quantum processing unit, the central hardware element of a quantum computer.
PyTorch: An open-source deep learning framework developed by Meta and governed by the Linux Foundation.
Edge AI: Running AI models on local devices near data sources rather than in centralized clouds.
Dark Factory: A fully automated manufacturing facility operated with little or no human presence.
XPeng IRON: A humanoid robot developed by Chinese EV maker XPeng.
SmolVLA: A small efficient vision-language-action model for robotics released by Hugging Face.
DeepSeek-OCR: A DeepSeek vision-language model compressing long text into image tokens for document OCR.
Contexts optical compression: A technique used by DeepSeek-OCR encoding text as compact visual representations.
OpenAI Gym: An open-source toolkit for developing reinforcement-learning algorithms.
OpenAI Gym Retro: An OpenAI extension of Gym turning classic video games into RL environments.
OpenAI Universe: An older OpenAI platform letting RL agents interact with software via a virtual screen.