Artificial intelligence terms
Last reviewed
Sources
34 citations
Review status
Source-backed
Revision
v6 ยท 2,628 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Sources
34 citations
Review status
Source-backed
Revision
v6 ยท 2,628 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Terms, Artificial intelligence and Artificial intelligence applications
This glossary defines 171 common artificial intelligence terms across 11 categories: foundational concepts, models and architectures, training and adaptation, prompting and interaction, AI agents and applications, generative media, organizations and labs, products and brands, safety and ethics, hardware and infrastructure, and specialized techniques. Each entry is a one-line definition that links to its full AI Wiki article, so the glossary serves as both a quick reference and a map of the field.
The vocabulary below spans the lineage of modern AI: from machine learning and neural networks through the transformer architecture introduced in the June 2017 paper "Attention Is All You Need" by Google researchers,[33] to today's large language models and AI agents. It covers both established terms (such as supervised learning and reinforcement learning) and rapidly emerging ones (such as agentic context engineering and computer-use agents). Click any term for the full article.
Artificial general intelligence (AGI): A hypothetical AI matching or surpassing humans across most cognitive tasks.[12]
Artificial Superintelligence (ASI): A hypothetical intellect that vastly exceeds humans in essentially every domain.[13]
Machine learning: A field of AI using statistical algorithms that learn from data without explicit programming.[17]
Deep learning: Machine learning using multilayered neural networks for tasks like classification and representation learning.[16]
Neural network: A computational model inspired by biological neurons that learns by adjusting weighted connections.[16]
Supervised learning: A learning paradigm that maps inputs to outputs using labeled example pairs.[17]
Unsupervised learning: Learning patterns from unlabeled data without explicit targets.[17]
Reinforcement learning: A paradigm where an agent learns by acting in an environment to maximize cumulative reward.[17]
Generative AI: AI that uses generative models to produce text, images, video, audio, or code from prompts.[1]
Natural language processing (NLP): The processing and understanding of human language by computers.[1]
Computer vision: The study of methods for acquiring, processing, and understanding digital images.[1]
Inference: Using a trained model to generate predictions or outputs from new inputs.[1]
Transformers: A neural architecture using multi-head attention to process sequences in parallel, underlying most modern LLMs.[3]
Attention mechanism: A method weighting how important each component of a sequence is relative to the others.[8]
Multi-head Latent Attention (MLA): A DeepSeek attention variant that compresses key and value caches to reduce inference memory.
Large language models (LLMs): Neural networks trained on large text corpora that generate, summarize, translate, and analyze language.[2]
Foundation models: Large models trained on broad data that can be adapted to many downstream tasks.[6]
Frontier models: The most advanced foundation models, whose capabilities may pose novel risks.[6]
GPT: Generative Pre-trained Transformer, a family of transformer language models pre-trained on unlabeled text.[21]
GPT-1: OpenAI's first GPT, released in 2018.[21]
GPT-2: OpenAI's 2019 1.5-billion-parameter LLM, initially released in stages over misuse concerns.[21]
GPT-3: OpenAI's 175-billion-parameter 2020 LLM, notable for few-shot abilities.[21][34]
GPT-4: OpenAI's multimodal LLM released in 2023, accepting text and image inputs.[21]
BERT: A 2018 Google encoder-only transformer that learns bidirectional text representations.[20]
LaMDA: Google's Language Model for Dialogue Applications, announced in 2021.
Diffusion Models: Generative models that produce data by reversing a step-by-step noise-adding process.[7]
Stable Diffusion: A 2022 open-source text-to-image diffusion model.[7]
ControlNet: A neural network add-on that conditions diffusion models on extra inputs like edges, depth, or poses.[32]
Vision Transformer (ViT): A transformer for computer vision that splits images into patches processed with attention.[22]
Vision Language Model (VLM): A multimodal model that jointly processes images and text.
Vision encoder: The component of a multimodal model that turns images into features for a language model.
Vision tokens: Discrete patch or feature representations of an image used by a vision or multimodal model.
Mixture of Experts (MoE): An architecture combining many expert subnetworks, with a router selecting experts per input.[18]
Convolutional neural network (CNN): A feedforward network that uses learned filters to extract spatial features from images.[16]
Recurrent neural network (RNN): A network that processes sequences by feeding outputs from one step back as inputs.[16]
Generative adversarial network (GAN): A framework where a generator and discriminator compete to produce realistic synthetic data.[26]
Multimodal AI: AI systems that integrate multiple data types such as text, images, audio, and video.[1]
World models: AI systems that build internal models of an environment and predict how actions change it.[31]
Computer-use model: A model trained to operate computer interfaces via screen, mouse, and keyboard.
OCR Models: Optical Character Recognition models that convert images of text into machine-readable strings.
Pre-training: The initial large-scale training phase where a model learns general patterns from broad data.[11]
Post-training: Refinement after pre-training, including supervised fine-tuning and reinforcement-learning alignment.[11]
Fine-tuning: Adapting a pre-trained model to a specific task by continuing training on task data.[11]
Supervised fine-tuning (SFT): Fine-tuning on labeled input-output pairs that demonstrate desired behavior.[11]
RLHF (Reinforcement learning from human feedback): Aligning a model with human preferences by training a reward model from feedback and optimizing against it.[5]
LoRA (low-rank adaptation): A parameter-efficient fine-tuning method that injects small low-rank matrices while freezing base weights.[29]
Knowledge distillation: Transferring knowledge from a large teacher model to a smaller student that approximates it.[23]
Quantization: Reducing the numerical precision of model weights and activations to lower memory and speed up inference.[30]
Pruning: Removing weights or neurons from a trained model to reduce size with limited accuracy loss.
Muon optimizer: A neural network optimizer using orthogonalized gradient updates that has gained attention for large-model training.
Benchmarks: Standardized tests comparing model performance on tasks such as reasoning, coding, or vision.
Leaderboards: Public rankings comparing AI models on benchmarks or user preferences.
Large language models ranking: Comparative listings of LLMs by capability, cost, or speed.
Emergent abilities: Capabilities that appear at sufficient model scale and are absent in smaller models.[2]
Model collapse: Gradual quality degradation when models are repeatedly trained on AI-generated synthetic data.
The Pile: An 800 GB open English text dataset created by EleutherAI for training LLMs.
Prompt: The text or other input given to a model to specify a task.
System prompt: A high-level instruction that sets a model's persona, rules, or constraints across a conversation.
Prompt engineering: The practice of structuring inputs to generative models to produce desired outputs.
Prompt engineering for image generation: Writing prompts that guide text-to-image models toward desired visuals.
Prompt engineering for text generation: Crafting prompts that elicit accurate, useful answers from language models.
Meta Prompting: Using a model to design or improve prompts for itself or another model.
Chain-of-thought: A prompting technique where the model writes intermediate reasoning steps before its final answer.[10]
Few shot: Providing a few examples in the prompt to demonstrate the desired task.
One shot: Prompting with a single example of the target task.
Zero-shot: Asking a model to perform a task with no examples, relying on the instruction alone.
Completion: The output text generated by an LLM in response to a prompt.[2]
Tokens: Subword units that language models consume and produce; context limits are measured in tokens.[25]
Context window: The maximum tokenized input a model can consider at once when generating output.[25]
Vector embeddings: Numerical vectors representing words or items so that similar meanings are close in vector space.
Vector database: A database that stores embeddings and retrieves them by vector similarity for semantic search.[19]
Retrieval-augmented generation (RAG): Retrieving external documents and feeding them to an LLM to supplement training knowledge.[4]
Agentic Context Engineering: Designing what information, tools, and memories an agent sees in its context.
ReasoningBank: A memory framework that stores reasoning strategies for agents to reuse across tasks.
Burstiness: Variation in sentence length and complexity, used to distinguish human writing from AI text.
AI Agents: Autonomous systems that pursue goals, use tools, and take actions.
Minimum Viable Agent: The smallest functional agent that demonstrates end-to-end value on a real task.
Browser-use agent (BUA): An agent that controls a web browser to perform tasks.
Computer-use agent (CUA): An agent that operates a computer through screen, mouse, and keyboard.
VLA: A vision-language-action model mapping visual and text inputs to actions, often in robotics.
Deep Research: An agentic feature where a model plans, searches, and writes a long sourced report.
Custom GPTs: Configurable ChatGPT variants combining instructions, knowledge files, and tools.
HuggingGPT: A system using an LLM controller to coordinate specialized models on Hugging Face.
LangChain: An open-source framework for building LLM apps with prompts, tools, memory, and data.
Manus AI: A general-purpose autonomous AI agent launched in 2025.
OpenMule: An open agent project focused on autonomous task execution.
OpenRouter: A unified API for accessing many LLMs from different providers.
Model hubs: Platforms that host and distribute ML models, weights, and datasets.
Ollama: A tool for running open-source LLMs locally.
Cursor Rules: Project-level instructions configuring the Cursor AI code editor.
Claude Code: Anthropic's official command-line coding agent.
Claude --dangerously-skip-permissions: A Claude Code flag that bypasses tool-use prompts.
Claude Skills: Packaged instructions and resources that extend Claude for specialized tasks.
Vibe coding: AI-assisted development where the developer describes intent and an LLM writes code.
CodeGen: AI code generation, or models trained to write source code.
AI Project Management: Using AI tools to plan, track, and report on projects.
Presentations: AI tools that generate slides and visuals from prompts.
Paper2Video: A system that converts academic papers into narrated video summaries.
SEO for ChatGPT: Optimizing web content so AI assistants surface and cite it.
Art: AI-generated visual art produced by text-to-image or image-to-image models.
Image generator: A model that produces images from text or other prompts.
Text-to-image: Models that generate images from text descriptions.
Text-to-video: Models that generate short video clips from text descriptions.
Text-to-3D: Models that produce 3D objects or scenes from text prompts.
Text-to-audio: Models that generate speech, music, or sound effects from text.
AudioCraft: Meta's open-source library for generative audio, including MusicGen and AudioGen.
DreamStudio: Stability AI's web interface for generating images with Stable Diffusion.
Unstable Diffusion: A community project fine-tuning diffusion models for unrestricted image generation.
LongCat-Video: A long-form video generation model from Meituan's LongCat team.
Wan AI: Alibaba's family of open-source video generation models.
Segment Anything: A Meta model producing segmentation masks for any object indicated in an image.
Deepfake: Synthetic image, video, or audio media created by AI to depict real or fictional people.
OpenAI: An American AI research and deployment company headquartered in San Francisco.
Anthropic: An American AI safety company that develops the Claude models.
Google DeepMind: A British-American AI research lab and Alphabet subsidiary that builds Gemini.
Meta AI: Meta Platforms' AI division, which releases the Llama models.
Meta AI (Company): The corporate AI unit within Meta Platforms producing consumer AI products.
Mistral AI: A French AI company developing efficient open and proprietary LLMs.
xAI: Elon Musk's AI company, creator of the Grok family of LLMs.
Stability AI: The company behind Stable Diffusion and related open-source generative models.
DeepSeek: A Chinese AI lab known for high-performing open-weight LLMs and efficient training.
Moonshot AI: A Chinese startup developing the Kimi family of long-context language models.
MiniMax: A Chinese AI company developing conversational, text, image, and audio models.
Baidu AI: Baidu's AI division, which develops the Ernie family of foundation models.
Alibaba Cloud: Alibaba's cloud arm, which trains and serves the Qwen family of models.
ByteDance Seed: ByteDance's foundation model team behind the Doubao and Seed model families.
inclusionAI: An open-source AI organization that releases models and tools on Hugging Face.
Reflection AI: A New York based startup focused on autonomous coding agents and frontier research.
Z.ai: The international brand of Zhipu AI, a Chinese company developing the GLM models.
Frontier labs: The leading AI organizations training the most capable foundation models.
Claude: Anthropic's family of LLM assistants designed with a focus on safety and helpfulness.
Bard: Google's conversational AI service launched in 2023, later rebranded as Gemini.
Grok: The conversational AI assistant developed by xAI and integrated with the X platform.
Perplexity: An AI answer engine combining web search with LLM summarization and citations.
Qwen: Alibaba's family of open-weight foundation models for language, vision, and audio.
GLM: The General Language Model family developed by Zhipu AI for chat, coding, and multimodal tasks.
Q* OpenAI: A reported internal OpenAI project, subject of speculation about advanced reasoning.
Backdooring LLMs: Implanting hidden triggers in a language model that cause malicious behavior on certain inputs.
PoisonGPT: A demonstration of uploading a poisoned LLM to a hub while it behaves normally most of the time.
Prompt injection: An attack where inputs cause an LLM to ignore prior instructions and follow injected ones.[24]
Prompt extraction: An attack that tries to recover a system's hidden prompt or instructions.
Purple Llama: Meta's umbrella project for trust and safety in open generative AI.
Hallucination: AI output that contains plausible-sounding but false or unsupported information.[9]
AI alignment: The research area for steering AI systems toward intended human goals and values.[14]
AI safety: An interdisciplinary field focused on preventing accidents, misuse, and harms from AI systems.[15]
Ethics: The study of moral issues raised by AI, including bias, fairness, accountability, and transparency.
Controversies: Public debates and disputes around AI development, deployment, and impact.
Copyright: Legal questions about the ownership of training data, model outputs, and AI-assisted works.
AI bubble: The concern that AI valuations and investment have outpaced fundamentals.
Dead internet theory: A conspiracy theory that the internet is now dominated by bots and AI-generated content.
LLM Anxiety: Worry or stress caused by interacting with or depending on LLMs.
Manipulation problem: The risk of persuasive AI systems being used to influence beliefs or decisions at scale.
Grok 3 Jailbreak: A documented method for bypassing safety filters on xAI's Grok 3 model.
AI Monarchy: A scenario where a small group, or an AI itself, gains outsized control through advanced AI.
RSI (Recursive self-improvement): The hypothesized process by which an AGI rewrites its own code to become more capable.[12]
NVIDIA: An American chip company whose GPUs are the dominant hardware for training and serving AI models.
GPU: A parallel processor used to accelerate deep learning training and inference.
CUDA: NVIDIA's parallel computing platform and API for general-purpose GPU computation.[27]
cuDNN: NVIDIA's CUDA Deep Neural Network library of optimized primitives for deep learning.[27]
NVIDIA DGX Spark: A compact NVIDIA AI workstation for local model training and inference.
NVIDIA Omniverse: NVIDIA's platform for 3D simulation, digital twins, and synthetic data.
Isaac GR00T: NVIDIA's foundation model and platform for general-purpose humanoid robots.
QPU: A quantum processing unit, the central hardware element of a quantum computer.
PyTorch: An open-source deep learning framework developed by Meta and governed by the Linux Foundation.[28]
Edge AI: Running AI models on local devices near data sources rather than in centralized clouds.
Dark Factory: A fully automated manufacturing facility operated with little or no human presence.
XPeng IRON: A humanoid robot developed by Chinese EV maker XPeng.
SmolVLA: A small efficient vision-language-action model for robotics released by Hugging Face.
DeepSeek-OCR: A DeepSeek vision-language model compressing long text into image tokens for document OCR.
Contexts optical compression: A technique used by DeepSeek-OCR encoding text as compact visual representations.
OpenAI Gym: An open-source toolkit for developing reinforcement-learning algorithms.
OpenAI Gym Retro: An OpenAI extension of Gym turning classic video games into RL environments.
OpenAI Universe: An older OpenAI platform letting RL agents interact with software via a virtual screen.