Amazon Bedrock

AI Tools & Products Artificial Intelligence

20 min read

Updated Jun 20, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 20, 2026

Fact-checked

In review queue

Sources

16 citations

Revision

v6 · 3,991 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Amazon Bedrock is a fully managed service from Amazon Web Services (AWS) that offers a choice of high-performing foundation models from leading AI companies, accessible through a single unified API, so developers and enterprises can build, scale, and deploy generative artificial intelligence applications and agents without managing infrastructure or training models from scratch ^[1]. AWS describes it as "the platform for building generative AI applications and agents at production scale," and reports that Bedrock powers generative AI for more than 100,000 organizations worldwide, from startups to global enterprises across every industry ^[6]^[10]. It reached general availability on September 28, 2023, and by early 2026 offered access to nearly 100 serverless models, making it one of the broadest multi-provider model platforms in cloud computing ^[1]^[14].

What is Amazon Bedrock used for?

Bedrock is used to add generative AI capabilities to applications without operating model infrastructure. Common workloads include text generation and summarization, conversational assistants and chatbots, retrieval-augmented generation over private data, code generation, image generation, document and video analysis, and increasingly the building and operation of autonomous AI agents. Because models are exposed through one API and standard AWS SDKs, teams can prototype with several models, benchmark them on their own tasks, and move the best fit into production without rebuilding application logic. The official AWS definition states that "Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies" ^[1].

History and Development

AWS first previewed Amazon Bedrock in April 2023 at its annual summit, positioning it as the company's answer to growing demand for enterprise-grade generative AI tools. The service became generally available on September 28, 2023, initially launching in two AWS regions: US East (N. Virginia) and US West (Oregon) ^[1]^[2]. At launch, the platform offered models from AI21 Labs, Anthropic, Cohere, Stability AI, and Amazon's own Titan family, with Meta's Llama 2 (in fine-tuned 13B and 70B versions) added shortly after as the first fully managed Llama 2 API ^[1]^[2]. Bedrock achieved HIPAA eligibility and GDPR compliance at general availability ^[2].

Since then, AWS has steadily expanded Bedrock's capabilities and model catalog. In December 2025, AWS announced its largest single expansion, adding 18 fully managed open-weight models to the platform, with new entries from Mistral, Google, NVIDIA, OpenAI, MiniMax, Moonshot, and Qwen; with that launch Bedrock reached nearly 100 serverless models ^[14]. That same month, on December 2, 2025, AWS introduced the Amazon Nova 2 family of foundation models, which offer advanced reasoning with competitive price-performance characteristics ^[3].

In October 2025, AWS launched Amazon Bedrock AgentCore, a platform-level service for building, deploying, and operating AI agents at scale without managing infrastructure. AgentCore became generally available on October 13, 2025 and marked a significant evolution in the platform's capabilities, reflecting the broader industry shift toward agentic AI architectures ^[9]^[15]. By early 2026, Amazon Bedrock powers generative AI for more than 100,000 organizations worldwide, spanning startups to global enterprises across every industry ^[10].

When was Amazon Bedrock released?

Amazon Bedrock was previewed in April 2023 and reached general availability on September 28, 2023. The following timeline summarizes major milestones ^[1]^[3]^[9]^[14]^[15].

Date	Milestone
April 2023	Bedrock previewed at the AWS Summit
September 28, 2023	General availability in US East (N. Virginia) and US West (Oregon)
2025	Intelligent Prompt Routing, Priority and Flex pricing tiers, multi-agent collaboration introduced
July 2025	Bedrock AgentCore released in preview
October 13, 2025	Bedrock AgentCore reaches general availability
December 2, 2025	Amazon Nova 2 family announced; 18 open-weight models added (largest expansion to date)
December 3, 2025	Reinforcement fine-tuning announced
Early 2026	Nearly 100 serverless models; 100,000+ organizations using Bedrock

Available Models

One of Bedrock's primary differentiators is its multi-provider approach. Instead of locking customers into a single model family, it offers a marketplace of foundation models spanning text generation, image generation, and embeddings.

Provider	Notable Models	Capabilities
Anthropic	Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 4	Text generation, analysis, code, vision
Meta	Llama 3.3 (8B, 70B, 405B)	Text generation, code, multilingual
Amazon	Titan Text, Titan Embeddings, Amazon Nova 2 Lite, Nova 2 Pro	Text generation, embeddings, reasoning
Mistral AI	Mistral Large 3, Magistral Small 1.2, Ministral 3 series, Voxtral	Text generation, code, multilingual, audio
Cohere	Command R, Command R+, Embed	Text generation, RAG-optimized, embeddings
Stability AI	Stable Diffusion XL, SDXL Turbo	Image generation
AI21 Labs	Jamba-Instruct	Text generation, long context
DeepSeek	DeepSeek R1	Reasoning, code
Google	Gemma 3 (4B, 12B, 27B)	Text generation, lightweight
NVIDIA	Nemotron Nano 2 series	Text generation, vision

AWS continues to add new providers and models on a regular basis. The platform also includes Amazon's own Nova 2 family, which was announced on December 2, 2025 and includes Nova 2 Lite, a fast, cost-effective reasoning model for everyday workloads, and Nova 2 Pro (Preview), positioned as Amazon's most intelligent model for complex, multi-step tasks ^[3]. Both Nova 2 models support extended thinking with step-by-step reasoning at three thinking-intensity levels (low, medium, and high), include built-in tools such as a code interpreter and web grounding, support remote Model Context Protocol (MCP) tools, and provide a one-million-token context window ^[3].

Model Pricing by Provider

Bedrock prices models on a per-token basis, with costs varying significantly across providers and model sizes. The following table provides representative on-demand pricing for commonly used models as of early 2026.

Model	Input (per 1M tokens)	Output (per 1M tokens)	Notes
Claude 3.5 Sonnet	$6.00	$30.00	Batch: $3.00 / $15.00
Claude 3.5 Haiku	$1.00	$5.00	Fastest Claude model
Llama 2 Chat 70B	$1.95	$2.56	Open-weight
DeepSeek v3.2	$0.62	$1.85	Reasoning-focused
Gemma 3 4B	$0.04	$0.08	Lightweight, budget
Gemma 3 12B	$0.09	$0.29	Mid-range
Gemma 3 27B	$0.23	$0.38	Higher capability
Ministral 3B	$0.10	$0.10	Budget option
Voxtral Mini	$0.04	$0.04	Audio processing

Output tokens typically cost 3 to 5 times more than input tokens across most models, reflecting the higher computational cost of text generation versus processing ^[7]. Budget-conscious deployments can leverage smaller models like Gemma 3 4B or Ministral 3B at a fraction of the cost of frontier models.

Key Features

Model Access and API

Bedrock provides a unified API that lets developers switch between different foundation models with minimal code changes. This abstraction layer means that teams can experiment with multiple models, benchmark their performance on specific tasks, and select the best fit without rebuilding their application logic. All API calls are made through standard AWS SDKs, making integration straightforward for organizations already using AWS infrastructure.

Knowledge Bases and Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) is a technique that enhances the accuracy of AI-generated responses by grounding them in external data sources. Bedrock Knowledge Bases is a fully managed capability that handles the entire RAG workflow, from data ingestion and indexing to retrieval and prompt augmentation ^[4]. Users connect their data sources (Amazon S3 buckets, web crawlers, or databases), select an embeddings model, and Bedrock automatically chunks, indexes, and stores the data in a vector store. When a query comes in, the system retrieves relevant information and passes it to the generation model as context.

According to AWS, RAG with Knowledge Bases can reduce hallucinated outputs by 50 to 90 percent compared to relying solely on a model's parametric knowledge ^[4]. New documents can be reindexed automatically without retraining the model, and the system supports both structured and unstructured data with metadata filtering.

Knowledge Bases Architecture

The Knowledge Bases pipeline operates in two phases. During the ingestion phase, Bedrock accepts documents from connected data sources, splits them into configurable chunk sizes, generates vector embeddings using a selected embeddings model (such as Amazon Titan Embeddings or Cohere Embed), and stores the resulting vectors in a supported vector store. Supported vector stores include Amazon OpenSearch Serverless, Amazon Aurora PostgreSQL, Pinecone, and Redis Enterprise Cloud ^[4].

During the retrieval phase, when a user submits a query, the system generates a query embedding, performs a semantic similarity search against the vector store, retrieves the most relevant document chunks, augments the prompt with the retrieved context, and passes the augmented prompt to the selected generation model. The system also supports in-built session context management and source attribution, returning citations that trace each part of the response back to its source document ^[11].

Knowledge Bases can be integrated with Bedrock Guardrails for content filtering and with Bedrock Agents for multi-step task execution that requires grounding in enterprise data.

Bedrock Agents

Bedrock Agents allows developers to create AI-powered assistants that can reason through multi-step tasks, call external APIs, and interact with Knowledge Bases. An agent takes a user request, breaks it down into subtasks, decides which tools or data sources to query, executes those steps, and returns a coherent response. This makes it possible to build applications like customer service bots that can look up order status, process refunds, and answer product questions in a single conversation.

Agents are defined through action groups (the APIs and functions they can call) and can be integrated with Knowledge Bases and Guardrails for controlled, grounded responses.

Agent Architecture and Multi-Agent Collaboration

Bedrock Agents use a managed orchestration architecture where a foundation model serves as the reasoning engine. When a user sends a request, the agent invokes the FM to create a reasoning trace, which determines the sequence of actions required to fulfill the request. At each step, the agent decides whether to call an action group, query a Knowledge Base, or return a response to the user.

Multi-agent collaboration, introduced in 2025, allows developers to build systems where multiple specialized agents work together under the coordination of a supervisor agent. The supervisor agent breaks complex processes into manageable steps, assigns tasks to domain-specialist sub-agents, and aggregates their results. Each sub-agent can focus on a specific capability, such as database queries, API calls, or document analysis, enabling a separation of concerns that improves both reliability and maintainability ^[9].

Amazon Bedrock AgentCore

AgentCore, which reached general availability on October 13, 2025, provides the infrastructure layer for deploying and operating agents at scale. AWS describes it as "an agentic platform to build, deploy and operate highly capable agents securely at scale using any framework, model, or protocol" ^[15]. Its components include ^[9]^[15]:

Component	Function
Runtime	Secure, serverless environment for deploying agents and tools; supports any framework, complete session isolation, and execution windows up to 8 hours
Memory	Session persistence and context retention across interactions
Gateway	Transforms existing tools, APIs, and Lambda functions into agent-ready capabilities and connects to MCP servers
Identity	Authentication and authorization for agent-to-service communication, with OAuth integration and secure token storage
Observability	Monitoring, logging, and debugging for agent behavior via Amazon CloudWatch dashboards, OpenTelemetry compatible

AgentCore supports the Agent-to-Agent (A2A) protocol, announced in late 2025, which enables interoperability between agents built on different frameworks, including AWS Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, and Claude Agents SDK. The A2A protocol allows agents to share context, capabilities, and reasoning in a standardized, verifiable format ^[12].

Bedrock Guardrails

Bedrock Guardrails provides configurable safety filters that sit between the user and the model. Organizations can define policies to block harmful content, enforce topic boundaries, redact personally identifiable information, and check for hallucinations. AWS reports that Guardrails can block up to 88 percent of harmful content, including multimodal image and text content, and that its Automated Reasoning checks identify correct model responses with up to 99 percent accuracy ^[5]^[6]. AWS notes that the Automated Reasoning checks "mathematically verify natural language content against your defined policies," providing provable explanations rather than pattern matching ^[5].

Guardrails also include contextual grounding checks, which verify whether a model's response is actually supported by the source material provided. This is especially useful in RAG applications where responses should be traceable back to retrieved documents.

Guardrails Feature Categories

The Guardrails system provides several distinct categories of protection ^[5]^[11]:

Feature	Description
Content filters	Block harmful or inappropriate text and images across categories (hate, sexual, violence, self-harm) with configurable severity thresholds
Prompt attack detection	Identify and block malicious prompts attempting to bypass moderation or alter model behavior
Denied topics	Define topics the model should refuse to discuss (e.g., illegal advice, competitor analysis)
Word filters	Block specific words or phrases such as profanity, competitor names, or internal terminology
PII redaction	Automatically detect and redact personally identifiable information from inputs and outputs
Contextual grounding	Verify that responses are supported by retrieved source material
Automated Reasoning	Logic-based verification that checks model outputs against defined business rules

Guardrails can be applied to any model available through Bedrock, including custom and fine-tuned models. They are also integrated with Knowledge Bases and Agents, creating layered protection across the full application stack.

Fine-Tuning and Model Customization

Bedrock supports fine-tuning of select models using labeled training data. Organizations can adapt a foundation model to their specific domain, terminology, or response style without building a model from scratch. Bedrock also supports continued pre-training with unlabeled data for deeper customization. All fine-tuning happens within the AWS environment, and training data never leaves the customer's account ^[6].

Customization Methods

Bedrock offers four distinct approaches to model customization ^[13]:

Method	Description	Data Required	Best For
Supervised fine-tuning	Adapt model behavior using labeled prompt-completion pairs	Labeled examples	Domain-specific tasks, tone adjustment
Continued pre-training	Train on unlabeled domain data to deepen knowledge	Unlabeled text corpus	Industry jargon, proprietary knowledge
Distillation	Transfer capabilities from a large "teacher" model to a smaller "student" model	Prompt dataset (automated synthesis)	Cost reduction while maintaining quality
Reinforcement fine-tuning	Optimize using reward functions with rule-based or AI-based graders	Evaluation criteria	Code generation, math, instruction following

Distilled models run up to 500 percent faster and cost up to 75 percent less than the larger models they are derived from, while preserving comparable accuracy ^[6]. Reinforcement fine-tuning, introduced on December 3, 2025, is particularly notable: it enables developers to improve model accuracy without deep machine learning expertise or large volumes of labeled data. It supports two complementary approaches: Reinforcement Learning with Verifiable Rewards (RLVR), which uses rule-based graders for objective tasks such as code generation and mathematical reasoning, and Reinforcement Learning from AI Feedback (RLAIF), which uses AI-based judges for subjective tasks such as instruction following and tone consistency. At launch the feature supported Amazon Nova 2 Lite, with additional models planned, and AWS reports that reinforcement fine-tuning delivers average accuracy gains of 66 percent over base models ^[13].

Model Evaluation

Bedrock provides built-in model evaluation tools that let teams compare different models on their own datasets. Automatic evaluations charge only for the model inference used, with no minimum usage commitments. This is particularly valuable for organizations that want to rigorously benchmark models before committing to a specific one in production.

Intelligent Prompt Routing

Introduced in 2025, Intelligent Prompt Routing allows Bedrock to automatically direct requests to different models within the same model family based on prompt complexity. For example, simple queries might be routed to Claude 3 Haiku (cheaper, faster), while complex queries go to Claude 3.5 Sonnet (more capable). AWS claims this can cut costs by up to 30 percent while maintaining quality ^[6]. The routing system also supports Llama (routing between Llama 3.3 70B and 3.1 8B) and Nova (routing between Nova Pro and Nova Lite) model families.

Bedrock Flows

Bedrock Flows is a visual workflow authoring tool that lets users orchestrate multiple components, including foundation models, prompts, agents, Knowledge Bases, Guardrails, and AWS services, into coherent pipelines. Teams can design, test, and iterate on multi-step AI workflows using a drag-and-drop interface. Flows pricing is $0.035 per 1,000 node transitions ^[7].

How much does Amazon Bedrock cost?

Bedrock uses a pay-per-use pricing structure with several options designed to match different workload patterns.

Pricing Tier	Description	Best For
On-Demand (Standard)	Pay per input/output token at base rates	Variable or experimental workloads
Priority	75% premium over standard rates for guaranteed low latency	Latency-sensitive production workloads
Flex	50% discount vs. on-demand; best-effort processing	Cost-sensitive, latency-tolerant workloads
Batch Inference	50% discount vs. on-demand; results within 24 hours	Large-scale, non-time-sensitive processing
Provisioned Throughput	Reserved model units for guaranteed capacity (1-month and 6-month terms)	High-volume production workloads
Bedrock Flows	$0.035 per 1,000 node transitions	Multi-step orchestrated pipelines
Model Evaluation	Charged only for inference used	Benchmarking and model selection

The introduction of the Priority and Flex tiers in 2025 added flexibility for workloads with different latency and cost requirements. Batch inference is particularly attractive for offline processing tasks, offering the same model quality at half the on-demand price ^[7].

Real-World Customer Deployments

Amazon Bedrock has attracted a diverse range of enterprise customers across industries. Several notable deployments illustrate the platform's production capabilities.

Customer	Industry	Use Case	Results
Robinhood	Financial services	AI-first financial analysis and customer support	Scaled from 500M to 5B tokens/day in 6 months; 80% AI cost reduction; development time cut by half ^[10]^[16]
Toyota Motor North America	Automotive	RAG-driven dealer assistant for vehicle information	Over 7,000 dealer interactions per month ^[10]
Apex Fintech Solutions	Financial services	Financial crime investigation with agent-to-agent communication	Automated complex investigation workflows ^[10]
Epsilon	Marketing	Intelligent agents for campaign workflow automation	Enterprise-grade campaign management with security compliance ^[10]
CloudZero	Cloud FinOps	AI-powered cloud cost advisor platform	50x growth; 75% reduction in developer cognitive load ^[10]
Fujitsu	Supply chain	Agentic supply chain workflows with guardian agent monitoring	Continuous monitoring and correction of agent drift ^[10]

These deployments demonstrate that Bedrock is being used in production at significant scale, with customers processing billions of tokens daily and integrating AI into mission-critical business processes. Robinhood, which built a secure "FinCrimes Agent" on Bedrock to synthesize customer and transactional data into investigative summaries, has made the platform central to its operations. According to Dev Tagare, Robinhood's Head of AI, "Amazon Bedrock's model diversity, security, and compliance features are purpose-built for regulated industries" ^[16].

Security and Data Privacy

Bedrock encrypts all data in transit and at rest. Customer prompts and outputs are not stored by AWS or used to train or improve foundation models. All data processing occurs within the customer's own AWS account, and Bedrock integrates with AWS Identity and Access Management (IAM), AWS PrivateLink, and AWS CloudTrail for access control and auditing. Bedrock achieved HIPAA eligibility and GDPR compliance at general availability, making it suitable for industries with strict compliance requirements, including healthcare, finance, and government ^[2].

How does Amazon Bedrock compare with Azure and Vertex AI?

Bedrock competes primarily with Azure OpenAI Service (Microsoft) and Google Vertex AI (Google Cloud).

Feature	Amazon Bedrock	Azure OpenAI Service	Google Vertex AI
Model Providers	10+ providers	Primarily OpenAI	Primarily Google (Gemini) + Model Garden
Approach	Multi-vendor marketplace	Deep OpenAI integration	Data-first, analytics-driven
RAG Support	Knowledge Bases (managed)	Azure AI Search integration	Vertex AI Search
Agent Framework	Bedrock Agents + AgentCore	Azure AI Agent Service	Vertex AI Agents
Safety Tools	Bedrock Guardrails	Content filtering + Responsible AI	Responsible AI toolkit
Pricing Model	Per-token, batch, flex, provisioned	Per-token, PTUs	Per-token, compute-hour
Ecosystem	AWS services integration	Microsoft/Office 365 integration	BigQuery, Dataflow integration
Multi-Agent	A2A protocol, supervisor agents	Agent orchestration	Agent Engine
Fine-Tuning	Supervised, continued pre-training, distillation, reinforcement	Supervised fine-tuning	Supervised, RLHF

Bedrock's main advantage is breadth of model choice. While Azure centers on OpenAI's models and Vertex AI focuses on Google's Gemini family, Bedrock offers the widest selection of third-party providers in a single managed platform. For enterprises already invested in AWS infrastructure, Bedrock also provides the smoothest integration path ^[8].

For typical enterprise applications processing 10 to 50 million tokens monthly, Bedrock generally offers 15 to 25 percent lower costs than Azure, though Azure becomes more competitive at scale with reserved capacity ^[8]. Bedrock's Flex tier offers a unique advantage for latency-tolerant workloads, providing 50 percent discounts that have no direct equivalent in Azure or Vertex AI.

Current State (2025-2026)

As of early 2026, Amazon Bedrock has grown into one of the most feature-rich AI platforms in the cloud market, powering generative AI for more than 100,000 organizations worldwide ^[10]. The addition of nearly 100 serverless models, including open-weight models from Google, NVIDIA, MiniMax, and Moonshot AI, has broadened its appeal beyond the initial set of providers ^[14]. The Nova 2 family positions Amazon's own models, with their one-million-token context windows and tiered reasoning, as competitive alternatives for agentic and reasoning tasks ^[3].

AWS has also invested heavily in agentic AI capabilities, with Bedrock Agents, AgentCore, and Flows forming the backbone of increasingly sophisticated multi-step AI workflows. The support for the A2A protocol reflects a commitment to interoperability in a market where enterprises often use multiple AI frameworks and providers. The platform's Guardrails feature has matured into a comprehensive responsible AI solution that addresses enterprise concerns about safety, hallucination, and compliance.

Looking ahead, AWS continues to expand regional availability, add new model providers, and deepen integration with the broader AWS ecosystem. Bedrock's position as a model-agnostic platform gives it a structural advantage in a market where model leadership shifts rapidly between providers.

References

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

5 revisions by 1 contributors · full history

Suggest edit

Amazon Bedrock

What is Amazon Bedrock used for?

History and Development

When was Amazon Bedrock released?

Available Models

Model Pricing by Provider

Key Features

Model Access and API

Knowledge Bases and Retrieval-Augmented Generation

Knowledge Bases Architecture

Bedrock Agents

Agent Architecture and Multi-Agent Collaboration

Amazon Bedrock AgentCore

Bedrock Guardrails

Guardrails Feature Categories

Fine-Tuning and Model Customization

Customization Methods

Model Evaluation

Intelligent Prompt Routing

Bedrock Flows

How much does Amazon Bedrock cost?

Real-World Customer Deployments

Security and Data Privacy

How does Amazon Bedrock compare with Azure and Vertex AI?

Current State (2025-2026)

References

Improve this article

What links here (24 of 81)

What links here (24 of 81)

What is Amazon Bedrock used for?

History and Development

When was Amazon Bedrock released?

Available Models

Model Pricing by Provider

Key Features

Model Access and API

Knowledge Bases and Retrieval-Augmented Generation

Knowledge Bases Architecture

Bedrock Agents

Agent Architecture and Multi-Agent Collaboration

Amazon Bedrock AgentCore

Bedrock Guardrails

Guardrails Feature Categories

Fine-Tuning and Model Customization

Customization Methods

Model Evaluation

Intelligent Prompt Routing

Bedrock Flows

How much does Amazon Bedrock cost?

Real-World Customer Deployments

Security and Data Privacy

How does Amazon Bedrock compare with Azure and Vertex AI?

Current State (2025-2026)

References

Improve this article

Related Articles

Claude Sonnet 4.5

Microsoft 365 Copilot

Model Context Protocol

Apple Intelligence

AI in Healthcare

AI Drug Discovery

What links here (24 of 81)

Related Articles

Claude Sonnet 4.5

Microsoft 365 Copilot

Model Context Protocol

Apple Intelligence

AI in Healthcare

AI Drug Discovery

What links here (24 of 81)