Vertex AI is the unified machine learning and generative AI platform offered by Google Cloud. It was announced on May 18, 2021 at Google I/O, and it consolidated the company's previously separate AI Platform and AutoML products into a single managed service that covers the full ML lifecycle: data preparation, training, tuning, evaluation, deployment, monitoring, and generative AI application development. Vertex AI is the primary path through which enterprise customers consume models built by Google DeepMind, including the Gemini family, as well as third-party foundation models from Anthropic, Mistral AI, Meta, and others.
At launch Google claimed Vertex AI required roughly 80% fewer lines of code to train a production model than competing platforms, a figure repeatedly cited by analysts to describe the platform's emphasis on managed MLOps over hand-rolled pipelines.[1][2] By 2026 the platform had grown into one of the three dominant enterprise ML clouds alongside Amazon SageMaker and Azure Machine Learning, and Google had begun rebranding the surface as the Gemini Enterprise Agent Platform while continuing to ship the underlying Vertex AI services.[3][4]
Vertex AI is a managed service: customers do not run the training schedulers, feature stores, model registries, or inference endpoints themselves. Instead they call a single REST/gRPC API, a Python SDK, or a web console, and Google operates the underlying compute on its own GPUs and TPUs. The platform is designed to be modular, so a team can adopt only the pieces it needs (for example, just the Model Registry and online prediction service) and integrate them with its own MLOps tooling, including open-source projects such as Kubeflow and MLflow.
The product is split conceptually into two halves. The first is the classical ML half that grew out of the old AI Platform: notebooks, custom training jobs, AutoML, hyperparameter tuning, the Feature Store, the Model Registry, and online and batch prediction. The second is the generative AI half that grew out of the 2023 launch of Vertex AI Studio and Model Garden, which exposes foundation models such as Gemini 3 Pro, Imagen, Veo, Claude, Llama, Mistral, and Gemma as managed APIs. Both halves share the same identity, billing, networking, and governance plane.
Before Vertex AI, Google Cloud sold ML capabilities under two separate brands. Cloud AI Platform (originally Cloud Machine Learning Engine, launched in 2017) was the developer-facing service for training and serving custom models written in TensorFlow, PyTorch, and other open-source frameworks. Cloud AutoML, introduced in January 2018, was a no-code product that let business users train custom vision, language, and tabular models from labeled datasets without writing model code. The two products had different APIs, different consoles, different billing pages, and different SDKs, which made it difficult for an organization to mix custom and AutoML workflows in one pipeline.
Alongside these, Google sold a portfolio of pretrained APIs (Vision API, Natural Language API, Translation API, Speech-to-Text, Video Intelligence) that were not directly part of either Cloud AI Platform or AutoML. Customers who wanted to combine these services with custom models often ended up writing their own glue code.
Google announced Vertex AI on May 18, 2021 during the Google I/O developer conference. The product was generally available the same day. The launch keynote was led by Andrew Moore, vice president and general manager of Cloud AI and Industry Solutions, who said the team had two guiding lights while building the platform: "get data scientists and engineers out of the orchestration weeds, and create an industry-wide shift that would make everyone get serious about moving AI out of pilot purgatory and into full-scale production."[1]
Craig Wiley, the director of product management for the new platform, told TechCrunch that "machine learning in the enterprise is in crisis," pointing to the very high rate at which enterprise ML projects failed to make it into production. Wiley had previously been general manager of Amazon SageMaker between 2016 and 2018, which gave him direct perspective on the competitive landscape.[2]
At launch the platform consolidated the legacy AI Platform and AutoML services and added a set of new MLOps components: Vertex Vizier (hyperparameter tuning), Vertex Feature Store, Vertex Experiments, Vertex Pipelines, Vertex Model Monitoring, Vertex ML Metadata, and Vertex ML Edge Manager (in experimental preview at launch). Google highlighted ModiFace (a L'Oreal subsidiary), Essence (a WPP-owned media agency), Sabre, Iron Mountain, and Wayfair as early customers in the launch press materials.[1][5]
Vertex AI's footprint expanded dramatically with the rise of foundation models. In June 2023 Google made generative AI features generally available on Vertex AI, including the first Generative AI Studio (later renamed Vertex AI Studio) and Model Garden, which began life with around 60 first-party and partner models.[6] In August 2023 Google added third-party models to Model Garden, starting with Anthropic's Claude 2 and Meta's Llama 2, both delivered as fully managed APIs rather than as containers customers had to host themselves.
In April 2024 Google introduced Vertex AI Agent Builder, a no-code surface for building conversational and tool-using agents grounded in enterprise data, and announced support for GPT-4-style function calling on Gemini models. Through 2024 and 2025 Vertex AI Search and Vertex AI Conversation matured into the search and chat building blocks for Google's enterprise products.
On November 18, 2025 Google launched Gemini 3 Pro, and Vertex AI was one of the four channels (alongside the Gemini app, Google AI Studio, and the Gemini API) where the model was available in preview from day one.[7] In early 2026 Google deprecated the original gemini-3-pro-preview and replaced it with gemini-3.1-pro-preview as the flagship model on Vertex AI Model Garden.
At Google Cloud Next 2026 in Las Vegas, Google rebranded the platform as the Gemini Enterprise Agent Platform. The company described the new name as an evolution rather than a replacement: all Vertex AI APIs, SDKs, and components continued to exist under the new umbrella, but the marketing and product hierarchy were reorganized around agent development. New components such as Agent Studio (a low-code agent builder), Agent Runtime (with sub-second cold starts), and Workspace Studio (for building agents inside Gmail, Docs, Sheets, Drive, Meet, and Chat) joined the existing Vertex AI lineup.[3][4]
Vertex AI is a collection of services that share a common API surface, console, and IAM model. The table below summarizes the core components as of early 2026.
| Component | Purpose |
|---|---|
| Vertex AI Workbench | Managed JupyterLab notebook environments, available as managed instances (Google operates the VM) or user-managed instances (the customer controls the underlying Compute Engine VM). Workbench notebooks are integrated with BigQuery, Cloud Storage, and Dataproc. |
| Vertex AI Pipelines | Serverless orchestrator for ML workflows, built on the open-source Kubeflow Pipelines and TFX SDKs. Pipelines run as DAGs of containerized steps and emit lineage metadata to Vertex ML Metadata. |
| Vertex AI Feature Store | Centralized store for online and offline ML features, with point-in-time correct lookups, automatic ingestion from BigQuery, and a managed online serving layer. The 2023 "Feature Store v2" rewrite moved the offline storage onto BigQuery directly. |
| Vertex AI Model Registry | Single registry for trained models, versions, evaluations, and lineage, regardless of whether the model came from AutoML, custom training, or an external workflow. The registry feeds the Prediction service and Model Monitoring. |
| Vertex AI Prediction | Managed inference service offering both online (real-time, low-latency) endpoints and batch prediction jobs. Endpoints support GPU, TPU, and CPU backends, traffic splitting between model versions, and private service connect for VPC-only deployments. |
| Vertex AI Vizier | Black-box hyperparameter tuning service based on Google's internal Vizier system, originally described in a 2017 KDD paper. Vizier supports Bayesian optimization, grid, and random search, and can tune any external objective, not just Vertex AI training jobs. |
| Vertex AI Vector Search | Vector similarity search service (formerly Matching Engine) built on Google's ScaNN approximate nearest neighbor algorithm. Used for semantic search, recommendation systems, and retrieval-augmented generation. Vector Search 2.0 entered preview in August 2025 with a fully managed, billion-vector-scale design. |
| Vertex AI Explanations | Explainable AI features that attribute model predictions to input features using techniques such as Sampled Shapley, Integrated Gradients, and XRAI. Available for AutoML, custom-trained, and tabular models, and integrated with Model Monitoring. |
| Vertex AI Model Monitoring | Continuous monitoring service that detects training-serving skew and prediction drift on deployed endpoints, with alerts pushed to Cloud Logging and Cloud Monitoring. |
| Vertex AI Experiments | Tracking and comparison service for training runs, hyperparameter sweeps, and pipeline executions, comparable in scope to MLflow Tracking. |
| Vertex AI Studio | Web-based playground for prompt design, prompt evaluation, fine-tuning, and grounding of generative models in Model Garden. Renamed from Generative AI Studio in 2024. |
| Vertex AI Model Garden | Catalog of more than 200 foundation models spanning Google first-party, third-party partner, and open-source families. Each model is exposed as a managed API or a one-click deployment to a Vertex Endpoint. |
| Vertex AI Agent Builder | Low- and no-code framework for building agents that can use tools, retrieve from enterprise data, and run as long-lived processes on Agent Engine. |
Vertex AI Workbench is the IDE most data scientists use as their first entry point. Workbench provides JupyterLab environments preinstalled with the Vertex AI SDK, BigQuery client libraries, and common ML frameworks including TensorFlow, PyTorch, JAX, and scikit-learn. Two flavors are offered: managed notebooks, where Google handles the underlying virtual machine, idle shutdown, and upgrades, and user-managed notebooks, where the customer keeps root access on a Compute Engine VM and is responsible for the OS image. In 2024 Google began consolidating these two paths into Colab Enterprise, a Vertex-integrated version of the popular Colab product, while keeping the existing Workbench surfaces available for backward compatibility.
Vertex AI Pipelines is built directly on the open-source Kubeflow Pipelines SDK (and also accepts TFX pipelines), so customers can run the same pipeline definition on Vertex, on a self-managed Kubeflow cluster, or locally. Each pipeline run records lineage in the Vertex ML Metadata service, which makes it possible to trace any deployed model back to the exact dataset, code, and hyperparameters that produced it. Pipelines are commonly used for nightly retraining, batch scoring, and continuous evaluation.
The Vertex AI Feature Store gives ML teams a single source of truth for features, with the same values served at training time and at inference time to avoid skew. The 2023 "v2" architecture stores offline features directly in BigQuery, eliminating an earlier requirement to maintain a parallel store, and serves online features through a low-latency managed key-value layer.
Vector Search (formerly Matching Engine) is the platform's vector database. It is the production hosting service for ScaNN, the approximate nearest neighbor algorithm Google Research published at ICML 2020. Vector Search powers production semantic search and recommendation systems, and is the default vector store recommended by the Vertex AI RAG Engine. Vector Search 2.0, in preview from August 2025, simplified index management, added native filtering, and scaled to billions of vectors per index in a single managed configuration.
Vertex AI Studio is the web playground for foundation models. It supports prompt design, side-by-side prompt comparison, prompt evaluation against datasets, supervised and reinforcement fine-tuning, and grounding (binding model answers to a customer's data through Vector Search or Vertex AI Search).
Model Garden is the catalog from which models are picked. As of early 2026, Model Garden listed more than 200 models across three buckets:
| Bucket | Examples |
|---|---|
| Google first-party | Gemini 3.1 Pro, Gemini 3 Flash, Gemini 3 Pro Image, Imagen 3, Imagen 4, Veo 3 Generate, Veo 3 Fast, Veo 3.1 Lite, Lyria (music), Chirp (speech), Gemma open models, MedLM, Codey |
| Third-party partner | Anthropic Claude Haiku 4.5, Anthropic Claude Sonnet 4.6, Mistral Large (24.11), Mistral Codestral, AI21 Jamba |
| Open-source / community | Meta Llama 3.x and Llama 4, Falcon, Stable Diffusion variants, BERT, T5, and a long tail of fine-tunable Hugging Face models |
Partner models such as Claude and Mistral are sold under a model-as-a-service (MaaS) arrangement: Google bills the customer, the partner runs the actual inference (or in some cases Google runs it on the partner's behalf), and the customer never has to provision GPUs. Vertex AI was the first non-Anthropic cloud to offer Claude models in this MaaS shape when Claude 2 launched in Model Garden in August 2023.
In early 2026 Google added prompt caching with a one-hour Time To Live for Anthropic Claude models on Vertex AI, and listed several older versions (including Claude 3.7 Sonnet, deprecated November 11, 2025 with shutdown scheduled for May 11, 2026) as part of its rolling lifecycle policy.
A distinguishing feature of Vertex AI relative to its competitors is its tight coupling with BigQuery, Google's serverless data warehouse. The two services share the same data, the same IAM, and the same billing. Three integration points stand out:
CREATE MODEL statements. BigQuery ML supports linear and logistic regression, k-means clustering, matrix factorization, gradient-boosted trees, deep neural networks, AutoML Tables, and remote inference against Vertex AI Endpoints (so a BigQuery query can call out to a deployed Gemini model). Models trained in BigQuery ML can be registered in the Vertex AI Model Registry with a single command and then served on a Vertex Endpoint.This integration is the main reason Vertex AI is often described as the strongest of the three hyperscaler ML platforms for organizations that already run their analytics on BigQuery.
Vertex AI is priced on a pay-as-you-go basis. Customers are billed for the underlying compute (training, prediction, notebooks), for storage of artifacts and features, for managed services (Model Monitoring, Pipelines), and, for generative AI, for tokens, characters, images, or seconds of generated content depending on the model. There is no per-seat license fee and no fixed platform charge.
Key pricing patterns include:
Academic studies in 2024 found Vertex AI's tabular AutoML and custom training to be cost-competitive with comparable AWS SageMaker offerings, with BigQuery ML producing notably lower cost for SQL-driven workloads at the expense of less flexible model architectures.
Vertex AI competes most directly with Amazon's SageMaker, Microsoft's Azure Machine Learning, and the data-and-AI platform sold by Databricks. The table below summarizes the most-cited differences in 2024 to 2026 industry analyses.
| Capability | Vertex AI | AWS SageMaker | Azure Machine Learning |
|---|---|---|---|
| Default developer experience | Single unified API and console | Service-by-service APIs (SageMaker Studio is the integrated IDE) | Studio-based IDE with strong drag-and-drop UX |
| Managed notebook surface | Workbench and Colab Enterprise | SageMaker Studio Notebooks, SageMaker Notebook Instances | Azure ML Compute Instances, Synapse notebooks |
| Pipeline orchestrator | Vertex AI Pipelines (Kubeflow / TFX) | SageMaker Pipelines | Azure ML Pipelines |
| Feature store | Vertex AI Feature Store, BigQuery-backed | SageMaker Feature Store (offline in S3, online in DynamoDB) | Azure ML Feature Store |
| Model registry | Vertex AI Model Registry | SageMaker Model Registry | Azure ML Model Registry |
| Online inference | Vertex AI Endpoints (CPU, GPU, TPU) | SageMaker Endpoints, Inference Recommender | Azure ML Online Endpoints |
| Hyperparameter tuning | Vertex AI Vizier | SageMaker Automatic Model Tuning | Azure ML HyperDrive |
| Vector search | Vertex AI Vector Search (ScaNN) | OpenSearch / SageMaker JumpStart embeddings | Azure AI Search vector indexes |
| First-party foundation models | Gemini, Imagen, Veo, Lyria, Chirp, Gemma | Amazon Titan, Nova family | Phi family (and a deep partnership with OpenAI for GPT models) |
| Headline third-party catalogue | Claude, Llama, Mistral via Model Garden MaaS | Claude, Llama, Mistral via Bedrock | OpenAI GPT family via Azure OpenAI Service, plus Llama, Mistral, Cohere via Azure AI Foundry |
| Tightest data warehouse integration | BigQuery (BigQuery ML, Feature Store v2) | Redshift ML, S3-native | Microsoft Fabric, Synapse |
| Strongest fit | Teams already using BigQuery and Google data services; teams that want managed access to Gemini | AWS-native shops; teams that need petabyte-scale custom training | Microsoft-shop enterprises; regulated industries needing strong governance and hybrid deployment |
Industry analysts have generally placed all three platforms in the leader quadrant of their respective Magic Quadrant and Forrester Wave reports. Google was named a Leader in the 2024 Gartner Magic Quadrant for Cloud AI Developer Services for the fifth consecutive year, in Forrester's Q3 2024 Wave on AI/ML Platforms, and in the 2025 IDC MarketScape for Worldwide GenAI Life-Cycle Foundation Model Software.[6][8]
Databricks, although not a hyperscaler, competes with Vertex AI on classical MLOps and on its own DBRX foundation models, and is frequently picked by organizations that want a single platform spanning data engineering, BI, and ML. Many large enterprises run a mix of Vertex AI and Databricks (or SageMaker, or Azure ML) for different teams.
Vertex AI is used across retail, finance, manufacturing, media, healthcare, and the public sector. Three publicly documented enterprise deployments illustrate the platform's range.
Wayfair, the U.S. online furniture retailer, adopted Vertex AI in 2021 to support roughly 80 data science teams and more than 100 data scientists. The company moved its hyperparameter tuning workflow from a two-week cycle to a one-day cycle, and shrank its model deployment timeline from about two months to roughly two weeks, by standardizing on Vertex AI Pipelines and the Vertex AI Feature Store. Use cases include demand forecasting, customer support routing through NLP on chat messages, and real-time feature serving that powers product search and recommendations for the company's roughly 30 million active customers.[5]
Automatic Data Processing (ADP), the global payroll and HR services company, runs ML and generative AI workloads across multiple clouds, including Google Cloud, as part of its "One AI" and "One Data" strategy. ADP uses Vertex AI components alongside other tools to deliver enterprise generative AI capabilities for HR and payroll services, with a focus on data quality, IP protection, and cost control across global operations.
The U.S. home improvement retailer Lowe's is one of several large retailers (alongside Wayfair and Staples) that has used Google Cloud's generative AI services on Vertex AI to scale visual content generation for product listings and to improve search relevance. The retailer uses Vertex AI for visual search, product imagery generation, and conversational shopping assistance.
Other named Vertex AI customers include Ford, Seagate, Cash App, Cruise, ModiFace (a L'Oreal subsidiary that uses the platform for AR-based virtual try-on), Essence (a WPP-owned media agency), Sabre, and Iron Mountain. Public sector and regulated-industry customers consume Vertex AI through the same APIs but with additional controls (Customer-Managed Encryption Keys, VPC Service Controls, HIPAA-aligned configurations, FedRAMP High in selected regions).
Vertex AI runs on Google Cloud's standard global infrastructure, with regional endpoints in most Google Cloud regions and multi-regional endpoints for serving Generative AI models with redundancy. Training jobs can target Compute Engine machine types backed by NVIDIA GPUs (A100, H100, B200) or by Google's own Tensor Processing Units (TPU v5e, v5p, and Trillium / TPU v6, depending on region). Foundation model training and serving for Gemini happens on TPUs by default; Imagen and Veo also rely heavily on TPUs.
Networking and security features include private service connect, VPC Service Controls, customer-managed encryption keys (CMEK), AXT (Access Transparency) logging, and IAM-based access control on every component, from individual notebook instances to model versions and tuned model assets. Audit logs go through Cloud Audit Logs, and metrics flow into Cloud Monitoring.
The initial 2021 launch was generally well received by industry analysts, who interpreted Vertex AI as Google's answer to several years of Amazon SageMaker-driven enterprise momentum. Bradley Shimmin, then chief analyst at Omdia, said at the time that "enterprise data science practitioners hoping to put AI to work across the enterprise aren't looking to wrangle tooling. It takes a supportive infrastructure capable of unifying the user experience."[1]
Developers have praised Vertex AI for its tight integration with BigQuery, its early access to Gemini, and its breadth of partner models in Model Garden, while criticisms have focused on documentation depth, the steep ramp into the more advanced MLOps components, and the speed at which Google renames and reorganizes products (Generative AI Studio became Vertex AI Studio, Matching Engine became Vector Search, and the platform itself became the Gemini Enterprise Agent Platform within five years of launch).