Snowflake AI refers to the suite of artificial intelligence and machine learning capabilities developed by Snowflake Inc., a cloud computing company founded in 2012. Originally known for its cloud-based data warehousing platform, Snowflake has expanded aggressively into AI since 2023, building and acquiring technologies that bring generative AI, large language models, embedding models, and ML tools directly into its data platform. Snowflake's AI strategy centers on the idea that enterprises can get the most value from AI when it runs close to their data, governed by the same security and access controls they already use.
Snowflake Inc. was founded in July 2012 in San Mateo, California, by Benoit Dageville, Thierry Cruanes, and Marcin Zukowski. Dageville and Cruanes were former data architects at Oracle Corporation, while Zukowski co-founded the analytical database company Vectorwise. The company launched its first product, a cloud-native data warehouse, in June 2015. Snowflake's core architectural innovation was the separation of storage and compute, allowing each to scale independently on public cloud infrastructure including Amazon Web Services, Microsoft Azure, and Google Cloud Platform.
On September 16, 2020, Snowflake went public on the New York Stock Exchange in what became the largest software IPO in history at the time, raising approximately $3.4 billion at a valuation of $33.2 billion. Shares more than doubled on the first day of trading.
In February 2024, Sridhar Ramaswamy became CEO of Snowflake, succeeding Frank Slootman. Ramaswamy, who previously spent 15 years at Google leading its advertising products and later co-founded the AI search startup Neeva, has been credited with shifting Snowflake's strategic narrative from a pure data warehouse provider to an AI data cloud. As of early 2026, Snowflake serves over 12,000 customers and generated approximately $4.4 billion in annual revenue for fiscal year 2026 (ending January 31, 2026).
In March 2022, Snowflake acquired Streamlit, an open-source Python framework for building data applications, for approximately $800 million. Streamlit had launched in 2019 and at the time of acquisition had over 8 million downloads, with more than 1.5 million applications built using its framework. The deal closed on March 31, 2022, with a purchase consideration of approximately $650.8 million (comprising $211.8 million in cash and 1.9 million shares of Class A common stock).
Streamlit is now deeply integrated into Snowflake's platform, enabling data scientists and developers to build interactive data apps and AI-powered dashboards directly within Snowflake. Streamlit in Snowflake allows users to create and deploy apps without managing external infrastructure.
In May 2023, Snowflake acquired Neeva, an AI-powered search engine startup founded in 2019 by Sridhar Ramaswamy and Vivek Raghunathan, both former Google advertising executives. Neeva had originally built a subscription-based, ad-free search engine focused on user privacy but pivoted to enterprise AI and large language model applications shortly before the acquisition. Financial terms were not publicly disclosed.
The Neeva acquisition brought critical AI search expertise and talent into Snowflake. The technology and team from Neeva played a foundational role in building Cortex Search, Cortex Analyst, and other AI services within the Snowflake platform. Ramaswamy, who had joined Snowflake as Senior Vice President of AI following the Neeva deal, was promoted to CEO in February 2024.
Snowflake Arctic is an open-source large language model released on April 24, 2024, designed specifically for enterprise workloads such as SQL generation, coding, and instruction following. It was developed by Snowflake's AI Research team and released under the Apache 2.0 license.
Arctic uses a Dense-MoE Hybrid transformer architecture that combines a 10 billion parameter dense transformer model with a residual mixture of experts (MoE) multilayer perceptron (MLP). The MoE component consists of 128 fine-grained experts, each with 3.66 billion parameters. This results in a total of 480 billion parameters, of which only 17 billion are active during inference, selected through top-2 gating.
The use of 128 experts is notably more than typical MoE architectures, which often employ only 8 to 16 experts. This "many-but-condensed" approach allows for more specialized expert activation while keeping inference costs low. Arctic activates roughly 50% fewer parameters than DBRX and 75% fewer than Llama 3 70B during inference.
Arctic was trained on 3.5 trillion tokens using a cluster of over 1,000 GPUs. The entire model was built from scratch in approximately three months at a computational cost of roughly $2 million, which Snowflake reported as about one-eighth of the training costs of comparable LLMs. The training infrastructure was built on top of the DeepSpeed library, using ZeRO-2 optimization and expert-parallelism for efficient large-scale MoE model training.
A key training innovation was the use of a "dynamic data curriculum" that adjusted the balance between code and natural language data over time, mimicking human learning patterns. This curriculum approach contributed to improvements in both language understanding and reasoning capabilities.
Arctic was benchmarked against several open-source models at the time of release. The following table summarizes its performance on key enterprise and general benchmarks.
| Benchmark | Arctic | Llama 3 70B | DBRX Instruct | Mixtral 8x22B |
|---|---|---|---|---|
| MMLU (Knowledge) | 67.3% | 79.8% | 73.3% | 77.5% |
| Spider (SQL) | 79.0% | 80.2% | 76.3% | 79.2% |
| HumanEval+ (Coding) | 64.3% | 71.9% | 61.0% | 69.9% |
| GSM8K (Math) | 74.2% | 91.4% | n/a | 84.2% |
| IFEval (Instruction Following) | 52.4% | n/a | 27.6% | n/a |
| Commonsense Reasoning | 73.1% | n/a | n/a | n/a |
While Arctic does not match Llama 3 70B on general knowledge benchmarks like MMLU, it delivers competitive performance on enterprise-focused tasks (SQL, coding, instruction following) while using significantly less compute. Snowflake emphasized that Arctic achieved parity with Llama 3 70B on enterprise metrics despite using 17 times less training compute.
Snowflake Arctic Embed is a family of open-source text embedding models optimized for retrieval tasks, first released on April 16, 2024. The models are available on Hugging Face under an Apache 2.0 license and achieved top rankings on the Massive Text Embedding Benchmark (MTEB) Retrieval leaderboard at each size category upon release.
| Model | Parameters | Embedding Dimensions | Max Tokens | MTEB Retrieval (NDCG@10) |
|---|---|---|---|---|
| arctic-embed-xs | 22M | 384 | 512 | n/a |
| arctic-embed-s | 33M | 384 | 512 | n/a |
| arctic-embed-m | 110M | 768 | 512 | n/a |
| arctic-embed-m-long | 137M | 768 | 2048 (8192 with RPE) | 54.83 |
| arctic-embed-l | 335M | 1024 | 512 | 55.98 |
The largest model, arctic-embed-l, was the only model with fewer than 1 billion parameters to surpass an average MTEB retrieval score of 55.9 at the time of release. Models with comparable retrieval quality typically had over 1 billion parameters or were closed-source.
In July 2024, Snowflake released arctic-embed-m-v1.5, which introduced highly compressible embedding vectors capable of preserving quality even when compressed to as small as 128 bytes per vector.
In December 2024, Snowflake released Arctic Embed 2.0 with two multilingual variants:
Arctic Embed 2.0 models support multilingual text retrieval without sacrificing English-language performance.
Snowflake Cortex AI is the company's managed suite of AI services that allows users to run generative AI workloads directly within the Snowflake platform. Cortex AI reached general availability in November 2025 and provides access to large language models from multiple providers, including Anthropic (Claude), OpenAI (GPT), Meta (Llama), Mistral AI, and Snowflake's own Arctic model.
Cortex AI exposes several SQL-callable AI functions that enable users to run unstructured data analytics without leaving Snowflake:
| Function | Description |
|---|---|
| AI_COMPLETE | Generates a text or image completion using a selected LLM |
| AI_CLASSIFY | Classifies text or images into user-defined categories |
| AI_EXTRACT | Extracts structured information from text, documents, and images |
| AI_TRANSLATE | Translates text between languages |
| AI_TRANSCRIBE | Transcribes audio and video files with timestamps and speaker identification |
These functions can be called directly in SQL queries, allowing analysts to process unstructured data at scale alongside their structured data pipelines.
Separate from the LLM-powered functions, Cortex includes built-in ML functions that operate on structured data using traditional machine learning techniques. These are accessible through SQL without requiring Python or external ML tooling.
| Function Category | Capabilities |
|---|---|
| Forecasting | Predicts future metric values from time-series data |
| Anomaly Detection | Flags metric values that deviate from expected patterns |
| Contribution Explorer | Identifies drivers behind changes in time-series metrics |
| Classification | Sorts rows into classes based on predictive features |
Cortex Search is a managed hybrid search service that combines vector search, keyword search, and semantic reranking to power retrieval-augmented generation (RAG) and search-driven applications. It reached general availability in October 2024.
Cortex Search can be deployed with a single SQL statement and automatically handles embedding generation, index creation, and ongoing index refreshes. Snowflake has reported that Cortex Search outperforms enterprise search tools such as Azure AI Search, Elasticsearch, and AWS OpenSearch by up to 15% on NDCG@10 across benchmarks covering product search, email search, technical search, and web search scenarios.
Cortex Analyst is a managed text-to-SQL service that enables business users to ask natural language questions about their structured data and receive SQL-generated answers. It is exposed as a REST API and can be integrated into custom applications.
Cortex Analyst uses an agentic AI architecture powered by state-of-the-art LLMs. Unlike generic text-to-SQL tools that rely solely on database schema, Cortex Analyst uses a semantic model (defined in a lightweight YAML file or, more recently, through Semantic Views) to capture business logic, metric definitions, and domain-specific terminology. Snowflake has reported over 90% SQL accuracy on real-world use cases and claims the system is nearly twice as accurate as single-prompt SQL generation from GPT-4o.
Cortex Agents, which reached general availability in November 2025, provide agentic orchestration across both structured and unstructured data sources. An agent can use Cortex Analyst (for structured queries) and Cortex Search (for unstructured retrieval) as tools, coordinating multistep tasks that span different data types.
Key orchestration capabilities include planning (decomposing complex requests into subtasks), tool use (routing to the appropriate data source), and reflection (evaluating intermediate results before generating a final response). Cortex Agents can be integrated into Microsoft Teams, custom applications, and other enterprise workflows.
Snowflake Intelligence is an enterprise intelligence agent that became generally available in November 2025. Accessible via ai.snowflake.com, it allows any employee to ask complex questions about enterprise data in natural language. It connects to both structured data (tables and records) and unstructured data (documents, transcripts, conversations) and generates insights through a Deep Research Agent for Analytics.
Snowflake Intelligence is powered by AI models from providers like Anthropic and automatically respects all existing role-based access controls, data masking policies, and governance rules.
Cortex Fine-tuning is a fully managed, serverless service that enables users to fine-tune supported LLMs on their own data, all within the Snowflake platform. The service uses parameter-efficient fine-tuning (PEFT) techniques such as LoRA to adapt pre-trained base models to domain-specific tasks.
Fine-tuning is initiated through a SQL function call (FINETUNE) with subcommands for creating, monitoring, describing, and canceling fine-tuning jobs. Training data must reside in a Snowflake table or view with columns named prompt and completion. Supported base models include variants of Meta's Llama 3 (8B and 70B parameter versions) and Mistral AI models (such as Mistral 7B).
The service is designed to allow smaller models, once fine-tuned, to match the accuracy of much larger models on specific tasks at a fraction of the inference cost.
Snowpark ML (also referred to as Snowflake ML) is Snowflake's integrated platform for end-to-end machine learning development, providing tools for feature engineering, model training, deployment, and monitoring, all without moving data out of Snowflake.
| Component | Description |
|---|---|
| Snowpark ML Modeling API | Enables use of popular Python frameworks (scikit-learn, XGBoost, LightGBM) for feature engineering and model training with distributed execution |
| Feature Store | Manages, stores, and discovers ML features with automated incremental refresh from batch and streaming data |
| Model Registry | Centralized registry for versioning, deploying, and managing trained models |
| ML Observability | Monitors model performance metrics, tracks drift, and supports alerting for production models |
| ML Lineage | Traces end-to-end lineage from source data to features, datasets, and models |
| GPU Acceleration | Integrates NVIDIA cuML and cuDF for up to 200x speedups on scikit-learn and pandas workloads |
Snowflake Notebooks on Container Runtime provide a Jupyter-like environment for training and fine-tuning large-scale models within Snowflake, with preinstalled packages such as PyTorch, XGBoost, and scikit-learn.
Snowflake Document AI (now part of the Cortex AI Functions suite) uses optical character recognition (OCR) and large language models to extract structured data from unstructured documents. The core function, AI_EXTRACT, reached general availability in October 2025.
Document AI supports the following file formats: PDF, PNG, JPEG, JPG, DOCX, EML, HTM, HTML, TXT, TIF, and TIFF. It offers two extraction modes:
Extraction can produce results in entity format (answering natural language questions), list/array format (using JSON schemas), or table format (specifying column structures). The service handles handwriting recognition, table extraction, and checkbox detection.
The following table summarizes the major AI and ML products and services within the Snowflake platform.
| Product | Category | Description | Availability |
|---|---|---|---|
| Snowflake Arctic | Open-source LLM | 480B parameter MoE model for enterprise tasks | April 2024 |
| Arctic Embed | Open-source embedding models | Family of text embedding models for retrieval | April 2024 |
| Cortex AI Functions | Managed LLM services | SQL-callable AI functions (AI_COMPLETE, AI_CLASSIFY, AI_EXTRACT, AI_TRANSLATE, AI_TRANSCRIBE) | GA November 2025 |
| Cortex ML Functions | Managed ML services | Forecasting, anomaly detection, classification, contribution explorer | GA |
| Cortex Search | Managed search service | Hybrid vector + keyword search with semantic reranking for RAG | GA October 2024 |
| Cortex Analyst | Text-to-SQL service | Natural language to SQL conversion using semantic models | Preview August 2024 |
| Cortex Agents | Agentic orchestration | Multi-step task orchestration across structured and unstructured data | GA November 2025 |
| Cortex Fine-tuning | Model customization | Serverless LLM fine-tuning with LoRA | GA |
| Snowflake Intelligence | Enterprise intelligence agent | Natural language data analytics for all employees | GA November 2025 |
| Snowpark ML | ML development platform | End-to-end ML with feature store, model registry, and observability | GA |
| Document AI | Document extraction | OCR and LLM-powered structured data extraction from documents | GA October 2025 |
| Streamlit in Snowflake | App development | Interactive data app and dashboard builder | GA |
Snowflake's AI strategy places it in direct competition with several major platforms:
Databricks is widely considered Snowflake's closest competitor. Originally focused on data engineering and data science through Apache Spark, Databricks has expanded into data warehousing (with its lakehouse architecture) and generative AI. Databricks released its own open-source LLM, DBRX, in March 2024, and acquired MosaicML in 2023 for $1.3 billion to strengthen its AI training capabilities. Databricks held approximately 8.67% of the cloud data warehousing market in early 2025 and is generally considered stronger in data science and ML workflows.
Google BigQuery has the largest customer base among cloud data platforms, with five times the number of customers compared to both Snowflake and Databricks. Google has integrated its Gemini AI models into BigQuery and reported a 16x year-over-year increase in customer use of AI models within BigQuery as of 2025.
Amazon Redshift, part of AWS, held approximately 15% of the data warehousing market in early 2025. AWS offers its own suite of AI services through Amazon SageMaker and Amazon Bedrock, giving Redshift users access to a broad ecosystem of ML tools.
Microsoft Azure Synapse Analytics competes through tight integration with the Microsoft ecosystem, including Azure OpenAI Service and Microsoft Copilot.
Snowflake's differentiation in this competitive landscape rests on three pillars: running AI directly within the data platform (minimizing data movement), providing access to models from multiple providers (rather than being locked to a single model vendor), and maintaining unified governance and security across all AI workloads.