Scale AI is an American artificial intelligence data infrastructure company that provides data labeling, model evaluation, and AI platform services. Founded in 2016 by Alexandr Wang and Lucy Guo, Scale AI began as a data annotation service for autonomous vehicles and has since evolved into one of the most important behind-the-scenes players in the AI industry, providing the training data that powers large language models from companies including OpenAI, Meta, and Microsoft. Headquartered in San Francisco, California, Scale AI reached a valuation of $29 billion in 2025 following a landmark investment from Meta.
Alexandr Wang was born in January 1997 in Los Alamos, New Mexico, to Chinese immigrant parents who both worked as physicists at Los Alamos National Laboratory. Wang showed exceptional aptitude for mathematics and computer science from a young age, winning national programming competitions in high school. He was admitted to the Massachusetts Institute of Technology (MIT) to study mathematics and computer science, but before starting, he took a gap year and moved to Silicon Valley, where he landed a job as a software engineer at Quora.
During his freshman year at MIT, Wang co-founded Scale AI in 2016 with Lucy Guo. The core idea was to build a platform that could produce high-quality labeled data for machine learning models at scale. At the time, the primary customer base was the autonomous vehicle industry, which needed enormous volumes of labeled images, point clouds, and sensor data to train self-driving car systems. Wang dropped out of MIT to focus on the company full-time.
In 2021, at age 24, Wang became the world's youngest self-made billionaire, according to Forbes, which estimated his net worth at approximately $3.6 billion as of April 2025. He was previously named to the Forbes 30 Under 30 list in the Enterprise Technology category in both 2018 and 2021.
Scale AI's initial business focused heavily on data labeling for autonomous vehicles, with early customers including companies in the self-driving car space. The company built a workforce of human annotators (organized through its subsidiary Remotasks) who labeled images, drew bounding boxes around objects, and annotated sensor data from LiDAR and camera systems.
As the AI landscape shifted with the rise of large language models in 2020 and beyond, Scale AI recognized that the same fundamental capability, organizing human intelligence to produce high-quality training data, was needed even more urgently for language models. The company pivoted its business to focus increasingly on LLM training data, including:
This pivot proved extremely lucrative. As companies like OpenAI, Anthropic, and Meta raced to train increasingly powerful language models, the demand for high-quality human feedback data surged. Scale AI's existing infrastructure for managing large distributed workforces of annotators positioned it perfectly to meet this demand.
As of 2025, Scale AI organizes its product portfolio into three main categories: Build AI, Apply AI, and Evaluate AI [9].
The Scale Data Engine is the company's core product for AI training data. It encompasses the full lifecycle of data preparation for model development:
| Component | Function |
|---|---|
| Data collection | Gathering raw data from diverse sources |
| Data curation | Selecting and organizing training examples |
| Data annotation | Human labeling across modalities (text, image, video, audio) |
| RLHF services | Human evaluators ranking model outputs for reward model training |
| Model evaluation | Testing model performance against benchmarks and edge cases |
| Safety and alignment | Red teaming and safety testing for deployed models |
The Data Engine supports a wide range of AI modalities, from traditional computer vision tasks (bounding boxes, segmentation, keypoint detection) to complex language tasks (instruction following, code generation, creative writing evaluation).
The Scale GenAI Platform (also called the Generative AI Data Engine) is specifically designed for companies building and deploying generative AI applications. It provides tools for fine-tuning models on custom data, evaluating model performance, and managing the data pipelines that feed into generative AI systems. The platform targets enterprise customers who want to build custom AI applications on top of foundation models.
Scale Donovan is a decision-support platform that applies generative AI to unstructured data and geospatial feeds to surface recommendations at "mission speed." Originally developed for defense and intelligence applications, Donovan integrates LLM-based tools including geospatial chat, text-to-API, and retrieval-augmented generation to enable public sector teams to access verified intelligence quickly [9].
Donovan's capabilities include:
| Capability | Description |
|---|---|
| Geospatial analysis | AI-powered analysis of satellite imagery and geographic data |
| Document summarization | Automated summarization of intelligence reports and lengthy documents |
| Report generation | Templated report creation using AI synthesis |
| Document translation | Multi-language translation of uploaded documents |
| Data integration | Aggregation of disparate data sources into unified intelligence picture |
The platform runs on secure government servers and can quickly assemble relevant operations or intelligence data for decision-making in contexts ranging from battlefield planning to intelligence gathering. Donovan has been deployed with the U.S. Department of Defense and forms a central part of Scale AI's public sector business.
SEAL is Scale AI's research division focused on evaluating and aligning large language models. Launched in 2023, SEAL produces benchmarks, leaderboards, and evaluation frameworks that help the AI community assess model capabilities and safety properties. The SEAL Leaderboards provide independent rankings of LLM performance across various tasks, offering a neutral third-party evaluation that has become an important reference point for the industry.
In March 2026, the company expanded SEAL into Scale Labs, a broader research division that builds on SEAL's evaluation work while expanding into new areas of AI safety and alignment research.
SEAL has produced several notable evaluation frameworks and benchmark suites [10]:
| Benchmark/Product | Focus Area |
|---|---|
| SEAL Leaderboards | Public rankings of LLM performance across diverse tasks |
| SEAL Showdown | Head-to-head model comparisons based on human evaluation |
| SWE-Atlas | Evaluation framework for software engineering capabilities |
| Voice Showdown | Benchmark for voice and speech AI systems |
| PropensityBench | Evaluation of model tendencies and behavioral patterns |
| 2025 Model of the Year Awards | Annual recognition of top-performing AI models |
SEAL's evaluation work serves a dual purpose. Publicly, it provides the AI community with independent model assessments. Commercially, it positions Scale AI as the authoritative evaluator of model quality, which reinforces its value to customers who need to select and validate models for production deployment.
One of Scale AI's most strategically important offerings is its reinforcement learning from human feedback (RLHF) data services. RLHF has become the standard technique for aligning large language models with human preferences and instructions. The process requires large volumes of human-generated comparison data, where evaluators read model outputs and rank them by quality, helpfulness, accuracy, and safety.
Scale AI provides this data at scale, employing trained evaluators who specialize in different domains (coding, mathematics, creative writing, factual knowledge) to produce the nuanced judgments that RLHF requires. The company's RLHF data has been used by several of the leading AI labs to train their flagship models.
The quality of RLHF data directly impacts the quality of the resulting model, making Scale AI's role in the AI supply chain critical. Poor or inconsistent human feedback can lead to models that are unreliable, biased, or misaligned, which is why Scale AI invests heavily in annotator training, quality assurance, and evaluation methodology.
Scale AI's approach to RLHF data production follows a structured methodology that prioritizes quality at volume [11]:
| Phase | Description |
|---|---|
| Task design | Define evaluation criteria, rubrics, and scoring guidelines for specific model behaviors |
| Annotator selection | Match tasks to evaluators with relevant domain expertise (coding, math, science, creative writing) |
| AI pre-labeling | Use AI models to generate initial labels, reducing human effort on straightforward cases |
| Human annotation | Trained evaluators rank, compare, or score model outputs based on defined criteria |
| Quality control | Multi-layer QC including inter-annotator agreement checks, gold standard tasks, and statistical auditing |
| Grading and leveling | Workers are scored on accuracy; high performers advance to more complex tasks, low performers are removed |
| Delivery | Processed data is delivered in formats ready for reward model training |
This methodology, which Scale describes as "crowd + AI + rigorous QC," combines the scale of crowdsourced labor with the quality controls of a managed service. Workers on the Remotasks and Outlier platforms are graded continuously, with top performers receiving access to higher-paying, more complex tasks and underperformers being phased out [11].
Scale AI has invested in building specialized annotator pools for domains that require technical expertise:
| Domain | Annotator Requirements | Typical Tasks |
|---|---|---|
| Software engineering | Professional developers, CS degree holders | Code review, debugging evaluation, algorithm comparison |
| Mathematics | Advanced math education or research background | Proof verification, solution ranking, error detection |
| Scientific reasoning | Graduate-level science education | Factual accuracy assessment, methodology evaluation |
| Creative writing | Professional writers, editors | Style evaluation, coherence scoring, originality assessment |
| Multilingual | Native speakers with professional fluency | Translation quality, cultural appropriateness, idiomatic accuracy |
| Legal and compliance | Legal professionals, compliance officers | Regulatory accuracy, policy adherence evaluation |
Scale AI manages its annotation workforce through two primary subsidiaries:
| Subsidiary | Focus | Workforce Profile |
|---|---|---|
| Remotasks | Computer vision and autonomous vehicle data annotation | Large global workforce; task-based compensation |
| Outlier | Data annotation for LLMs and generative AI | Professionals with advanced degrees, industry expertise, native fluency in target languages |
These subsidiaries coordinate a global network of over 200,000 trained annotators who perform the detailed labeling work that underpins Scale AI's products [11]. The workforce model has drawn both praise for creating economic opportunities in developing countries and criticism regarding working conditions and pay rates.
Outlier, which focuses on the higher-complexity LLM training data, recruits contributors with more specialized qualifications. Typical Outlier tasks involve content evaluation, RLHF ranking, and domain-specific assessment work that requires subject matter expertise rather than just general labeling skills. The distinction between Remotasks (volume-oriented, task-based) and Outlier (quality-oriented, expertise-based) reflects the different requirements of computer vision versus language model training data.
Scale AI has built a significant government business, particularly with the U.S. Department of Defense:
| Date | Contract/Partnership |
|---|---|
| 2020 | Initial contract with the U.S. Department of Defense |
| May 2021 | Michael Kratsios (former U.S. CTO under Trump) joins as Managing Director and Head of Strategy |
| February 2025 | Five-year partnership with the Qatari government for AI-powered government services |
| February 2025 | Selected as third-party evaluator of AI models for the U.S. AI Safety Institute |
| March 2025 | Deal with the Department of Defense for the Thunderforge project |
| 2025 | Defense Information Systems Agency (DISA) selects Scale for Project ASCEND |
| 2025 | Contract with Defense Logistics Agency (DLA) for supply chain optimization |
The Thunderforge project aims to use AI to plan and help execute movements of ships, planes, and other military assets, representing a significant expansion of AI applications in defense logistics and operations. The hiring of Michael Kratsios, who had served as Chief Technology Officer of the United States during the Trump administration, signaled Scale AI's serious commitment to government work.
Project ASCEND, awarded by the Defense Information Systems Agency, focuses on deploying generative AI solutions for Defensive Cyber Operations (DCO). The DLA contract uses Scale Donovan to streamline operations and create real-time data insights for the U.S. military's supply chain [10].
Scale AI's defense business has positioned the company at the intersection of AI and national security, a space that carries both lucrative contract opportunities and complex ethical considerations.
Scale AI has raised substantial capital across multiple funding rounds:
| Round | Date | Amount | Valuation | Lead Investors |
|---|---|---|---|---|
| Seed | 2016 | $4.5M | - | Y Combinator, Founders Fund |
| Series A | 2017 | $7.4M | - | Accel Partners |
| Series B | 2019 | $22.5M | - | Index Ventures |
| Series C | 2020 | $100M | ~$3.5B | Founders Fund, Tiger Global |
| Series D | 2021 | $325M | $7.3B | Tiger Global |
| Series E | 2023 | $250M | $7.3B | Various |
| Series F | 2024 | $1B+ | $14B | Various |
| Meta Investment | June 2025 | $14.3B | $29B | Meta Platforms |
The Meta investment in June 2025 was transformative. Meta acquired a 49% stake in Scale AI for $14.3 billion, more than doubling the company's valuation from $14 billion to $29 billion. The deal reflected Meta's strategy of securing access to high-quality AI training data infrastructure.
The investment established a "privileged access" framework for Meta, including priority scheduling for Scale's global workforce and exclusive rights to Scale's SEAL evaluation frameworks for high-stakes reasoning models [12].
Scale AI generated approximately $870 million in revenue in 2024 and projected over $2 billion in revenue for 2025. The company's revenue growth has been driven by the explosive demand for LLM training data, with customers willing to pay premium prices for high-quality, domain-specific human feedback data.
In June 2025, alongside Meta's $14.3 billion investment, Alexandr Wang departed Scale AI to join Meta as Chief AI Officer, leading Meta Superintelligence Labs. Jason Droege, Scale AI's Chief Strategy Officer, assumed the role of Interim CEO. The transition marked a significant moment for both companies: Scale AI's founder moving to lead superintelligence research at one of the world's largest technology companies, while Scale AI continued to operate independently (with Meta as a major shareholder).
Scale AI competes with several companies in the data labeling and AI services space. The global data labeling market is projected to expand from approximately $4.87 billion in 2025 to more than $29 billion by 2032, creating intense competition among providers [13].
| Competitor | Primary Focus | Key Differentiator vs. Scale AI |
|---|---|---|
| Labelbox | Data labeling platform | Platform-first approach; customers bring their own labelers; usage-based billing tied to Labelbox Units (LBUs) |
| Appen | Data annotation services | Pivoting toward "Sovereign AI" solutions for national governments; ~$231M trailing revenue (late 2025) |
| Surge AI | AI training data | Smaller scale, focused on high-quality RLHF data |
| Amazon SageMaker Ground Truth | AWS data labeling | Integrated with AWS ecosystem; programmatic labeling |
| Snorkel AI | Programmatic data labeling | Software-defined labeling; reduces need for human annotators |
| Sama | Data annotation services | Ethical sourcing focus; impact-driven workforce model |
Scale AI differentiates through its combination of scale, quality, and breadth of services. While competitors may excel in specific areas (such as programmatic labeling or specific data types), Scale AI offers the broadest end-to-end platform, from initial data collection through RLHF and model evaluation. Its relationships with the leading AI labs and the U.S. government provide significant competitive moats.
The Meta investment reshaped the competitive landscape in several ways [13]:
Despite these dynamics, Scale AI's scale of operations, quality infrastructure, and government relationships continue to provide significant advantages that competitors have not replicated.
As of early 2026, Scale AI is one of the most valuable private AI companies in the world at a $29 billion valuation. Under interim CEO Jason Droege, the company continues to expand its product offerings, with the launch of Scale Labs broadening its research capabilities. The government business continues to grow, with the Thunderforge defense project, Project ASCEND, and AI Safety Institute evaluation role representing high-profile contracts. With Meta as a major shareholder and customer, and continued demand for high-quality training data from the broader AI industry, Scale AI remains a critical piece of the AI infrastructure ecosystem.