AI21 Labs is an Israeli artificial intelligence company that develops and deploys large language models (LLMs) for enterprise applications. Founded in 2017 by Yoav Shoham, Ori Goshen, and Amnon Shashua, the company is headquartered in Tel Aviv, Israel. AI21 Labs has built a series of foundation models, starting with the Jurassic family and later pioneering the Jamba series, which introduced a hybrid Mamba-Transformer architecture to production-scale language modeling. The company also develops Wordtune, a consumer-facing AI writing assistant. As of late 2025, AI21 Labs has raised a total of $336 million in funding and is valued at $1.4 billion.
AI21 Labs was founded in November 2017 in Tel Aviv, Israel, by three co-founders with deep expertise in artificial intelligence and technology entrepreneurship.
Yoav Shoham is a professor emeritus of computer science at Stanford University and a former Principal Scientist at Google. He has made notable contributions to AI, game theory, and multi-agent systems over a career spanning several decades.
Ori Goshen is a technology entrepreneur with over 15 years of experience in product leadership roles. Before co-founding AI21 Labs, Goshen co-founded Crowdx, a network analytics company, and led the development of VoIP products.
Amnon Shashua is a professor of computer science at the Hebrew University of Jerusalem and a prominent Israeli entrepreneur. He co-founded Mobileye, the autonomous driving technology company that Intel acquired in 2017 for approximately $15.3 billion. Shashua also co-founded OrCam, a company focused on assistive technology for the visually impaired, and the digital bank One Zero.
The company's name, AI21, refers to "artificial intelligence for the 21st century." The founders started with a broad vision of building AI systems that could genuinely understand and generate natural language, rather than targeting a specific product from the outset. In January 2019, AI21 Labs raised $9.5 million in a seed funding round. Through its early years, the company operated in stealth mode while developing its core natural language processing technology.
On October 27, 2020, AI21 Labs emerged from stealth with the launch of Wordtune, its first consumer product. Wordtune is an AI-powered writing assistant that can understand the context and meaning of text and suggest paraphrases, rewrites, and alternative phrasings. The product launched as a Chrome browser extension and quickly gained traction among individual users and professionals.
In 2021, AI21 Labs began to expand rapidly. The company completed a $25 million Series A round led by Pitango First in November 2021. That same year, the company launched AI21 Studio and its Jurassic-1 family of language models, entering the competitive landscape alongside OpenAI and other foundation model developers.
In July 2022, AI21 Labs raised $64 million in a Series B round led by Ahren Innovation Capital, bringing its total funding to $118 million at a valuation of $664 million. The funds were directed toward research and development, as well as expanding the company's sales and marketing teams.
In March 2023, the company released Jurassic-2, a significant upgrade to its model family. By August 2023, AI21 Labs had raised $155 million in the first tranche of its Series C round, reaching a valuation of $1.4 billion and achieving unicorn status. Investors in this round included Walden Catalyst, Pitango, SCB10X, b2venture, Samsung Next, and co-founder Amnon Shashua, with Google and NVIDIA also participating.
In November 2023, AI21 Labs extended its Series C with an additional $53 million, bringing the total Series C to $208 million. New investors Intel Capital and Comcast Ventures joined the round, and the company's total funding reached $336 million.
In March 2024, AI21 Labs released Jamba, a model that marked a significant departure from the pure Transformer architecture that had dominated the LLM landscape. Jamba introduced a hybrid architecture combining Mamba state space model (SSM) layers with Transformer attention layers. This made Jamba the first production-grade model to scale the Mamba architecture beyond small experimental sizes.
In August 2024, the company released the Jamba 1.5 model family, scaling the hybrid architecture further with Jamba 1.5 Large (398B total parameters) and Jamba 1.5 Mini (52B total parameters).
In January 2026, AI21 Labs released Jamba 2, a pair of compact models (3B and Mini) focused on enterprise reliability, grounding, and instruction-following, released under the Apache 2.0 license.
In late December 2025, multiple reports indicated that NVIDIA was in advanced talks to acquire AI21 Labs for $2 billion to $3 billion. The reported deal was described as primarily an acqui-hire, with NVIDIA seeking to secure AI21's team of roughly 200 employees, most of whom hold advanced degrees and possess specialized expertise in AI research. If completed, the acquisition would represent NVIDIA's fourth significant purchase in Israel and its second-largest after the $7 billion Mellanox acquisition in 2020. As of early 2026, neither company has officially confirmed the transaction.
Jurassic-1 (J1) was AI21 Labs' first family of large language models, announced alongside the AI21 Studio developer platform on August 4, 2021. The family consisted of two autoregressive models:
A distinguishing feature of Jurassic-1 was its vocabulary of approximately 250,000 tokens, roughly five times larger than most existing vocabularies at the time. This vocabulary included multi-word tokens such as common expressions, phrases, and named entities, which improved computational efficiency, reduced latency, and allowed more text to fit within a fixed context window. The larger vocabulary also meant that Jurassic-1 could pack more examples into a prompt for few-shot learning compared to similarly sized models like GPT-3.
Jurassic-1 was trained on a broad corpus spanning web text, academic publications, legal documents, and source code. The architecture diverged from the standard Transformer design used by GPT-3; AI21 Labs optimized the depth-to-width ratio based on theoretical work on expressivity tradeoffs in self-attention networks.
Jurassic-2 (J2) was released on March 9, 2023, as the successor to Jurassic-1. The model family offered several improvements over its predecessor, including better quality, lower latency (up to 30% faster inference), multilingual support, and zero-shot instruction-following capabilities.
The Jurassic-2 lineup included three tiers:
Both J2-Ultra and J2-Mid were available in instruction-tuned variants that could follow natural language instructions without requiring task-specific fine-tuning. Jurassic-2 also added support for multiple European languages, including Spanish, French, German, Portuguese, Italian, and Dutch.
Jurassic-2 models were made available through AI21 Studio, Amazon Bedrock, and Amazon SageMaker, expanding AI21 Labs' reach to enterprise customers already using AWS infrastructure.
Jamba, released on March 28, 2024, represented a fundamental architectural shift for AI21 Labs. Rather than building on the standard Transformer architecture, Jamba introduced a hybrid design combining Transformer attention layers with Mamba structured state space model (SSM) layers, augmented by a Mixture of Experts (MoE) mechanism.
The Mamba architecture, introduced by Albert Gu and Tri Dao in December 2023, uses selective state spaces to process sequences in linear time rather than the quadratic time complexity of standard Transformer attention. This makes it particularly efficient for long sequences but, prior to Jamba, Mamba-based models had not been scaled beyond approximately 3 billion parameters in production settings.
Jamba's architecture interleaves blocks of Transformer and Mamba layers, with each block containing either an attention or a Mamba layer followed by a multi-layer perceptron (MLP). The overall design uses a ratio of one Transformer attention layer for every eight total layers. MoE is applied to selected layers to increase total model capacity while keeping the number of active parameters manageable during inference.
Key specifications of the original Jamba model:
| Specification | Value |
|---|---|
| Total Parameters | 52 billion |
| Active Parameters | 12 billion |
| Context Window | 256K tokens |
| Architecture | Hybrid Mamba-Transformer + MoE |
| License | Apache 2.0 |
| Max Context on Single 80GB GPU | ~140K tokens |
Jamba demonstrated approximately 3x throughput on long contexts compared to Mixtral 8x7B, a leading open MoE model at the time. It also matched or outperformed comparable models on standard language benchmarks and long-context evaluations. The model was released as open weights on Hugging Face under the Apache 2.0 license, making it freely available for commercial use.
Jamba's release was significant because it demonstrated that the hybrid SSM-Transformer approach could be scaled to production quality, challenging the assumption that pure Transformer architectures were the only viable path for building competitive large language models.
On August 22, 2024, AI21 Labs released the Jamba 1.5 model family, scaling the hybrid Mamba-Transformer architecture to larger sizes. The family included two models:
Both models retained the 256K context window and hybrid architecture of the original Jamba. They added support for function calling, retrieval-augmented generation (RAG) optimizations, structured JSON output, and multilingual capabilities.
Jamba 1.5 Large was positioned as AI21's most advanced model, capable of handling complex reasoning tasks such as financial analysis and legal document review. Jamba 1.5 Mini was designed for speed and efficiency in tasks like customer support, document summarization, and general text generation.
The Jamba 1.5 models were released as open weights under the Jamba Open Model License and made available through AI21 Studio, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Azure AI.
Jamba 2 was released on January 8, 2026, as a pair of compact models focused on enterprise reliability and steerability. The family includes:
Both models feature a 256K context window and were released under the Apache 2.0 license. Jamba 2 was mid-trained on 500 billion carefully curated tokens with increased representation of math, code, high-quality web data, and long documents. The training process included a state passing phase for the Mamba layers, cold-start supervised fine-tuning, direct preference optimization (DPO), and multi-phase on-policy reinforcement learning.
The Jamba 2 models were designed for enterprise workflows that demand accuracy and grounding, producing answers that stay closely tied to source material across technical manuals, research papers, company policies, and internal knowledge bases. In benchmarks, Jamba 2 led on instruction-following tests (IFBench, IFEval, Collie) and grounding evaluations (FACTS).
| Model | Release Date | Total Parameters | Active Parameters | Context Window | Architecture | License |
|---|---|---|---|---|---|---|
| Jurassic-1 Jumbo | August 2021 | 178B | 178B | 2,048 tokens | Transformer | Proprietary |
| Jurassic-1 Large | August 2021 | 7B | 7B | 2,048 tokens | Transformer | Proprietary |
| Jurassic-2 Ultra | March 2023 | Undisclosed | Undisclosed | 8,192 tokens | Transformer | Proprietary |
| Jurassic-2 Mid | March 2023 | Undisclosed | Undisclosed | 8,192 tokens | Transformer | Proprietary |
| Jurassic-2 Light | March 2023 | Undisclosed | Undisclosed | 8,192 tokens | Transformer | Proprietary |
| Jamba | March 2024 | 52B | 12B | 256K tokens | Hybrid Mamba-Transformer + MoE | Apache 2.0 |
| Jamba 1.5 Mini | August 2024 | 52B | 12B | 256K tokens | Hybrid Mamba-Transformer + MoE | Jamba Open Model License |
| Jamba 1.5 Large | August 2024 | 398B | 94B | 256K tokens | Hybrid Mamba-Transformer + MoE | Jamba Open Model License |
| Jamba 2 3B | January 2026 | 3B | 3B | 256K tokens | Hybrid Mamba-Transformer (dense) | Apache 2.0 |
| Jamba 2 Mini | January 2026 | 52B | 12B | 256K tokens | Hybrid Mamba-Transformer + MoE | Apache 2.0 |
AI21 Studio is the company's developer platform, providing API access to AI21's language models and task-specific capabilities. Launched alongside Jurassic-1 in August 2021, the platform allows developers to build text-based applications including virtual assistants, chatbots, content generation tools, text classifiers, and more.
The platform offers several access tiers and includes features such as custom model training (requiring as few as 50 to 100 training examples for fine-tuning), a web-based interactive playground for testing prompts, and comprehensive API documentation. AI21 Studio supports both the Jurassic and Jamba model families.
Alongside Jurassic-2, AI21 Labs introduced a set of Task-Specific Models (TSMs): pre-built, highly optimized APIs designed for specific natural language processing tasks. Unlike general-purpose language model APIs, these endpoints require no prompt engineering or fine-tuning and can be integrated with a single API call.
The Task-Specific APIs include:
| API | Function |
|---|---|
| Paraphrase | Generates alternative phrasings of input text while preserving meaning, with adjustable tone and length |
| Summarize | Condenses long documents into concise summaries, optimized for financial reports, legal documents, and technical papers |
| Contextual Answers | Answers user questions based on a provided knowledge base or document context |
| Grammatical Error Correction | Identifies and corrects grammatical errors in text |
| Text Improvements | Suggests improvements to text for clarity, fluency, and engagement |
AI21 Labs has reported that its Summarize API achieved a faithfulness rate 19% higher than OpenAI's Davinci-003 model and an acceptance rate 18% higher on the same benchmark.
Wordtune is AI21 Labs' consumer-facing AI writing assistant, launched on October 27, 2020. The product was the company's first public offering and served as the vehicle through which AI21 emerged from stealth.
Wordtune is available as a Chrome browser extension, a web application, an iOS app, and an integration within Google Docs. It offers several core features:
Wordtune has grown to tens of millions of users since its launch. Google named it one of its favorite Chrome extensions of 2021. Enterprise customers include Monday.com, eBay, UiPath, and Transmit Security. The product offers a free tier with basic rewriting capabilities and a premium subscription at $9.99 per month that unlocks advanced features.
AI21 Labs models are available through multiple cloud platforms, broadening access for enterprise customers:
| Platform | Available Models |
|---|---|
| Amazon Bedrock | Jurassic-2, Jamba 1.5 |
| Amazon SageMaker | Jurassic-2 |
| Google Cloud Vertex AI | Jamba 1.5 |
| Microsoft Azure AI | Jamba 1.5 |
| NVIDIA API Catalog | Jamba |
These integrations allow enterprises to deploy AI21 models within their existing cloud infrastructure without managing separate model hosting.
The most distinctive technical contribution from AI21 Labs is the hybrid Mamba-Transformer architecture used in the Jamba model family. Understanding this approach requires some background on the two architectures it combines.
The standard Transformer architecture, introduced in 2017, uses self-attention mechanisms to process input sequences. Self-attention allows every token in a sequence to attend to every other token, capturing long-range dependencies effectively. However, the computational cost of self-attention scales quadratically with sequence length (O(n^2)), which makes processing very long sequences expensive in terms of both time and memory.
Structured state space models (SSMs) offer an alternative approach to sequence modeling. SSMs process sequences by maintaining a hidden state that is updated as each new token arrives, similar to recurrent neural networks but with a mathematically structured state transition. This allows SSMs to process sequences in linear time (O(n)), making them far more efficient for long inputs.
The Mamba architecture, introduced by Albert Gu and Tri Dao in their December 2023 paper "Mamba: Linear-Time Sequence Modeling with Selective State Spaces," improved upon earlier SSMs by adding a selective mechanism that allows the model to focus on relevant parts of the input. This selectivity addressed a key weakness of previous SSMs, which struggled with content-based reasoning tasks.
Jamba's architecture interleaves Mamba and Transformer layers in a structured pattern. Each block contains either a Mamba SSM layer or a Transformer attention layer, followed by a standard multi-layer perceptron. The ratio of Mamba layers to Transformer layers is approximately 7:1, meaning that only one out of every eight layers uses the computationally expensive attention mechanism.
This hybrid approach offers several benefits:
The result is a model that can process 256K token contexts efficiently on standard hardware, fitting up to 140K tokens on a single 80GB GPU, something that would require significantly more resources with a pure Transformer model of comparable quality.
AI21 Labs has raised $336 million in total funding through multiple rounds.
| Round | Date | Amount | Valuation | Lead Investors |
|---|---|---|---|---|
| Seed | January 2019 | $9.5M | Undisclosed | Undisclosed |
| Series A | November 2021 | $25M | Undisclosed | Pitango First |
| Series B | July 2022 | $64M | $664M | Ahren Innovation Capital |
| Series C (Tranche 1) | August 2023 | $155M | $1.4B | Walden Catalyst, Pitango, SCB10X, b2venture, Samsung Next |
| Series C (Tranche 2) | November 2023 | $53M | $1.4B | Intel Capital, Comcast Ventures |
| Total | $336M | $1.4B |
Notable investors across all rounds include Google, NVIDIA, Intel Capital, Comcast Ventures, Pitango, Walden Catalyst, Ahren Innovation Capital, TPY Capital, Samsung Next, SCB10X, b2venture, and co-founder Amnon Shashua.
AI21 Labs operates in a highly competitive market for large language models and enterprise AI services. Its primary competitors include several well-funded companies:
| Company | Headquarters | Key Models | Focus |
|---|---|---|---|
| OpenAI | San Francisco, USA | GPT-4, GPT-4o | General-purpose AI, consumer and enterprise |
| Anthropic | San Francisco, USA | Claude | AI safety, enterprise |
| Cohere | Toronto, Canada | Command R | Enterprise NLP |
| Mistral AI | Paris, France | Mistral, Mixtral | Open-weight models, enterprise |
| Google DeepMind | London, UK | Gemini | General-purpose AI, research |
| Meta AI | Menlo Park, USA | LLaMA | Open-source models |
AI21 Labs differentiates itself in several ways. First, its hybrid Mamba-Transformer architecture gives the Jamba models an efficiency advantage for long-context processing compared to pure Transformer competitors. Second, the company has maintained a strong enterprise focus, offering deployment flexibility through SaaS, cloud partnerships (AWS, Google Cloud, Azure), virtual private cloud (VPC), and on-premises options. Third, its Task-Specific APIs provide specialized, ready-to-use solutions that require no prompt engineering, which appeals to enterprises seeking reliability and ease of integration.
AI21 Labs co-founder Ori Goshen has publicly stated that the company "usually wins" when competing directly with OpenAI for enterprise business, citing advantages in accuracy, deployment flexibility, and customer support.
Compared to open-source competitors like Meta's LLaMA and Mistral AI's models, AI21 Labs occupies a middle ground: the Jamba base models are released with open weights under permissive licenses, while the company also offers proprietary, optimized versions through its commercial API platform.
| Detail | Information |
|---|---|
| Founded | November 2017 |
| Headquarters | 124 Shlomo Ibn Gabirol Street, Tel Aviv, Israel |
| Co-Founders | Yoav Shoham (Co-CEO), Ori Goshen (Co-CEO), Amnon Shashua (Chairman) |
| Employees | Approximately 200-250 |
| Total Funding | $336 million |
| Valuation | $1.4 billion (as of November 2023) |
| Key Products | AI21 Studio, Jamba models, Wordtune, Task-Specific APIs |