AI21 Labs

AI21 Labs is an Israeli artificial intelligence company that develops and deploys large language models (LLMs) for enterprise applications. Founded in 2017 by Yoav Shoham, Ori Goshen, and Amnon Shashua, the company is headquartered in Tel Aviv, Israel. AI21 Labs has built a series of foundation models, starting with the Jurassic family and later pioneering the Jamba series, which introduced a hybrid Mamba-Transformer architecture to production-scale language modeling. The company also develops Wordtune, a consumer-facing AI writing assistant. As of late 2025, AI21 Labs has raised a total of $336 million in funding and is valued at $1.4 billion.

History

Founding and Early Years (2017-2020)

AI21 Labs was founded in November 2017 in Tel Aviv, Israel, by three co-founders with deep expertise in artificial intelligence and technology entrepreneurship.

Yoav Shoham is a professor emeritus of computer science at Stanford University and a former Principal Scientist at Google. He has made notable contributions to AI, game theory, and multi-agent systems over a career spanning several decades.

Ori Goshen is a technology entrepreneur with over 15 years of experience in product leadership roles. Before co-founding AI21 Labs, Goshen co-founded Crowdx, a network analytics company, and led the development of VoIP products.

Amnon Shashua is a professor of computer science at the Hebrew University of Jerusalem and a prominent Israeli entrepreneur. He co-founded Mobileye, the autonomous driving technology company that Intel acquired in 2017 for approximately $15.3 billion. Shashua also co-founded OrCam, a company focused on assistive technology for the visually impaired, and the digital bank One Zero.

The company's name, AI21, refers to "artificial intelligence for the 21st century." The founders started with a broad vision of building AI systems that could genuinely understand and generate natural language, rather than targeting a specific product from the outset. In January 2019, AI21 Labs raised $9.5 million in a seed funding round. Through its early years, the company operated in stealth mode while developing its core natural language processing technology.

Wordtune Launch and Coming Out of Stealth (2020)

On October 27, 2020, AI21 Labs emerged from stealth with the launch of Wordtune, its first consumer product. Wordtune is an AI-powered writing assistant that can understand the context and meaning of text and suggest paraphrases, rewrites, and alternative phrasings. The product launched as a Chrome browser extension and quickly gained traction among individual users and professionals.

Scaling Language Models (2021-2023)

In 2021, AI21 Labs began to expand rapidly. The company completed a $25 million Series A round led by Pitango First in November 2021. That same year, the company launched AI21 Studio and its Jurassic-1 family of language models, entering the competitive landscape alongside OpenAI and other foundation model developers.

In July 2022, AI21 Labs raised $64 million in a Series B round led by Ahren Innovation Capital, bringing its total funding to $118 million at a valuation of $664 million. The funds were directed toward research and development, as well as expanding the company's sales and marketing teams.

In March 2023, the company released Jurassic-2, a significant upgrade to its model family. By August 2023, AI21 Labs had raised $155 million in the first tranche of its Series C round, reaching a valuation of $1.4 billion and achieving unicorn status. Investors in this round included Walden Catalyst, Pitango, SCB10X, b2venture, Samsung Next, and co-founder Amnon Shashua, with Google and NVIDIA also participating.

In November 2023, AI21 Labs extended its Series C with an additional $53 million, bringing the total Series C to $208 million. New investors Intel Capital and Comcast Ventures joined the round, and the company's total funding reached $336 million.

The Jamba Era (2024-Present)

In March 2024, AI21 Labs released Jamba, a model that marked a significant departure from the pure Transformer architecture that had dominated the LLM landscape. Jamba introduced a hybrid architecture combining Mamba state space model (SSM) layers with Transformer attention layers. This made Jamba the first production-grade model to scale the Mamba architecture beyond small experimental sizes.

In August 2024, the company released the Jamba 1.5 model family, scaling the hybrid architecture further with Jamba 1.5 Large (398B total parameters) and Jamba 1.5 Mini (52B total parameters).

In January 2026, AI21 Labs released Jamba 2, a pair of compact models (3B and Mini) focused on enterprise reliability, grounding, and instruction-following, released under the Apache 2.0 license.

Reported Nvidia Acquisition Talks (2025-2026)

In late December 2025, multiple reports indicated that NVIDIA was in advanced talks to acquire AI21 Labs for $2 billion to $3 billion. The reported deal was described as primarily an acqui-hire, with NVIDIA seeking to secure AI21's team of roughly 200 employees, most of whom hold advanced degrees and possess specialized expertise in AI research. If completed, the acquisition would represent NVIDIA's fourth significant purchase in Israel and its second-largest after the $7 billion Mellanox acquisition in 2020. As of early 2026, neither company has officially confirmed the transaction.

Language Models

Jurassic-1 (2021)

Jurassic-1 (J1) was AI21 Labs' first family of large language models, announced alongside the AI21 Studio developer platform on August 4, 2021. The family consisted of two autoregressive models:

J1-Jumbo: 178 billion parameters, making it one of the largest publicly accessible language models at the time of release.
J1-Large: 7 billion parameters, offering a smaller and faster alternative.

A distinguishing feature of Jurassic-1 was its vocabulary of approximately 250,000 tokens, roughly five times larger than most existing vocabularies at the time. This vocabulary included multi-word tokens such as common expressions, phrases, and named entities, which improved computational efficiency, reduced latency, and allowed more text to fit within a fixed context window. The larger vocabulary also meant that Jurassic-1 could pack more examples into a prompt for few-shot learning compared to similarly sized models like GPT-3.

Jurassic-1 was trained on a broad corpus spanning web text, academic publications, legal documents, and source code. The architecture diverged from the standard Transformer design used by GPT-3; AI21 Labs optimized the depth-to-width ratio based on theoretical work on expressivity tradeoffs in self-attention networks.

Jurassic-2 (2023)

Jurassic-2 (J2) was released on March 9, 2023, as the successor to Jurassic-1. The model family offered several improvements over its predecessor, including better quality, lower latency (up to 30% faster inference), multilingual support, and zero-shot instruction-following capabilities.

The Jurassic-2 lineup included three tiers:

J2-Ultra (originally J2-Jumbo): The largest and most capable model in the family.
J2-Mid (originally J2-Grande): A mid-sized model balancing quality and cost.
J2-Light (originally J2-Large): The smallest and fastest model, optimized for cost efficiency.

Both J2-Ultra and J2-Mid were available in instruction-tuned variants that could follow natural language instructions without requiring task-specific fine-tuning. Jurassic-2 also added support for multiple European languages, including Spanish, French, German, Portuguese, Italian, and Dutch.

Jurassic-2 models were made available through AI21 Studio, Amazon Bedrock, and Amazon SageMaker, expanding AI21 Labs' reach to enterprise customers already using AWS infrastructure.

Jamba (March 2024)

Jamba, released on March 28, 2024, represented a fundamental architectural shift for AI21 Labs. Rather than building on the standard Transformer architecture, Jamba introduced a hybrid design combining Transformer attention layers with Mamba structured state space model (SSM) layers, augmented by a Mixture of Experts (MoE) mechanism.

The Mamba architecture, introduced by Albert Gu and Tri Dao in December 2023, uses selective state spaces to process sequences in linear time rather than the quadratic time complexity of standard Transformer attention. This makes it particularly efficient for long sequences but, prior to Jamba, Mamba-based models had not been scaled beyond approximately 3 billion parameters in production settings.

Jamba's architecture interleaves blocks of Transformer and Mamba layers, with each block containing either an attention or a Mamba layer followed by a multi-layer perceptron (MLP). The overall design uses a ratio of one Transformer attention layer for every eight total layers. MoE is applied to selected layers to increase total model capacity while keeping the number of active parameters manageable during inference.

Key specifications of the original Jamba model:

Specification	Value
Total Parameters	52 billion
Active Parameters	12 billion
Context Window	256K tokens
Architecture	Hybrid Mamba-Transformer + MoE
License	Apache 2.0
Max Context on Single 80GB GPU	~140K tokens

Jamba demonstrated approximately 3x throughput on long contexts compared to Mixtral 8x7B, a leading open MoE model at the time. It also matched or outperformed comparable models on standard language benchmarks and long-context evaluations. The model was released as open weights on Hugging Face under the Apache 2.0 license, making it freely available for commercial use.

Jamba's release was significant because it demonstrated that the hybrid SSM-Transformer approach could be scaled to production quality, challenging the assumption that pure Transformer architectures were the only viable path for building competitive large language models.

Jamba 1.5 (August 2024)

On August 22, 2024, AI21 Labs released the Jamba 1.5 model family, scaling the hybrid Mamba-Transformer architecture to larger sizes. The family included two models:

Jamba 1.5 Large: 398 billion total parameters with 94 billion active parameters, representing the first time a hybrid SSM-Transformer model had been scaled to this size.
Jamba 1.5 Mini: 52 billion total parameters with 12 billion active parameters, matching the original Jamba's size but with improved training and instruction-following.

Both models retained the 256K context window and hybrid architecture of the original Jamba. They added support for function calling, retrieval-augmented generation (RAG) optimizations, structured JSON output, and multilingual capabilities.

Jamba 1.5 Large was positioned as AI21's most advanced model, capable of handling complex reasoning tasks such as financial analysis and legal document review. Jamba 1.5 Mini was designed for speed and efficiency in tasks like customer support, document summarization, and general text generation.

The Jamba 1.5 models were released as open weights under the Jamba Open Model License and made available through AI21 Studio, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Azure AI.

Jamba 2 (January 2026)

Jamba 2 was released on January 8, 2026, as a pair of compact models focused on enterprise reliability and steerability. The family includes:

Jamba 2 3B: A dense (non-MoE) model with 3 billion parameters.
Jamba 2 Mini: A MoE model with 52 billion total parameters and 12 billion active parameters.

Both models feature a 256K context window and were released under the Apache 2.0 license. Jamba 2 was mid-trained on 500 billion carefully curated tokens with increased representation of math, code, high-quality web data, and long documents. The training process included a state passing phase for the Mamba layers, cold-start supervised fine-tuning, direct preference optimization (DPO), and multi-phase on-policy reinforcement learning.

The Jamba 2 models were designed for enterprise workflows that demand accuracy and grounding, producing answers that stay closely tied to source material across technical manuals, research papers, company policies, and internal knowledge bases. In benchmarks, Jamba 2 led on instruction-following tests (IFBench, IFEval, Collie) and grounding evaluations (FACTS).

Model Comparison Table

Model	Release Date	Total Parameters	Active Parameters	Context Window	Architecture	License
Jurassic-1 Jumbo	August 2021	178B	178B	2,048 tokens	Transformer	Proprietary
Jurassic-1 Large	August 2021	7B	7B	2,048 tokens	Transformer	Proprietary
Jurassic-2 Ultra	March 2023	Undisclosed	Undisclosed	8,192 tokens	Transformer	Proprietary
Jurassic-2 Mid	March 2023	Undisclosed	Undisclosed	8,192 tokens	Transformer	Proprietary
Jurassic-2 Light	March 2023	Undisclosed	Undisclosed	8,192 tokens	Transformer	Proprietary
Jamba	March 2024	52B	12B	256K tokens	Hybrid Mamba-Transformer + MoE	Apache 2.0
Jamba 1.5 Mini	August 2024	52B	12B	256K tokens	Hybrid Mamba-Transformer + MoE	Jamba Open Model License
Jamba 1.5 Large	August 2024	398B	94B	256K tokens	Hybrid Mamba-Transformer + MoE	Jamba Open Model License
Jamba 2 3B	January 2026	3B	3B	256K tokens	Hybrid Mamba-Transformer (dense)	Apache 2.0
Jamba 2 Mini	January 2026	52B	12B	256K tokens	Hybrid Mamba-Transformer + MoE	Apache 2.0

Products and Services

AI21 Studio

AI21 Studio is the company's developer platform, providing API access to AI21's language models and task-specific capabilities. Launched alongside Jurassic-1 in August 2021, the platform allows developers to build text-based applications including virtual assistants, chatbots, content generation tools, text classifiers, and more.

The platform offers several access tiers and includes features such as custom model training (requiring as few as 50 to 100 training examples for fine-tuning), a web-based interactive playground for testing prompts, and comprehensive API documentation. AI21 Studio supports both the Jurassic and Jamba model families.

Task-Specific APIs

Alongside Jurassic-2, AI21 Labs introduced a set of Task-Specific Models (TSMs): pre-built, highly optimized APIs designed for specific natural language processing tasks. Unlike general-purpose language model APIs, these endpoints require no prompt engineering or fine-tuning and can be integrated with a single API call.

The Task-Specific APIs include:

API	Function
Paraphrase	Generates alternative phrasings of input text while preserving meaning, with adjustable tone and length
Summarize	Condenses long documents into concise summaries, optimized for financial reports, legal documents, and technical papers
Contextual Answers	Answers user questions based on a provided knowledge base or document context
Grammatical Error Correction	Identifies and corrects grammatical errors in text
Text Improvements	Suggests improvements to text for clarity, fluency, and engagement

AI21 Labs has reported that its Summarize API achieved a faithfulness rate 19% higher than OpenAI's Davinci-003 model and an acceptance rate 18% higher on the same benchmark.

Wordtune

Wordtune is AI21 Labs' consumer-facing AI writing assistant, launched on October 27, 2020. The product was the company's first public offering and served as the vehicle through which AI21 emerged from stealth.

Wordtune is available as a Chrome browser extension, a web application, an iOS app, and an integration within Google Docs. It offers several core features:

Rewriting and paraphrasing: Users can highlight text and receive multiple alternative phrasings that preserve the original meaning.
Tone adjustment: The tool can make text more formal or more casual, and adjust sentence length.
Wordtune Spices: A feature that generates contextual suggestions tailored to specific professional domains (for example, legal, healthcare, or marketing writing), with source citations for factual claims.
Wordtune Read: Launched in May 2021, this feature analyzes and summarizes documents and web articles, with a Spotlight feature that allows users to re-summarize content from different perspectives. It supports PDF uploads, web links, and YouTube video transcripts.

Wordtune has grown to tens of millions of users since its launch. Google named it one of its favorite Chrome extensions of 2021. Enterprise customers include Monday.com, eBay, UiPath, and Transmit Security. The product offers a free tier with basic rewriting capabilities and a premium subscription at $9.99 per month that unlocks advanced features.

Cloud Platform Integrations

AI21 Labs models are available through multiple cloud platforms, broadening access for enterprise customers:

Platform	Available Models
Amazon Bedrock	Jurassic-2, Jamba 1.5
Amazon SageMaker	Jurassic-2
Google Cloud Vertex AI	Jamba 1.5
Microsoft Azure AI	Jamba 1.5
NVIDIA API Catalog	Jamba

These integrations allow enterprises to deploy AI21 models within their existing cloud infrastructure without managing separate model hosting.

Technical Architecture: The Hybrid SSM-Transformer Approach

The most distinctive technical contribution from AI21 Labs is the hybrid Mamba-Transformer architecture used in the Jamba model family. Understanding this approach requires some background on the two architectures it combines.

Transformer Attention

The standard Transformer architecture, introduced in 2017, uses self-attention mechanisms to process input sequences. Self-attention allows every token in a sequence to attend to every other token, capturing long-range dependencies effectively. However, the computational cost of self-attention scales quadratically with sequence length (O(n^2)), which makes processing very long sequences expensive in terms of both time and memory.

State Space Models and Mamba

Structured state space models (SSMs) offer an alternative approach to sequence modeling. SSMs process sequences by maintaining a hidden state that is updated as each new token arrives, similar to recurrent neural networks but with a mathematically structured state transition. This allows SSMs to process sequences in linear time (O(n)), making them far more efficient for long inputs.

The Mamba architecture, introduced by Albert Gu and Tri Dao in their December 2023 paper "Mamba: Linear-Time Sequence Modeling with Selective State Spaces," improved upon earlier SSMs by adding a selective mechanism that allows the model to focus on relevant parts of the input. This selectivity addressed a key weakness of previous SSMs, which struggled with content-based reasoning tasks.

How Jamba Combines Both

Jamba's architecture interleaves Mamba and Transformer layers in a structured pattern. Each block contains either a Mamba SSM layer or a Transformer attention layer, followed by a standard multi-layer perceptron. The ratio of Mamba layers to Transformer layers is approximately 7:1, meaning that only one out of every eight layers uses the computationally expensive attention mechanism.

This hybrid approach offers several benefits:

Memory efficiency: The Mamba layers maintain a compact recurrent state rather than storing full key-value caches for all past tokens, dramatically reducing memory requirements for long contexts.
Throughput: The linear-time processing of Mamba layers delivers higher throughput on long sequences compared to pure Transformer models.
Quality preservation: The interspersed Transformer attention layers provide the content-based reasoning and long-range dependency modeling capabilities that pure SSM models can lack.
MoE scaling: The Mixture of Experts mechanism allows the model to have a large total parameter count while keeping the active parameter count (and thus inference cost) low.

The result is a model that can process 256K token contexts efficiently on standard hardware, fitting up to 140K tokens on a single 80GB GPU, something that would require significantly more resources with a pure Transformer model of comparable quality.

Funding and Valuation

AI21 Labs has raised $336 million in total funding through multiple rounds.

Round	Date	Amount	Valuation	Lead Investors
Seed	January 2019	$9.5M	Undisclosed	Undisclosed
Series A	November 2021	$25M	Undisclosed	Pitango First
Series B	July 2022	$64M	$664M	Ahren Innovation Capital
Series C (Tranche 1)	August 2023	$155M	$1.4B	Walden Catalyst, Pitango, SCB10X, b2venture, Samsung Next
Series C (Tranche 2)	November 2023	$53M	$1.4B	Intel Capital, Comcast Ventures
Total		$336M	$1.4B

Notable investors across all rounds include Google, NVIDIA, Intel Capital, Comcast Ventures, Pitango, Walden Catalyst, Ahren Innovation Capital, TPY Capital, Samsung Next, SCB10X, b2venture, and co-founder Amnon Shashua.

Competitive Landscape

AI21 Labs operates in a highly competitive market for large language models and enterprise AI services. Its primary competitors include several well-funded companies:

Company	Headquarters	Key Models	Focus
OpenAI	San Francisco, USA	GPT-4, GPT-4o	General-purpose AI, consumer and enterprise
Anthropic	San Francisco, USA	Claude	AI safety, enterprise
Cohere	Toronto, Canada	Command R	Enterprise NLP
Mistral AI	Paris, France	Mistral, Mixtral	Open-weight models, enterprise
Google DeepMind	London, UK	Gemini	General-purpose AI, research
Meta AI	Menlo Park, USA	LLaMA	Open-source models

AI21 Labs differentiates itself in several ways. First, its hybrid Mamba-Transformer architecture gives the Jamba models an efficiency advantage for long-context processing compared to pure Transformer competitors. Second, the company has maintained a strong enterprise focus, offering deployment flexibility through SaaS, cloud partnerships (AWS, Google Cloud, Azure), virtual private cloud (VPC), and on-premises options. Third, its Task-Specific APIs provide specialized, ready-to-use solutions that require no prompt engineering, which appeals to enterprises seeking reliability and ease of integration.

AI21 Labs co-founder Ori Goshen has publicly stated that the company "usually wins" when competing directly with OpenAI for enterprise business, citing advantages in accuracy, deployment flexibility, and customer support.

Compared to open-source competitors like Meta's LLaMA and Mistral AI's models, AI21 Labs occupies a middle ground: the Jamba base models are released with open weights under permissive licenses, while the company also offers proprietary, optimized versions through its commercial API platform.

Company Profile

Detail	Information
Founded	November 2017
Headquarters	124 Shlomo Ibn Gabirol Street, Tel Aviv, Israel
Co-Founders	Yoav Shoham (Co-CEO), Ori Goshen (Co-CEO), Amnon Shashua (Chairman)
Employees	Approximately 200-250
Total Funding	$336 million
Valuation	$1.4 billion (as of November 2023)
Key Products	AI21 Studio, Jamba models, Wordtune, Task-Specific APIs

References

AI21 Labs. "Announcing AI21 Studio and Jurassic-1 Language Models." AI21 Blog, August 4, 2021. https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1/
AI21 Labs. "Announcing Jurassic-2 and Task-Specific APIs." AI21 Blog, March 9, 2023. https://www.ai21.com/blog/introducing-j2/
AI21 Labs. "Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model." AI21 Blog, March 28, 2024. https://www.ai21.com/blog/announcing-jamba/
Lieber, Opher, et al. "Jamba: A Hybrid Transformer-Mamba Language Model." arXiv:2403.19887, March 2024. https://arxiv.org/abs/2403.19887
AI21 Labs. "The Jamba 1.5 Open Model Family: The Most Powerful and Efficient Long Context Models." AI21 Blog, August 22, 2024. https://www.ai21.com/blog/announcing-jamba-model-family/
AI21 Labs. "Introducing Jamba2: The Open Source Model Family for Enterprise Reliability and Efficiency." AI21 Blog, January 8, 2026. https://www.ai21.com/blog/introducing-jamba2/
Gu, Albert and Tri Dao. "Mamba: Linear-Time Sequence Modeling with Selective State Spaces." arXiv:2312.00752, December 2023. https://arxiv.org/abs/2312.00752
TechCrunch. "Generative AI startup AI21 Labs lands $155M at a $1.4B valuation." August 30, 2023. https://techcrunch.com/2023/08/30/generative-ai-startup-ai21-labs-lands-155m-at-a-1-4b-valuation/
AI21 Labs. "AI21 Completes $208 Million Oversubscribed Series C Round." Press Release, November 21, 2023. https://www.ai21.com/blog/ai21-completes-208-million-oversubscribed-series-c-round/
PR Newswire. "AI21 Labs Raises $64 Million to Change the Way People Read and Write Using Artificial Intelligence." July 12, 2022. https://www.prnewswire.com/news-releases/ai21-labs-raises-64-million-to-change-the-way-people-read-and-write-using-artificial-intelligence-301584831.html
BusinessWire. "AI21 Labs Comes out of Stealth and Launches First Deep-Tech Writing Assistant, Wordtune." October 27, 2020. https://www.businesswire.com/news/home/20201027005162/en/AI21-Labs-Comes-out-of-Stealth-and-Launches-First-Deep-Tech-Writing-Assistant-Wordtune
Calcalist Tech. "Nvidia in advanced talks to acquire AI21 in $2-3 billion deal focused on talent." December 30, 2025. https://www.calcalistech.com/ctechnews/article/rkbh00xnzl
VentureBeat. "AI21 Labs co-founder says 'we usually win' when competing with OpenAI for enterprise business." https://venturebeat.com/ai/ai21-labs-co-founder-says-we-usually-win-when-competing-with-openai-for-enterprise-business
Lieber, Opher, et al. "Jurassic-1: Technical Details and Evaluation." AI21 Labs White Paper, 2021. https://www.ai21.com/research/jurassic-1-technical-details-evaluation/
AI21 Labs. "Introducing Wordtune Read." AI21 Blog, May 2021. https://www.ai21.com/blog/introducing-wordtune-read/

History

Founding and Early Years (2017-2020)

Wordtune Launch and Coming Out of Stealth (2020)

Scaling Language Models (2021-2023)

The Jamba Era (2024-Present)

Reported Nvidia Acquisition Talks (2025-2026)

Language Models

Jurassic-1 (2021)

Jurassic-2 (2023)

Jamba (March 2024)

Jamba 1.5 (August 2024)

Jamba 2 (January 2026)

Model Comparison Table

Products and Services

AI21 Studio

Task-Specific APIs

Wordtune

Cloud Platform Integrations

Technical Architecture: The Hybrid SSM-Transformer Approach

Transformer Attention

State Space Models and Mamba

How Jamba Combines Both

Funding and Valuation

Competitive Landscape

Company Profile

References

Related Articles

Cohere

Anthropic

Baidu AI

DeepSeek

Meta AI

MiniMax

History

Founding and Early Years (2017-2020)

Wordtune Launch and Coming Out of Stealth (2020)

Scaling Language Models (2021-2023)

The Jamba Era (2024-Present)

Reported Nvidia Acquisition Talks (2025-2026)

Language Models

Jurassic-1 (2021)

Jurassic-2 (2023)

Jamba (March 2024)

Jamba 1.5 (August 2024)

Jamba 2 (January 2026)

Model Comparison Table

Products and Services

AI21 Studio

Task-Specific APIs

Wordtune

Cloud Platform Integrations

Technical Architecture: The Hybrid SSM-Transformer Approach

Transformer Attention

State Space Models and Mamba

How Jamba Combines Both

Funding and Valuation

Competitive Landscape

Company Profile

References

Related Articles

Cohere

Anthropic

Baidu AI

DeepSeek

Meta AI

MiniMax