Reka AI
Last reviewed
May 8, 2026
Sources
28 citations
Review status
Source-backed
Revision
v5 ยท 5,887 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 8, 2026
Sources
28 citations
Review status
Source-backed
Revision
v5 ยท 5,887 words
Add missing citations, update stale details, or suggest a clearer explanation.
Reka AI (commonly referred to as Reka) is an artificial intelligence research and product company that builds multimodal large language models capable of processing text, images, video, and audio inputs. Founded in 2022 by a team of researchers from Google DeepMind, Meta FAIR, and Google, the company has positioned itself as one of a small number of organizations building frontier-class multimodal AI models from scratch. Reka is headquartered in Sunnyvale, California, and maintains operations in Singapore, with team members distributed across London, Zurich, Seattle, and Hong Kong.
The company emerged from stealth in June 2023 with $58 million in funding and has since released a family of models spanning different size classes: Reka Core, Reka Flash, and Reka Edge. In July 2025, Reka raised $110 million in a funding round backed by Nvidia and Snowflake, reaching a valuation of over $1 billion and achieving unicorn status. The company has expanded beyond foundation models into product offerings such as Reka Nexus (an AI workforce platform), Reka Vision (a video and image understanding system), and Reka Research (an agentic web research tool).
Reka was founded in 2022 by five AI researchers who had previously worked at some of the world's leading AI laboratories. The founders, Dani Yogatama, Yi Tay, Cyprien de Masson d'Autume, Qi Liu, and Mikel Artetxe, shared a conviction that it was impractical to expect a single large language model to serve all possible enterprise use cases. While working on projects like AlphaCode and Google's Bard at DeepMind and Google, the founders observed that general-purpose models often fell short when organizations needed AI tailored to specific requirements, such as generating marketing copy in a particular brand voice or processing domain-specific documents.
The company operated in stealth mode through the first half of 2023, assembling a small research team and beginning work on its first models.
On June 28, 2023, Reka publicly emerged from stealth, announcing it had raised $58 million in a funding round led by DST Global Partners and Radical Ventures, with participation from Snowflake Ventures and several angel investors including former GitHub CEO Nat Friedman. At the time of this announcement, the company was valued at approximately $300 million.
Reka's pitch to investors centered on its ability to build efficient, customizable AI models for enterprise deployment. Unlike larger competitors that focused primarily on consumer-facing chatbots, Reka emphasized on-premise and virtual private cloud (VPC) deployment options that would allow enterprises to keep sensitive data within their own infrastructure.
In October 2023, Reka launched Yasa-1, its first publicly available multimodal AI assistant. Yasa-1 was built on a unified model trained from scratch and could process text, images, short videos, and audio snippets. The assistant supported 20 languages and offered a context window of up to 24,000 tokens by default, with the ability to handle documents as long as 100,000 tokens.
Yasa-1 was made available through APIs and as Docker containers for on-premise or VPC deployment. The system could also be customized on private datasets of any modality, allowing enterprises to fine-tune the model for specific use cases.
Following the Yasa-1 launch, Reka released Reka Flash, a 21 billion parameter multimodal language model that served as the company's mid-tier "turbo-class" offering. Flash was designed to deliver strong performance at a fraction of the computational cost required by larger models. It could process text, images, and video inputs and was positioned as a practical choice for everyday enterprise applications.
On April 15, 2024, Reka published a comprehensive technical report detailing its full model lineup: Reka Core, Reka Flash, and Reka Edge. The report, titled "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models," was released as a preprint on arXiv and demonstrated that a small team could build frontier-competitive models.
Reka Core, the company's most capable model, approached the performance of GPT-4V on image question-answering benchmarks and ranked as the second most preferred model (behind only GPT-4V) in blind third-party human evaluations of multimodal chat, outperforming Claude 3 Opus. On video question answering, Core surpassed Gemini Ultra on the Perception-Test benchmark.
The smaller models also performed well above their weight class. Reka Flash (21B parameters) outperformed much larger models including Gemini Pro 1.0 and Llama 2 70B on multiple benchmarks, while Reka Edge (7B parameters) beat other models in its size category, including Mistral 7B and Gemma 7B.
Reka Core was released with launch API pricing of $10 per million input tokens and $25 per million output tokens, putting it in the same general price tier as competing frontier models at the time. The model was pretrained on textual data covering 32 languages, with fluency demonstrated in English and several Asian and European languages.
On April 18, 2024, only days after the Reka Core release, Reka announced a partnership with Oracle Corporation. Under the agreement, Reka selected Oracle Cloud Infrastructure (OCI) as its preferred cloud provider for training and serving its multimodal models, running on OCI AI Infrastructure with Nvidia GPUs. Reka Core and Reka Flash were also added to the Oracle Cloud Marketplace, giving Oracle's enterprise customers a way to deploy Reka models alongside Oracle databases and applications.
In May 2024, Bloomberg reported that Snowflake was in talks to acquire Reka for more than $1 billion. The reporting framed the discussion as part of Snowflake's push to compete with Databricks and other data platforms that were rapidly absorbing or building generative AI capabilities. By May 22, 2024, those talks had broken down without a deal, with both companies deciding it made more sense to move forward independently. Bloomberg reported that the parties could not agree on price and terms, and that Reka's existing investors and founders preferred to keep the company independent.
Despite the failed acquisition, Reka and Snowflake deepened their partnership rather than walking away. Snowflake integrated Reka Flash and Reka Core into its Cortex AI service, making them available to over 400 enterprises using Snowflake's data cloud. This integration enabled customers to build generative AI applications that could work with text, images, and video inputs directly within Snowflake's platform. Snowflake Ventures, which had been an investor since the 2023 round, kept its stake and later returned in the 2025 growth round.
In May 2024, Reka released Vibe-Eval, an open evaluation benchmark for multimodal language models. The benchmark consists of 269 visual understanding prompts, including 100 prompts rated as hard difficulty, with gold-standard responses authored by human experts. Over 50% of the questions in the hard set were answered incorrectly by all frontier models at the time of release, making it a useful tool for measuring real progress in multimodal AI. The benchmark and its dataset were released on GitHub under an open license, and the work was published as an arXiv preprint titled "Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models."
On November 25, 2024, co-founder and Chief Scientist Yi Tay announced his departure from Reka to return to Google DeepMind as a Senior Staff Research Scientist. Tay had spent approximately 1.5 years at Reka and cited his identity as a researcher and scientist as the primary reason for his return. At DeepMind, he went on to lead a new GenAI research lab in Singapore, contributing to work on the Gemini Deep Think model.
On March 10, 2025, Reka launched Reka Nexus, an AI platform aimed at enterprise customers that lets organizations create and manage AI workers to automate tasks like deep topic research, invoice processing, and sales lead generation. Nexus is built on top of Reka Flash and presents itself as a workforce layer rather than a raw model API. Each AI worker can be customized for a specific role, deployed on-premise or on-device with quantization support, and produces a human-readable execution trace that customers can use for auditing.
On March 11, 2025, Reka open-sourced Reka Flash 3, a 21 billion parameter general-purpose reasoning model trained from scratch. The model was released under the Apache 2.0 license, making it freely available for commercial and research use. Flash 3 was trained on synthetic and public datasets using supervised fine-tuning, followed by RLOO (REINFORCE Leave One-Out) with model-based and rule-based rewards. The model uses Reasoning tags to mark the boundaries of its thinking process, similar to the approach in OpenAI's o1 series, and the weights were uploaded to Hugging Face under the RekaAI organization.
Reka Flash 3 performed competitively with proprietary models such as OpenAI o1-mini, which was notable given its relatively small parameter count. An updated version, Reka Flash 3.1, followed shortly after, improving by 10 points on LiveCodeBench v5 and showing particular strength on coding tasks and as a base model for agentic fine-tuning.
On July 8, 2025, Reka introduced Reka Vision, a multimodal video and image understanding platform aimed at content creators, security operators, and media companies. Reka Vision is structured around three modules: Watch, Search, and Chat, which can be combined by a model planner to handle workflows like clipping highlights from long videos, finding specific moments inside large media libraries, and triggering alerts when a security camera sees something unusual.
Early customers included Shutterstock, which used Reka Vision to enrich its image and video catalog with natural-language search, and Turing Video, which built an agentic surveillance product called Guardian AI on top of the platform. The system targets long-form video understanding, where each frame can be searched, summarized, or used as the trigger for an alert without sending data to a third party.
On July 10, 2025, Reka released Reka Quant, an open-source model quantization library based on its internal toolchain, alongside a 3.5-bit quantized version of Reka Flash 3.1. Reka Quant uses calibrated error reduction and online self-distillation to keep quality close to the original 16-bit model. According to the company, quantizing Flash 3.1 to the Q3_K_S format in llama.cpp using Reka Quant produced an average performance drop of only 1.6 points across benchmarks, compared to 6.8 points using standard Q3_K_S quantization. The library supports NF4 and GGML quantization primitives, distributed Hessian computation for fast LDLQ, and exporting to common llama.cpp formats including Q3_K and Q4_K.
In July 2025, Reka raised $110 million in a funding round backed by Nvidia and Snowflake. This round more than tripled the company's valuation from approximately $300 million to over $1 billion, making Reka a unicorn. The investment reflected confidence in Reka's ability to develop market-leading models at a fraction of the cost incurred by larger competitors. Bloomberg's coverage of the round noted that both Snowflake and Nvidia were doubling down on a company they had previously considered acquiring or had supported through earlier rounds, and that the new capital would be used to expand the team and accelerate the rollout of Nexus, Vision, and the next generation of foundation models.
On March 11, 2026, Reka announced a new generation of Reka Edge, a 7 billion parameter vision language model aimed at physical AI use cases such as robotics, edge cameras, and embedded inspection systems. The 2026 Reka Edge variant uses 3x fewer input tokens than competing 8B models on the same image-and-text inputs and achieves up to 65% higher throughput. With 4-bit quantization, memory consumption drops from 13 GB to about 5 GB, a 62% reduction, while retaining over 98% of the original benchmark performance and delivering up to 2.3x faster inference. The model targets image understanding, video analysis, object detection, and agentic tool-use, and is published on Hugging Face as RekaAI/reka-edge-2603.
In May 2026, Reka showcased its multimodal physical security platform at Smart City Asia 2026 in Ho Chi Minh City. The demonstration centered on Reka Vision running on top of Reka Edge for live video analysis at scale, and on the company's pitch that a single multimodal model could replace stacks of narrow vision systems used in traffic, retail, and public safety deployments.
Reka was founded by five researchers with complementary backgrounds in natural language processing, machine learning, and large-scale systems engineering.
| Founder | Role at Reka | Previous Affiliation | Background |
|---|---|---|---|
| Dani Yogatama | CEO and Co-founder | DeepMind (2016-2022), Baidu Silicon Valley AI Lab (2015-2016) | PhD in Computer Science from Carnegie Mellon University. Senior Staff Research Scientist at DeepMind. Associate Professor at the University of Southern California. |
| Yi Tay | Co-founder, Chief Scientist (until Nov 2024) | Google Brain, DeepMind | Co-lead of PaLM 2 at Google. Inventor of UL2 and Differentiable Search Indexes. Key contributor to Flan-T5 and other instruction-tuning work. Returned to Google DeepMind in November 2024. |
| Cyprien de Masson d'Autume | CTO and Co-founder | DeepMind (2016-2022) | Staff Research Engineer at DeepMind. Worked on Gopher and AlphaCode. |
| Qi Liu | Co-founder | DeepMind, Meta FAIR, Microsoft Research | PhD from the University of Oxford. Assistant Professor at the University of Hong Kong. |
| Mikel Artetxe | Co-founder | Meta FAIR | PhD from the University of the Basque Country. Research focus on multilingual NLP, unsupervised machine translation, and cross-lingual representation learning. Honorary Researcher at the University of the Basque Country. |
Reka has developed models across multiple size tiers, all trained from scratch with native multimodal capabilities. The full lineup includes general-purpose multimodal models, reasoning-tuned models, quantized variants, and specialized vision language models for physical AI.
| Model | Release | Parameters | Modalities | Notes |
|---|---|---|---|---|
| Yasa-1 | October 2023 | Undisclosed | Text, image, short video, audio | First public Reka product. 20 languages. 24K default context, up to 100K. |
| Reka Flash 1.0 | November 2023 | 21B | Text, image, video | Original turbo-class multimodal model. |
| Reka Edge 1.0 | April 2024 | 7B | Text, image, video | Compact frontier model for resource-constrained deployment. |
| Reka Core | April 15, 2024 | ~67B (reported) | Text, image, video, audio | Frontier-class multimodal model. 128K context. Trained on 32 languages. |
| Reka Flash 3 | March 11, 2025 | 21B | Text, image | Open weights under Apache 2.0. Reasoning-tuned with RLOO. |
| Reka Flash 3.1 | 2025 | 21B | Text, image | +10 points on LiveCodeBench v5. Stronger coding and agentic baseline. |
| Reka Spark | 2025 | ~2B | Text, image | Ultra-compact tier for lightweight tasks. |
| Reka Flash 3.1 RekaQuant Q3_K_S | July 10, 2025 | 21B (3.5-bit) | Text, image | Quantized version. 1.6-point average drop vs full precision. |
| Reka Edge 2603 | March 11, 2026 | 7B | Image, video, text | New vision language model for physical AI. 4-bit quantized variant uses about 5 GB memory. |
Reka Core is the company's largest and most capable model. While Reka has not officially disclosed the exact parameter count, reports suggest it has approximately 67 billion parameters. Core features a 128K context window and can process text, images, video, and audio inputs simultaneously.
Core was designed as a frontier-class model and was benchmarked against GPT-4V, Claude 3 Opus, and Gemini Ultra at the time of its release. On language tasks, it achieved an MMLU score of 83.2, a GSM8K score of 92.2, and a HumanEval score of 76.8. On multimodal tasks, it scored 56.3 on MMMU, 78.1 on VQAv2, and 59.3 on the Perception-Test video benchmark.
Reka Core launched at $10 per million input tokens and $25 per million output tokens, available through Reka's own API, through Snowflake Cortex, through the Oracle Cloud Marketplace, and via on-premise deployment using Docker containers.
Reka Flash is a 21 billion parameter dense model trained on approximately 5 trillion text tokens with an 8K default context window that can extend to 128K. Flash was positioned as a "turbo-class" model, offering fast inference and strong performance at a lower computational cost than Core.
Flash outperformed several larger models on key benchmarks: it scored 75.9 on MMLU, 85.8 on GSM8K, 72.0 on HumanEval, and 53.3 on MMMU. These scores surpassed Gemini Pro 1.0 and Llama 2 70B on multiple measures.
Reka Edge is a 7 billion parameter dense model trained on 4.5 trillion text tokens. It was designed for resource-constrained environments such as on-device deployment and local inference scenarios. Edge features an 8K default context window that can extend to 64K.
Despite its compact size, Edge outperformed other models in the 7B class, including Mistral 7B, Gemma 7B, and Llama 2 7B. It scored 65.7 on MMLU, 66.2 on GSM8K, and 54.3 on HumanEval.
In March 2026, Reka released a new generation of Reka Edge specifically engineered for physical AI applications. The 2026 release uses 3x fewer input tokens and reaches 65% higher throughput than leading 8B models in the same class. With 4-bit quantization, memory drops from 13 GB to about 5 GB while retaining 98% of original performance, and the model can process about 5.46 images per second on standard inference hardware. Reka pitched this version as the model that lets robots, drones, and security cameras run a real frontier-style vision language model on the device itself rather than streaming everything to the cloud.
Released in March 2025 under the Apache 2.0 license, Reka Flash 3 is a 21 billion parameter general-purpose reasoning model trained from scratch. Unlike the earlier Flash model, Flash 3 was specifically optimized for reasoning tasks through reinforcement learning (RLOO). It uses a Llama-compatible architecture and the cl100k_base tokenizer. The reasoning trace is wrapped in dedicated tags so that downstream applications can choose to display, hide, or audit the model's intermediate steps.
Flash 3.1, an updated version released shortly after, incorporated advances in Reka's reinforcement learning stack and improved by 10 points on LiveCodeBench v5. Flash 3.1 showed particular strength on coding benchmarks and as a base model for fine-tuning on agentic tasks.
In July 2025, Reka released a 3.5-bit quantized variant of Flash 3.1 produced with the open-source Reka Quant library. The quantized variant runs in roughly the same memory budget as a 7B model in full precision, while keeping benchmark loss to about 1.6 points on average.
Reka Spark is an ultra-compact model with approximately 2 billion parameters, designed for lightweight tasks and edge deployment on smaller devices. Spark serves as the most affordable entry point in Reka's model lineup.
The following table summarizes Reka's model performance on key benchmarks alongside leading competitors, as reported in the April 2024 technical report.
| Benchmark | Reka Edge (7B) | Reka Flash (21B) | Reka Core | GPT-4 (0613) | Claude 3 Opus | Gemini Ultra |
|---|---|---|---|---|---|---|
| MMLU | 65.7 | 75.9 | 83.2 | 86.4 | 86.8 | 83.7 |
| GSM8K | 66.2 | 85.8 | 92.2 | 92.0 | 95.0 | 94.4 |
| HumanEval | 54.3 | 72.0 | 76.8 | 76.5 | 84.9 | 74.4 |
| GPQA | - | 34.0 | 38.2 | 38.1 | 50.2 | 35.7 |
| Benchmark | Reka Flash (21B) | Reka Core | GPT-4V | Claude 3 Opus | Gemini Ultra |
|---|---|---|---|---|---|
| MMMU | 53.3 | 56.3 | 56.8 | 59.1 | 59.4 |
| VQAv2 | 78.4 | 78.1 | 77.2 | - | 77.8 |
| Perception-Test (Video) | 56.4 | 59.3 | - | - | 54.7 |
| Benchmark | Reka Edge | Reka Flash | Reka Core | GPT-4 |
|---|---|---|---|---|
| MedMCQA | 52.6 | 71.3 | 80.6 | 72.4 |
| PubMedQA | 71.6 | 69.0 | 74.6 | 80.4 |
| MMLU Medical | 65.7 | 79.5 | 88.3 | 90.3 |
| Model | ELO Score | Win Rate |
|---|---|---|
| GPT-4V | 1201 | 79.4% |
| Reka Core | 1130 | 72.2% |
| Reka Flash | 1082 | 66.8% |
| Claude 3 Opus | 1073 | 66.2% |
These results were notable because Reka built these models with a team of roughly 20 core researchers, while competitors employed teams that were orders of magnitude larger.
| Benchmark | Reka Flash 3 | Reka Flash 3.1 | OpenAI o1-mini |
|---|---|---|---|
| LiveCodeBench v5 | baseline | +10 over Flash 3 | competitive |
| AIME (math) | competitive with o1-mini | improved | reference |
| MMLU-Pro reasoning | competitive | competitive | reference |
The reasoning-tuned Flash 3 family closed much of the gap to closed-source reasoning models in math, code, and tool-use evaluations, despite weighing in at only 21B parameters and being released with open weights under Apache 2.0.
Reka Chat is the company's consumer-facing chat interface, available at chat.reka.ai. It allows users to interact with Reka's models for text conversations, document analysis, and multimodal queries involving images and files.
Reka Research is an agentic AI product that can browse the web and search through private documents to answer complex questions. The tool performs multiple web searches, visits dozens of websites in one to three minutes, and synthesizes information from multiple sources in a multi-hop manner, reasoning in natural language before taking each step. Every answer includes a traceable path of the steps taken.
The Reka Research API is OpenAI-compatible and supports customizable execution, including domain restrictions and structured JSON output. Enterprises can deploy Reka Research on-premise, in a private cloud, or through the API.
Reka Nexus is the company's AI workforce platform, launched in March 2025. It allows enterprise customers to spin up specialized AI workers that handle defined business tasks, such as conducting deep research on a topic, processing invoices, or generating sales leads. Each Nexus worker is built on top of Reka Flash and can be customized with the customer's own data, deployed on-premise or on-device, and quantized for cost-sensitive workloads. The platform produces a human-readable trace of every action a worker takes, which is useful in regulated industries that need an audit log for any AI-driven decision.
Reka Vision, launched in July 2025, is a multimodal video and image understanding platform. The product is built around three primitives: Watch (continuous monitoring of video streams or libraries), Search (natural-language retrieval over images and clips), and Chat (multimodal Q&A grounded in the user's media). A model planner orchestrates these primitives to handle higher-level workflows, such as turning a long YouTube video into a set of social-media-ready short clips, or scanning thousands of hours of camera footage for a specific incident.
Early Reka Vision deployments include Shutterstock, which uses the platform to power semantic search across its image and video catalog, and Turing Video, which built Guardian AI, an agentic surveillance product, on top of the platform. Reka Vision is also integrated with Nvidia's AI Blueprint for video search and summarization, allowing customers to combine Reka's reasoning with Nvidia's reference video pipeline.
Reka offers flexible deployment options for enterprises, including API access, on-premise deployment, and private cloud (VPC) deployment via Docker containers. This flexibility is a key differentiator, as many competing AI providers only offer cloud-based API access. Enterprises can also fine-tune Reka's models on their own proprietary datasets.
| Tier | Use case | Notes |
|---|---|---|
| Reka Core API | Frontier multimodal use | $10 per million input tokens, $25 per million output tokens at launch. |
| Reka Flash API | Mid-tier multimodal use | Lower per-token price than Core. Available via Reka API, Snowflake Cortex, and Oracle Cloud Marketplace. |
| Reka Edge / Spark | Edge and on-device | Open weights for the reasoning Flash 3 family. Edge can be deployed on a single GPU or quantized for CPU. |
| On-premise / VPC | Regulated and large enterprise | Docker container licensing. Pricing negotiated per deployment. |
| Reka Nexus | AI workforce | Per-seat or per-task licensing for AI workers. |
| Reka Vision | Video and image understanding | Usage-based pricing for video minutes processed and image queries. |
Snowflake has been one of Reka's most significant partners. Snowflake Ventures participated in Reka's initial $58 million funding round in 2023 and later backed the $110 million round in 2025. In May 2024, Bloomberg reported that Snowflake and Reka were in advanced talks for a full acquisition valued at over $1 billion, but those talks ended on May 22, 2024 without a deal. Rather than walk away, the companies turned the discussion into a deeper commercial partnership. Reka Flash and Reka Core were integrated into Snowflake's Cortex AI service, making Reka's multimodal capabilities available to over 400 enterprises using Snowflake's data cloud, and Snowflake Ventures kept its equity stake.
The integration allows Snowflake customers to build generative AI applications that can process text, images, and video inputs without moving data outside of Snowflake's platform. Reka's models brought the total number of LLMs available in Snowflake Cortex to approximately a dozen.
Nvidia has been both an investor and a technology partner for Reka. Beyond participating in the $110 million funding round in 2025, Nvidia's GPU infrastructure serves as the primary compute platform for training and serving Reka's models. The partnership reflects the close relationship between frontier AI model developers and Nvidia's hardware ecosystem. Reka Vision is also integrated with Nvidia's AI Blueprint for video search and summarization, which packages a reference architecture for ingesting video, extracting captions, and answering natural-language queries.
In April 2024, Reka announced a partnership with Oracle. Reka selected Oracle Cloud Infrastructure (OCI) as its preferred cloud platform for training and serving multimodal models, running on OCI's Nvidia GPU clusters. Reka Core and Reka Flash were also added to the Oracle Cloud Marketplace, where Oracle's enterprise customers can deploy them alongside Oracle databases, ERP systems, and analytics tooling. The partnership gave Reka a third major distribution channel beyond its own API and the Snowflake Cortex integration.
Following the launch of Reka Vision in July 2025, Reka announced anchor customers in two adjacent markets. Shutterstock adopted Reka Vision for natural-language search across its image and video catalog, allowing users to find clips by describing the content rather than guessing tags. Turing Video built Guardian AI, an agentic surveillance product, on top of Reka Vision, with capabilities including searching for specific events across recorded footage, configuring smart alerts, and generating incident summaries written in natural language.
Unlike many competitors that built multimodal capabilities as extensions to existing text-only models, Reka trained its models as natively multimodal from the beginning. This approach means the models can process and reason across text, images, video, and audio within a single unified architecture, rather than relying on separate encoders bolted onto a text model after the fact.
This design choice has direct consequences for what the models can do. Reka Core can take an image and a long audio clip in the same prompt and reason about both together, instead of running speech-to-text first and feeding only the transcript to a language model. The same logic applies to video, where Reka's models work directly on frame-level features rather than receiving a separate caption pipeline as input.
One of Reka's defining characteristics is its emphasis on training efficiency. The company has consistently achieved competitive results with significantly fewer resources than larger competitors. The original Reka Core model was developed by a team of roughly 20 researchers, compared to the hundreds or thousands of engineers at organizations like OpenAI, Google, and Anthropic. Yi Tay noted during his time at Reka that the team managed to build a GPT-4-class multimodal model from scratch in under a year with this small team.
For its 2025 reasoning models, Reka adopted REINFORCE Leave One-Out (RLOO) as its main reinforcement learning algorithm rather than PPO or DPO. RLOO uses multiple sampled responses for each prompt and treats one held-out sample as the baseline for the others, which sidesteps the cost and instability of training a separate value network. Reka combines RLOO with both rule-based rewards (for tasks like math and code where correctness can be checked automatically) and model-based rewards (for tasks like instruction following). The reasoning trace is part of the training signal, so the model learns when a longer chain of thought helps and when it does not.
Reka Quant, released as open source in July 2025, is the company's contribution to the model compression literature. The library combines two ideas: calibrated error reduction, which uses a small calibration set to learn weight-update directions that minimize quantization loss, and online self-distillation, which uses the original model as a teacher during quantization-aware fine-tuning. Together, these techniques let Reka quantize Flash 3.1 to 3.5 bits per weight while losing only about 1.6 average benchmark points, compared to 6.8 points using the standard Q3_K_S baseline in llama.cpp.
Reka has contributed to the open-source AI ecosystem through several releases. Reka Flash 3 and Flash 3.1 were released under the Apache 2.0 license, making them freely available for commercial and research use. The company also released the Vibe-Eval benchmark as an open resource for the research community, providing a standardized way to evaluate multimodal language models with expert-annotated gold-standard responses. The Reka Quant library was released on GitHub under an open license in July 2025, and the 4-bit quantized 2026 Reka Edge model was published on Hugging Face for local deployment.
| Date | Round | Amount | Lead Investors | Valuation |
|---|---|---|---|---|
| June 2023 | Series A | $58 million | DST Global Partners, Radical Ventures | ~$300 million |
| July 2025 | Growth Round | $110 million | Nvidia, Snowflake | ~$1 billion |
Total disclosed funding: approximately $168 million.
Notable investors across rounds include DST Global Partners, Radical Ventures, Snowflake Ventures, Nvidia, and angel investor Nat Friedman (former GitHub CEO). The 2024 Snowflake acquisition discussion, which would have valued Reka at more than $1 billion in cash, was the closest the company came to leaving the independent startup track. After those talks ended without a deal in May 2024, the partners chose to keep working together on the commercial side and let Reka raise its own growth round in 2025 instead.
Reka is headquartered at 530 Lawrence Expy, PMB 9004, Sunnyvale, California. The company also operates Reka AI Pte. Ltd., a Singapore entity at 1 Robinson Road, AIA Tower, which handles operations across Asia. Reka AI Ltd., the UK entity, gives the company a presence in London. Beyond these formal offices, Reka describes itself as a remote-first company with talent based in California, Seattle, London, Zurich, and Hong Kong. As of mid-2025 the company had grown to roughly 50 people, still small by frontier-lab standards but several times the size of the original 20-person research team that built Reka Core.
| Feature | Reka AI | OpenAI | Google DeepMind | Anthropic |
|---|---|---|---|---|
| Founded | 2022 | 2015 | 2010 (DeepMind) | 2021 |
| Headquarters | Sunnyvale, CA | San Francisco, CA | London, UK | San Francisco, CA |
| Team Size (approx.) | ~50 | ~3,000+ | ~2,500+ | ~1,500+ |
| Flagship Model (2024) | Reka Core | GPT-4 | Gemini Ultra | Claude 3 Opus |
| Native Multimodal | Yes (text, image, video, audio) | Yes (GPT-4V/o) | Yes (Gemini) | Partial (text, image) |
| Video Understanding | Yes | Limited | Yes | No |
| Audio Input | Yes | Yes (Whisper integration) | Yes | No |
| On-Premise Deployment | Yes | No (API only) | Limited | No (API only) |
| Open-Source Models | Yes (Flash 3, Flash 3.1, Edge 2603, Reka Quant) | Limited (Whisper, CLIP) | Yes (Gemma) | No |
| Estimated Valuation | ~$1B (2025) | ~$157B (2025) | Part of Alphabet | ~$60B (2025) |