Apriel (ServiceNow)

AI Models Large Language Models Open Source AI

7 min read

Updated Jun 8, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 8, 2026

Fact-checked

In review queue

Sources

11 citations

Revision

v1 · 1,441 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Apriel is a family of open-weight small language models developed by ServiceNow, the enterprise-software company, through its in-house AI research organization. The family pairs compact size (roughly 5 billion to 15 billion parameters) with an emphasis on strong reasoning and efficiency, so that frontier-adjacent reasoning quality can be delivered on a single GPU and embedded cost-effectively in ServiceNow's enterprise AI agents and workflows. The best-known members are Apriel-5B (a general-purpose base and instruct model), Apriel-Nemotron-15B-Thinker (a reasoning model built with NVIDIA Nemotron data and infrastructure), and Apriel-1.5-15B-Thinker (a multimodal reasoning model). All are released under the permissive MIT license on Hugging Face.^[1]^[2]^[3]

The name "Apriel" derives from the Latin word aperire, meaning "to open," chosen to signal the family's open-weight release, its clear reasoning-based answers, and its accessibility for developers.^[4]

Overview

Apriel models are positioned as an efficiency-first alternative to much larger frontier systems. Rather than scaling parameter count, ServiceNow's stated thesis is that careful data curation and a staged training recipe can extract competitive reasoning from a model small enough to run on a single accelerator. ServiceNow reports that the 15B "Thinker" variants occupy roughly half the memory of comparable 32B-parameter reasoning models, allowing them to fit within a single 80 GB GPU (such as an NVIDIA H100) or a pair of consumer GPUs.^[2]^[5]

The "Thinker" suffix denotes the reasoning variants, which produce explicit chain-of-thought traces before answering, in the same vein as other small reasoning models such as Microsoft's Phi series, Alibaba's Qwen reasoning models, NVIDIA's Nemotron family, and the distilled variants of DeepSeek-R1. The family's intended use is to power ServiceNow's agentic products and enterprise workflows, where predictable cost and on-premise deployability matter as much as raw capability.^[1]^[5]

Developer (ServiceNow AI Research)

Apriel is built by ServiceNow's AI research group, sometimes referred to as the ServiceNow Language Models (SLAM) lab, a collaboration between ServiceNow Research and ServiceNow AI. ServiceNow states that the models were "built and trained entirely in-house, from data to architecture to infrastructure."^[1]^[6] Torsten Scholak is cited as a research lead at ServiceNow's Foundation Models Lab overseeing Apriel, and Srinivas Sunkara, ServiceNow's VP of Machine Learning Engineering, has spoken publicly about the naming and design goals.^[4]^[6]

The two 15B "Thinker" reasoning models were developed in collaboration with NVIDIA. For Apriel-Nemotron-15B-Thinker, ServiceNow used NVIDIA DGX Cloud infrastructure and curated datasets from the NVIDIA Nemotron collection; the company reports that roughly a quarter of the data used in the model's depth up-scaling stage came from Nemotron.^[2]^[5] In October 2025 ServiceNow and NVIDIA announced a deepened partnership, including the next-generation "Apriel 2.0" Nemotron open-model family aimed at multimodal enterprise reasoning.^[7]

The Apriel models

The family has grown from a general-purpose 5B base model into a line of compact reasoning systems.

Apriel-5B (released April 2025) is a decoder-only model with about 4.8 billion parameters, shipped as a Base and an Instruct variant. ServiceNow reports it was trained on more than 4.5 trillion tokens and emphasizes efficiency, citing roughly 2.3 times fewer GPU-hours and about 31 percent less compute than the comparable OLMo-2 7B model.^[3]^[8]
Apriel-Nemotron-15B-Thinker (paper August 2025) is a 15-billion-parameter text reasoning model, up-scaled from Mistral-Nemo-Base-2407 and post-trained with Nemotron data using continual pretraining, supervised fine-tuning, and reinforcement learning. ServiceNow reports it matches medium-sized state-of-the-art reasoning models such as o1-mini, QwQ-32B, and EXAONE-Deep-32B at roughly half their memory footprint.^[2]^[5]
Apriel-1.5-15B-Thinker (released October 2025) is a 15-billion-parameter multimodal (image-and-text) reasoning model built from Mistral's Pixtral-12B base. Its accompanying paper carries the subtitle "Mid-training is all you need," reflecting that it reaches its quality through continual pretraining and supervised fine-tuning alone, without a reinforcement-learning or preference-optimization stage.^[1]^[9]

ServiceNow has continued to iterate, with later updates such as Apriel-1.6-15B-Thinker extending the same single-GPU, multimodal reasoning approach.^[10]

Architecture and training

Apriel-5B uses a transformer decoder with grouped-query attention and YaRN rotary position embeddings, trained on ServiceNow's in-house "Fast-LLM" training stack using on the order of 91,000 H100 GPU-hours, with a stated knowledge cutoff around April 2024.^[3]

The 15B "Thinker" models are not trained from scratch. They are produced by depth up-scaling, in which transformer layers from a smaller open base model are duplicated to enlarge the network before further training, an approach ServiceNow reports uses less than 20 percent of the compute of training from scratch.^[2]

For Apriel-Nemotron-15B-Thinker, ServiceNow describes a four-stage pipeline: (1) depth up-scaling from Mistral-Nemo-Base-2407 (a 12B model), (2) continual pretraining on roughly 68 billion tokens weighted toward reasoning and chain-of-thought data, (3) supervised fine-tuning (including specialized math-focused checkpoints later merged by weight averaging), and (4) reinforcement learning using Group Relative Policy Optimization (GRPO) with rule-based rewards.^[2]
For Apriel-1.5-15B-Thinker, ServiceNow builds on Pixtral-12B-Base-2409, expanding the decoder from 40 to 48 layers and realigning the projection network so the enlarged decoder matches the existing vision encoder. Training then proceeds through staged continual pretraining (developing text and vision understanding, then visual reasoning via synthetic data) followed by text-only supervised fine-tuning on curated reasoning traces. The paper reports the final model was trained on 640 NVIDIA H100 GPUs over roughly seven days, and that no image-specific SFT or RL stage was used.^[1]^[9]

Benchmarks

ServiceNow's central claim is "frontier-adjacent" reasoning at around 15 billion parameters. The headline result is for Apriel-1.5-15B-Thinker, which the independent evaluation service Artificial Analysis scores at 52 on its Artificial Analysis Intelligence Index, an aggregate of ten third-party evaluations. ServiceNow and Artificial Analysis note this is competitive with much larger systems such as DeepSeek-R1-0528 while the model is roughly 8 to 10 times smaller, framing it as comparable intelligence at a fraction of the cost.^[1]^[11] These scores are vendor-reported or drawn from a third-party aggregator and should be read as such; users should consult the original sources for exact, current figures.

Reported component scores for Apriel-1.5-15B-Thinker include approximately 88 on AIME 2025, about 71 on GPQA Diamond, about 73 on LiveCodeBench, 62 on the IFBench instruction-following benchmark, and 68 on the Tau-squared Bench (Telecom) agentic benchmark.^[1]^[11]

For the earlier Apriel-Nemotron-15B-Thinker, ServiceNow's paper reports figures such as AIME 2024 around 73, MATH-500 around 92, MMLU-Pro around 73, and GPQA Diamond around 57, evaluated alongside enterprise-focused tasks including MBPP, BFCL, Enterprise RAG, MT-Bench, MixEval, IF-Eval, and MultiChallenge.^[2]

The table below summarizes the principal members of the family.

Model	Released	Parameters	Modality	Base model	Key training stages	License	Notable reported result
Apriel-5B (Base / Instruct)	April 2025	~4.8B	Text	Trained from scratch	Pretraining (4.5T+ tokens) on Fast-LLM	MIT	GSM8K ~64 (Base); IFEval ~81 (Instruct)
Apriel-Nemotron-15B-Thinker	August 2025	~15B	Text	Mistral-Nemo-Base-2407 (12B)	Depth up-scaling, CPT (~68B tokens), SFT, GRPO RL	MIT	AIME 2024 ~73; MATH-500 ~92
Apriel-1.5-15B-Thinker	October 2025	~15B	Text + image	Pixtral-12B-Base-2409	Depth up-scaling (40 to 48 layers), staged CPT, SFT	MIT	Artificial Analysis Index 52; AIME 2025 ~88

Figures are approximate and as reported by ServiceNow or by Artificial Analysis; benchmark methodologies differ and results change with evaluation settings.

Availability and significance

All Apriel models are released as open-weight checkpoints under the MIT license and are distributed through Hugging Face, with the reasoning variants also offered via inference partners.^[1]^[3]^[10] The MIT license permits broad commercial use, which fits ServiceNow's strategy of embedding the models in its own platform while also encouraging external adoption.

Apriel illustrates a broader 2025 to 2026 trend toward small, open reasoning models that aim to close much of the gap to frontier systems through data-centric training rather than sheer scale, alongside families such as Phi, Qwen, Nemotron, and DeepSeek distillations. For ServiceNow specifically, the family provides a controllable, deployable foundation for enterprise AI agents, where the ability to run capable reasoning on a single GPU translates directly into lower serving cost and easier on-premise or regulated deployment.^[1]^[5]^[7]

References

ServiceNow-AI. "Apriel-1.5-15b-Thinker." Hugging Face. https://huggingface.co/ServiceNow-AI/Apriel-1.5-15b-Thinker ↩
ServiceNow Language Models Lab. "Apriel-Nemotron-15B-Thinker." arXiv:2508.10948. https://arxiv.org/abs/2508.10948 ↩
ServiceNow-AI. "Apriel-5B-Base." Hugging Face. https://huggingface.co/ServiceNow-AI/Apriel-5B-Base ↩
The Letter Two. "The Meaning Behind ServiceNow's Apriel AI Model Name." May 18, 2025. https://thelettertwo.com/2025/05/18/why-servicenow-named-its-lightweight-ai-models-apriel/ ↩
NVIDIA Blog. "Your Service Teams Just Got a New Coworker, Built by ServiceNow and NVIDIA." https://blogs.nvidia.com/blog/servicenow-apriel-nemotron/ ↩
ServiceNow. "Apriel Model Family: Frontier Reasoning." 2025. https://www.servicenow.com/blogs/2025/apriel-model-family-frontier-reasoning ↩
Business Wire. "ServiceNow Unites Intelligent Workflows and Open Models with NVIDIA Technologies to Scale Trusted AI Across Industries." October 28, 2025. https://www.businesswire.com/news/home/20251028739624/en/ ↩
ServiceNow. "Apriel 5B: Small but mighty enterprise language model." 2025. https://www.servicenow.com/blogs/2025/apriel-5b-small-enterprise-language-model ↩
ServiceNow AI Research. "Apriel-1.5-15b-Thinker: Mid-training is all you need." arXiv:2510.01141. https://arxiv.org/abs/2510.01141 ↩
ServiceNow-AI. "Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance." Hugging Face Blog. https://huggingface.co/blog/ServiceNow-AI/apriel-1p6-15b-thinker ↩
Artificial Analysis. "Apriel-v1.5-15B-Thinker: Intelligence, Performance & Price Analysis." https://artificialanalysis.ai/models/apriel-v1-5-15b-thinker ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

Best Small Language Models DeepSeek V3 Llama 3

Overview

Developer (ServiceNow AI Research)

The Apriel models

Architecture and training

Benchmarks

Availability and significance

References

Improve this article

Related Articles

Llama 3

OLMo

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

What links here

Related Articles

Llama 3

OLMo

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

What links here