Aquila (language model)
Last reviewed
Jun 3, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,326 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,326 words
Add missing citations, update stale details, or suggest a clearer explanation.
Aquila (Chinese: 悟道·天鹰, Wudao Tianying) is a series of open bilingual (Chinese and English) large language models developed by the Beijing Academy of Artificial Intelligence (BAAI). The first generation was released in June 2023 as part of BAAI's WuDao 3.0 announcement, and the second generation, Aquila2, followed in October 2023. The models are distributed through BAAI's FlagOpen open-source ecosystem and its FlagAI toolkit, with weights hosted on Hugging Face and BAAI's ModelHub. Aquila is named for the Latin word for "eagle," matching the Chinese "Tianying" (heavenly eagle). [1][2]
BAAI introduced its WuDao line in 2021 with WuDao 1.0 and the much-publicized WuDao 2.0, a 1.75 trillion-parameter sparse model. WuDao 3.0, presented at the 5th BAAI Conference in June 2023, marked a shift away from a single very large model toward a modular system of openly released components. Within that system the language side was branded "WuDao Aquila," and the vision and multimodal side "WuDao Vision." [2][3]
Aquila was positioned as a from-scratch bilingual model rather than a fine-tune of an existing Western base model. BAAI stated that the architecture borrows design ideas from GPT-3 and LLaMA, swaps in more efficient low-level operators, and uses a redesigned tokenizer built for mixed Chinese and English text. Training used BAAI's BMTrain library, which BAAI reported reached close to eight times the training efficiency of a Megatron plus DeepSpeed ZeRO-2 setup on the same task. [1][4]
The first-generation Aquila centered on a 7-billion-parameter base model, Aquila-7B, released alongside the AquilaChat-7B dialogue model and AquilaCode generation models. BAAI announced 33B versions (Aquila-33B and AquilaChat-33B) as forthcoming, and an AquilaChat-33B with 33 billion parameters was discussed in BAAI's WuDao 3.0 materials, though the 33B base weights were not part of the initial public release. The 7B family was iterated through several point releases in mid-2023, including a v0.9 in July and a v0.10 on 15 August 2023 that BAAI said improved scores on benchmarks such as TruthfulQA. [1][4][5]
Aquila2 is the second-generation series, released on 12 October 2023 on BAAI ModelHub and Hugging Face. It added a much larger 34B model and an experimental 70B model, and replaced BMTrain with FlagScale, a training framework based on the Megatron-LM project. A version 1.2 update to the 34B models followed on 25 October 2023, and the experimental 70B models were released on 30 November 2023. The accompanying Aquila2 technical report (arXiv, August 2024) describes the architecture: grouped-query attention (GQA), rotary position embeddings (RoPE), a 100,000-token byte-pair-encoding vocabulary, and bfloat16 mixed-precision training. According to that report, Aquila2-34B was trained on roughly 1.8 trillion bilingual tokens. [6][7]
The table below lists the main released models.
| Model | Generation | Parameters | Type | Context | Notes |
|---|---|---|---|---|---|
| Aquila-7B | Aquila | 7B | Base | 2,048 | First-generation base model |
| AquilaChat-7B | Aquila | 7B | Chat (SFT + RL) | 2,048 | Dialogue model |
| AquilaCode-multi | Aquila | 7B | Code | 2,048 | Multilingual code generation |
| AquilaCode-py | Aquila | 7B | Code | 2,048 | Python-focused code generation |
| Aquila2-7B | Aquila2 | 7B | Base | 2,048 | N/A |
| AquilaChat2-7B | Aquila2 | 7B | Chat | 2,048 | Also a 16K long-context variant |
| Aquila2-34B | Aquila2 | 34B | Base | 4,096 | v1.2 update Oct 2023 |
| AquilaChat2-34B | Aquila2 | 34B | Chat | 4,096 | 16K and Int4-GPTQ variants |
| Aquila2-70B-Expr | Aquila2 | 70B | Base (experimental) | 4,096 | Released Nov 2023 |
| AquilaChat2-70B-Expr | Aquila2 | 70B | Chat (experimental) | 4,096 | Experimental |
AquilaChat is the conversational variant. AquilaChat-7B is a supervised fine-tuning (SFT) model built on Aquila-7B and further refined with reinforcement learning. BAAI described it as supporting fluent Chinese and English dialogue and extensible tool use through a special instruction format, including hooks to call image models such as AltDiffusion. AquilaChat2 carried the dialogue line into the second generation, adding AquilaChat2-7B and AquilaChat2-34B, long-context 16K variants (AquilaChat2-7B-16k and AquilaChat2-34B-16k), and a 4-bit quantized AquilaChat2-34B-Int4-GPTQ build. [4][6][8]
AquilaCode is the code-generation variant, continued from the Aquila-7B base on filtered open-source code. The first generation shipped as AquilaCode-multi for multilingual code and AquilaCode-py for Python, and BAAI also released hardware-specific builds: AquilaCode-7B-NV trained on NVIDIA GPUs and AquilaCode-7B-TS trained on domestic accelerators. BAAI emphasized that the code training set was deliberately small relative to other open code models while still aiming for competitive generation quality. [9][10]
A recurring theme in BAAI's presentation of Aquila is data compliance. The pretraining corpus is roughly 40 percent Chinese and 60 percent English. The Chinese portion was drawn from BAAI's accumulated datasets spanning more than 10,000 sources, the large majority of them domestic, supplemented with curated Chinese books and literature. BAAI promoted Aquila as the first open-source large language model to combine Chinese and English knowledge, a commercial license, and conformance with Chinese data regulations, a framing aimed at enterprises operating under those rules. The Aquila2 data pipeline relied on BAAI's FlagData tooling for cleaning and quality filtering. [4][6][11]
Aquila is one piece of FlagOpen, BAAI's umbrella for open-source AI infrastructure. Related projects include FlagAI (the model toolkit that originally hosted Aquila), FlagScale (large-model training, used for Aquila2), FlagData (data construction and cleaning), FlagEval (model evaluation), FlagPerf (AI-chip benchmarking), and FlagEmbedding, the family that includes BAAI's widely used BGE text embeddings. The Aquila2 technical report also introduces HeuriMentor, a training-management system with three parts: an Adaptive Training Engine for changing cluster size mid-run and supporting heterogeneous devices, a Training State Monitor for tracking loss and downstream metrics in real time, and a Data Management Unit for staging multi-phase data mixtures. BAAI later announced FlagOpen 2.0 at the 6th BAAI Conference in June 2024. [6][7][12]
BAAI reported strong objective-benchmark results for Aquila2-34B. The technical report gives a mean score of 72.20 across 21 bilingual datasets (68.63 on the English subset and 76.56 on the Chinese subset), and states that the 34B model outperforms LLaMA-2-70B while using fewer training tokens. Cited per-task figures include 81.18 on BUSTM (versus 71.20 for LLaMA-2-70B) and 39.02 on HumanEval (versus 29.90). The report also notes only minimal degradation under 4-bit quantization. The v1.2 update to Aquila2-34B was reported to lift scores on datasets including MMLU, TruthfulQA, CSL, TNEWS, OCNLI, and BUSTM. [6][7]
For the chat models, BAAI evaluated AquilaChat2-34B with its FlagEval framework and reported that the model approaches or exceeds GPT-3.5 on a subjective evaluation across several ability dimensions. As with most vendor-published results, these comparisons were prepared by the developer; the Aquila2-34B base model was independently listed on the Hugging Face Open LLM Leaderboard. [6][13]
Aquila source code is released under the Apache License 2.0, while the model weights are governed by the BAAI Aquila Model License Agreement, a custom license that permits commercial use subject to its restrictions. The experimental 70B models use a separate BAAI Aquila 70B Model License Agreement. This dual arrangement, permissive code plus a bespoke weights license allowing commercial deployment, was a selling point BAAI used to distinguish Aquila from contemporaries with more restrictive non-commercial terms. [1][6][9]