HiDream

Chinese AI Image Generation Open Source AI

6 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

12 citations

Revision

v2 · 1,284 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

HiDream most commonly refers to HiDream-I1, an open-source text-to-image generative foundation model released in April 2025 by the Chinese company HiDream.ai (Chinese: 智象未来). The model has about 17 billion parameters and uses a sparse diffusion transformer with a mixture-of-experts design. It drew attention shortly after release for topping the Artificial Analysis text-to-image arena, where its distilled "Dev" variant briefly outranked closed models including GPT-4o image generation. ^[1]^[2] The same name also covers a small family of related models from HiDream.ai (the E1 editing models and the later O1-Image model) and the company's consumer products, vivago.ai and PixMaker. ^[3]

HiDream.ai (the company)

HiDream.ai is a generative AI company founded in 2023 and headquartered in Beijing, China. ^[4] It builds foundation models for image, video, 3D, and text generation, and it operates consumer and marketing-oriented creative tools on top of those models. ^[3]^[4]

The founder and CEO is Tao Mei (梅涛), a computer-vision and multimedia researcher. Before starting HiDream.ai he was a Vice President at the e-commerce company JD.com and, earlier, a Senior Research Manager at Microsoft Research. ^[5] He holds B.E. and Ph.D. degrees from the University of Science and Technology of China, and he is a Fellow of the ACM, IEEE, IAPR, and CAAI, as well as an International Fellow of the Canadian Academy of Engineering. ^[5] (Some secondary descriptions order the name as "Mei Tao"; the surname is Mei.)

The company has raised venture funding across multiple rounds, including a seed round in late 2023 and, in April 2026, a round exceeding CNY 500 million led by investors including Oriental Fortune Capital and Anhui Provincial Investment Group, earmarked for a next-generation multimodal model and enterprise products. ^[6]

HiDream-I1 model

HiDream-I1 is the company's first open-source image foundation model. The transformer weights were published on GitHub and Hugging Face in early April 2025, and a technical report, "HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer," followed on 28 May 2025 (arXiv:2505.22705). ^[7]^[8]

The model generates 1024 x 1024 images from text prompts. It is described by its authors as achieving state-of-the-art prompt following among open models while keeping inference fast, which is the basis for the "high-efficient" label in the report title. ^[8]

Architecture

HiDream-I1 has roughly 17 billion parameters and is built as a sparse diffusion transformer. The network combines dual-stream blocks, which process image and text tokens in separate paths, with single-stream blocks, where the two modalities interact. Both block types use a dynamic mixture-of-experts layer, so only part of the network is active for a given token, which keeps compute lower than a dense model of the same nominal size. ^[8]

For text conditioning the model uses four sources of text features rather than a single encoder: a long-context CLIP variant (Long-CLIP), the T5-XXL encoder, the Llama 3.1 8B Instruct language model, and pooled text embeddings. ^[1]^[8] The image autoencoder (VAE) is reused from Black Forest Labs' FLUX.1 [schnell] release. ^[1]

Variants

HiDream-I1 ships in three variants that trade speed against the number of sampling steps. The "Full" model runs the standard sampler, while "Dev" and "Fast" are distilled for fewer steps and run with classifier-free guidance effectively disabled (guidance scale 1.0). ^[1]^[9]

Variant	Inference steps	Guidance scale (cfg)	Notes
HiDream-I1-Full	50	5.0	Highest-fidelity base model
HiDream-I1-Dev	28	1.0	Guidance-distilled; the variant that topped the arena
HiDream-I1-Fast	16	1.0	Fastest; fewest steps

The technical report cites 14 steps for the Fast configuration, slightly fewer than the 16 steps recommended in the GitHub README and common ComfyUI workflows; both figures refer to the same distilled model. ^[8]^[9]

Open-source release and licensing

The HiDream-I1 transformer weights are released under the MIT License, which permits commercial use, and this permissive licensing was a large part of why the release was widely picked up. ^[1]^[2] Because the model is assembled from third-party components, those pieces keep their own licenses: the reused FLUX.1 [schnell] VAE is under Apache 2.0, the T5-XXL encoder is under Apache 2.0, and the bundled Llama 3.1 8B Instruct text encoder is governed by Meta's Llama 3.1 Community License. ^[1]

Benchmarks and arena standing

On standard prompt-following benchmarks reported in the technical report, HiDream-I1 scores 0.83 overall on GenEval and 85.89 on DPG-Bench, and it averages 33.82 on the HPSv2.1 human-preference benchmark. The GenEval result is above the 0.80 reported there for Janus-Pro-7B, and the HPSv2.1 average is above the 32.47 reported for FLUX.1-dev. ^[1]^[8]

Shortly after release in April 2025, HiDream-I1 reached the top of the Artificial Analysis text-to-image arena, an Elo-style leaderboard built from blind human pairwise preferences. Chinese state outlet China Daily reported that the model "topped the Artificial Analysis global leaderboard within 24 hours of its release, beating mainstream models from companies such as Midjourney, OpenAI and Google." ^[2] The MIT-licensed Dev variant was the entry that reached the number-one position, briefly above GPT-4o image generation. ^[2]^[10] That standing was temporary, as it is for any arena as newer models arrive; by 2026 HiDream's own O1-Image model and various other systems sat above the original I1 entries. ^[3]

HiDream-E1 is an instruction-based image-editing model built by fine-tuning HiDream-I1 on a dataset of about 5 million (source image, editing instruction, target image) triplets, letting users edit an image with natural-language commands and no masks. ^[8] An updated HiDream-E1.1, open-sourced on 16 July 2025, adds dynamic resolution up to roughly 1 megapixel, lifting the original E1's 768 x 768 limit. ^[11]

In 2026 HiDream.ai released HiDream-O1-Image, an 8-billion-parameter model open-sourced under the MIT License on 8 May 2026 (arXiv:2605.11061). ^[12] It uses a "Pixel-level Unified Transformer" that encodes raw pixels, text, and task conditions in one shared token space, dropping the external VAE and separate text encoders used by latent diffusion models, and it handles text-to-image generation, instruction editing, and subject-driven personalization at up to 2048 x 2048. ^[12] At launch it sat around eighth on the Artificial Analysis text-to-image arena, the highest-placed open-weight model there at the time, despite being far smaller than competitors such as FLUX.2 [dev]. ^[12]

Products

Beyond the open models, HiDream.ai runs vivago.ai, a consumer all-in-one creative assistant available on web and mobile that generates and edits images and other media from text prompts; a relaunched "vivago 2.0" bundles several creation tools into one workflow. ^[3] The company also offers PixMaker (also referred to in some materials by the name Pixeling), aimed at marketing and commercial content creation. ^[3] These products are powered by HiDream.ai's own foundation models, including the HiDream-I1 line.

References

HiDream-ai, "HiDream-ai/HiDream-I1-Full" (model card), Hugging Face. https://huggingface.co/HiDream-ai/HiDream-I1-Full ↩
China Daily, "AI model of GUi-backed startup tops global ranking," 29 April 2025. https://regional.chinadaily.com.cn/GUI/2025-04/29/c_1089736.htm ↩
Artificial Analysis, "HiDream: Quality, Generation Time & Price Analysis" (model family page). https://artificialanalysis.ai/image/model-families/hidream ↩
Tracxn, "HiDream.ai, 2026 Company Profile, Team, Funding & Competitors." https://tracxn.com/d/companies/hidream.ai/__J7YzOctipcIcRt9HOMc7YAKTDBVK-kLMgjbci3BgPBY ↩
Tao Mei, "Tao Mei, Ph.D., Founder & CEO of HiDream.ai" (personal site). https://taomei.me/ ↩
Gasgoo Auto News, "HiDream.ai completes new financing round exceeding 500 million yuan," April 2026. https://autonews.gasgoo.com/articles/news/seeds-hidreamai-completes-new-financing-round-exceeding-500-million-yuan-2046514735661445121 ↩
HiDream-ai, "HiDream-ai/HiDream-I1" (GitHub repository). https://github.com/HiDream-ai/HiDream-I1 ↩
Tao Mei et al., "HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer," arXiv:2505.22705, 28 May 2025. https://arxiv.org/abs/2505.22705 ↩
RunComfy, "HiDream-I1: SOTA Image Generation with 17B Model" (workflow and recommended settings). https://www.runcomfy.com/comfyui-workflows/hidream-i1-sota-image-generation-with-17b-model ↩
Cloudbooklet (via X/@cloudbooklet), announcement that HiDream-I1-Dev was open-sourced under the MIT License and ranked #1 on the Artificial Analysis leaderboard, April 2025. https://x.com/cloudbooklet/status/1909831808499187947 ↩
HiDream-ai, "HiDream-ai/HiDream-E1-1" (model card), Hugging Face. https://huggingface.co/HiDream-ai/HiDream-E1-1 ↩
HiDream-ai, "HiDream-O1-Image: A Natively Unified Image Generative Foundation Model with Pixel-level Unified Transformer," arXiv:2605.11061, May 2026. https://arxiv.org/abs/2605.11061 ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Best AI Image Generators Stable Diffusion

HiDream.ai (the company)

HiDream-I1 model

Architecture

Variants

Open-source release and licensing

Benchmarks and arena standing

Related models

Products

References

Improve this article

Related Articles

Seedream

Doubao Seedream

Seedream 4.0

Seedream 5.0

Hunyuan Image 3.0

Jimeng (Dreamina)

What links here

Related Articles

Seedream

Doubao Seedream

Seedream 4.0

Seedream 5.0

Hunyuan Image 3.0

Jimeng (Dreamina)

What links here