Stable Diffusion 3.5

Diffusion Models Image Generation Open Source AI

19 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

27 citations

Revision

v5 · 3,830 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Stable Diffusion 3.5 (SD 3.5) is a family of open-weights text-to-image diffusion models released by Stability AI on October 22, 2024, comprising three variants: Stable Diffusion 3.5 Large (8.1 billion parameters), Stable Diffusion 3.5 Large Turbo (a distilled 4-step version), and Stable Diffusion 3.5 Medium (approximately 2.5 billion parameters). All three are built on the Multimodal Diffusion Transformer (MMDiT) architecture introduced earlier the same year with Stable Diffusion 3 (SD3), and all ship under the permissive Stability AI Community License, which is free for non-commercial use and for commercial users earning less than US$1 million in annual revenue.^[1]^[13] The Large and Large Turbo variants were released on October 22, 2024, and the Medium variant followed on October 29, 2024.^[1]^[2]

Stability AI described the SD 3.5 release as "our most powerful models yet" and positioned it as "one of the most customizable and accessible image models on the market."^[1] The release was widely interpreted as a course correction following the contentious launch of SD3 Medium in June 2024, which had been criticized both for image-quality regressions, particularly in human anatomy, and for an unusually restrictive commercial license. With SD 3.5, Stability AI re-released open weights under the more permissive Stability AI Community License.^[1]^[3]^[4] The company explicitly acknowledged that its prior SD3 Medium release "didn't fully meet our standards or our communities' expectations" and framed SD 3.5 as a response to community feedback.^[1]^[5]

The release introduced architectural refinements, most notably the integration of Query-Key Normalization (QK-norm) and, in the Medium variant, an "MMDiT-X" variant with additional self-attention modules and dual attention blocks, and broadened the model's distribution channels via Hugging Face, the Stability AI API, ComfyUI, Replicate, Fireworks AI, DeepInfra, NVIDIA NIM microservices, and (from December 2024) Amazon Bedrock.^[1]^[6]^[7]^[8] In subjective and benchmark testing, SD 3.5 Large was generally described as competitive with FLUX.1 [dev] and other contemporary frontier image models on prompt adherence, while sometimes trailing FLUX.1 [pro] on photorealism.^[9]^[10]^[11]

Key facts

Field	Detail
Developer	Stability AI
Release	October 22, 2024 (Large, Large Turbo); October 29, 2024 (Medium)^[1]^[2]
Models	Stable Diffusion 3.5 Large (8.1B parameters), Large Turbo (8B distilled), Medium (~2.5B parameters)^[1]^[7]^[12]
Architecture	Multimodal Diffusion Transformer (MMDiT/MMDiT-X) with QK-normalization and (in Medium) dual attention blocks^[1]^[13]^[14]
Text encoders	OpenCLIP-ViT/G, CLIP-ViT/L, T5-XXL^[13]^[14]
Training objective	Rectified-flow formulation from Stable Diffusion 3^[15]
License	Stability AI Community License (free under $1M revenue), Enterprise License above^[1]^[4]
Predecessor	Stable Diffusion 3 (SD3 Medium, June 12, 2024)^[3]^[5]
Successor	No "Stable Diffusion 4" had been released as of this article's writing; SD 3.5 remained Stability AI's flagship open image model^[16]

When was Stable Diffusion 3.5 released?

Stability AI announced the Stable Diffusion 3.5 family on October 22, 2024, making SD 3.5 Large and SD 3.5 Large Turbo available immediately, with SD 3.5 Medium following on October 29, 2024.^[1]^[2] The announcement framed the release around three stated principles: that the models be customizable, efficient (able to run on consumer hardware), and produce diverse output representative of the world.^[1] All three variants were distributed as open weights through gated Hugging Face repositories under the Stability AI Community License.^[1]^[13]

Variant	Release date	Parameters	Notes
SD 3.5 Large	October 22, 2024	8.1B^[1]	Flagship; ~1 MP output^[1]^[13]
SD 3.5 Large Turbo	October 22, 2024	8B (distilled)	4-step inference via ADD^[6]^[18]
SD 3.5 Medium	October 29, 2024	~2.5B^[1]^[14]	MMDiT-X; runs on consumer GPUs^[1]^[14]

Background: SD3 and the license controversy

Stable Diffusion 3 (SD3) was first announced as a research preview in February 2024 and described in detail in the paper Scaling Rectified Flow Transformers for High-Resolution Image Synthesis by Patrick Esser and colleagues, posted to arXiv on March 5, 2024.^[15] The paper introduced the Multimodal Diffusion Transformer (MMDiT), a transformer-based replacement for the U-Net backbone used by earlier Stable Diffusion releases, along with a diffusion model training objective based on rectified flow, in which data and noise are connected by a linear trajectory and the noise schedule is reweighted toward perceptually relevant scales.^[15] The paper studied models ranging from 450 million to 8 billion parameters and reported smooth scaling improvements in validation loss and human preference.^[15]

On June 12, 2024, Stability AI publicly released SD3 Medium, a roughly 2-billion-parameter model, as open weights on Hugging Face. The release was unusually controversial for two independent reasons. First, users widely reported severe image-quality regressions, especially on human anatomy, with the model frequently producing distorted hands, feet, and limbs. Second, the accompanying "Stability AI Community License" introduced unfamiliar restrictions for commercial use, with reviewers and community sites flagging concerns over its definition of derivative works and its termination clauses.^[4]^[5]^[17] CivitAI, a major community model-sharing platform, temporarily banned all SD3-related uploads pending clarification.^[17]

The original SD3 license also drew attention for an expansive definition of "derivative works" that some interpreted as covering any model trained on outputs from SD3, raising fears that LoRAs and fine-tunes could fall under Stability AI's continuing control. The same agreement was widely flagged for clauses that appeared to make end users liable for downstream misuse by their own customers, and for permitting Stability AI to terminate the agreement at its discretion.^[3]^[4]^[17] Stability AI revised the license terms in July 2024 to clarify that the model could be used without charge by individuals and organizations earning under US$1 million in annual revenue, but the goodwill damage was substantial.^[4] The CEO at the time apologized publicly and promised an improved model.^[5] In that context, the October 2024 SD 3.5 release was both a technical revision aimed at addressing the anatomical and prompt-adherence shortcomings of SD3 Medium and a re-affirmation of Stability AI's stated commitment to open, broadly licensed weights.^[1]

What models are in the SD 3.5 family?

The SD 3.5 release comprises three open-weights models, all distributed under the Stability AI Community License through gated Hugging Face repositories.

Stable Diffusion 3.5 Large

Stable Diffusion 3.5 Large is the flagship of the family and was released on October 22, 2024. It has approximately 8.1 billion parameters in the transformer backbone and is designed for professional use cases, generating images at resolutions up to roughly 1 megapixel (e.g., 1024x1024).^[1]^[7]^[12] Stability AI describes the SD 3.5 models as "our most powerful models yet" and emphasizes image quality, prompt adherence, and typography for the Large variant.^[1]^[13] The model is positioned to be customizable for downstream professional use, with the QK-normalization changes intended to make fine-tuning more tractable than the prior SD3 release.^[1]^[13] The reference Diffusers pipeline recommends 28-40 denoising steps and a classifier-free guidance scale of approximately 4.5 for image generation, with bf16 precision as the standard inference dtype.^[18] The model was later made available for enterprise users through Amazon Bedrock (US West / Oregon region) on December 19, 2024, where Stability AI noted the model had been trained on Amazon SageMaker HyperPod.^[7]

Stable Diffusion 3.5 Large Turbo

Stable Diffusion 3.5 Large Turbo is a timestep-distilled variant of SD 3.5 Large, also released on October 22, 2024.^[1]^[6] It is produced via Stability AI's Adversarial Diffusion Distillation (ADD) technique, originally developed for the SDXL Turbo and Stable Video Diffusion lines, and is optimized for few-step inference: Stability AI's reference model card generates images in just 4 sampling steps with classifier-free guidance effectively disabled (guidance scale 0).^[6]^[18] In Adversarial Diffusion Distillation, the student model is trained against a discriminator that pushes its few-step outputs to match those of a multi-step teacher, allowing the student to reproduce high-fidelity samples with a small number of denoising calls. The model trades some peak image quality and prompt fidelity for an order-of-magnitude reduction in inference cost relative to the full Large model, and was positioned as a competitor to FLUX.1 [schnell] in the few-step open-weights category.^[9]^[10] Because guidance is disabled in the distilled pipeline, classifier-free guidance scale tuning, a common knob for steering image quality and prompt strength in standard diffusion sampling, does not apply at inference time, simplifying deployment for high-throughput hosted services.^[6]^[18]

Stable Diffusion 3.5 Medium

Stable Diffusion 3.5 Medium was released on October 29, 2024, one week after the Large variants.^[1]^[2]^[14] It has approximately 2.5 billion parameters (some early coverage cited 2.6 billion) and is engineered to run "out of the box" on consumer hardware, generating images at resolutions from roughly 0.25 to 2 megapixels.^[1]^[2]^[14] It uses a refined MMDiT-X architecture, a Stability AI-specific variant of MMDiT, featuring self-attention modules in the first 13 transformer layers and dual attention blocks in the first 12 transformer layers, both intended to improve multi-resolution generation, structural coherence, and anatomy.^[14] Stability AI's reference inference requires roughly 9.9 GB of VRAM excluding text encoders, making it usable on mid-range consumer GPUs.^[1] The Medium model is trained on a mixed-resolution pipeline progressing through 256, 512, 768, 1024, and 1440 latent resolutions, with extended positional embedding spaces to better handle non-square and multi-resolution outputs.^[14]

How does the SD 3.5 architecture work?

MMDiT and MMDiT-X

All three SD 3.5 models share a diffusion model backbone based on the Multimodal Diffusion Transformer (MMDiT) architecture introduced in the SD3 paper.^[15] MMDiT differs from prior diffusion transformer (DiT) designs by maintaining separate weights for image and text token streams within each transformer block, while permitting bidirectional information flow between them via a joint attention operation. The SD3 paper showed that this dual-stream design outperformed both U-ViT and standard DiT in visual fidelity and text alignment over the course of training.^[15]

SD 3.5 retains this core MMDiT design but introduces two refinements that the prior SD3 Medium release did not have:^[13]^[14]^[18]

Query-Key Normalization (QK-norm). A normalization layer is applied to query and key projections within attention blocks. QK-norm has become standard practice for training large transformer models because it stabilizes training and reduces the risk of attention-logit blow-up at scale. Stability AI also says it makes the resulting weights easier to fine-tune.^[13]^[18]
Dual attention layers. In SD 3.5 Large, each MMDiT block uses double attention layers rather than the single attention used in SD3 Medium; as the Hugging Face Diffusers team put it, "instead of using single attention layers for each stream of modality in the MMDiT blocks, SD3.5 uses double attention layers."^[18] In SD 3.5 Medium, the MMDiT-X variant adds additional self-attention modules in the first 13 transformer layers and dual attention blocks in the first 12 layers, intended to improve multi-resolution behavior.^[14]

The text encoder stack, latent diffusion VAE decoder (16 latent channels), and noise scheduler are unchanged from SD3 Medium.^[18]

Text encoders

SD 3.5 uses three fixed, pretrained text encoders concatenated along the sequence dimension:^[13]^[14]^[19]

OpenCLIP-ViT/G (context length 77 tokens)
CLIP-ViT/L (context length 77 tokens)
T5-XXL (context length 77 or 256 tokens depending on training stage)

The model can be run with any one or two encoders disabled to reduce memory, at some cost to prompt fidelity.^[19] The reference inference repository specifies OpenAI CLIP-L/14, OpenCLIP bigG, and Google T5-XXL.^[19]

Training and data

Stability AI describes the SD 3.5 training corpus as a combination of "synthetic data and filtered publicly available data."^[13]^[14] As with previous Stable Diffusion releases, much of the publicly available data is scraped from the web, and Stability AI relies on a fair-use interpretation against ongoing copyright challenges.^[11] By March 2023, artists had already removed approximately 80 million images from public training datasets used by Stability AI through opt-out tools, a process that continued into the SD 3.5 training corpus preparation.^[11] The company says it used multi-prompt captioning during training, with shorter captions prioritized, to improve the diversity of concepts and demographic representation across generated outputs.^[11]

The SD 3.5 training objective inherits the rectified-flow framework introduced in the SD3 paper, in which the model is trained to predict the velocity field of a straight-line interpolation between data and noise rather than the noise schedule used in classical denoising diffusion probabilistic models. The SD3 paper additionally introduced a logit-normal weighting of the timestep distribution that biases training toward perceptually relevant noise scales, a choice that the SD 3.5 family retains.^[15] The SD 3.5 Medium model further refines this objective with the mixed-resolution training schedule described above, with progressive crop augmentation on positional embeddings used to teach the model to handle non-square aspect ratios and a wider range of output resolutions.^[14]

How does SD 3.5 compare to FLUX and other models?

In Stability AI's own benchmark charts published alongside the announcement, SD 3.5 Large led peer open-weights and proprietary models on prompt adherence, including FLUX.1 [dev], Midjourney v6.1, Ideogram 2.0, and others, while remaining competitive on image quality, where it was placed close to FLUX.1 [pro] and ahead of other open models.^[1]^[20] Stability AI's framing emphasized that SD 3.5 Large was both broadly capable and small enough to run on a single consumer-grade GPU after quantization.^[1]

Independent reviews broadly corroborated this picture with significant nuance. Side-by-side comparisons in technical and consumer outlets observed that:

Dimension	SD 3.5 Large vs. competitors
Prompt adherence	On par with or ahead of FLUX.1 [dev] on complex multi-object prompts; clearly improved over SD3 Medium and SDXL^[9]^[10]
Photorealism	FLUX.1 [pro] retained a perceptible lead on skin texture, lighting, and material rendering^[9]^[10]
Typography (text in images)	Ideogram remained ahead on legibility, though SD 3.5 was a substantial improvement over SDXL^[10]
Semantic accuracy on nuanced prompts	DALL-E 3, coupled with GPT-4 prompt rewriting in ChatGPT, was often judged better than open models including SD 3.5^[10]
Few-step generation	SD 3.5 Large Turbo at 4 steps was broadly competitive with FLUX.1 [schnell]^[9]

Stability AI also noted in its announcement that SD 3.5 deliberately exhibits greater variation across seeds for the same prompt than some competitors, stating that "greater variation in outputs from the same prompt with different seeds may occur, which is intentional," a design choice meant to preserve stylistic diversity and broader knowledge at the cost of less deterministic outputs.^[1] Practitioners testing the model on photography, 3D-rendered scenes, painterly styles, and line-art benchmarks reported strong cross-style generalization but recommended pairing the Medium model with Skip Layer Guidance during sampling for better structural and anatomical coherency on portraits and figure-heavy compositions.^[14]

Is Stable Diffusion 3.5 free and open source?

All three SD 3.5 models are released under the Stability AI Community License, in the same form that had been retroactively applied to SD3 Medium in July 2024.^[1]^[4] Stability AI summarizes the license as "free for research, non-commercial, and commercial use for organizations or individuals with less than $1M in total annual revenue."^[13] The license has three principal tiers:

Tier	Who it covers	Cost
Non-commercial	Research, hobbyist, educational	Free^[1]^[13]
Commercial (small)	Individuals or organizations earning less than US$1 million in annual revenue	Free^[1]^[13]
Commercial (large)	Organizations earning US$1 million or more in annual revenue	Requires an Enterprise License from Stability AI^[13]^[14]

The license also affirms that users retain ownership of the media they generate and may distribute and commercialize that media independently of any restrictions on the model weights themselves.^[1]^[11] Compared to SD3 Medium's original June 2024 terms, which had restricted derivatives, set monthly active user caps, and triggered the CivitAI ban, the SD 3.5 license is functionally equivalent to the revised July 2024 SD3 terms and represents the same revenue-threshold model that the SD3 controversy had ultimately produced.^[3]^[4]^[17] Independent analysts continued to note that Stability AI retains the right to terminate the agreement, which some users view as a limitation relative to traditional permissive licenses.^[3]

The SD 3.5 reference inference repository on GitHub is published under the MIT License, with portions of helper code subject to the Hugging Face Transformers Apache 2.0 License.^[19]

Reception

The reception of SD 3.5 was broadly positive in both press coverage and community forums, particularly in contrast to SD3 Medium. Tom's Guide described it as "a step up in realism," Decrypt headlined its coverage as Stability AI "redeems itself," and How-To Geek noted the release came "with the right number of limbs," a pointed reference to SD3 Medium's anatomical failures.^[9]^[21]^[22] Hacker News and the r/StableDiffusion community on Reddit discussed the release at length on October 22-29, 2024, with many practitioners flagging SD 3.5 Large as competitive with FLUX.1 [dev] in their own testing while welcoming the return to broadly usable open weights.^[23]^[24]

CivitAI, which had banned SD3 content under the prior license, accepted SD 3.5 uploads under the revised Community License, allowing community LoRAs, fine-tunes, and other derivative artifacts to be redistributed alongside the official weights.^[17] By the time of the Medium release, Stability AI emphasized that the model had been trained "to generate more diverse images of people," with the company stating the goal was to create images "representative of the world, not just one type of person," achievable without specialized prompting, which it highlighted as an explicit response to user feedback about the homogeneity of earlier model outputs.^[11]^[1]^[9]

Criticism focused on three points. First, SD 3.5 Large at full precision required substantial VRAM (over 24 GB) for native inference, although Hugging Face Diffusers' integration with bitsandbytes 4-bit (NF4) quantization brought inference within reach of single 24 GB consumer GPUs.^[18] Second, although improved over SD3 Medium, some anatomical artifacts persisted, which Stability AI characterized as engineering trade-offs.^[11] Third, the community fine-tune and LoRA ecosystem for SD 3.5 was slower to mature than for SDXL, which remained the most widely used Stability AI base model in many production pipelines through 2025-2026.^[25]

Tooling and integration

Within days of the October 22, 2024 announcement, SD 3.5 Large was integrated into Hugging Face Diffusers via the StableDiffusion3Pipeline class, the same pipeline class previously introduced for SD3 Medium, since the text encoders, VAE, and scheduler were unchanged.^[18] Hugging Face's accompanying blog post documented inference at bf16 precision, recommended values of 28-40 sampling steps and a guidance scale of approximately 4.5 for the Large model, and 4 steps with low guidance for Large Turbo. It also documented training and fine-tuning recipes (e.g., DreamBooth LoRA) compatible with the existing SD3 training scripts.^[18]

A reference inference implementation is published by Stability AI at the GitHub repository Stability-AI/sd3.5 under the MIT License, supporting SD3.5 Large, Large Turbo, Medium, and SD3 Medium, plus the ControlNets released in late November 2024.^[19] ComfyUI gained native SD 3.5 support shortly after launch, including dedicated nodes for the Large Turbo's 4-step pipeline.^[1]^[19]

In addition to self-hosted use, the SD 3.5 family is available through Stability AI's hosted API and was distributed via partner platforms including Replicate, Fireworks AI, and DeepInfra at launch.^[1] In December 2024, Stability AI announced SD 3.5 Large availability in Amazon Bedrock for enterprise customers, with US West (Oregon) as the launch region.^[7] In a separate collaboration, Stability AI and NVIDIA shipped a Stable Diffusion 3.5 NIM microservice with TensorRT-optimized weights and bundled ControlNet variants for streamlined enterprise deployment.^[8]

On November 26, 2024, Stability AI released three ControlNets for SD 3.5 Large (Blur, Canny, and Depth), extending the family with conditioning models for upscaling to 8K/16K (Blur), structural control via edge maps (Canny), and depth-guided generation (Depth, using DepthFM-derived depth maps).^[26] Additional ControlNets, including ones targeting SD 3.5 Medium, were announced as in development.^[26]

Subsequent releases

As of this article's writing in 2026, Stability AI had not released a "Stable Diffusion 4" model.^[16] SD 3.5 Large remained the company's flagship open-weights image-generation model, and the SD 3.5 family (Large, Large Turbo, and Medium) collectively defined the company's open offering throughout 2025. Stability AI's later 2024-2026 work expanded primarily into adjacent modalities, including upgrades to Stable Video Diffusion (notably Stable Video 4D and SV4D 2.0 for 4D / multi-view video generation), and into enterprise distribution channels (Amazon Bedrock, NVIDIA NIM, and direct API) rather than into a numbered SD4 successor.^[16]^[27]^[7]^[8]

The ControlNet ecosystem for SD 3.5 continued to expand into 2025, with Stability AI signaling that additional control models, for SD 3.5 Medium and for new modalities such as additional structural and stylistic conditioning, were in development.^[26] Community-contributed LoRAs, IP-Adapter-style conditioning, and fine-tuned variants gradually accumulated on Hugging Face and CivitAI under the Community License, although coverage lagged the much larger SDXL ecosystem.^[25] During the same period, the broader open image-generation field shifted toward newer architectures and competitors such as the FLUX.1 family from Black Forest Labs, Ideogram 3, and Google Imagen 4, which together defined much of the 2025-2026 frontier alongside SD 3.5.^[10]

References

Stability AI, "Introducing Stable Diffusion 3.5," October 22, 2024. https://stability.ai/news/introducing-stable-diffusion-3-5 ↩
"Three models of the image generation AI 'Stable Diffusion 3.5' series are openly released," Gigazine, October 23, 2024. https://gigazine.net/gsc_news/en/20241023-stable-diffusion-3-5-released ↩
J. Roberts, "Stable Diffusion 3 License Revamped Amid Blowback, Promising Better Model," Decrypt, July 2024. https://decrypt.co/238871/stable-diffusion-3-license-revamped-amid-blowback ↩
"Stability AI Unveils a More Permissive Copyright License for Stable Diffusion 3," AiBase News, July 2024. https://news.aibase.com/news/10131 ↩
"Stability AI apologizes for disappointing Stable Diffusion 3, promises 'much improved' model soon," The Decoder, 2024. https://the-decoder.com/stability-ai-apologizes-for-disappointing-stable-diffusion-3-promises-much-improved-model-soon/ ↩
Hugging Face, "stabilityai/stable-diffusion-3.5-large-turbo" model card. https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo ↩
AWS, "Stable Diffusion 3.5 Large is now available in Amazon Bedrock," December 19, 2024. https://aws.amazon.com/blogs/aws/stable-diffusion-3-5-large-is-now-available-in-amazon-bedrock/ ↩
Stability AI, "Stability AI and NVIDIA Bring Faster Performance and Simplified Enterprise Deployment with the Stable Diffusion 3.5 NIM." https://stability.ai/news/stability-ai-and-nvidia-bring-faster-performance-and-simplified-enterprise-deployment-with-the-stable-diffusion-35-nim ↩
"Stable Diffusion 3.5: Stability AI Redeems Itself With New Models and Expanded Features," Decrypt. https://decrypt.co/287807/stable-diffusion-3-5-stability-ai-redeems-itself-with-new-models-and-expanded-features ↩
"Comparing Prompt Accuracy Across Various Image Generation AIs (Stable Diffusion 3.5, FLUX1.1, Imagen 3, DALL-E 3, Adobe Firefly)," DEV Community. https://dev.to/nabata/comparing-prompt-accuracy-across-various-image-generation-ais-stable-diffusion-35-flux11-imagen-3-dalle-3-adobe-firefly-1jlp ↩
K. Wiggers, "Stability claims its newest Stable Diffusion models generate more 'diverse' images," TechCrunch, October 22, 2024. https://techcrunch.com/2024/10/22/stability-claims-its-newest-stable-diffusion-models-generate-more-diverse-images/ ↩
M. Wheatley, "Stability AI releases next-gen open-source Stable Diffusion 3.5 text-to-image AI model family," SiliconANGLE, October 22, 2024. https://siliconangle.com/2024/10/22/stable-ai-releases-next-gen-open-source-stable-diffusion-3-5-text-image-ai-model-family/ ↩
Hugging Face, "stabilityai/stable-diffusion-3.5-large" model card. https://huggingface.co/stabilityai/stable-diffusion-3.5-large ↩
Hugging Face, "stabilityai/stable-diffusion-3.5-medium" model card. https://huggingface.co/stabilityai/stable-diffusion-3.5-medium ↩
P. Esser, S. Kulal, A. Blattmann et al., "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis," arXiv:2403.03206, March 5, 2024. https://arxiv.org/abs/2403.03206 ↩
"Stable Diffusion," Wikipedia. https://en.wikipedia.org/wiki/Stable_Diffusion ↩
E. Yip, "Stable Diffusion's End: Ambiguous Licensing Prompts Temporary Ban on SD3 Models," Medium / StableDiffusion. https://medium.com/stablediffusion/stable-diffusions-end-ambiguous-licensing-prompts-temporary-ban-on-sd3-models-a261d30a9785 ↩
Hugging Face Blog, "Diffusers welcomes Stable Diffusion 3.5 Large," October 22, 2024. https://huggingface.co/blog/sd3-5 ↩
Stability AI, `Stability-AI/sd3.5` GitHub repository. https://github.com/Stability-AI/sd3.5 ↩
"Stability AI Releases Stable Diffusion 3.5: Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo," MarkTechPost, October 22, 2024. https://www.marktechpost.com/2024/10/22/stability-ai-releases-stable-diffusion-3-5-stable-diffusion-3-5-large-and-stable-diffusion-3-5-large-turbo/ ↩
R. Morrison, "StabilityAI releases Stable Diffusion 3.5, a step up in realism," Tom's Guide. https://www.tomsguide.com/ai/stabilityai-releases-stable-diffusion-3-5-a-step-up-in-realism ↩
"Stable Diffusion 3.5 Is Out with Better Performance & the Right Number of Limbs," How-To Geek. https://www.howtogeek.com/stable-diffusion-3-5-release/ ↩
"StabilityAI releases Stable Diffusion 3.5," Hacker News discussion. https://news.ycombinator.com/item?id=41918087 ↩
"Stable Diffusion 3.5 medium has been released and there's something special," r/StableDiffusion via daslikes.wordpress.com. https://daslikes.wordpress.com/2024/10/29/stable-diffusion-3-5-medium-has-been-released-and-theres-something-special-via-r-stablediffusion/ ↩
"Stable Diffusion Reddit: What 500K Members Recommend in 2026," AI Tool Discovery. https://www.aitooldiscovery.com/guides/stable-diffusion-reddit ↩
"Stability AI Releases Stable Diffusion 3.5 Large ControlNet Models," ComfyUI Wiki, November 26, 2024. https://comfyui-wiki.com/en/news/2024-11-26-sd3-5-large-controlnets ↩
Stability AI News & Updates. https://stability.ai/news ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

4 revisions by 1 contributor · full history

Suggest edit

What links here

Diffusion model Doubao Seedream Flow Matching MMDiT (Multimodal Diffusion Transformer)Midjourney Mochi 1 Rectified Flow SDXL (Stable Diffusion XL)Stability AI

Key facts

When was Stable Diffusion 3.5 released?

Background: SD3 and the license controversy

What models are in the SD 3.5 family?

Stable Diffusion 3.5 Large

Stable Diffusion 3.5 Large Turbo

Stable Diffusion 3.5 Medium

How does the SD 3.5 architecture work?

MMDiT and MMDiT-X

Text encoders

Training and data

How does SD 3.5 compare to FLUX and other models?

Is Stable Diffusion 3.5 free and open source?

Reception

Tooling and integration

Subsequent releases

References

Improve this article

Related Articles

Stable Diffusion

Flux (text-to-image model)

Black Forest Labs

FLUX.1

SDXL (Stable Diffusion XL)

DALL-E

What links here

Related Articles

Stable Diffusion

Flux (text-to-image model)

Black Forest Labs

FLUX.1

SDXL (Stable Diffusion XL)

DALL-E

What links here