Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

How defensive AI and machine learning can strengthen your cyber defenses

January 25, 2026

We co-developed Amazon Alexa: Why I quit building an AI startup, I have no regrets

January 25, 2026

Accelerate drug research and development with AI-powered structural intelligence

January 25, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Monday, January 26
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Diffuser welcomes Stable Diffusion 3.5 Large
Tools

Diffuser welcomes Stable Diffusion 3.5 Large

By December 30, 2024Updated:February 13, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Stable Difffusion 3.5 is an improved version of its predecessor, Stable Difffusion 3. The model is currently available in the Hugging Face Hub and can be used with the 🧨 Diffuser.

This release includes two checkpoints.

Large-scale (8B) model Large-scale (8B) time-step distillation model that allows several steps of inference

This post focuses on how to use Stable Diffusion 3.5 (SD3.5) with Diffuser, covering both inference and training.

table of contents

Architecture changes

The SD3.5 (Large) transformer architecture is very similar to the SD3 (Medium) with the following differences:

QK Normalization: QK normalization has become standard for training large transformer models, and SD3.5 Large is no exception. Dual Attention Layer: Instead of using a single attention layer for each stream of modalities within an MMDiT block, SD3.5 uses a double attention layer.

The rest of the details regarding the text encoder, VAE, and noise scheduler are exactly the same as the SD3 Medium. For more information about SD3, we recommend checking the original paper.

Using SD3.5 with a diffuser

Be sure to install the latest version of your diffuser.

pip install -U diffuser

The model is gated, so before you can use it with your diffuser, you must first go to the Stable Diffusion 3.5 Large Hugging Face page and fill out the form to accept the gate. Once logged in, you must log in so the system knows you have accepted the gate. Log in using the command below.

hug face-cli login

The following snippet downloads the 8B parameter version of SD3.5 with torch.bfloat16 precision. This is the format used in the original checkpoint published by Stability AI and is the recommended method for performing inference.

import torch
from diffuser import StableDiffusion3Pipeline pipe = StableDiffusion3Pipeline.from_pretrained(
“stabilityai/stable-diffusion-3.5-large”torch_dtype=torch.bfloat16 ).to(“Cuda”) image = pipe(prompt =“Photo of a cat holding a sign that says Hello World”negative prompt =“”num_inference_steps=40height =1024width=1024guidance_scale=4.5).images(0) Image.Save(“sd3_hello_world.png”)

This release also comes with a “timestep distillation” model that eliminates classifier-less guidance and can generate images in fewer steps (typically 4-8 steps).

import torch
from diffuser import StableDiffusion3Pipeline pipe = StableDiffusion3Pipeline.from_pretrained(
“stabilityai/stable-diffusion-3.5-large-turbo”torch_dtype=torch.bfloat16 ).to(“Cuda”) image = pipe(prompt =“Photo of a cat holding a sign that says Hello World”num_inference_steps=4height =1024width=1024guidance_scale=1.0).images(0) Image.Save(“sd3_hello_world.png”)

hello_world_cat_2

All examples shown in the SD3 blog post and official diffuser documentation should already work in SD3.5. In particular, both of these resources detail optimizing memory requirements for performing inference. SD3.5 Large is significantly larger than SD3 Medium, so memory optimization is important to enable inference at the consumer interface.

Performing inference using quantization

Diffuser natively supports processing bit-sand-byte quantization, further optimizing memory.

First, make sure to install all required libraries.

pip install -Uq git+https://github.com/huggingface/transformers@main pip install -Uq bitsandbytes

Next, load the transformer with “NF4” accuracy.

from diffuser import BitsAndBytesConfig, SD3Transformer2DModel
import Torch model ID = “Stable AI/Stable Diffusion -3.5-Large”
nf4_config = BitsAndBytesConfig(load_in_4bit=truth,bnb_4bit_quant_type=“nf4”bnb_4bit_compute_dtype=torch.bfloat16 ) model_nf4 = SD3Transformer2DModel.from_pretrained(model_id, subfolder=“transformer”quantization_config=nf4_config, torch_dtype=torch.bfloat16)

Now you are ready to perform inference.

from diffuser import StableDiffusion3Pipeline Pipeline = StableDiffusion3Pipeline.from_pretrained( model_id,Transformer=model_nf4, torch_dtype=torch.bfloat16 ) Pipeline.enable_model_cpu_offload() Prompt = “A whimsical and creative image depicting a hybrid waffle-hippo creature basking in a river of melted butter in a breakfast-themed landscape, featuring the hippo’s unique, bulky body shape. But instead the normal gray-skinned creature’s body resembles a freshly baked golden-brown crispy waffle, and its skin is textured with the familiar checkered pattern of a waffle. This environment features a hippo’s natural habitat and a breakfast table setting, a river of warm melted butter peeking through the lush pancake-like foliage in the background. It is a combination of large dishes and plates.As the sun rises in this fantastical world, a river of butter yawns a satisfied creature, and a flock of birds takes flight nearby.”
image = pipeline( prompt = prompt, negative_prompt =“”num_inference_steps=28guidance_scale=4.5maximum sequence length =512).images(0) Image.Save(“Whimsical.png”)

happy hippo

You can control other knobs in BitsAndBytesConfig. See the documentation for more information.

It is also possible to directly load quantized models with the same nf4_config as above. This is especially useful for machines with low RAM. See this Colab notebook for an end-to-end example.

Training LoRA with SD3.5 Large with Quantization

Thanks to libraries like bitsandbytes and peft, it is possible to fine-tune large models like the SD3.5 Large on consumer GPU cards with 24 GB of VRAM. It is already possible to leverage existing SD3 training scripts for LoRA training. The training commands below are already working.

Accelerate startup train_dreambooth_lora_sd3.py \ –pretrained_model_name_or_path=“Stable AI/Stable Diffusion -3.5-Large” \ –dataset_name=“Norod78/Thread art style” \ –output_dir=“yart_art_sd3-5_lora” \ –mixed_precision=“BF16” \ –instance_prompt=“Frog, yarn art style” \ –caption_column=“Sentence”\ –resolution=768 \ –train_batch_size=1 \ –gradient_accumulation_steps=1 \ –learning_rate=4e-4 \ –report_to=“One Bu” \ –lr_scheduler=“Continuous” \ –lr_warmup_steps=0 \ –max_train_steps=700 \ –rank=16 \ –seed=“0” \ –push_to_hub

However, using quantize requires adjusting a few knobs. Here are some tips on how to do that.

Initialize the transformer with the quantization configuration or load the quantized checkpoint directly. Then prepare it using peft’s prepare_model_for_kbit_training(). The rest of the process remains the same thanks to peft’s strong support for bitsandbytes.

For a more complete example, see this sample script.

Using Single File Loading with Stable Diffusion 3.5 Transformer

You can load a Stable Diffusion 3.5 Transformer model using the original checkpoint file published by Stability AI using the from_single_file method.

import torch
from diffuser import SD3Transformer2DModel, StableDiffusion3Pipeline transformer = SD3Transformer2DModel.from_single_file(
“https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo/blob/main/sd3.5_large.safetensors”torch_dtype=torch.bfloat16, ) pipe = StableDiffusion3Pipeline.from_pretrained(
“Stable AI/Stable Diffusion -3.5-Large”transformer=transformer, torch_dtype=torch.bfloat16, ) Pipe.enable_model_cpu_offload() image = Pipe(“Cat holding a sign that says hello world”).images(0) Image.Save(“sd35.png”)

important links

Acknowledgment: The background photo used in the thumbnail in this blog post was provided by Daniel Frank. Thanks to Pedro Cuenca and Tom Aarsen for post-draft reviews.

author avatar
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleDavid Sachs named AI czar: What employers need to know about the new era of AI surveillance | Fisher Phillips
Next Article Delhi Police deploys AI-powered security for New Year celebrations

Related Posts

Tools

How defensive AI and machine learning can strengthen your cyber defenses

January 25, 2026
Tools

Accelerate drug research and development with AI-powered structural intelligence

January 25, 2026
Tools

Discovering new solutions to centuries-old problems in fluid mechanics

January 24, 2026
Add A Comment

Comments are closed.

Top Posts

Wall Street is pleased with Microsoft as it spends $100 billion on AI. Microsoft

July 30, 202510 Views

Gemini achieves gold medal level at International University Programming Contest World Finals

January 21, 20267 Views

AI grandma fights back against scammers

November 22, 20247 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Wall Street is pleased with Microsoft as it spends $100 billion on AI. Microsoft

July 30, 202510 Views

Gemini achieves gold medal level at International University Programming Contest World Finals

January 21, 20267 Views

AI grandma fights back against scammers

November 22, 20247 Views
Don't Miss

How defensive AI and machine learning can strengthen your cyber defenses

January 25, 2026

We co-developed Amazon Alexa: Why I quit building an AI startup, I have no regrets

January 25, 2026

Accelerate drug research and development with AI-powered structural intelligence

January 25, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?