Illustrious, a text-to-image model based on Stable Diffusion XL, has become so dominant in the AI art community that Civitai, the largest hub for AI art models, has a large ecosystem of resources. I had to create a separate category just to handle it.
And it all happened within 3 months. What is the secret of its success? A return to basics with a twist.
While newer models like SD 3.5 and Flux rely on long natural language descriptions, Illustrious developer Onoma AI takes a different approach by leveraging Danbooru tags to put wheels in motion with complex captioning systems. We made sure the model understood the concept without reinventing it.
Training your model on Danbooru’s vast library of tagged anime images gives you an edge in understanding visual concepts.
Each tag in the Danbooru system represents a specific element of the character, such as features, clothing, pose, background, etc., giving you precise control over the generated image without wasting valuable tokens on long descriptions.
These tags have been around for years and have become a kind of standard for classifying images among art and anime enthusiasts.
This model is very accurate and efficient in understanding photo features.
“It’s like having an artist who understands exactly what you want without having to explain it in paragraphs,” says Discord, who is part of a server focused on NSFW AI content. member Vishnu told Decrypt. “You just need to know the correct tag.”
At its core, Illustrious uses the good old SDXL architecture with a sophisticated dual encoder system combining CLIP ViT-L and OpenCLIP ViT-bigG to understand words and associate them with their visual equivalents .
This model is capable of processing and producing images at an impressive resolution of 1536 x 1536, which can be scaled up to 2048 x 2048 and even 3744 x 3744 without significant loss of quality.
By the way, the original SDXL handled Full HD resolution (1024×1024).
detailed description
The journey to creating Illustrious was methodical and deliberate. The first training phase, which produced version 0.1, processed 7.5 million images with a batch size of 192 images per batch at a resolution of 1024 × 1024.
The team carefully balanced the learning rate and ran it over 20 epochs (a process where the AI learns 100% of the dataset) to establish a solid foundation. If the results were satisfactory enough, the team increased the size of the dataset and increased the resolution used in the next iteration.
At the advanced training stage, Illustrious truly began to shine. Version 1.0 expands the dataset to 10 million images and increases resolution to 1536×1536.
While the batch size was reduced to 128, advanced tag manipulation strategies and register tokens were introduced, fundamental changes that define the model’s superior performance.
Further work was done in the final refinement phase of version 2.0. Working with 20 million images at the same high resolution and a large batch size of 512, the team incorporated a multi-captioning technique that dramatically improved text-to-image correspondence.
The result is the best wife generator known to man, with excellent fine-tuning capabilities, quick compliance, neat aesthetics, and high-quality output.
For the more tech-savvy, Illustrious developers have also introduced a number of interesting techniques, such as the “No Dropout Tokens” approach, which ensures that certain tokens are never left out during training. Implementation of quasi-registration tokens to allow models to handle unknown or strange concepts. Cosine annealing scheduler for learning rate. Multilevel dropout systems and input perturbation noise augmentation turn simple AI models into powerful ones.
How to use Illustrious
No additional steps are required to run Illustrious.
The installation process is the same as other SDXL models. Download the checkpoint and place it in the corresponding folder depending on the UI you use.
Windows and Linux
For ComfyUI, the root is \models\checkpoints. For A1111/Forge, the root is /models/Stable-diffusion. For Fooocus, the root is also \models\checkpoints.
MacOS
Mac users have a similar route. However, some common macOS-oriented UIs require additional steps.
Draw Things users must click on “Models”, go to “Customize” and click “Import Model”. From there, you can enter the URL to download Illustrious directly, or if you downloaded the model and saved it to your local drive, you can click Import Custom Model and select the file. Diffusion Bee users should click the hamburger icon in the top right corner, click (Settings), then (Add new model), and select the famous checkpoint that they downloaded locally.
Once the model is loaded, there are three things to consider.
Don’t use natural language. For better results, remember to rely on Danbooru tags and stick to the old SDXL prompt style. Do not use pony LoRas. Because the models use different approaches, we recommend using Illustrious Loras for best results. Avoid using the original Illustrious models and instead choose some of the most popular Fine Tunes. The original Illustrious model is the base model and is perfect for fine-tuning to focus on the results you want to achieve. Same as SDXL, Pony, Flux. Tweaking tends to give better results.
Best Illustrious Models to Choose
You can choose from many models focusing on different styles, aesthetics, and characteristics.
There are also popular models like Noob AI’s model, which uses Illustrious as a base and is used by finetuners to build models.
However, below are the top photos for different needs. These are great for quick uptake, output quality, and ease of use. All samples are from the Civit AI community and are not copyrighted.
Great for versatility: Mistoon_Anime
Link: Mistoon_Anime – v1.0 Illustrious | Famous Checkpoints | Chibitai