Ever since I realized that AI is shaping the future, I have been fascinated by its endless possibilities.
I’m someone who likes testing large-scale language models (LLMs) on my devices, and an open source approach to data has always been my preference.
why? Because open source projects give you the control, privacy, and customization that’s essential in today’s data-driven world.
When we decided to explore AI image generation, it felt like a natural extension of this idea. Why rely on a proprietary model when open source alternatives offer powerful functionality and flexibility?
I’ll be honest and admit, I don’t have the ideal hardware to run these models locally at blazing speeds. But where there is a will, there is a way. Sure, CPU inference is very slow, but it gets the job done eventually (patience builds character, right?).
During my research, I came across several fascinating projects. Some are fully ripe and ready to use, while others are still sprouting and need more time to mature.
This article is a compiled list of some great open source AI image generators that you can run locally. If there are any gems that we’ve missed, feel free to let us know in the comments.
1. Stable Diffusion 1.5 (combined with Stable Diffusion Webui)
Stable Diffusion v1.5 is a powerful latent text-to-image diffusion model designed to generate photorealistic images from text prompts.
Developed as an evolution of the previous version, fine-tuned and enhanced with a large dataset ‘LAION-Aesthetics v2 5+’.
This model is particularly suitable for artistic, creative and research purposes and provides excellent results with minimal computational requirements.
Main features
Free up high-quality text-to-image generation with a latent diffusion process and reduce computational overhead to achieve impressive results. Fine-tune large datasets to improve your ability to produce visually appealing images. Supports multiple platforms and tools, including Diffusers Library, ComfyUI, Automatic1111, SD.Next, and InvokeAI locally for seamless integration into Python workflows. Enjoy efficient weight options, including EMA-only weights for inference and EMA + non-EMA weights for fine-tuning tasks. Leverage a pre-trained text encoder inspired by Google’s Imagen model to ensure your text prompts are understood. Generate artwork, design prototypes, and educational visuals with creative applications, perfect for art and research purposes.
2. Call AI
InvokeAI is a robust open source image generation project inspired by Stable Diffusion that provides users with a highly customizable experience for creating unique visuals.
Whether you’re looking to generate artwork, photorealistic images, or something more abstract, InvokeAI offers a powerful toolkit with an easy-to-use interface.
Its flexibility is perfect for people who want more control over their creative process, especially those who work with specific intellectual property or require customized workflows.
Main features
Create highly detailed prompts with positive and negative guidance options to guide you through the generation process. Generate images based on text descriptions and use numerous customization options for more control. Use existing images as a reference to help the AI
3. Open Journey
OpenJourney is a powerful open-source text-to-image AI art generator that allows users to create stunning visuals from text prompts.
It was launched by PromptHero in November 2022 and quickly gained popularity as a free alternative to MidJourney.
Built on Stable Diffusion, OpenJourney was trained using thousands of MidJourney images from the v4 update, as well as other AI models such as DALL-E 2.
OpenJourney excels at producing photorealistic and artistic images, and its open source nature allows it to remain accessible to a wide range of users.
Main features
Create stunning visuals from your text prompts with powerful text-to-image generation capabilities. Enjoy photo-realistic and artistic images. Perfect for artists, designers, and anyone who wants to generate high-quality content. Access our curated library of prompt ideas to spark your creativity and start producing art. Customize the style and content of the generated images by creating specific prompts that match your vision. Take advantage of the stable version of OpenJourney. Diffusion-based architecture and additional training on MidJourney images to enhance functionality. Take advantage of the wide accessibility of free downloads on Hugging Face as part of a broad ecosystem of open source AI models.
4. LocalAI (all-rounder)
LocalAI is an open-source, free alternative to OpenAI that enables local AI inference on consumer-grade hardware.
It serves as a drop-in replacement for OpenAI’s API specification, allowing you to run large-scale language models (LLMs) and generate images, audio, and more without the need for a GPU.
Created and maintained by Ettore Di Giacinto, LocalAI provides a flexible and cost-effective solution for running AI models on-premises.
Main features
Compatible with the OpenAI API specification, making integration easy for developers. The platform runs on consumer-grade hardware and eliminates the need for a GPU. Supports a wide range of models and platforms including Llama, Hugging Face, and Ollama for diverse applications. Advanced text generation is possible using models such as llama.cpp and transformers. Users can generate images from text prompts for creatives. project. Contains text-to-audio conversion and audio-to-text audio functionality with whisper.cpp. Facilitates embedding generation for vector database tasks such as semantic search. Provides peer-to-peer inference for distributed AI processing across multiple devices. Integrate voice activity detection using Silero-VAD to improve accuracy for audio tasks. Provides an easy-to-use WebUI to manage your models without any technical steps. Features a model gallery to browse and download models directly from platforms such as Hugging Face.
5. Focus (Editor’s Choice)
Fooocus caught my attention as one of the most user-friendly and innovative open source image generators.
What particularly appealed to me was its ability to run on modest hardware (a meager laptop like mine), be compatible with a variety of models, and handle a wide variety of styles.
It’s like having a Swiss Army knife for image generation.
Main features
Fooocus boasts a unique repair algorithm that provides excellent results for editing and perfecting images. With the ability to use multiple prompts simultaneously, Fooocus enriches the creative possibilities and variety of output, opening new avenues of artistic expression. Fooocus supports a huge number of SDXL models, ranging from artistic to photorealistic styles, giving users endless experimentation options. You can specify a user-tailored image generation aspect ratio, ensuring that all outputs meet your unique requirements. Advanced style controls such as contrast, sharpness, and color adjustments allow users to precisely fine-tune the resulting images. Fooocus leverages A1111’s reweighting algorithm to strengthen the influence of specific elements within a prompt for more targeted results. The platform incorporates InsightFace technology for accurate face replacement, making it ideal for creating personalized images Fooocus is performance-optimized across a wide range of hardware configurations, ensuring accessibility and speed regardless of your setup.
conclusion
And it was done! From Stable Diffusion to Fooocus, here are some open source projects that you can host or deploy locally to create stunning images on your hardware.
While I won’t get into the gray area of
I like exploring local AI tools. Take a look at our list of open source AI tools for documentation.
5 local AI tools for working with PDFs and documents
Use these local AI tools to privately interact with your documents.
Now, before you get lost in a sea of
What do you think? Are there any hidden gems I missed? Do you agree with my secret love for LocalAI and Fooocus?
Pop into the comments section and let us know what you think. who knows? Your suggestion might be the next project I test (if my CPU allows, of course).
Until next time, keep producing and keep dreaming.