Seegmoe is an exciting framework for creating mixed models from scratch! Seegmoe is comprehensively integrated into the embracing facial ecosystem and is supported by a diffuser.
Among the features and integrations released today:
table of contents
What is Segmo?
The Segmoe model follows the same architecture as stable diffusion. Like the Mixtral 8x7B, the Seegmoe model comes with multiple models in one. The way this works is to replace some feedforward layers with sparse MOE layers. The MOE layer includes a router network, which selects which experts will most efficiently process the token. You can create your own MOE model using the Segmoe package! The process takes a few minutes. For more information, please see the GitHub repository. Inspired by the popular library Mergekit, we design Seegmoe. We would like to thank Mergekit contributors for such a useful library.
For more information about MoES, see Hugging Face’s post: hf.co/blog/moe.
segmoe release tl;dr;
Release Custom MOE Making Codes for SEGMOE-4X2, SEGMOE-2X1, and SEGMOE-SD4X2 Versions
About the name
Segmoe Moes is called Seegmoe-Axb, and here refers to the number of expert model MOE-Ds, with the second number referring to the number of experts involved in the generation of each image. Depending on your configuration settings, only some layers of the model (feedforward blocks, attention, or all) will be replicated. The remaining parameters are the same as the stable diffusion model. For more information about how Moes works, see Mixed Explained Experts.
inference
Release three merges in the hub.
The Segmoe 2×1 comes in two expert models. The Segmoe 4×2 comes in four expert models. Seegmoe SD 4×2 comes in four stable diffusion 1.5 expert models.
sample
Images generated using Seegmoe 4×2
Images generated using Seegmoe 2×1:
Images generated using Seegmoe SD 4×2
🤗 I use a diffuser
Run the following command to install the Segmoe package: Make sure you have the latest versions of Diffusers and Transformer installed.
PIP Installation-U Segmoed Diffuser Transformer
Below we load the second model from the list above (“Segmoe 4×2”) and run the generation on it.
from Segumoe Import segmoepipeline pipeline = segmoepipeline (“segmind/segmoe-4×2-v0”device =“cuda”)prompt = “Space canvas, Orange City background, Chubby cat painting”
negial_prompt = “NSFW, poor quality, poor quality”
img = pipeline(prompt = prompt, negial_prompt = negial_prompt, height =1024width =1024num_inference_steps =twenty five,Guidance_scale =7.5,).images(0)img.save(“image.png”))
Using the local model
Alternatively, you can load the local model as well. Here, SEGMOE_V0 is the path to the directory containing the local SEGMOE model. Learn how to build your own checkout to create your own Segmoe to create your own Segmoe!
from Segumoe Import segmoepipeline pipeline = segmoepipeline (“segmoe_v0”device =“cuda”)prompt = “Space canvas, Orange City background, Chubby cat painting”
negial_prompt = “NSFW, poor quality, poor quality”
img = pipeline(prompt = prompt, negial_prompt = negial_prompt, height =1024width =1024num_inference_steps =twenty five,Guidance_scale =7.5,).images(0)img.save(“image.png”))
Comparison
As shown in the image below, the quick understanding appears to be improved. Each image shows the following models from left to right: SEGMOE-2X1-V0, SEGMOE-4X2-V0, Base model (RealVISXL_V3.0)
3 green glass bottles
Panda wears aviator glasses on her head
The Statue of Liberty Next to Washington Monument
The Taj Mahal is with his reflection. Detailed charcoal sketch.
Create your own Segmoe
Simply prepare a config.yaml file with the following structure:
base_model: base Model path, Model card or Civitai download link
num_experts: number of Experts In use
moe_layers: type of Layer In mix (can Become “FF”, “attn” or “all”). Defaults In “attn”
num_experts_per_tok: number of Experts In use
Experts:
– source_model: Experts 1 path, Model card or Civitai download link
POSION_PROMPT: positive prompt for Computing gate Weight
negative_prompt: Negative prompt for Computing gate Weight
– source_model: Experts 2 path, Model card or Civitai download link
POSION_PROMPT: positive prompt for Computing gate Weight
negative_prompt: Negative prompt for Computing gate Weight
– source_model: Experts 3 path, Model card or Civitai download link
POSION_PROMPT: positive prompt for Computing gate Weight
negative_prompt: Negative prompt for Computing gate Weight
– source_model: Experts 4 path, Model card or Civitai download link
POSION_PROMPT: positive prompt for Computing gate Weight
negative_prompt: Negative prompt for Computing gate Weight
Any number of models can be combined. For more information on how to create a configuration file, see the github repository
Note that both embracing faces and Civitai models are supported. For Civitai models, paste the model download link. For example, “https://civitai.com/api/download/models/239306” “
Next, run the following command:
segmoe config.yaml seegmoe_v0
This creates a folder called segmoe_v0 with the following structure:
Model_index.json├├)└s└└. └. . Scheduler_config.json merges.txt ├·x├├. .
Alternatively, you can create a mixture of expert models using the Python API.
from Segumoe Import segmoepipeline pipeline = segmoepipeline (“config.yaml”device =“cuda”)pipeline.save_pretrained(“segmoe_v0”))
Push to the hub
The model can be pushed into the hub via Huggingface-cli
huggingface-cli uploadsegmind/segmoe_v0 ./segmoe_v0
The model can also be pushed directly into the hub from Python.
from huggingface_hub Import create_repo, upload_folder model_id = “segmind/segmoe-v0”
repo_id = create_repo(repo_id = model_id, sex_ok =truth).repo_id upload_folder(repo_id = repo_id,folder_path =“segmoe_v0”,commid_message =“First commit”,Ingore_patterns =(“Step_*”, “epoch_*”) )
Detailed usage can be found here
Disclaimer and ongoing work
Slow speed: If the number of experts per token is greater than 1, the MOE performs calculations on several specialist models. This will slower than a single SD 1.5 or SDXL model.
High VRAM usage: MOES performs inference very quickly, but still requires a lot of VRAM (and therefore an expensive GPU). This makes them difficult to use in local setups, but is ideal for deployments with multiple GPUs. As a reference point, the SEGMOE-4X2 requires about half of 24GB of VRAM.
Conclusion
We provide the community with new tools that make it easy to create SOTA diffusion models by simply building Seegmoe and combining preprocessed models while keeping inference times low. We are excited to see what you can build with it!
Additional resources