title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | license |
---|---|---|---|---|---|---|---|---|
Stable Diffusion XL 1.0 |
🔥 |
yellow |
gray |
gradio |
3.11.0 |
app.py |
true |
mit |
This is a gradio demo with web ui supporting Stable Diffusion XL 1.0. This demo loads the base and the refiner model.
This is forked from StableDiffusion v2.1 Demo WebUI. Refer to the git commits to see the changes.
Update 🔥🔥🔥: Latent consistency models (LCM) LoRA is supported and enabled by default (controlled by ENABLE_LCM
)! Turn on USE_SSD
to use SSD-1B
for a even faster generation (4.9 sec/image on free colab T4 without additional optimizations)! Colab has been updated to use this by default.
Update 🔥🔥🔥: Check out our work LLM-grounded Diffusion (LMD), which introduces LLMs into the diffusion world and achieves much better prompt understanding compared to the standard Stable Diffusion without any fine-tuning! LMD with SDXL is supported on our Github repo and a demo with SD is available.
Update: SDXL 1.0 is released and our Web UI demo supports it! No application is needed to get the weights! Launch the colab to get started. You can run this demo on Colab for free even on T4.
Update: Multiple GPUs are supported. You can easily spread the workload to different GPUs by setting MULTI_GPU=True
. This uses data parallelism to split the workload to different GPUs.
Update: See a more comprehensive comparison with 1200+ images here. Both SD XL and SD v2.1 are benchmarked on prompts from StableStudio.
Left: SDXL. Right: SD v2.1.
Without any tuning, SDXL generates much better images compared to SD v2.1!
With torch 2.0.1 installed, we also need to install:
pip install accelerate transformers invisible-watermark "numpy>=1.17" "PyWavelets>=1.1.1" "opencv-python>=4.1.0.25" safetensors "gradio==3.11.0"
pip install git+/~https://github.com/huggingface/diffusers.git
It's free and no form is needed now. Leaked weights seem to be available on reddit, but I have not used/tested them.
There are two ways to load the weights. Option 1 works out of the box (no need for manual download). If you prefer loading from local repo, you can use Option 2.
Run the command to automatically set up the weights:
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python app.py
If you have cloned both repo (base, refiner) locally (please change the path_to_sdxl
):
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 SDXL_MODEL_DIR=/path_to_sdxl python app.py
Note that stable-diffusion-xl-base-1.0
and stable-diffusion-xl-refiner-1.0
should be placed in a directory. The path of the directory should replace /path_to_sdxl
.
Turn on torch.compile
will make overall inference faster. However, this will add some overhead to the first run (i.e., have to wait for compilation during the first run).
- Turn on
pipe.enable_model_cpu_offload()
and turn offpipe.to("cuda")
inapp.py
. - Turn off refiner by setting
enable_refiner
to False. - More ways to save memory and make things faster.
USE_SSD
: use segmind/SSD-1B. This is a distilled SDXL model that is faster. This is disabled by default.ENABLE_LCM
: use LCM LoRA. This is enabled by default.SDXL_MODEL_DIR
: load SDXL locally.ENABLE_REFINER=true/false
turn on/off the refiner (refiner refines the generation). The refiner is disabled by default if LCM LoRA or SSD model is enabled.OFFLOAD_BASE
andOFFLOAD_REFINER
can be set to true/false to enable/disable model offloading (model offloading saves memory at the cost of slowing down generation).OUTPUT_IMAGES_BEFORE_REFINER=true/false
useful is refiner is enabled. Output images before and after the refiner stage.SHARE=true/false
creates public link (useful for sharing and on colab)MULTI_GPU=true/false
enables data parallelism on multi gpus.