How do I run a stable diffusion model?
Regardless of your choice of stable diffusion inference server, models, or extensions, the basic process is as follows:
- Get a docker image that runs your inference server.
- Copy any models and extensions you want into the docker image.
- Ensure the container is listening on an ipv6 address.
- Push the new image up to a container registry.
- Deploy the image as a salad container group.
Steps
- Find a docker image of your preferred stable diffusion inference server. Here are some popular ones that we’ve verified work on Salad. We’ll be using ComfyUI for this example, but the principles covered apply to all of these options.
- ComfyUI
- Git Repo: https://github.com/ai-dock/comfyui
- Docker Image: ghcr.io/ai-dock/comfyui:latest-cuda
- Model Directory:
/opt/ComfyUI/models
- Custom Node Directory:
/opt/ComfyUI/custom_nodes/
- Automatic1111
- Git Repo: https://github.com/SaladTechnologies/stable-diffusion-webui-docker
- Docker Image:
saladtechnologies/a1111:ipv6-latest
- Data Directory:
/data
- Model Directory:
/data/models
- Extension Directory:
/data/config/auto/extensions
- Controlnet Model Directory (If controlnet extension is installed):
/data/config/auto/extensions/sd-webui-controlnet/models
- SD.Next
- Git Repo: https://github.com/SaladTechnologies/sdnext
- Docker Image:
saladtechnologies/sdnext:base
- Data Directory:
/webui/data
- Model Directory:
/webui/data/models
- Extension Directory:
/webui/data/extensions
Controlnet Model Directory:
/webui/extensions-builtin/sd-webui-controlnet/models
Note that you will be interacting with these as an API, and not through their browser user interface.
- ComfyUI
- Download any model files you plan to use. For our example, we’re going to use Dreamshaper 8 , available on Civitai.com (https://blog.salad.com/civitai-salad/)
- Create a new file called
Dockerfile
and open it in your preferred text editor. At this point, your directory should look like this:
. ├── Dockerfile └── dreamshaper_8.safetensors
- Copy the following into your Dockerfile:
# We're going to use this verified comfyui image as a base FROM ghcr.io/ai-dock/comfyui:latest-cuda # Now we copy our model into the image ENV MODEL_DIR=/opt/ComfyUI/models COPY dreamshaper_8.safetensors ${MODEL_DIR}/checkpoints/dreamshaper_8.safetensors # We also need to copy the comfyui-wrapper binary into the image, since ComfyUI # is fully asyncronous by default, and has no convenient way to retrieve # generated images. ADD <https://github.com/SaladTechnologies/comfyui-wrapper/releases/download/1.0.0/comfyui-wrapper> . RUN chmod +x comfyui-wrapper CMD ["./comfyui-wrapper"]
Note that we are including a simple wrapper binary to the image to make it easier to retrieve generated images. ComfyUI accepts prompts into a queue, and then eventually saves images to the local filesystem. This makes it difficult to use in a stateless environment like Salad. This additional binary extends the ComfyUI /prompt
API to allow either receiving the generated images in the response body, or having complete images submitted to a provided webhook url. Automatic1111 and SD.Next include functionality out-of-the-box to let you get your generated images in the response, so no additional binary is required for those options.
- Build the docker image. You should change the specified tag to suit your purpose.
docker build -t saladtechnologies/comfyui:wrapped-1.0.0 .
- (Recommended) Run the docker image locally to confirm it works as expected
docker run -it --rm --gpus all -p 3000:3000 -p 8188:8188 --name comfyui \\ saladtechnologies/comfyui:wrapped-1.0.0
Using it here locally, we’re going to expose port 3000, which is required for the wrapper, and port 8188 that will let us access the web ui locally to make it easier to get the prompt object we need for the api. In production, we will only expose port 3000.
Go to http://localhost:8188/ in your browser. You should see something like this:
Click “Queue Prompt” to generate an image. Mine came out like this.
Enable Dev Mode Options via the settings menu
You should see a new option in the menu, “Save (API Format)”:
- Click the “Save (API Format)” button, and save it. You’ll get a file called “workflow_api.json” that contains everything ComfyUI needs to run that prompt again.
{ "3": { "inputs": { "seed": 712610403220747, "steps": 20, "cfg": 8, "sampler_name": "euler", "scheduler": "normal", "denoise": 1, "model": [ "4", 0 ], "positive": [ "6", 0 ], "negative": [ "7", 0 ], "latent_image": [ "5", 0 ] }, "class_type": "KSampler", "_meta": { "title": "KSampler" } }, "4": { "inputs": { "ckpt_name": "dreamshaper_8.safetensors" }, "class_type": "CheckpointLoaderSimple", "_meta": { "title": "Load Checkpoint" } }, "5": { "inputs": { "width": 512, "height": 512, "batch_size": 1 }, "class_type": "EmptyLatentImage", "_meta": { "title": "Empty Latent Image" } }, "6": { "inputs": { "text": "beautiful scenery nature glass bottle landscape, , purple galaxy bottle,", "clip": [ "4", 1 ] }, "class_type": "CLIPTextEncode", "_meta": { "title": "CLIP Text Encode (Prompt)" } }, "7": { "inputs": { "text": "text, watermark", "clip": [ "4", 1 ] }, "class_type": "CLIPTextEncode", "_meta": { "title": "CLIP Text Encode (Prompt)" } }, "8": { "inputs": { "samples": [ "3", 0 ], "vae": [ "4", 2 ] }, "class_type": "VAEDecode", "_meta": { "title": "VAE Decode" } }, "9": { "inputs": { "filename_prefix": "ComfyUI", "images": [ "8", 0 ] }, "class_type": "SaveImage", "_meta": { "title": "Save Image" } } }
You might notice this is kind of an unintuitive prompting format, but it does capture the nodes and connections used by ComfyUI. In my experience, the ComfyUI web ui is the best way to design your prompts, rather than trying to create a workflow json file like this from scratch.
e. Submit the prompt to the wrapper API on port 3000, using Postman or any http request tool of your choice.
You should submit a POST request to http://localhost:3000/prompt
with a JSON request body like this, where the value of “prompt” is that workflow json we created previously.
{ "prompt": { ... } }
f. In a couple seconds you should receive a response like this:
{ "id": "randomuuid", "prompt": { ... }, "images": ["base64encodedimage"] }
g. Decode the base64 encoded string into your image. You can do this in a free browser tool such as https://codebeautify.org/base64-to-image-converter or using CLI tools like jq
and base64
.
For this method, first save your response to a file called response.json
. Then, run the following command:
jq -r '.images[0]' response.json | base64 -d > image.png
- Push your docker image up to docker hub (or the container registry of your choice.
docker push saladtechnologies/comfyui:wrapped-1.0.0
- Deploy your image on Salad, using either the Portal or the Salad Public API
We’re going to name our container group something obvious, and fill in the configuration form. Since this is a stable diffusion 1.5 based model, we’re gong to give ourselves fairly modest hardware: 4 vcpus, 12GB ram, an RTX 3060 Ti GPU, and a reserved 1 GB of local storage for temporary storage of images as they are being generated. We’re going to use 3 replicas, to ensure coverage during node interruptions and reallocations.
Additionally, we will want to configure our startup and readiness probes (endpoints provided by the wrapper), and enable the container gateway on port 3000. We’ve disabled authentication for this example, but you may want to enable it. If you enable authentication, requests must be submitted with your Salad API Key in the Salad-Api-Key
header.
Click Deploy, and wait for the deployment to come up.
- Wait for the deployment to be ready.
First, Salad pulls your container image into our own internal high-performance cache.
Next, Salad locates eligible nodes for your workflow, based on the configuration you provided.
- Next, Salad begins downloading the cached container image to the nodes that have been assigned to your workload.
This step can take tens of minutes in some cases, depending on the size of the image, and the internet speed of the individual nodes. Note that our progress bars are only estimates, and do not necessarily reflect real-time download status. These slow cold starts, and the possibility of nodes being interrupted by their host without warning, are why we always want to provision multiple replicas.
d. Eventually, you will see instances listed as “running”, with a green check in the “ready” column.
- Submit your prompt to the provided Access Domain Name. You will get back a json response within a few seconds. See step 6 for how to submit the request and process the response.