Serve a StableDiffusion text-to-image model on Kubernetes#
Note: The Python files for the Ray Serve application and its client are in the ray-project/serve_config_examples repo and the Ray documentation.
Step 1: Create a Kubernetes cluster with GPUs#
Follow aws-eks-gpu-cluster.md or gcp-gke-gpu-cluster.md to create a Kubernetes cluster with 1 CPU node and 1 GPU node.
Step 2: Install KubeRay operator#
Follow this document to install the latest stable KubeRay operator via Helm repository.
Please note that the YAML file in this example uses serveConfigV2, which is supported starting from KubeRay v0.6.0.
Step 3: Install a RayService#
# Step 3.1: Download `ray-service.stable-diffusion.yaml`
curl -LO https://raw.githubusercontent.com/ray-project/kuberay/v1.2.2/ray-operator/config/samples/ray-service.stable-diffusion.yaml
# Step 3.2: Create a RayService
kubectl apply -f ray-service.stable-diffusion.yaml
This RayService configuration contains some important settings:
- The - tolerationsfor workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we set- nvidia.com/gpu: 1in the Pod’s resource configurations.- # Please add the following taints to the GPU node. tolerations: - key: "ray.io/node-type" operator: "Equal" value: "worker" effect: "NoSchedule" 
- It includes - diffusersin- runtime_envsince this package is not included by default in the- ray-mlimage.
Step 4: Forward the port of Serve#
First get the service name from this command.
kubectl get services
Then, port forward to the serve.
kubectl port-forward svc/stable-diffusion-serve-svc 8000
Note that the RayService’s Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
Step 5: Send a request to the text-to-image model#
# Step 5.1: Download `stable_diffusion_req.py`
curl -LO https://raw.githubusercontent.com/ray-project/serve_config_examples/master/stable_diffusion/stable_diffusion_req.py
# Step 5.2: Set your `prompt` in `stable_diffusion_req.py`.
# Step 5.3: Send a request to the Stable Diffusion model.
python stable_diffusion_req.py
# Check output.png
- You can refer to the document “Serving a Stable Diffusion Model” for an example output image.