Programmatic Cluster Scaling#
ray.autoscaler.sdk.request_resources#
Within a Ray program, you can command the autoscaler to scale the cluster up to a desired size with request_resources() call. The cluster will immediately attempt to scale to accommodate the requested resources, bypassing normal upscaling speed constraints.
- ray.autoscaler.sdk.request_resources(num_cpus: int | None = None, bundles: List[dict] | None = None) None[source]
- Command the autoscaler to scale to accommodate the specified requests. - The cluster will immediately attempt to scale to accommodate the requested resources, bypassing normal upscaling speed constraints. This takes into account existing resource usage. - For example, suppose you call - request_resources(num_cpus=100)and there are 45 currently running tasks, each requiring 1 CPU. Then, enough nodes will be added so up to 100 tasks can run concurrently. It does not add enough nodes so that 145 tasks can run.- This call is only a hint to the autoscaler. The actual resulting cluster size may be slightly larger or smaller than expected depending on the internal bin packing algorithm and max worker count restrictions. - Parameters:
- num_cpus – Scale the cluster to ensure this number of CPUs are available. This request is persistent until another call to request_resources() is made to override. 
- bundles (List[ResourceDict]) – Scale the cluster to ensure this set of resource shapes can fit. This request is persistent until another call to request_resources() is made to override. 
 
 - Examples - >>> from ray.autoscaler.sdk import request_resources >>> # Request 1000 CPUs. >>> request_resources(num_cpus=1000) >>> # Request 64 CPUs and also fit a 1-GPU/4-CPU task. >>> request_resources( ... num_cpus=64, bundles=[{"GPU": 1, "CPU": 4}]) >>> # Same as requesting num_cpus=3. >>> request_resources( ... bundles=[{"CPU": 1}, {"CPU": 1}, {"CPU": 1}]) - DeveloperAPI: This API may change across minor Ray releases.