diff --git a/POC-TESTING.md b/POC-TESTING.md
new file mode 100644
index 00000000000..6dec9577baa
--- /dev/null
+++ b/POC-TESTING.md
@@ -0,0 +1,169 @@
+# POC manual testing — RayService upgrade quota gate
+
+Branch: `poc/rayservice-upgrade-quota-gate`
+Issue: kubernetes-sigs/kueue#11102
+
+## Prerequisites
+
+- kind cluster running (`kubectl config current-context` → `kind-kind`).
+- KubeRay operator deployed with PR ray-project/kuberay#4841 applied. The POC was verified against the image `kuberay/operator:test-suspend`.
+- Local kuberay checkout at `/Users/kaihsunchen/Desktop/kuberay` (the vendored `rayservice_types.go` mirrors it).
+
+## Build and deploy Kueue with this branch
+
+```bash
+# 1. install CRDs
+make install
+
+# 2. build and load image into kind
+make kind-image-build
+kind load docker-image us-central1-docker.pkg.dev/k8s-staging-images/kueue/kueue:main
+
+# 3. deploy
+make deploy
+
+# 4. pin imagePullPolicy and image tag so kind serves the freshly built copy
+kubectl patch deployment -n kueue-system kueue-controller-manager \
+  -p '{"spec":{"template":{"spec":{"containers":[{"name":"manager","imagePullPolicy":"IfNotPresent"}]}}}}'
+kubectl set image -n kueue-system deployment/kueue-controller-manager \
+  manager=us-central1-docker.pkg.dev/k8s-staging-images/kueue/kueue:main
+
+# 5. enable the elastic-jobs feature gate
+kubectl patch deployment -n kueue-system kueue-controller-manager --type=json \
+  -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--feature-gates=ElasticJobsViaWorkloadSlices=true"}]'
+kubectl wait -n kueue-system --for=condition=Available --timeout=120s deployment/kueue-controller-manager
+```
+
+When you re-build Kueue mid-test, the tag is reused, so kind keeps the old image under the version-numbered tag. Always re-run `kind load docker-image` and either `kubectl rollout restart` or `kubectl set image` to ensure the new digest is picked up — confirm by inspecting `imageID` on the running pod.
+
+## Happy path — quota gates the pending RayCluster
+
+ClusterQueue has room for exactly one head pod (1 CPU / 2 GiB). The upgrade should keep the active RayCluster running and leave the pending RayCluster suspended.
+
+```bash
+# queues
+kubectl apply -f poc-queues.yaml
+
+# RayService — already labelled with queue-name and annotated kueue.x-k8s.io/elastic-job: "true"
+kubectl apply -f ray-service-sample.yaml
+
+# wait for KubeRay to promote the first cluster to Active
+until [ -n "$(kubectl get rayservice test-rayservice -o jsonpath='{.status.activeServiceStatus.rayClusterName}')" ]; do sleep 6; done
+```
+
+Pre-upgrade expectation:
+
+```
+NAME                    SUSPEND   VERSION   STATE
+test-rayservice-XXXXX   false     2.48.0    ready
+```
+
+The RayService should also have `spec.rayClusterConfig.suspend: true` (persistent template gate) and `spec.suspend` empty/false (top-level released after admission).
+
+Trigger a zero-downtime upgrade by bumping rayVersion:
+
+```bash
+kubectl patch rayservice test-rayservice --type=merge \
+  -p '{"spec":{"rayClusterConfig":{"rayVersion":"2.46.0"}}}'
+```
+
+After ~10 s, inspect state:
+
+```bash
+kubectl get rayclusters -o custom-columns=NAME:.metadata.name,SUSPEND:.spec.suspend,VERSION:.spec.rayVersion,STATE:.status.state
+kubectl get pods -l ray.io/cluster
+kubectl get workloads --sort-by=.metadata.creationTimestamp
+kubectl get clusterqueue cluster-queue -o jsonpath='{.status.flavorsUsage}{"\n"}'
+```
+
+Pass criteria:
+
+| Field | Expected |
+|---|---|
+| Active RayCluster | `Suspend=false`, `STATE=ready`, head pod Running |
+| Pending RayCluster | `Suspend=true`, `STATE=suspended`, no head pod |
+| Workloads | Two workloads; older admitted (head count 1), newer Pending (head count 2) |
+| ClusterQueue usage | `cpu: 1`, `memory: 2Gi` — only the active is reserved |
+
+Release the gate by adding capacity:
+
+```bash
+kubectl patch clusterqueue cluster-queue --type=merge -p '{"spec":{"resourceGroups":[{"coveredResources":["cpu","memory"],"flavors":[{"name":"default","resources":[{"name":"cpu","nominalQuota":"2"},{"name":"memory","nominalQuota":"4Gi"}]}]}]}}'
+```
+
+The newer workload should be admitted, `unsuspendAdmittedChildren` patches the pending RayCluster's `Spec.Suspend=false`, KubeRay promotes it once Serve apps are ready, and the older slice is finished.
+
+`ClusterQueue.status.admittedWorkloads` should settle to `1` after the transition. `EnsureWorkloadSlices`' case 2 path (`pkg/workloadslicing/workloadslicing.go:229-241`) finishes the old slice as `WorkloadFinishedReasonOutOfSync` as soon as the new slice is admitted as its replacement, and finished workloads are not counted. The count may briefly read `2` while both slices are alive.
+
+Tear down:
+
+```bash
+kubectl delete rayservice test-rayservice
+kubectl delete -f poc-queues.yaml
+```
+
+## Limitation 1 — heterogeneous PodSpec on the same group name
+
+`PodSets` merges per-group counts across children and keeps the first child's template, so resource-changing upgrades on the same group name are mis-accounted. To reproduce:
+
+```bash
+# 1. enlarge quota so active can run with the bigger resource ask we'll set
+kubectl apply -f poc-queues.yaml
+kubectl patch clusterqueue cluster-queue --type=merge -p '{"spec":{"resourceGroups":[{"coveredResources":["cpu","memory"],"flavors":[{"name":"default","resources":[{"name":"cpu","nominalQuota":"4"},{"name":"memory","nominalQuota":"8Gi"}]}]}]}}'
+
+# 2. apply the sample (head = 1 CPU / 2 GiB)
+kubectl apply -f ray-service-sample.yaml
+until [ -n "$(kubectl get rayservice test-rayservice -o jsonpath='{.status.activeServiceStatus.rayClusterName}')" ]; do sleep 6; done
+
+# 3. trigger an upgrade that DOUBLES the head's resource request (2 CPU / 4 GiB)
+kubectl patch rayservice test-rayservice --type=merge -p '{
+  "spec":{"rayClusterConfig":{
+    "rayVersion":"2.46.0",
+    "headGroupSpec":{"rayStartParams":{},"template":{"spec":{"containers":[{"name":"ray-head","image":"rayproject/ray:2.46.0","resources":{"requests":{"cpu":"2","memory":"4Gi"},"limits":{"cpu":"2","memory":"4Gi"}}}]}}}
+  }}
+}'
+```
+
+Observation:
+
+```bash
+kubectl get rayclusters -o custom-columns=NAME:.metadata.name,VERSION:.spec.rayVersion
+kubectl get workloads --sort-by=.metadata.creationTimestamp -o jsonpath='{range .items[*]}{.metadata.name}{" head-count="}{.spec.podSets[0].count}{" cpu="}{.spec.podSets[0].template.spec.containers[0].resources.requests.cpu}{"\n"}{end}'
+kubectl get clusterqueue cluster-queue -o jsonpath='{.status.flavorsUsage}{"\n"}'
+```
+
+The new workload slice will show `head count=2` with `cpu=1` taken from the OLD active's template — not the actual mix of 1 CPU (old) + 2 CPU (new). Real demand is 3 CPU but the slice reserves 2 CPU. Once admitted, both head pods run on under-reserved quota. Conversely, downgrading resources over-reserves.
+
+Workaround (POC-only): make resource-changing upgrades go through a manual two-step (scale active down, change spec while paused, re-admit) until the heterogeneous case is handled properly.
+
+## Limitation 2 — user tampering with the persistent template gate
+
+There's no webhook validation guarding `Spec.RayClusterSpec.Suspend`. A user can manually disable the gate, and then KubeRay creates the upgrade's pending cluster un-suspended, bypassing the quota gate.
+
+```bash
+# happy-path setup
+kubectl apply -f poc-queues.yaml
+kubectl apply -f ray-service-sample.yaml
+until [ -n "$(kubectl get rayservice test-rayservice -o jsonpath='{.status.activeServiceStatus.rayClusterName}')" ]; do sleep 6; done
+
+# disable the template gate manually
+kubectl patch rayservice test-rayservice --type=merge \
+  -p '{"spec":{"rayClusterConfig":{"suspend":false}}}'
+
+# trigger an upgrade
+kubectl patch rayservice test-rayservice --type=merge \
+  -p '{"spec":{"rayClusterConfig":{"rayVersion":"2.46.0"}}}'
+
+# inspect: the new pending RayCluster has Suspend=false at creation,
+# its head pod starts running before Kueue admits the upgrade workload slice
+kubectl get rayclusters -o custom-columns=NAME:.metadata.name,SUSPEND:.spec.suspend,STATE:.status.state
+kubectl get pods -l ray.io/cluster
+```
+
+Expectation: the pending head pod is `Running` even though the upgrade workload slice is `Pending`. ClusterQueue usage stays at 1 CPU on Kueue's books while two head pods actually consume the node — quota over-commitment.
+
+A proper webhook would reject this update (or re-patch the gate back to true). Until then, treat the template-level `suspend` as Kueue-owned.
+
+## Limitation 3 — MultiKueue (not yet wired)
+
+`pkg/controller/jobs/rayservice/rayservice_multikueue_adapter.go` hasn't been updated for the new suspend semantics, so management cluster → worker cluster sync won't propagate `Spec.Suspend` correctly and won't keep `RayClusterSpec.Suspend=true` on the worker side. Skip the MultiKueue happy path in manual testing for now.
diff --git a/pkg/controller/jobs/rayservice/rayservice_controller.go b/pkg/controller/jobs/rayservice/rayservice_controller.go
index c2e4c8b9b76..a9c2f362e44 100644
--- a/pkg/controller/jobs/rayservice/rayservice_controller.go
+++ b/pkg/controller/jobs/rayservice/rayservice_controller.go
@@ -28,6 +28,7 @@ import (
 	"k8s.io/apimachinery/pkg/types"
 	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
 	"k8s.io/client-go/tools/events"
+	"k8s.io/utils/ptr"
 	ctrl "sigs.k8s.io/controller-runtime"
 	"sigs.k8s.io/controller-runtime/pkg/builder"
 	"sigs.k8s.io/controller-runtime/pkg/client"
@@ -41,6 +42,8 @@ import (
 	"sigs.k8s.io/kueue/pkg/features"
 	"sigs.k8s.io/kueue/pkg/podset"
 	"sigs.k8s.io/kueue/pkg/util/roletracker"
+	"sigs.k8s.io/kueue/pkg/workload"
+	"sigs.k8s.io/kueue/pkg/workloadslicing"
 )
 
 var (
@@ -73,7 +76,7 @@ func init() {
 // +kubebuilder:rbac:groups=kueue.x-k8s.io,resources=resourceflavors,verbs=get;list;watch
 // +kubebuilder:rbac:groups=kueue.x-k8s.io,resources=workloadpriorityclasses,verbs=get;list;watch
 // +kubebuilder:rbac:groups=ray.io,resources=rayservices/finalizers,verbs=get;update
-// +kubebuilder:rbac:groups=ray.io,resources=rayclusters,verbs=get;list;watch
+// +kubebuilder:rbac:groups=ray.io,resources=rayclusters,verbs=get;list;watch;update;patch
 
 type rayServiceReconciler struct {
 	jr     *jobframework.JobReconciler
@@ -105,7 +108,111 @@ func NewReconciler(
 }
 
 func (r *rayServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
-	return r.jr.ReconcileGenericJob(ctx, req, newJob())
+	result, err := r.jr.ReconcileGenericJob(ctx, req, newJob())
+	if err != nil {
+		return result, err
+	}
+	if err := r.unsuspendAdmittedChildren(ctx, req); err != nil {
+		return result, err
+	}
+	return result, nil
+}
+
+// unsuspendAdmittedChildren patches Spec.Suspend=false on any child RayCluster
+// that KubeRay created with Suspend=true (via the persistent
+// RayService.Spec.RayClusterSpec.Suspend=true template gate) once the latest
+// workload slice has been admitted by Kueue. This is the "Kueue owns the suspend
+// state after creation" half of KubeRay PR #4841: KubeRay creates child
+// RayClusters suspended, and Kueue is responsible for releasing them after
+// quota is reserved.
+func (r *rayServiceReconciler) unsuspendAdmittedChildren(ctx context.Context, req ctrl.Request) error {
+	var rs rayv1.RayService
+	if err := r.client.Get(ctx, req.NamespacedName, &rs); err != nil {
+		return client.IgnoreNotFound(err)
+	}
+	// If the RayService itself is suspended, KubeRay will delete its children.
+	// Nothing to unsuspend.
+	if rs.Spec.Suspend {
+		return nil
+	}
+
+	wls, err := workloadslicing.FindNotFinishedWorkloads(ctx, r.client, &rs, gvk)
+	if err != nil {
+		return err
+	}
+	if len(wls) == 0 {
+		return nil
+	}
+	// FindNotFinishedWorkloads returns slices sorted oldest-first. The latest one
+	// covers the current set of children. Only unsuspend when it's admitted —
+	// during an upgrade transition the new slice may be unadmitted while the old
+	// slice is still alive; pending children must wait.
+	latest := &wls[len(wls)-1]
+	if !workload.IsAdmitted(latest) {
+		return nil
+	}
+
+	var children rayv1.RayClusterList
+	if err := r.client.List(ctx, &children,
+		client.InNamespace(rs.Namespace),
+		client.MatchingLabels{
+			rayutils.RayOriginatedFromCRNameLabelKey: rs.Name,
+			rayutils.RayOriginatedFromCRDLabelKey:    rayutils.RayOriginatedFromCRDLabelValue(rayutils.RayServiceCRD),
+		},
+	); err != nil {
+		return err
+	}
+
+	// Race guard: when KubeRay creates the upgrade's pending child, this reconcile
+	// can fire before the framework has materialised a new workload slice that
+	// covers both children. The old slice is still the latest and is admitted, but
+	// its PodSets only account for the active child. Unsuspending now would let
+	// the pending child run on the old slice's quota.
+	//
+	// Compare the latest slice's PodSet counts against what the current children
+	// require. If the slice doesn't yet cover the union, wait for the next reconcile.
+	required := computeRequiredPodSetCounts(&children)
+	covered := workload.ExtractPodSetCountsFromWorkload(latest)
+	for name, need := range required {
+		if covered[name] < need {
+			return nil
+		}
+	}
+	for i := range children.Items {
+		child := &children.Items[i]
+		if child.Spec.Suspend == nil || !*child.Spec.Suspend {
+			continue
+		}
+		patch := client.MergeFrom(child.DeepCopy())
+		child.Spec.Suspend = ptr.To(false)
+		if err := r.client.Patch(ctx, child, patch); err != nil {
+			return err
+		}
+	}
+	return nil
+}
+
+// computeRequiredPodSetCounts returns the union of PodSet counts across all
+// child RayClusters, keyed by PodSet name (head + each worker group). Mirrors
+// the logic in (*RayService).PodSets so the race-guard compares like for like.
+func computeRequiredPodSetCounts(children *rayv1.RayClusterList) map[kueue.PodSetReference]int32 {
+	required := make(map[kueue.PodSetReference]int32)
+	for i := range children.Items {
+		child := &children.Items[i]
+		required[headGroupPodSetName] += 1
+		for j := range child.Spec.WorkerGroupSpecs {
+			wgs := &child.Spec.WorkerGroupSpecs[j]
+			count := int32(1)
+			if wgs.Replicas != nil {
+				count = *wgs.Replicas
+			}
+			if wgs.NumOfHosts > 1 {
+				count *= wgs.NumOfHosts
+			}
+			required[kueue.NewPodSetReference(wgs.GroupName)] += count
+		}
+	}
+	return required
 }
 
 func (r *rayServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
@@ -132,7 +239,7 @@ func (j *RayService) Object() client.Object {
 }
 
 func (j *RayService) IsSuspended() bool {
-	return j.Spec.RayClusterSpec.Suspend != nil && *j.Spec.RayClusterSpec.Suspend
+	return j.Spec.Suspend
 }
 
 func (j *RayService) IsActive() bool {
@@ -140,7 +247,15 @@ func (j *RayService) IsActive() bool {
 }
 
 func (j *RayService) Suspend() {
-	j.Spec.RayClusterSpec.Suspend = new(true)
+	// Top-level Spec.Suspend=true tells KubeRay to delete all owned resources
+	// (RayClusters, Services, Gateway, HTTPRoute). This is the "stop everything" switch
+	// used both at admission default-suspend and at preemption.
+	j.Spec.Suspend = true
+	// Nested template Suspend=true is the persistent gate: any RayCluster KubeRay
+	// creates from this template (initial admission and zero-downtime upgrade pending
+	// cluster) starts suspended. Kueue then unsuspends each child after the
+	// corresponding workload slice is admitted. See PR ray-project/kuberay#4841.
+	j.Spec.RayClusterSpec.Suspend = ptr.To(true)
 }
 
 func (j *RayService) GVK() schema.GroupVersionKind {
@@ -155,18 +270,62 @@ func (j *RayService) PodLabelSelector() string {
 }
 
 func (j *RayService) PodSets(ctx context.Context) ([]kueue.PodSet, error) {
-	// Always build PodSets from RayService spec first
-	podSets, err := raycluster.BuildPodSets(&j.Spec.RayClusterSpec)
+	// List the actual child RayClusters owned by this RayService. Labels mirror
+	// KubeRay's common.RayServiceRayClustersAssociationOptions.
+	var children rayv1.RayClusterList
+	err := reconciler.client.List(ctx, &children,
+		client.InNamespace(j.GetNamespace()),
+		client.MatchingLabels{
+			rayutils.RayOriginatedFromCRNameLabelKey: j.GetName(),
+			rayutils.RayOriginatedFromCRDLabelKey:    rayutils.RayOriginatedFromCRDLabelValue(rayutils.RayServiceCRD),
+		},
+	)
 	if err != nil {
 		return nil, err
 	}
 
-	rayClusterName := j.Status.ActiveServiceStatus.RayClusterName
-	podSets, err = raycluster.UpdatePodSets(ctx, podSets, reconciler.client, j.Object(), j.Spec.RayClusterSpec.EnableInTreeAutoscaling, rayClusterName)
-	if err != nil {
-		return nil, err
+	// Bootstrap: before KubeRay creates the first child, build from the template so
+	// the Workload exists with the right shape (head + worker groups).
+	if len(children.Items) == 0 {
+		return raycluster.BuildPodSets(&j.Spec.RayClusterSpec)
 	}
 
+	// Steady state and zero-downtime upgrade: build PodSets from each child's real
+	// spec, then union by PodSet name (head + each worker group). Counts are summed
+	// across children, so during upgrade the workload reserves quota for both the
+	// active and pending RayClusters. Keeping PodSet keys stable across the 1↔2
+	// child transition lets EnsureWorkloadSlices handle the upgrade as a scale-up
+	// and the post-upgrade tear-down as a scale-down, without falling back to the
+	// non-slice path.
+	//
+	// POC limitation: when two children share a group name with different PodSpecs
+	// (e.g., upgrade changes container image, env vars, or resource requests on the
+	// same worker group), the merged PodSet uses the first child's template. Quota
+	// is then computed against that template's resources, which under- or over-
+	// accounts the other child. Same-shape, same-resource upgrades are exact.
+	podSetMap := make(map[kueue.PodSetReference]*kueue.PodSet)
+	var order []kueue.PodSetReference
+	for i := range children.Items {
+		child := &children.Items[i]
+		childPodSets, err := raycluster.BuildPodSets(&child.Spec)
+		if err != nil {
+			return nil, err
+		}
+		for k := range childPodSets {
+			name := childPodSets[k].Name
+			if existing, ok := podSetMap[name]; ok {
+				existing.Count += childPodSets[k].Count
+				continue
+			}
+			ps := childPodSets[k]
+			podSetMap[name] = &ps
+			order = append(order, name)
+		}
+	}
+	podSets := make([]kueue.PodSet, 0, len(order))
+	for _, n := range order {
+		podSets = append(podSets, *podSetMap[n])
+	}
 	return podSets, nil
 }
 
@@ -176,7 +335,12 @@ func (j *RayService) RunWithPodSetsInfo(ctx context.Context, podSetsInfo []podse
 		return podset.BadPodSetsInfoLenError(expectedLen, len(podSetsInfo))
 	}
 
-	j.Spec.RayClusterSpec.Suspend = new(false)
+	// Unsuspend the RayService so KubeRay can manage child RayClusters again.
+	// Intentionally do NOT touch j.Spec.RayClusterSpec.Suspend: it stays true so
+	// any new child RayCluster (initial creation, or pending cluster during a
+	// zero-downtime upgrade) is born suspended. The controller unsuspends each
+	// child individually after the matching workload slice is admitted.
+	j.Spec.Suspend = false
 
 	rayClusterSpec := &j.Spec.RayClusterSpec
 	err := raycluster.UpdateRayClusterSpecToRunWithPodSetsInfo(rayClusterSpec, podSetsInfo)
diff --git a/poc-queues.yaml b/poc-queues.yaml
new file mode 100644
index 00000000000..8c42616f213
--- /dev/null
+++ b/poc-queues.yaml
@@ -0,0 +1,28 @@
+apiVersion: kueue.x-k8s.io/v1beta2
+kind: ResourceFlavor
+metadata:
+  name: default
+---
+apiVersion: kueue.x-k8s.io/v1beta2
+kind: ClusterQueue
+metadata:
+  name: cluster-queue
+spec:
+  namespaceSelector: {}
+  resourceGroups:
+  - coveredResources: ["cpu", "memory"]
+    flavors:
+    - name: default
+      resources:
+      - name: cpu
+        nominalQuota: "1"
+      - name: memory
+        nominalQuota: "2Gi"
+---
+apiVersion: kueue.x-k8s.io/v1beta2
+kind: LocalQueue
+metadata:
+  namespace: default
+  name: user-queue
+spec:
+  clusterQueue: cluster-queue
diff --git a/ray-service-sample.yaml b/ray-service-sample.yaml
new file mode 100644
index 00000000000..10876942639
--- /dev/null
+++ b/ray-service-sample.yaml
@@ -0,0 +1,63 @@
+apiVersion: ray.io/v1
+kind: RayService
+metadata:
+  name: test-rayservice
+  namespace: default
+  labels:
+    kueue.x-k8s.io/queue-name: user-queue
+  annotations:
+    kueue.x-k8s.io/elastic-job: "true"
+spec:
+  # serveConfigV2 takes a yaml multi-line scalar, which should be a Ray Serve multi-application config. See https://docs.ray.io/en/latest/serve/multi-app.html.
+  serveConfigV2: |
+    applications:
+      - name: fruit_app
+        import_path: fruit.deployment_graph
+        route_prefix: /fruit
+        runtime_env:
+          working_dir: "https://github.com/ray-project/test_dag/archive/78b4a5da38796123d9f9ffff59bab2792a043e95.zip"
+        deployments:
+          - name: MangoStand
+            num_replicas: 2
+            user_config:
+              price: 3
+            ray_actor_options:
+              num_cpus: 0.1
+          - name: OrangeStand
+            num_replicas: 1
+            user_config:
+              price: 2
+            ray_actor_options:
+              num_cpus: 0.1
+          - name: PearStand
+            num_replicas: 1
+            user_config:
+              price: 1
+            ray_actor_options:
+              num_cpus: 0.1
+          - name: FruitMarket
+            num_replicas: 1
+            ray_actor_options:
+              num_cpus: 0.1
+  rayClusterConfig:
+    rayVersion: '2.48.0' # should match the Ray version in the image of the containers
+    ######################headGroupSpecs#################################
+    # Ray head pod template.
+    headGroupSpec:
+      # The `rayStartParams` are used to configure the `ray start` command.
+      # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
+      # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
+      rayStartParams: {}
+      #pod template
+      template:
+        spec:
+          containers:
+          - name: ray-head
+            image: rayproject/ray:2.46.0
+            resources:
+              limits:
+                cpu: 1
+                memory: 2Gi
+              requests:
+                cpu: 1
+                memory: 2Gi
diff --git a/vendor/github.com/ray-project/kuberay/ray-operator/apis/ray/v1/rayservice_types.go b/vendor/github.com/ray-project/kuberay/ray-operator/apis/ray/v1/rayservice_types.go
index 5b6ca9e0ff8..db68aec6e76 100644
--- a/vendor/github.com/ray-project/kuberay/ray-operator/apis/ray/v1/rayservice_types.go
+++ b/vendor/github.com/ray-project/kuberay/ray-operator/apis/ray/v1/rayservice_types.go
@@ -67,6 +67,7 @@ type ClusterUpgradeOptions struct {
 	// +kubebuilder:default:=100
 	MaxSurgePercent *int32 `json:"maxSurgePercent,omitempty"`
 	// The percentage of traffic to switch to the upgraded RayCluster at a set interval after scaling by MaxSurgePercent.
+	// StepSizePercent must be less than or equal to MaxSurgePercent.
 	StepSizePercent *int32 `json:"stepSizePercent"`
 	// The interval in seconds between transferring StepSize traffic from the old to new RayCluster.
 	IntervalSeconds *int32 `json:"intervalSeconds"`
@@ -121,6 +122,12 @@ type RayServiceSpec struct {
 	// Therefore, the head Pod's endpoint will not be added to the Kubernetes Serve service.
 	// +optional
 	ExcludeHeadPodFromServeSvc bool `json:"excludeHeadPodFromServeSvc,omitempty"`
+	// Suspend indicates whether the RayService should suspend its execution. When set to true,
+	// all Kubernetes resources owned by the RayService controller (RayClusters, Kubernetes
+	// Services, Gateway, HTTPRoute) will be deleted. Setting it back to false will allow the
+	// RayService controller to recreate the resources.
+	// +optional
+	Suspend bool `json:"suspend,omitempty"`
 }
 
 // RayServiceStatuses defines the observed state of RayService
@@ -208,6 +215,11 @@ const (
 	UpgradeInProgress RayServiceConditionType = "UpgradeInProgress"
 	// RollbackInProgress means the RayService is currently rolling back an in-progress upgrade to the original cluster state.
 	RollbackInProgress RayServiceConditionType = "RollbackInProgress"
+	// RayServiceSuspending means the RayService is in the middle of deleting its owned resources in response to Spec.Suspend.
+	// Once entered, the suspend operation completes atomically regardless of later changes to Spec.Suspend.
+	RayServiceSuspending RayServiceConditionType = "Suspending"
+	// RayServiceSuspended means all resources owned by the RayService controller have been deleted and the RayService is suspended.
+	RayServiceSuspended RayServiceConditionType = "Suspended"
 )
 
 const (
@@ -220,6 +232,9 @@ const (
 	NoActiveCluster                RayServiceConditionReason = "NoActiveCluster"
 	RayServiceValidationFailed     RayServiceConditionReason = "ValidationFailed"
 	TargetClusterChanged           RayServiceConditionReason = "TargetClusterChanged"
+	SuspendRequested               RayServiceConditionReason = "SuspendRequested"
+	SuspendInProgress              RayServiceConditionReason = "SuspendInProgress"
+	SuspendComplete                RayServiceConditionReason = "SuspendComplete"
 )
 
 // +kubebuilder:object:root=true