The Secondary Scheduler Operator provides the ability to deploy a customized scheduler image developed using the scheduler plugin framework with customized configuration as a secondary scheduler in OpenShift.
| osso version | ocp version | k8s version | golang |
|---|---|---|---|
| 1.1.0 | 4.11-4.13 | 1.21 | 1.18 |
| 1.1.2 | 4.12, 4.13 | 1.26 | 1.19 |
| 1.1.3 | 4.12, 4.13 | 1.26 | 1.19 |
| 1.1.4 | 4.12, 4.13 | 1.26 | 1.19 |
| 1.1.5 | 4.12, 4.13 | 1.26 | 1.19 |
| 1.2.0 | 4.14, 4.15 | 1.27 | 1.20 |
| 1.2.1 | 4.14, 4.15 | 1.28 | 1.20 |
| 1.2.2 | 4.14, 4.15 | 1.28 | 1.20 |
| 1.2.3 | 4.14, 4.15 | 1.28 | 1.20 |
| 1.3.0 | 4.16, 4.17 | 1.29 | 1.21 |
| 1.3.1 | 4.16, 4.17 | 1.30 | 1.22 |
| 1.3.2 | 4.16, 4.17 | 1.30 | 1.22 |
| 1.4.0 | 4.18, 4.19 | 1.31 | 1.22 |
| 1.4.1 | 4.18, 4.19 | 1.32 | 1.23 |
| 1.4.2 | 4.18, 4.19 | 1.32 | 1.23 |
| 1.5.0 | 4.20, 4.21 | 1.33 | 1.24 |
| 1.5.1 | 4.20, 4.21 | 1.34 | 1.24 |
| 1.6.0 | 4.22, 4.23 | 1.35 | 1.25 |
$ make build
$ make install-local
$ make run-local-
Build and push the operator image to a registry:
export QUAY_USER=${your_quay_user_id} export IMAGE_TAG=${your_image_tag} podman build -t quay.io/${QUAY_USER}/secondary-scheduler-operator:${IMAGE_TAG} . podman login quay.io -u ${QUAY_USER} podman push quay.io/${QUAY_USER}/secondary-scheduler-operator:${IMAGE_TAG}
-
Update the image spec under
.spec.template.spec.containers[0].imagefield in thedeploy/05_deployment.yamlDeployment to point to the newly built image -
Update the
.spec.schedulerImagefield underdeploy/07_secondary-scheduler-operator.cr.yamlCR to point to a secondary scheduler image -
Update the
KubeSchedulerConfigurationunderdeploy/06_configmap.yamlto configure available plugins -
Apply the manifests from
deploydirectory:oc apply -f deploy/
This process requires access to the Brew building system.
-
List available bundle images (as IMAGE):
$ brew list-builds --package=secondary-scheduler-operator-bundle-container -
Get pull secret for selected bundle image (as IMAGE_PULL):
$ brew --noauth call --json getBuild IMAGE |jq -r '.extra.image.index.pull[0]' -
Build the index image (with IMAGE_TAG):
$ opm index add --bundles IMAGE_PULL --tag quay.io/${QUAY_USER}/secondary-scheduler-operator-index:IMAGE_TAG
This process refers to building the operator in a way that it can be installed locally via the OperatorHub with a custom index image
-
Build and push the operator image to a registry:
export QUAY_USER=${your_quay_user_id} export IMAGE_TAG=${your_image_tag} podman build -t quay.io/${QUAY_USER}/secondary-scheduler-operator:${IMAGE_TAG} . podman login quay.io -u ${QUAY_USER} podman push quay.io/${QUAY_USER}/secondary-scheduler-operator:${IMAGE_TAG}
-
Update the
.spec.install.spec.deployments[0].spec.template.spec.containers[0].imagefield in the SSO CSV undermanifests/cluster-secondary-scheduler-operator.clusterserviceversion.yamlto point to the newly built image. -
build and push the metadata image to a registry (e.g. https://quay.io):
podman build -t quay.io/${QUAY_USER}/secondary-scheduler-operator-metadata:${IMAGE_TAG} -f Dockerfile.metadata . podman push quay.io/${QUAY_USER}/secondary-scheduler-operator-metadata:${IMAGE_TAG}
-
build and push image index for operator-registry (pull and build https://github.com/operator-framework/operator-registry/ to get the
opmbinary)opm index add --bundles quay.io/${QUAY_USER}/secondary-scheduler-operator-metadata:${IMAGE_TAG} --tag quay.io/${QUAY_USER}/secondary-scheduler-operator-index:${IMAGE_TAG} podman push quay.io/${QUAY_USER}/secondary-scheduler-operator-index:${IMAGE_TAG}
Don't forget to increase the number of open files, .e.g.
ulimit -n 100000in case the current limit is insufficient. -
create and apply catalogsource manifest (notice to change <<QUAY_USER>> and <<IMAGE_TAG>> to your own values)::
apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: secondary-scheduler-operator namespace: openshift-marketplace spec: sourceType: grpc image: quay.io/<<QUAY_USER>>/secondary-scheduler-operator-index:<<IMAGE_TAG>>
-
create
openshift-secondary-scheduler-operatornamespace:$ oc create ns openshift-secondary-scheduler-operator -
open the console Operators -> OperatorHub, search for
secondary scheduler operatorand install the operator -
create CM for the KubeSchedulerConfiguration (the config file has to stored under
config.yaml). E.g.:cat config.yaml apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: KubeSchedulerConfiguration leaderElection: leaderElect: false profiles: - schedulerName: secondary-scheduler plugins: score: disabled: - name: NodeResourcesBalancedAllocation - name: NodeResourcesLeastAllocated enabled: - name: TargetLoadPacking pluginConfig: - name: TargetLoadPacking args: defaultRequests: cpu: "2000m" defaultRequestsMultiplier: "1" targetUtilization: 70 metricProvider: type: Prometheus address: ${PROM_URL} token: ${PROM_TOKEN} oc create -n openshift-secondary-scheduler-operator configmap secondary-scheduler-config --from-file=config.yamlYou can run the following commands to get
PROM_URLandPROM_TOKENenvs from your OpenShift cluster:PROM_HOST=`oc get routes prometheus-k8s -n openshift-monitoring -ojson |jq ".status.ingress"|jq ".[0].host"|sed 's/"//g'` PROM_URL="https://${PROM_HOST}" TOKEN_NAME=`oc get secret -n openshift-monitoring|awk '{print $1}'|grep prometheus-k8s-token -m 1` PROM_TOKEN=`oc describe secret $TOKEN_NAME -n openshift-monitoring|grep "token:"|cut -d: -f2|sed 's/^ *//g'`
-
Create CR for the secondary scheduler operator in the console (
schedulerImageis set to a scheduler built from upstream https://github.com/kubernetes-sigs/scheduler-plugins repository):apiVersion: operator.openshift.io/v1 kind: SecondaryScheduler metadata: name: cluster namespace: openshift-secondary-scheduler-operator spec: managementState: Managed schedulerConfig: secondary-scheduler-config schedulerImage: k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.22.6
To deploy a custom scheduler, you must build and host a container image for
your scheduler using the Kubernetes Scheduler Framework. You can then set the
image with the operator's spec.schedulerImage field, like so:
$ oc edit secondaryschedulers/secondary scheduler
...
spec:
schedulerImage: quay.io/myuser/myscheduler:latest
...
A sample CR definition looks like below (the operator expects cluster CR under openshift-secondary-scheduler-operator namespace):
apiVersion: operator.openshift.io/v1
kind: SecondaryScheduler
metadata:
name: cluster
namespace: openshift-secondary-scheduler-operator
spec:
schedulerConfig: secondary-scheduler-config
schedulerImage: k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.24.9
topology:
mode: SingleReplicaThe operator spec provides the following fields:
schedulerConfig: Name of the ConfigMap containing the custom KubeSchedulerConfigurationschedulerImage: Container image for the custom schedulertopology: Configuration for scheduler deployment topology (single replica or high availability)
The Secondary Scheduler Operator supports running the scheduler in High Availability (HA) mode with multiple replicas distributed across nodes using pod anti-affinity.
The operator supports two topology modes:
- SingleReplica (default): Runs a single instance of the secondary scheduler
- HighlyAvailable: Runs multiple instances with pod anti-affinity to distribute them across nodes
To enable HA mode, set the topology.mode field to HighlyAvailable:
apiVersion: operator.openshift.io/v1
kind: SecondaryScheduler
metadata:
name: cluster
namespace: openshift-secondary-scheduler-operator
spec:
schedulerConfig: secondary-scheduler-config
schedulerImage: k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.24.9
topology:
mode: HighlyAvailable
highlyAvailableTopology:
maxReplicas: 3The highlyAvailableTopology field supports the following options:
-
maxReplicas (default: 3): Maximum number of scheduler replicas. The actual number of replicas is determined by the number of nodes matching the nodeSelector, capped at maxReplicas.
-
nodeSelector (optional): Target specific nodes for scheduler placement. If unspecified, all nodes are considered.
topology: mode: HighlyAvailable highlyAvailableTopology: maxReplicas: 3 nodeSelector: node-role.kubernetes.io/worker: ""
-
tolerations (optional): Allow scheduling on nodes with specific taints.
topology: mode: HighlyAvailable highlyAvailableTopology: maxReplicas: 3 tolerations: - key: "dedicated" operator: "Equal" value: "scheduler" effect: "NoSchedule"
When HA mode is enabled, the operator automatically configures soft pod anti-affinity (preferredDuringSchedulingIgnoredDuringExecution) to distribute scheduler instances across different nodes:
- Weight: 100
- TopologyKey:
kubernetes.io/hostname - Label Selector:
app=secondary-scheduler
This ensures that scheduler instances prefer to run on different nodes for improved availability, while still allowing multiple instances on the same node if necessary (e.g., during updates).
This repository is compatible with the OpenShift Tests Extension (OTE) framework.
make build# Run a specific test suite or test
./secondary-scheduler-operator-tests-ext run-suite openshift/secondary-scheduler-operator/all
./secondary-scheduler-operator-tests-ext run-test "test-name"
# Run with JUnit output
./secondary-scheduler-operator-tests-ext run-suite openshift/secondary-scheduler-operator/all --junit-path /tmp/junit.xml# List all test suites
./secondary-scheduler-operator-tests-ext list suites
# List tests in a suite
./secondary-scheduler-operator-tests-ext list tests --suite=openshift/secondary-scheduler-operator/allFor more information about the OTE framework, see the openshift-tests-extension documentation.