Extend PodLifeTime with condition, exit code, owner kind, and transition time filters#1844
Conversation
There was a problem hiding this comment.
Pull request overview
This PR expands the PodLifeTime descheduler plugin to support transition-based eviction via new filter fields (conditions, exit codes, owner kinds) and then refactors RemoveFailedPods to delegate its eviction behavior to PodLifeTime (via the new NewAs constructor) to reduce duplicated logic.
Changes:
- Extend
PodLifeTimeArgswithconditions,exitCodes, andownerKinds, plus enhancedstatesmatching (now includes terminated reasons). - Add delegation support (
NewAs) so other plugins (notablyRemoveFailedPods) can reusePodLifeTimelogic with additional injected filters. - Add/extend unit tests, e2e tests, and documentation/examples to cover transition-based eviction scenarios.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
test/e2e/e2e_podlifetime_test.go |
Adds e2e coverage for PodLifeTime eviction behavior on failed/succeeded pods with new filters. |
pkg/framework/plugins/removefailedpods/failedpods.go |
Refactors RemoveFailedPods to delegate to PodLifeTime via NewAs and maps args. |
pkg/framework/plugins/podlifetime/pod_lifetime.go |
Implements new filters, delegation constructor, and new eviction ordering logic. |
pkg/framework/plugins/podlifetime/types.go |
Adds new config types/fields: OwnerKinds, PodConditionFilter, ExitCodes. |
pkg/framework/plugins/podlifetime/validation.go |
Updates validation rules to account for new filter criteria and ownerKinds. |
pkg/framework/plugins/podlifetime/validation_test.go |
Updates validation test expectations for new validation semantics. |
pkg/framework/plugins/podlifetime/pod_lifetime_test.go |
Adds unit tests for new filtering and ordering behavior. |
pkg/framework/plugins/podlifetime/zz_generated.deepcopy.go |
Updates generated deep-copies for new args/types. |
pkg/framework/plugins/podlifetime/README.md |
Documents the new filters, AND/OR semantics, ordering, and migration notes. |
examples/pod-life-time-transition.yml |
Adds example policy for transition-based eviction. |
README.md |
Updates top-level plugin documentation and examples for PodLifeTime. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2a26912 to
e7a230d
Compare
|
/retest-required |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/retest-required |
e7a230d to
dff7cb8
Compare
dff7cb8 to
d613ffd
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d613ffd to
77d06d2
Compare
77d06d2 to
9f3c55a
Compare
9f3c55a to
c4f0ac0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/cc @jklaw90 @ingvagabund 🙏🏼 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if podLifeTimeArgs.MaxPodLifeTimeSeconds != nil { | ||
| podFilter = podutil.WrapFilterFuncs(podFilter, func(pod *v1.Pod) bool { | ||
| podAge := metav1.Now().Sub(pod.GetCreationTimestamp().Local()) | ||
| if podAge < 0 { |
There was a problem hiding this comment.
this is odd, but copilot requested a change for code that I did not touch
7ef5b0d to
93fc0c5
Compare
| if filter.Reason != "" && cond.Reason != filter.Reason { | ||
| return false | ||
| } | ||
| return true |
There was a problem hiding this comment.
The case where all fields are unset is validated. So in case someone invokes this without the validation a condition with all fields unset matches every pod? Maybe revert the logic?
There was a problem hiding this comment.
Technically only one need to be set so this condition check is still needed
There was a problem hiding this comment.
validation:
args.MaxPodLifeTimeSeconds != nil ||
len(args.States) > 0 ||
len(args.Conditions) > 0 ||
len(args.ExitCodes) > 0
| p.Status.Phase = v1.PodFailed | ||
| }), | ||
| }, | ||
| expectedEvictedPodCount: 1, |
There was a problem hiding this comment.
This should not evict a pod if a job is excluded.
There was a problem hiding this comment.
isn't p2 a non-Job pod? hence why it's expecting p2 pod to be evicted
There was a problem hiding this comment.
I added a separate test to focus on a Job not getting evicted, but this test-case is valid and the assertion is correct
93fc0c5 to
7114d0e
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Move container waiting/terminated state checking from PodLifeTime and RemovePodsHavingTooManyRestarts into podutil as separate exported helpers: HasMatchingContainerWaitingState and HasMatchingContainerTerminatedState. Each plugin composes only the helpers it needs.
7114d0e to
a4391ea
Compare
|
/hold cancel |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ingvagabund The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…#1838-#1842-#1847-#1848-#1844-upstream-release-1.35 Automated cherry pick of #1836: Synchronize helm clusterrole RBAC with base yaml #1826: Add init containers support to Helm chart #1838: Change icon URL in Chart.yaml #1842: fix: resolve detected data races #1847: fix(ci): upgrade codeql-action to v4 and clean up security #1848: update go dependencies #1844: Extend PodLifeTime with condition, exit code, owner kind,
Description
Extend PodLifeTime with new filtering fields
Adds new fields to
PodLifeTimeArgsso it can handle fine-grained, transition-based eviction:conditions: Filter pods bystatus.conditionsentries (type,status,reason) with an optionalminTimeSinceLastTransitionSecondsper condition filterexitCodes: Filter by container terminated exit codesownerKinds: Include or exclude pods by owner reference kind (include/exclude)statesenhanced: Now also checks container terminated reasons (previously only checked waiting reasons)All non-empty filter categories are ANDed together (a pod must satisfy every specified filter). Within each category, entries are ORed. Pods are processed from oldest to newest based on their creation time. Condition filters, including
minTimeSinceLastTransitionSeconds, determine which pods are eligible for eviction but do not affect this ordering.Extract shared container state matching helpers into podutil
Moves the container waiting/terminated state checking logic from
PodLifeTimeandRemovePodsHavingTooManyRestartsintopodutilas separate exported helpers:HasMatchingContainerWaitingStateandHasMatchingContainerTerminatedState. Each plugin composes only the helpers it needs —TooManyRestartsuses only the waiting helper, whilePodLifeTimeuses both.Checklist