[processor/k8sattributes] feat: add reverse mapping between podUID and podIdentifier to delete stale entries#48398
Conversation
|
Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib. Important reminders:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8887d3b3bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| podTableSize := len(c.Pods) | ||
| c.m.Unlock() | ||
|
|
||
| c.appendDeleteRequests(staleIdentifiers, true) |
There was a problem hiding this comment.
Cancel stale pod identifier deletes when restored
When an update temporarily drops an identifier, this queues a delayed delete, but a later update that restores the same identifier for the same pod UID never removes that pending request. deleteLoopProcessing only checks that the cached pod still has the same UID before deleting, so after the grace period it will remove an active association (for example a container.id or pod IP that disappeared in one informer update and reappeared before cleanup), breaking lookups until another update repopulates it.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This should be addressed by the latest commits. I’m happy to adjust the implementation if you see a better approach.
I chose to track pending stale deletes in a map and store pointers to the queued delete requests, so a restored identifier can cancel its pending deletion with an average-case O(1) map lookup, instead of scanning the delete queue on every pod update.
The trade-off is a small amount of additional memory for stale identifiers while their delete requests are pending. That is bounded by the number of stale identifiers observed during the delete grace period. Once deleteLoopProcessing reaches those entries, cancelled requests are skipped and removed from the pending map.
|
@wehzzz for the PR. However I'm not sure what this PR is trying to fix and why Also could you provide what version you are hitting that issue on? BTW #47357 recently tried to fix sth similar, so could you ensure you run a version that includes this fix? |
|
Thanks for the question @ChrsMark . The issue this PR is trying to fix is slightly different from #47357.
So the problem is not that the 30s loop/grace period is too slow, it's the unbounded growth explained in the PR description. The missing part is detecting identifiers that disappeared from the pod state and explicitly enqueueing them for deletion. This PR adds a reverse index from
Regarding #47357: I believe that fix addresses map bucket retention after entries have been deleted from the cache. This PR addresses a case where stale We observed this behavior on both 0.150.0 and 0.151.0. |
ChrsMark
left a comment
There was a problem hiding this comment.
Thank's for additional details. Could you update the PR's description accordingly to reflect the intention of this change clearly?
I have left a question to clarify. Other code-owners of the component should also verify this.
Also I suggest for this change to add a changelog entry as a perf enhancement even if it's not strictly speaking "user facing".
|
|
||
| identifiers := c.getIdentifiersFromAssoc(newPod) | ||
| if newPod.PodUID != "" { | ||
| staleIdentifiers = c.getStalePodIdentifiersForDeletionLocked(newPod.PodUID, identifiers) |
There was a problem hiding this comment.
Could we just remove the stale identifiers directly instead of maintaining a second delete queue for this? From my perspective for any given Pod at any given time only the active identifiers should be maintained. Not sure if I miss anything here. @dmitryax please chime in.
Description
We observed unbounded growth in
podTableSizewhen running thek8sattributesprocessor in an OTel eBPF profiler-based collector configured to associate pods bycontainer.id. This was observed on bothv0.150.0andv0.151.0.The issue happens on clusters with container restarts. A pod can stay alive while one of its containers restarts and receives a new
container.id. The cache then adds the newcontainer.idassociation, but the previouscontainer.idassociation is not queued for deletion because there is no pod delete event. SincedeleteLooponly processes entries that have already been added to the delete queue, the stale association remains in the cache.This PR fixes the missing cleanup path by keeping a reverse index from
podUIDto knownPodIdentifiers. With that index:container.idassociations that are no longer present in the final pod object.Before fix
otelcol_otelsvc_k8s_pod_addedotelcol_otelsvc_k8s_pod_deletedotelcol_otelsvc_k8s_pod_updatedotelcol_otelsvc_k8s_pod_table_sizeAfter fix
restartCountacross currently running containersotelcol_otelsvc_k8s_pod_addedotelcol_otelsvc_k8s_pod_deletedotelcol_otelsvc_k8s_pod_updatedotelcol_otelsvc_k8s_pod_table_sizeTesting