Skip to content

[receiver/receivercreator] Option to skip the collector's own pod during annotation-based discovery #48409

@cyrille-leclerc

Description

@cyrille-leclerc

Component(s)

receiver/receivercreator

Is your feature request related to a problem? Please describe.

When the OpenTelemetry Collector is deployed on Kubernetes (for example via the opentelemetry-kube-stack Helm chart) and is configured to both:

  1. Expose its own internal telemetry via a Prometheus pull exporter
    (service.telemetry.metrics.readers[].pull.exporter.prometheus) for troubleshooting purpose, and
  2. Use the receiver_creator to scrape pods that carry the standard
    prometheus.io/scrape: "true" annotation,

…the collector's own pod ends up being discovered and scraped by itself,
because the Helm chart (and many user setups) annotate the collector pod
with prometheus.io/scrape: "true" so that other Prometheus-compatible
scrapers can pick it up. In most deployments this self-scrape is not
desired: collector internal metrics are typically forwarded through
service.telemetry.metrics.readers[].periodic.exporter.otlp and exporting
them a second time via receiver_creator creates duplicate series with
different resource attributes.

The same problem exists for logs discovery (io.opentelemetry.discovery.logs/enabled
applied via default_annotations), where the README already calls out the
footgun:

(to avoid collecting Collector's own logs make sure that Collector Pods
are properly annotated with io.opentelemetry.discovery.logs/enabled: "false")

Today the only workaround is to manually annotate the collector pod(s) to opt them out — which is fragile (easy to forget, easy to break when the chart's defaults change) and requires every chart/distribution to ship the right annotation by default.

Example reproducer

OTel Collector config:

receivers:
  receiver_creator/prometheus_annotations:
    watch_observers: [k8s_observer]
    receivers:
      prometheus_simple:
        rule: type == "pod" && annotations["prometheus.io/scrape"] == "true"
        config:
          metrics_path: '`"prometheus.io/path" in annotations ? annotations["prometheus.io/path"] : "/metrics"`'
          endpoint: '`endpoint`:`"prometheus.io/port" in annotations ? annotations["prometheus.io/port"] : 9090`'

service:
  telemetry:
    metrics:
      readers:
        - pull:
            exporter:
              # Internal telemetry exposed for troubleshooting
              prometheus:
                host: '0.0.0.0'
                port: 8888

If the collector pod itself carries prometheus.io/scrape: "true" (as it does by default in the kube-stack chart), receiver_creator will discover its own :8888/metrics endpoint and scrape it.

Describe the solution you'd like

An option on receiver_creator (or, alternatively, on the k8s_observer) that causes the receiver to skip endpoints belonging to the collector's own pod during discovery. The collector pod identity can be sourced from the downward API (e.g. POD_NAME / POD_NAMESPACE env vars, which the k8sattributes processor and others already rely on).

Priority is metrics discovery (annotations such as prometheus.io/scrape and io.opentelemetry.discovery.metrics/*), but the same option should ideally apply to logs discovery (io.opentelemetry.discovery.logs/*) for consistency.

I am happy to defer to maintainers on the exact mechanism (opt-in flag vs. opt-out default, location on receiver_creator vs. k8s_observer, etc.).

Describe alternatives you've considered

  • Annotating every collector pod with prometheus.io/scrape: "false" and io.opentelemetry.discovery.{metrics,logs}/enabled: "false". Works, but is brittle: each chart/distribution has to remember to ship the right defaults, and end users who customize annotations can easily break it.
  • Crafting the receiver_creator rule to exclude the collector pod by name/namespace via an extra pod.name != … clause. Possible, but requires every user to hand-roll this in their rule expression and to thread the collector's own identity into the config.

Additional context

The kube-stack Helm chart annotates the collector pod with prometheus.io/scrape: "true" by default, which makes this footgun the default experience for users who follow the recommended Kubernetes setup and turn on annotation-based Prometheus discovery via receiver_creator.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions