Skip to content

feat(scaler): add observability (metrics + tracing) to the external scaler#1634

Open
Fedosin wants to merge 1 commit into
kedacore:mainfrom
Fedosin:scaler-observability
Open

feat(scaler): add observability (metrics + tracing) to the external scaler#1634
Fedosin wants to merge 1 commit into
kedacore:mainfrom
Fedosin:scaler-observability

Conversation

@Fedosin
Copy link
Copy Markdown
Contributor

@Fedosin Fedosin commented May 13, 2026

Add OpenTelemetry-based metrics and distributed tracing to the external
scaler component, which previously had no observability instrumentation.

Metrics:

  • scaler.pinger.fetch.duration (histogram) — duration of each queue pinger fetch cycle
  • scaler.pinger.fetch.errors (counter) — total failed pinger fetch cycles
  • scaler.pinger.endpoints (gauge) — number of interceptor endpoints being polled
  • Prometheus /metrics endpoint on port 2224 (configurable via OTEL_PROM_EXPORTER_PORT)
  • Optional OTLP HTTP metrics export (via OTEL_EXPORTER_OTLP_METRICS_ENABLED)

Tracing:

  • OTEL tracing SDK with console, HTTP/protobuf, and gRPC exporters
  • otelgrpc stats handler for automatic gRPC server span instrumentation
  • W3C TraceContext + Baggage propagation

Configuration env vars (same convention as the interceptor):

  • OTEL_PROM_EXPORTER_ENABLED (default: true)
  • OTEL_PROM_EXPORTER_PORT (default: 2224)
  • OTEL_EXPORTER_OTLP_METRICS_ENABLED (default: false)
  • OTEL_EXPORTER_OTLP_TRACES_ENABLED (default: false)
  • OTEL_EXPORTER_OTLP_TRACES_PROTOCOL (default: console)

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO)
  • Changelog has been updated and is aligned with our changelog requirements
  • Any necessary documentation is added, such as:

Fixes #965 (partial — scaler component)

Copilot AI review requested due to automatic review settings May 13, 2026 15:14
@Fedosin Fedosin requested a review from a team as a code owner May 13, 2026 15:14
@keda-automation keda-automation requested a review from a team May 13, 2026 15:14
@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented May 13, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds OpenTelemetry-based observability to the external scaler by introducing metric instruments/exporters and distributed tracing, plus wiring them into the scaler’s startup and queue polling logic.

Changes:

  • Added an OTEL metrics provider + instruments for queue pinger fetch duration/errors and endpoint count, with Prometheus and optional OTLP/HTTP export.
  • Added OTEL tracing SDK setup and enabled automatic gRPC server span instrumentation via otelgrpc when tracing is enabled.
  • Wired metrics/tracing configuration into scaler config and main startup, including a /metrics HTTP endpoint.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scaler/tracing/tracing.go Adds OTEL tracing SDK setup and exporter selection for the scaler.
scaler/metrics/provider.go Introduces an OTEL MeterProvider with Prometheus and optional OTLP metric export.
scaler/metrics/instruments.go Defines metric instruments and recording helpers for queue pinger metrics.
scaler/queue_pinger.go Records pinger fetch metrics on each polling cycle.
scaler/queue_pinger_test.go Updates pinger construction in tests for the new instruments parameter.
scaler/main.go Initializes metrics/tracing, adds Prometheus /metrics server, and instruments gRPC when enabled.
scaler/config.go Adds env-configurable metrics and tracing settings for the scaler.
go.mod Adds otelgrpc dependency for gRPC tracing instrumentation.
go.sum Updates sums for added/updated dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scaler/queue_pinger.go Outdated
Comment thread scaler/metrics/provider.go
@Fedosin Fedosin force-pushed the scaler-observability branch from 6f6d5f4 to 1b8f62f Compare May 13, 2026 15:45
…caler

Add OpenTelemetry-based metrics and distributed tracing to the external
scaler component, which previously had no observability instrumentation.

Metrics:
- scaler.pinger.fetch.duration (histogram)
- scaler.pinger.fetch.errors (counter)
- scaler.pinger.endpoints (gauge)
- Prometheus /metrics endpoint on port 2224 (configurable)
- Optional OTLP HTTP metrics export

Tracing:
- OTEL tracing SDK with console, HTTP/protobuf, and gRPC exporters
- otelgrpc stats handler for automatic gRPC server span instrumentation
- W3C TraceContext + Baggage propagation

Relates to: kedacore#965

Signed-off-by: Mikhail Fedosin <mfedosin@redhat.com>
@Fedosin Fedosin force-pushed the scaler-observability branch from 1b8f62f to 744e536 Compare May 13, 2026 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Observability for all the components

2 participants