Component(s)
connector/spanmetrics
Describe the issue you're reporting
Our deployment uses HPA for computing spanmetrics, and is behind a loadbalancing exporter. It seems that when queues from upstream are flushed out, some combination of pod scaling and bursty traffic is leading to reported spikes.
We are using CUMULATIVE for aggregation temporality, and one troubleshooting step we took was removing isFirst, so it would always report the count seen instead of emitting 0 on isFirst. We stopped seeing the spikes but are wondering if this was the real cause of the spikes and if there is anything else needed to compute accurate metric values? It seems that the connector is not using span timestamp to compute the metric but the time it receives the spans?
The yellow line is running spanmetrics connector, the blue line is the connector with isFirst ignored and always emitting the count, and the green is the reference value.
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
Component(s)
connector/spanmetrics
Describe the issue you're reporting
Our deployment uses HPA for computing spanmetrics, and is behind a loadbalancing exporter. It seems that when queues from upstream are flushed out, some combination of pod scaling and bursty traffic is leading to reported spikes.
We are using CUMULATIVE for aggregation temporality, and one troubleshooting step we took was removing
isFirst, so it would always report the count seen instead of emitting 0 on isFirst. We stopped seeing the spikes but are wondering if this was the real cause of the spikes and if there is anything else needed to compute accurate metric values? It seems that the connector is not using span timestamp to compute the metric but the time it receives the spans?The yellow line is running spanmetrics connector, the blue line is the connector with isFirst ignored and always emitting the count, and the green is the reference value.
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.