perf: switch 100kLRPS benchmarks to realistic entropy data source by cijothomas · Pull Request #2985 · open-telemetry/otel-arrow

cijothomas · 2026-05-15T22:26:12Z

Summary

Switches the rate-limited (100kLRPS) continuous benchmark tests from semantic_conventions + pre_generated to static + fresh + use_trace_context: true.

Problem

The semantic_conventions data source with pre_generated strategy replays identical payloads every batch, resulting in unrealistically high compression ratios. OTAP egress was as low as 6-9 bytes/log, and OTLP ~27 bytes/log — making network utilization benchmarks misleading.

Solution

The static data source with fresh generation provides:

50 diverse log body templates (~150 bytes each) with realistic content
Unique trace_id/span_id per record via use_trace_context: true
Fresh timestamps and varied attribute values per batch

The uncompressed protobuf size per log record is approximately 250 bytes. See #2987 for a planned metric to expose this directly.

Before/After comparison (CI dedicated machine)

Test		CPU%	egress bytes/log	TX MB/s	Drop%
OTLP-ATTR-OTLP	Before	65.3	27.4	2.61	0%
	After	65.1	49.9	4.52	0%
OTLP-ATTR-OTAP	Before	64.5	8.9	0.81	5%
	After	64.4	37.3	3.38	0%
OTAP-ATTR-OTAP	Before	49.4	9.2	0.79	5.3%
	After	51.3	37.4	3.39	0%
OTAP-ATTR-OTLP	Before	64.9	28.9	2.48	5.3%
	After	64.9	57.9	5.24	0%
OTLP-BATCH-OTLP	Before	64.6	27.7	2.51	5%
	After	64.9	49.9	4.52	0%
OTAP-BATCH-OTAP	Before	37.9	6.4	0.58	0.05%
	After	38.2	34.9	3.16	0.05%

Key observations:

CPU is unchanged — the engine handles the higher-entropy data at the same cost
Compression ratios are now realistic: ~5:1 for OTLP (250→50), ~7:1 for OTAP (250→37)
OTAP columnar compression advantage over OTLP becomes measurable: 35-37 vs 50-58 bytes/log = ~30% less wire usage, which was invisible with the old low-entropy data
Saturation tests are not changed (they need pre_generated to avoid loadgen bottleneck)

Relates to #2540

codecov · 2026-05-15T22:40:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.91%. Comparing base (672d665) to head (5c545b1).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2985      +/-   ##
==========================================
- Coverage   85.92%   85.91%   -0.01%     
==========================================
  Files         725      725              
  Lines      275605   275605              
==========================================
- Hits       236811   236797      -14     
- Misses      38270    38284      +14     
  Partials      524      524

Components	Coverage Δ
otap-dataflow	`87.04% <ø> (-0.01%)`	⬇️
query_abstraction	`80.61% <ø> (ø)`
query_engine	`89.57% <ø> (ø)`
otel-arrow-go	`52.45% <ø> (ø)`
quiver	`92.25% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

JakeDern

Just curious - How does the static data compare to semantic registry data? For the comparison dashboard we're using fresh + semantic convention as the data generation strategy, but if this has better characteristics then maybe we'll consider swapping out there too.

cijothomas · 2026-05-16T02:26:57Z

Just curious - How does the static data compare to semantic registry data? For the comparison dashboard we're using fresh + semantic convention as the data generation strategy, but if this has better characteristics then maybe we'll consider swapping out there too.

Yes. We'd need to switch there too. The semantic convention registry does not have much variations. I'll dive deeper into it next week. (We also need to find an easy way to tell if we are getting too-easy-compress/unrealiistic input to begin with.)

JakeDern · 2026-05-16T05:12:01Z

Just curious - How does the static data compare to semantic registry data? For the comparison dashboard we're using fresh + semantic convention as the data generation strategy, but if this has better characteristics then maybe we'll consider swapping out there too.

Yes. We'd need to switch there too. The semantic convention registry does not have much variations. I'll dive deeper into it next week. (We also need to find an easy way to tell if we are getting too-easy-compress/unrealiistic input to begin with.)

Sounds good, I can make that change and compare rates! We have baseline measurements for those suites already with fresh generation + the semantic registry source, so we'll be able to see how the static source compares.

cijothomas requested a review from a team as a code owner May 15, 2026 22:26

github-project-automation Bot added this to OTel-Arrow May 15, 2026

cijothomas added the pipelineperf Trigger for the pipeline performance job on PRs label May 15, 2026

cijothomas closed this May 15, 2026

github-project-automation Bot moved this to Done in OTel-Arrow May 15, 2026

cijothomas reopened this May 15, 2026

perf: switch 100kLRPS tests to static data source with realistic entropy

5c545b1

cijothomas force-pushed the perf/realistic-loadgen-entropy branch from b170962 to 5c545b1 Compare May 15, 2026 23:07

JakeDern approved these changes May 16, 2026

View reviewed changes

JakeDern mentioned this pull request May 17, 2026

[Comparison Dashboard] Test impact of static vs semantic convention data source on compression #2991

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: switch 100kLRPS benchmarks to realistic entropy data source#2985

perf: switch 100kLRPS benchmarks to realistic entropy data source#2985
cijothomas wants to merge 1 commit into
open-telemetry:mainfrom
cijothomas:perf/realistic-loadgen-entropy

cijothomas commented May 15, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 15, 2026 •

edited

Loading

Uh oh!

JakeDern left a comment

Uh oh!

cijothomas commented May 16, 2026

Uh oh!

JakeDern commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cijothomas commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Before/After comparison (CI dedicated machine)

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JakeDern left a comment

Choose a reason for hiding this comment

Uh oh!

cijothomas commented May 16, 2026

Uh oh!

JakeDern commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cijothomas commented May 15, 2026 •

edited

Loading

codecov Bot commented May 15, 2026 •

edited

Loading