Skip to content

Add weaver-live-check-{start,stop} composite actions#1448

Open
cijothomas wants to merge 5 commits into
open-telemetry:mainfrom
cijothomas:cijothomas/weaver-live-check-actions
Open

Add weaver-live-check-{start,stop} composite actions#1448
cijothomas wants to merge 5 commits into
open-telemetry:mainfrom
cijothomas:cijothomas/weaver-live-check-actions

Conversation

@cijothomas
Copy link
Copy Markdown
Member

@cijothomas cijothomas commented May 17, 2026

Adds two reusable composite actions that wrap the weaver registry live-check
lifecycle for CI integration, mirroring the setup-weaver pattern:

  • weaver-live-check-start — start the OTLP listener in the background,
    wait for /health, expose endpoints as outputs.
  • weaver-live-check-stopPOST /stop to retrieve the report,
    render a step summary, gate the job on fail-on (violation | improvement |
    information | none).

Each action has a README and is exercised by a new
test-weaver-live-check-actions.yml workflow (parity with
test-setup-weaver-action.yml). Linux runners only in v1.

Closes the live-check half of #879 (which produced setup-weaver).

End-to-end prototypes

I wired the actions up in two SDK repos as throw-away prototypes, purely
to validate the flow end-to-end. These are not PRs to those repos —
just personal forks running the actions against real instrumentation
telemetry:

Repo (fork) Subject Run
opentelemetry-rust-contrib tower middleware green
opentelemetry-dotnet-contrib AspNetCore green

The prototypes confirmed the actions work uniformly across two languages
and SDK shapes; each consumer workflow ended up ~130 lines, of which only
~50 are project-specific (build, run, drive traffic).

Examples (follow-up)

Per discussion with @jsuereth, @lmolkova, and @jerbly, examples will be added
post-merge to the opentelemetry-weaver-examples
repo (consistent with how setup-weaver examples are hosted there today).
A follow-up PR will add a live_check_ci/ example pinning these actions to
a tagged weaver ref once this PR lands.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.2%. Comparing base (85ab198) to head (ebc28bc).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1448     +/-   ##
=======================================
- Coverage   82.2%   82.2%   -0.1%     
=======================================
  Files        125     125             
  Lines      10241   10241             
=======================================
- Hits        8424    8421      -3     
- Misses      1817    1820      +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cijothomas
Copy link
Copy Markdown
Member Author

otel-arrow just started using weaver live-check in CI: open-telemetry/otel-arrow#2986. I think that is among the first
usage of weaver live check in CI.

Once the actions proposed in this PR land, those workflows can be modified to leverage this and reduce boilerplates and ensure better consistency.

@cijothomas cijothomas force-pushed the cijothomas/weaver-live-check-actions branch from c421b74 to 9be6a0f Compare May 20, 2026 18:03
@cijothomas cijothomas force-pushed the cijothomas/weaver-live-check-actions branch from 9be6a0f to 20a1dfb Compare May 20, 2026 18:06
Pair more symmetrically with weaver-live-check-start. README of the
stop action now leads with a note clarifying that 'stop' also
collects the report, renders the step summary, and gates the job.
@cijothomas cijothomas changed the title Add weaver-live-check-{start,finalize} composite actions Add weaver-live-check-{start,stop} composite actions May 20, 2026
@cijothomas cijothomas marked this pull request as ready for review May 20, 2026 21:00
@cijothomas cijothomas requested a review from a team as a code owner May 20, 2026 21:00
@cijothomas
Copy link
Copy Markdown
Member Author

Marking ready for review. cc @jsuereth @lmolkova @jerbly — per our offline discussion, examples will follow in a separate PR to opentelemetry-weaver-examples once this lands. Happy to address feedback in either action.

Copy link
Copy Markdown
Contributor

@jerbly jerbly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice, thank you!

I have added a few comments, mostly for future improvements. I do think we need a way to see more detail when the run fails though. Let's improve that part a bit and then this will be a great first version.

Format passed to `weaver registry live-check --diagnostic-format`.
`gh_workflow_command` surfaces findings as inline GitHub Actions
annotations.
required: false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth clarifying in this description that diagnostics are not the live-check findings, but other problems found during the process, e.g. warnings during registry loading.

icon: 'check-square'
color: 'blue'

inputs:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the stats in the report there is a coverage output. It would be good to offer a failure condition for that too. This allows you to check that your registry is being fully exercised by your tests.

icon: 'play'
color: 'blue'

inputs:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also have --config here? This allows you to give weaver a toml file of the weaver-config schema. Within that you can configure all the cli parameters and add extra items like:

[[live-check.finding_filters]]
exclude_samples = [
     "service.name",
     "telemetry.sdk.language",
     "telemetry.sdk.name",
     "telemetry.sdk.version",
]

"",
]
if msg_counts:
summary_lines += [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The summary is nice but it needs more context info to be useful. e.g. the summary produced here: https://github.com/open-telemetry/weaver/actions/runs/26181926973

Is there a way to get the full report? As a user my next step is going to be to dig into exactly what failed.

(Maybe weaver needs to have a mode where it produces the text output for the log AND json on http??)

(never fail; report-only). Start with `none` when first adopting,
then tighten once existing findings are addressed.
required: false
default: 'violation'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be lifted up into weaver. live-check could return exit code 1 for the failure level selected, not just violation which is the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants