fix: use (name, subtype) tuple as component key in DeprecationDetector#498
Open
Rama542 wants to merge 9 commits into
Open
fix: use (name, subtype) tuple as component key in DeprecationDetector#498Rama542 wants to merge 9 commits into
Rama542 wants to merge 9 commits into
Conversation
…ntries Adds a new ecosystem-automation/v1-registry-sync Python package that reads the latest V2 registry snapshot and generates a report showing which stability, display_name, and description values would be written into matching V1 entries under opentelemetry.io/data/registry/. The tool runs in dry-run mode only for now and outputs a JSON or YAML report. It selects the most stable signal level across all signals for each component (stable > beta > alpha > development > deprecated > unmaintained) and omits null fields from the output. 18 unit tests cover the reader and reporter modules. Closes open-telemetry#465
- Remove stability from proposed_v1_changes: the V1 schema declares additionalProperties false and has no stability field, so the validator would reject any entry containing it - Remove title/display_name from proposed_v1_changes: a handful of V1 titles carry more information than the V2 display_name (e.g. otelarrowexporter), so limiting the initial sync to description avoids losing fidelity - Add target_v1_file and v1_entry_exists to each report entry so the dry-run output is directly actionable - Replace local _find_latest_version and _parse_component_file with InventoryManager from collector-watcher, which is the same pattern used by explorer-db-builder and configuration-watcher; this also fixes a latent issue where the old helper could pick up SNAPSHOT directories since it sorted all version dirs including pre-releases - Add --v1-registry-dir CLI argument to enable v1_entry_exists checks against a local clone of opentelemetry.io/data/registry - Add README.md to match sibling watcher packages
The previous target_v1_file used f"collector-{name}.yml" but actual V1
files follow collector-{component_type}-{slug}.yml, so v1_entry_exists
was returning false for nearly every component.
The fix builds a dict[go_module_path -> v1_filename] at startup by
reading the package.name field from each V1 file. Each V2 component's
expected module path is constructed as:
github.com/open-telemetry/opentelemetry-collector-contrib/{type}/{name}
Matching on the module path is consistent across both registries and
avoids naming-convention guesswork. Across 249 contrib components in
v0.151.0, 244 match this way; the 5 that do not (azurefunctionsreceiver,
googlesecopsexporter, drainprocessor, spanpruningprocessor,
datadogconnector) are genuinely missing from V1, not matcher bugs.
expected_go_module_path is also included on every report row so misses
are easy to triage. The test fixture now uses realistic V1 file names
(collector-receiver-fooreceiver.yml) instead of the old wrong convention.
The _build_component_set method was identifying components by name alone. This caused the detector to miss removals when two components shared the same name but had different subtypes, for example two extensions named zipkin under encoding and storage subtype groups respectively. Changed the set to hold (name, subtype) tuples so each component is uniquely identified. Updated the removal check in detect_deprecated to match on the full tuple. Added two new test cases that cover subtype removal detection and confirm no false positives are produced when the same name survives under an unchanged subtype. Closes open-telemetry#493
✅ Deploy Preview for otel-ecosystem-explorer ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #493
What was wrong
The
_build_component_setmethod inDeprecationDetectorwas building a plain set of component names to figure out which components had been removed between versions. That works fine when every component in a list has a unique name, but collector extension components can have asubtypefield (encoding, observer, storage) because they live in nested directories. Two extension components with the same name but different subtypes are completely separate things, yet the old code treated them as the same.A concrete example: if you had a
zipkinencoding extension and azipkinstorage extension in the previous version, and then the encoding one got removed in the next version, the detector would seezipkinin both the old and new sets and conclude nothing was removed. The deprecation would go untracked.What I changed
Changed the set to hold
(name, subtype)tuples instead of plain names. Updated the check insidedetect_deprecatedto match against those tuples. The variable was also renamed fromremoved_namestoremoved_keysto reflect that it now holds tuples rather than plain strings.Tests
Added two new test cases in
test_deprecation_detector.py:test_detect_deprecated_subtype_removal: verifies that removing one subtype entry while keeping another with the same name is correctly detected as a deprecation.test_detect_deprecated_same_name_different_subtype_no_false_positive: verifies that when a component with a specific subtype survives unchanged, no false deprecation is reported.All 8 tests pass.