Skip to content

implement struct column assignment and validation logic#2964

Open
guancioul wants to merge 4 commits into
open-telemetry:mainfrom
guancioul:feat/struct-col-assignment
Open

implement struct column assignment and validation logic#2964
guancioul wants to merge 4 commits into
open-telemetry:mainfrom
guancioul:feat/struct-col-assignment

Conversation

@guancioul
Copy link
Copy Markdown
Contributor

@guancioul guancioul commented May 13, 2026

Change Summary

This PR implements execution and validation logic for assigning values to nested struct fields in the OTAP columnar query engine, enabling queries such as:

logs | set resource.schema_url = "http://my-schema.com"
logs | set instrumentation_scope.name = "some_scope_name"

Validation (validate_assign in assign.rs)

  • Add StructCol arm to the validate_assign dispatch
  • Add is_valid_struct_field to check that the target field name actually exists in the struct schema
  • Add type validation to reject assignments where the source value type does not match the destination field type
  • Add validate_struct_col_assign_cardinality to enforce the OTAP 1:many hierarchy constraint (RESOURCE ==1:many==> SCOPE ==1:many==> LOG/SPAN/METRIC); e.g. set resource.schema_url = attributes["x"] is rejected because log-level data cannot be used to assign a resource-level field

Execution (execute in assign.rs)

  • Add StructCol arm to the execution dispatch to handle struct field assignment
  • Add assign_to_struct_column — handles non-null assignment to a struct field with row-level alignment between the source array and the target struct column
  • Add assign_null_struct_field — handles null assignment by dropping the target field from the struct in the OTAP batch

DataScope Enhancement

  • Extend DataScope with a new StructField variant to represent the cardinality level of a value read from a struct column field (e.g. reading resource.schema_url produces StructField(ResourceAttrs))
  • Previously, reading a struct field like resource.schema_url as a source expression was treated as Root (log-level) scope — the same as any plain column — so the cardinality validator could not distinguish resource-level struct field reads from log-level reads
  • With StructField, the validator can now correctly compare the cardinality level of the source expression (e.g. resource.schema_urlStructField(ResourceAttrs)) against the destination struct column (e.g. assigning to instrumentation_scope) and reject invalid assignments accordingly

What issue does this PR close?

How are these changes tested?

  • Assign a struct field from a static literal (e.g. resource.schema_url = "something")
  • Assign a non-string (UInt32) struct field from a scalar literal (e.g. instrumentation_scope.dropped_attributes_count = 42)
  • Reject assignment where source is at a lower cardinality level than the destination (e.g. resource.schema_url = attributes["x"])
  • Allow assignment where source is at a higher or equal cardinality level (e.g. instrumentation_scope.name = resource.schema_url)
  • Assign a struct field from its own sibling attributes (e.g. instrumentation_scope.name = instrumentation_scope.attributes["name"])
  • Reject assignment of the wrong type (e.g. resource.dropped_attributes_count = "hello")
  • Assign null to a struct field and verify the field is dropped from the batch
  • Assign the result of a function expression (e.g. instrumentation_scope.name = substring(instrumentation_scope.name, 0, 6))
  • Assign when the struct column does not yet exist in the batch
  • Assign when the target field does not yet exist within an existing struct column

Are there any user-facing changes?

Yes. Users can now write OPL/KQL queries that assign values to nested struct fields such as resource.schema_url, instrumentation_scope.name, instrumentation_scope.version, and instrumentation_scope.dropped_attributes_count.

@github-actions github-actions Bot added rust Pull requests that update Rust code query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches labels May 13, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 90.74316% with 71 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.93%. Comparing base (672d665) to head (b78f21d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2964      +/-   ##
==========================================
+ Coverage   85.92%   85.93%   +0.01%     
==========================================
  Files         725      725              
  Lines      275605   276347     +742     
==========================================
+ Hits       236811   237487     +676     
- Misses      38270    38336      +66     
  Partials      524      524              
Components Coverage Δ
otap-dataflow 87.06% <90.74%> (+0.01%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 89.57% <ø> (ø)
otel-arrow-go 52.45% <ø> (ø)
quiver 92.25% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions Bot added the ci-repo Repository maintenance, build, GH workflows, repo cleanup, or other chores label May 16, 2026
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 16, 2026

CLA Signed
The committers listed above are authorized under a signed CLA.

@guancioul guancioul force-pushed the feat/struct-col-assignment branch from b0f2244 to a0b91e3 Compare May 16, 2026 18:02
guancioul added 2 commits May 17, 2026 02:06
…ation

Signed-off-by: guancioul <guancioul@gmail.com>
Signed-off-by: guancioul <guancioul@gmail.com>
@guancioul guancioul force-pushed the feat/struct-col-assignment branch from a0b91e3 to 7f63bb6 Compare May 16, 2026 18:08
guancioul added 2 commits May 17, 2026 03:45
…gnment

Signed-off-by: guancioul <guancioul@gmail.com>
…gnment

Signed-off-by: guancioul <guancioul@gmail.com>
@guancioul guancioul marked this pull request as ready for review May 16, 2026 20:16
@guancioul guancioul requested a review from a team as a code owner May 16, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-repo Repository maintenance, build, GH workflows, repo cleanup, or other chores query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[OTAP query-engine]: Support assignment of "struct" columns

1 participant