Skip to content

[SPARK-56907][SQL] Reduce per-value allocation in DELTA_LENGTH_BYTE_ARRAY Parquet vectorized reader#55932

Open
iemejia wants to merge 1 commit into
apache:masterfrom
iemejia:SPARK-delta-length-byte-array
Open

[SPARK-56907][SQL] Reduce per-value allocation in DELTA_LENGTH_BYTE_ARRAY Parquet vectorized reader#55932
iemejia wants to merge 1 commit into
apache:masterfrom
iemejia:SPARK-delta-length-byte-array

Conversation

@iemejia
Copy link
Copy Markdown
Member

@iemejia iemejia commented May 17, 2026

What changes were proposed in this pull request?

This PR reduces object allocation in the DELTA_LENGTH_BYTE_ARRAY vectorized Parquet reader (VectorizedDeltaLengthByteArrayReader) by applying three targeted changes:

readBinary: Replace per-value in.slice(length) (one ByteBuffer allocation per value) with a single bulk in.slice(totalDataLen) that reads the entire batch at once. Individual values are then written to the column vector via putByteArray from the shared backing array, eliminating N-1 ByteBuffer object allocations.

skipBinary: Replace the per-value skip loop (N separate in.skip() calls) with a single bulk skip by summing all value lengths upfront.

readGeoData: Remove the ByteBuffer.wrap() + ByteBufferOutputWriter indirection per value and call putByteArray directly from the converter output array.

Why are the changes needed?

The DELTA_LENGTH_BYTE_ARRAY encoding is used for binary/string columns in Parquet v2 pages. In the current vectorized reader, readBinary allocates one ByteBuffer per value via in.slice(length), and skipBinary performs a separate stream skip per value. For large batches (e.g. 1M values per page), this creates significant allocation pressure and per-call overhead.

Micro-benchmarks on VectorizedDeltaReaderBenchmark Group D show:

Benchmark Before (ms) After (ms) Speedup
readBinary, payloadLen=8 12 10 1.2x
readBinary, payloadLen=32 16 14 1.1x
readBinary, payloadLen=128 13 12 1.1x
readBinary, payloadLen=512 32 32 ~1.0x
skipBinary (all sizes) 7 5 1.4x

readBinary speedup is larger for small payloads where allocation cost dominates. skipBinary shows consistent 1.4x improvement across all payload sizes.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests: ParquetDeltaLengthByteArrayEncodingSuite (14 tests including serialization, random strings, empty strings, skip interleaving, and geo types) and ParquetEncodingSuite all pass.

Benchmarks: VectorizedDeltaReaderBenchmark Group D (DELTA_LENGTH_BYTE_ARRAY) run locally on JDK 17.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: OpenCode with Claude claude-opus-4.6

…RRAY Parquet vectorized reader

This PR reduces object allocation in the DELTA_LENGTH_BYTE_ARRAY vectorized Parquet reader (`VectorizedDeltaLengthByteArrayReader`) by applying three targeted changes:

**readBinary**: Replace per-value `in.slice(length)` (one ByteBuffer allocation per value) with a single bulk `in.slice(totalDataLen)` that reads the entire batch at once. Individual values are then written to the column vector via `putByteArray` from the shared backing array, eliminating N-1 ByteBuffer object allocations.

**skipBinary**: Replace the per-value skip loop (N separate `in.skip()` calls) with a single bulk skip by summing all value lengths upfront.

**readGeoData**: Remove the `ByteBuffer.wrap()` + `ByteBufferOutputWriter` indirection per value and call `putByteArray` directly from the converter output array.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant