Skip to content

Optimize RawBsonDocument encode and decode #1913

Open
vbabanin wants to merge 8 commits into
mongodb:mainfrom
vbabanin:JAVA-6133
Open

Optimize RawBsonDocument encode and decode #1913
vbabanin wants to merge 8 commits into
mongodb:mainfrom
vbabanin:JAVA-6133

Conversation

@vbabanin
Copy link
Copy Markdown
Member

@vbabanin vbabanin commented Mar 17, 2026

  • Add BsonWriter.pipe(byte[], int, int) with BsonBinaryWriter override to write raw BSON bytes directly to the output, avoiding intermediate object allocation on the encode path
  • Add BsonInput.pipe(BsonOutput, int) to remove the temporary byte[] copy in BsonBinaryWriter.pipeDocument() on both encode and decode paths
  • Add public getBackingArray(), getByteOffset(), getByteLength() on RawBsonDocument to expose the backing byte array

Performance analyzer: link

JAVA-6133

…e allocations

- Add BsonWriter.pipe(byte[], int, int) with BsonBinaryWriter override to write raw BSON bytes directly to the output, avoiding intermediate object allocation on the encode path
- Add BsonInput.pipe(BsonOutput, int) to remove the temporary byte[] copy in BsonBinaryWriter.pipeDocument() on both encode and decode paths
- Add public getByteBacking(), getByteOffset(), getByteLength() on RawBsonDocument to expose the backing byte array

JAVA-6133
@vbabanin vbabanin self-assigned this Mar 17, 2026
@vbabanin vbabanin changed the title Optimize RawBsonDocument encode and decode by eliminating intermediat… Optimize RawBsonDocument encode and decode Mar 17, 2026
vbabanin added 3 commits May 19, 2026 09:19
- Removes pipe(byte[], int, int) from BsonWriter interface to avoid coupling it to concrete IO classes; dispatches via instanceof BsonBinaryWriter in the codec instead
- Renames getByteBacking() to getBackingArray() for clarity
- Validates minimum BSON document size before writing raw bytes, consistent with the reader-based pipe path
- Adds tests for the raw-byte pipe happy path and invalid-size rejection

JAVA-6133
Comment on lines -358 to -360
byte[] bytes = new byte[size - 4];
bsonInput.readBytes(bytes);
bsonOutput.writeBytes(bytes);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids an extra temporary byte[] allocation by reading directly into the target buffer, reducing both allocation pressure and byte-copy/processing overhead.

Comment on lines +337 to +338
@Override
public void pipe(final byte[] bytes, final int offset, final int length) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of routing through a BsonReader (which wraps a BsonInput but doesn’t expose the underlying array unless copied), we write the bytes directly to the output.

@vbabanin vbabanin requested review from Copilot and strogiyotec May 21, 2026 18:36
@vbabanin vbabanin marked this pull request as ready for review May 21, 2026 18:36
@vbabanin vbabanin requested a review from a team as a code owner May 21, 2026 18:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes RawBsonDocument encode/decode paths by enabling direct piping of raw BSON bytes between inputs/writers/outputs, reducing intermediate allocations and copies.

Changes:

  • Added BsonBinaryWriter.pipe(byte[], int, int) to write raw BSON document bytes directly to the output.
  • Added BsonInput.pipe(BsonOutput, int) and implemented it in ByteBufferBsonInput to avoid temporary byte array copies during piping.
  • Exposed RawBsonDocument backing byte array, offset, and length via new public accessors, with accompanying tests.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
bson/src/test/unit/org/bson/RawBsonDocumentSpecification.groovy Adds coverage for new RawBsonDocument backing-array/offset/length accessors.
bson/src/test/unit/org/bson/BsonBinaryWriterTest.java Adds tests for piping raw BSON bytes and invalid-size handling.
bson/src/main/org/bson/RawBsonDocument.java Adds public accessors to expose the backing byte array, offset, and length.
bson/src/main/org/bson/io/ByteBufferBsonInput.java Implements BsonInput.pipe with an array-backed fast path.
bson/src/main/org/bson/io/BsonInput.java Introduces the new pipe(BsonOutput, int) API on BsonInput.
bson/src/main/org/bson/codecs/RawBsonDocumentCodec.java Uses a fast-path to pipe raw bytes when the writer is a BsonBinaryWriter.
bson/src/main/org/bson/BsonWriter.java Minor formatting cleanup.
bson/src/main/org/bson/BsonBinaryWriter.java Implements raw-byte piping and refactors pipe-document completion logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread bson/src/main/org/bson/io/BsonInput.java Outdated
Comment thread bson/src/main/org/bson/io/ByteBufferBsonInput.java
Comment thread bson/src/main/org/bson/io/ByteBufferBsonInput.java
Comment thread bson/src/main/org/bson/BsonBinaryWriter.java
Comment thread bson/src/main/org/bson/BsonBinaryWriter.java
Comment thread bson/src/main/org/bson/RawBsonDocument.java
Comment thread bson/src/main/org/bson/io/BsonInput.java Outdated
@vbabanin vbabanin requested a review from rozza May 22, 2026 05:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Comment thread bson/src/test/unit/org/bson/io/BsonInputTest.java Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Comment on lines 42 to +50
public void encode(final BsonWriter writer, final RawBsonDocument value, final EncoderContext encoderContext) {
try (BsonBinaryReader reader = new BsonBinaryReader(new ByteBufferBsonInput(value.getByteBuffer()))) {
writer.pipe(reader);
if (writer instanceof BsonBinaryWriter) {
// Fast path. The pipe method should ideally exist on BsonWriter, but adding it as
// abstract would be a breaking change, and adding it as a default method would force
// BsonWriter to depend on BsonBinaryReader/ByteBufferBsonInput, violating the
// interface's abstraction.
// TODO JAVA-6211 move pipe(byte[], int, int) to BsonWriter to remove this instanceof.
((BsonBinaryWriter) writer).pipe(value.getBackingArray(), value.getByteOffset(), value.getByteLength());
} else {
@Test
void defaultPipeShouldCopyBytesFromInputToOutput() {
// given
byte[] inputBytes = {0x4a, 0x61, 0x76, 0x61, 0x21};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do

"Java!".getBytes(StandardCharsets.UTF_8);

instead ? Same thing but for me personally it's really hard to reason about raw bytes , what do you think ?


@Test
public void testPipeOfRawBytesWithInvalidSize() {
byte[] bytes = {4, 0, 0, 0}; // minimum document size is 5
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the validation for size of 5, just curious why is it 5 ?

}


def 'getBackingArray, getByteOffset and getByteLength should expose the document range'() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken but I saw a lot of PRs that also remove groovy spec class, can we move this test case to java test instead ?

@Test
void defaultPipeShouldCopyPartialBytesFromInputToOutput() {
// given
byte[] inputBytes = {0x4a, 0x61, 0x76, 0x61, 0x21};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants