Support any number of Document Sequence Sections in CommandMessage#getCommandDocument #1456

jyemin · 2024-07-20T00:09:30Z

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java

jyemin · 2024-07-20T00:33:00Z

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java

+
+                while (outputByteBuf.hasRemaining()) {
+                    outputByteBuf.position(outputByteBuf.position() + 1 /* payload type */ + 4 /* payload size */);
+                    String payloadName = getPayloadName(outputByteBuf);


Now we get the payload name directly from the OP_MSG instead of from the SplittablePayload, making it easier to support any number of document sequence independent of any future changes to SplittablePayload

If we now get the "payload name" (which is actually, the sequence identifier) effectively from what have been previously written to the bsonOutput, why do we still use SplittablePayload.getPayloadName, including using it in CommandMessage?

I don't yet have a good understanding of what's going on, but the above seems quite confusing.

When writing the OP_MSG Kind 1 Section, we need to get the payload name from somewhere. It comes from SplittablePayload. This code is just getting it out of the OP_MSG after it's been written, so it no longer has to depend on SplittablePayload, which is now free to evolve independent of this method (for example, to support multiple payloads).

If you are suggesting that the getPayloadName method should be renamed to something like getSequenceIdentifier, I agree.

I think, I now understand what is so confusing here:

SplittablePayload.getPayloadName is only supposed to be used when encoding (so it's not surprising that the method is still used), and obviously should not be used when decoding. Its previous usage for decoding appears to having been an attempt to simplify the small piece of the decoding code at the cost of abusing SplittablePayload.getPayloadName and making the whole picture less clear.

It is also unclear if CommandMessage.getCommandDocument should be an instance method of CommandMessage (more about this below). If it were not an instance method of CommandMessage, then it would not have been so easy to access SplittablePayload.getPayloadName from CommandMessage.getCommandDocument, and SplittablePayload.getPayloadName likely would not have been used by CommandMessage.getCommandDocument to begin with.

CommandMessage.getCommandDocument requires its argument to be not just any ByteBufferBsonOutput with a command encoded in the right format, but the one that was filled by the CommandMessage.encode method called on the same instance of CommandMessage the last time CommandMessage.encode was called. I am not sure what code design principle this violates, but it definitely violates something, makes the code fragile and difficult to think about.

The only hard reason for CommandMessage.getCommandDocument to be an instance method of CommandMessage, is that it needs information on where in bsonOutput (its argument) to start decoding from.

CommandMessage remembers the document's position in RequestMessage.encodingMetadata when CommandMessage.encode is called (each encode overrides the previously remembered position), and CommandMessage.getCommandDocument uses that remembered position.

I wonder, why it was designed this way? An immediate "fix" that comes to mind, is to combine EncodingMetadata and the ByteBufferBsonOutput passed to CommandMessage.encode into something like DecodableByteBufferBsonOutput, and remove RequestMessage.encodingMetadata. Depending on the reasons behind the current design, there may be a reason why the proposed approach couldn't work.

This all goes a long way back, to when there were many more subclasses of RequestMessage (we used to have subclasses for OP_INSERT, OP_UPDATE, OP_DELETE, etc). But now, there is only CommandMessage, and except for the annoyance of having to use OP_QUERY for the very first message we send on a connection, CommandMessage only encodes OP_MSG. Which makes EncodingMetadata#getFirstDocumentPosition effectively a constant.

When we removed support for all the other op codes, we simplified a lot, but there is more that can be done. Just not sure it should be done in this PR.

Thank you, I now understand why it was varying previously, but not the reasons behind the design that I described previously. I created https://jira.mongodb.org/browse/JAVA-5554 about improving this.

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java

driver-core/src/main/com/mongodb/internal/connection/ByteBufBsonDocument.java

…tCommandDocument JAVA-5536

driver-core/src/main/com/mongodb/internal/connection/ByteBufBsonDocument.java

stIncMale · 2024-07-25T13:41:42Z

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java

+
+                while (outputByteBuf.hasRemaining()) {
+                    outputByteBuf.position(outputByteBuf.position() + 1 /* payload type */ + 4 /* payload size */);
+                    String payloadName = getPayloadName(outputByteBuf);


I think, I now understand what is so confusing here:

SplittablePayload.getPayloadName is only supposed to be used when encoding (so it's not surprising that the method is still used), and obviously should not be used when decoding. Its previous usage for decoding appears to having been an attempt to simplify the small piece of the decoding code at the cost of abusing SplittablePayload.getPayloadName and making the whole picture less clear.

It is also unclear if CommandMessage.getCommandDocument should be an instance method of CommandMessage (more about this below). If it were not an instance method of CommandMessage, then it would not have been so easy to access SplittablePayload.getPayloadName from CommandMessage.getCommandDocument, and SplittablePayload.getPayloadName likely would not have been used by CommandMessage.getCommandDocument to begin with.

CommandMessage.getCommandDocument requires its argument to be not just any ByteBufferBsonOutput with a command encoded in the right format, but the one that was filled by the CommandMessage.encode method called on the same instance of CommandMessage the last time CommandMessage.encode was called. I am not sure what code design principle this violates, but it definitely violates something, makes the code fragile and difficult to think about.

The only hard reason for CommandMessage.getCommandDocument to be an instance method of CommandMessage, is that it needs information on where in bsonOutput (its argument) to start decoding from.

CommandMessage remembers the document's position in RequestMessage.encodingMetadata when CommandMessage.encode is called (each encode overrides the previously remembered position), and CommandMessage.getCommandDocument uses that remembered position.

I wonder, why it was designed this way? An immediate "fix" that comes to mind, is to combine EncodingMetadata and the ByteBufferBsonOutput passed to CommandMessage.encode into something like DecodableByteBufferBsonOutput, and remove RequestMessage.encodingMetadata. Depending on the reasons behind the current design, there may be a reason why the proposed approach couldn't work.

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java

jyemin self-assigned this Jul 20, 2024

jyemin commented Jul 20, 2024

View reviewed changes

jyemin marked this pull request as ready for review July 20, 2024 00:38

jyemin requested a review from stIncMale July 20, 2024 00:38

jyemin force-pushed the JAVA-5536 branch from 3ebf49a to 40e3744 Compare July 20, 2024 00:49

jyemin added 2 commits July 20, 2024 13:06

Add inline comments

Loading
Loading status checks…

0921645

Style changes

Loading
Loading status checks…

7870aed

stIncMale requested changes Jul 25, 2024

View reviewed changes

jyemin added 3 commits July 25, 2024 20:09

Refactor createOne

Loading
Loading status checks…

c207fdf

Refactor sequence identifier creation.

Loading
Loading status checks…

49bf6f8

Fix horrible bug in CommandMessage#getCommandDocument

Loading
Loading status checks…

ec9b6bb

jyemin requested a review from stIncMale July 26, 2024 01:10

NathanQingyangXu reviewed Jul 26, 2024

View reviewed changes

driver-core/src/main/com/mongodb/internal/connection/CommandMessage.java Show resolved Hide resolved

checkstyle

Loading
Loading status checks…

74f7351

stIncMale approved these changes Jul 26, 2024

View reviewed changes

jyemin merged commit 14fc2fa into mongodb:master Jul 26, 2024
57 of 59 checks passed

jyemin deleted the JAVA-5536 branch July 26, 2024 20:13

stIncMale mentioned this pull request Dec 6, 2024

Improved Bulk Write API #1509

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support any number of Document Sequence Sections in CommandMessage#getCommandDocument #1456

Support any number of Document Sequence Sections in CommandMessage#getCommandDocument #1456

jyemin commented Jul 20, 2024

Uh oh!

Uh oh!

jyemin Jul 20, 2024

Uh oh!

stIncMale Jul 24, 2024 •

edited

Loading

Uh oh!

jyemin Jul 24, 2024

Uh oh!

jyemin Jul 24, 2024

Uh oh!

stIncMale Jul 25, 2024

Uh oh!

jyemin Jul 26, 2024

Uh oh!

stIncMale Jul 26, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stIncMale Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support any number of Document Sequence Sections in CommandMessage#getCommandDocument #1456

Support any number of Document Sequence Sections in CommandMessage#getCommandDocument #1456

Conversation

jyemin commented Jul 20, 2024

Uh oh!

Uh oh!

jyemin Jul 20, 2024

Choose a reason for hiding this comment

Uh oh!

stIncMale Jul 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jyemin Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

jyemin Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

stIncMale Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

jyemin Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

stIncMale Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stIncMale Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stIncMale Jul 24, 2024 •

edited

Loading