REP-6088 Tolerate high numbers of mismatches #117

FGasper · 2025-06-02T16:53:26Z

Previously all document mismatches were recorded directly in the verification task. This meant, though, that if a task encompassed a large number of mismatched or missing documents, the verifier could fail to persist all of the mismatches, which caused a crash.

(The usual cause of excess mismatched/missing documents is starting migration-verifier before initial sync finishes, but it can also reasonably happen without REP-6129’s fix for queries against pre-v5 servers. See HELP-75910.)

This changeset makes the verifier save mismatches to a dedicated collection instead, one document per mismatch.

This change upends some familiar workflows for investigating mismatches: it’s no longer sufficient just to query the verification_tasks collection for mismatch information since the actual mismatches are recorded in a separate collection. To address this, the documentation now gives an aggregation pipeline that yields a similarly-useful result.

This entails a metadata version change. Because that’s happening, this also changes the task type verify to verifyDocuments. (That required some sorting workarounds in tests, which were tight-coupled to the task type strings.)

tdq45gj

Looks good in general. I've left some comments on small things.

internal/testutil/testutil.go

tdq45gj · 2025-06-03T13:24:03Z

internal/verifier/compare.go

+			return errors.Wrapf(err, "starting session")
+		}
+
+		sctx := mongo.NewSessionContext(ctx, sess)


Are we reading in a session just to get the cluster time?

Yes. That, per the driver team, is the approved way to do this.

Could we add a comment to explain the purpose of a session here?

internal/verifier/mismatches.go

internal/verifier/sharding.go

tdq45gj

LGTM. Thanks!

khodakovski

LGTM!

- Mismatches were previously shown in an indeterminate order. Now they’re consistently sorted by the mismatched documents’ `_id`. - Documents with missing fields were being logged as entirely missing. That logic is corrected here. - The logic to create the table of missing/changed documents previously iterated through the _ids persisted in the task rather than the actual missing/changed documents. This was appropriate when that list stored mismatches but is no longer correct since the list now always stores the list of documents to check in the task. Thus, if there were only a handful of missing documents in a recheck task that contained thousands of document IDs, all of that task’s document IDs would be logged as missing. This was an oversight from PR #117, which should have updated the logic to build that table as it migrated that for the mismatched-documents table. This changeset does the necessary update.

FGasper added 25 commits May 28, 2025 10:28

stronger typing, and remove deprecated field

f84fd03

fix IsRecheck

c9d35fb

define task ID as ObjID

f7b68fc

try this

7f06fc1

fix refresh

a2ba683

fix client session

be7c889

fixes

eac05c9

add chanutil

d9ad028

restore order

3c90f08

diag

72f41fe

fix types

bd88e0a

fix some queries

25644fd

more IDs

3981457

fix indexees

8540eb1

progress

8023021

Merge branch 'main' into REP-6088-tolerate-excess-mismatches

87b21e1

fix test

cce2c41

remove print statements

280a4b9

update metadata name & remove unused

3da6764

tweaks

98fb6c6

update test

9331a6c

fix sort stability

1753bf4

sort as we expect

d7a2d20

remove unneeded

4f63f28

commit

bbd970a

FGasper requested review from khodakovski and tdq45gj June 3, 2025 09:55

FGasper marked this pull request as ready for review June 3, 2025 09:56

tdq45gj requested changes Jun 3, 2025

View reviewed changes

add literal & remove TODO

616d5c3

FGasper requested a review from tdq45gj June 3, 2025 14:35

FGasper added 3 commits June 3, 2025 11:25

no panic on unset mismatch/result ID

2f71a41

Merge branch 'main' into REP-6088-tolerate-excess-mismatches

8dc02f9

explain session

7eb72fa

tdq45gj approved these changes Jun 3, 2025

View reviewed changes

khodakovski approved these changes Jun 4, 2025

View reviewed changes

FGasper merged commit d7b456e into mongodb-labs:main Jun 4, 2025
50 checks passed

FGasper deleted the REP-6088-tolerate-excess-mismatches branch June 4, 2025 20:10

FGasper mentioned this pull request Jun 14, 2025

REP-6088 Fix display of verification summary #119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

REP-6088 Tolerate high numbers of mismatches #117

REP-6088 Tolerate high numbers of mismatches #117

Uh oh!

FGasper commented Jun 2, 2025 •

edited

Loading

Uh oh!

tdq45gj left a comment

Uh oh!

Uh oh!

tdq45gj Jun 3, 2025

Uh oh!

FGasper Jun 3, 2025

Uh oh!

tdq45gj Jun 3, 2025

Uh oh!

FGasper Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

tdq45gj left a comment

Uh oh!

khodakovski left a comment

Uh oh!

Uh oh!

Uh oh!

REP-6088 Tolerate high numbers of mismatches #117

REP-6088 Tolerate high numbers of mismatches #117

Uh oh!

Conversation

FGasper commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tdq45gj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tdq45gj Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

FGasper Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

tdq45gj Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

FGasper Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tdq45gj left a comment

Choose a reason for hiding this comment

Uh oh!

khodakovski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

FGasper commented Jun 2, 2025 •

edited

Loading