Skip to content

BucketOperation doesn't expose _id field, causing IllegalArgumentException in subsequent operations #5046

@this-amine

Description

@this-amine

BucketOperation doesn't expose _id field, causing IllegalArgumentException in subsequent operations

Problem Description

Spring Data MongoDB's BucketOperation has a critical field exposure bug that prevents subsequent aggregation operations from referencing the _id field that MongoDB's $bucket operation always generates. This causes IllegalArgumentException errors in what should be standard aggregation pipelines.

Example That Demonstrates the Issue

// This fails with "Invalid reference '_id'"
new Aggregation(
    bucket("price").withBoundaries(10, 50, 100),
    addFields().addField("bucketLabel").withValueOf("_id")
).toDocument("collection", strictContext);

Error: IllegalArgumentException: Invalid reference '_id'

Why This Should Work

MongoDB's $bucket operation always generates an _id field in its output documents. When you run a bucket aggregation, the result looks like this:

[
  { "_id": 10, "count": 3 },
  { "_id": 50, "count": 7 },
  { "_id": 100, "count": 2 }
]

The _id field contains the bucket boundary value, making it a natural field to reference in subsequent pipeline stages.

Root Cause Analysis

The issue lies in BucketOperation's field exposure logic. Currently, the asExposedFields() method doesn't include the _id field:

// Current (incorrect) implementation in BucketOperationSupport
protected ExposedFields asExposedFields() {
    // Missing _id field exposure!
    if (isEmpty()) {
        return ExposedFields.from(new ExposedField("count", true));
    }
    ExposedFields fields = ExposedFields.from();
    // ... only exposes user-defined outputs
}

This means Spring Data MongoDB doesn't know that the _id field exists, even though MongoDB will generate it.

Proposed Solution

Make BucketOperation expose the _id field that MongoDB actually generates:

// Proposed fix in BucketOperationSupport.asExposedFields():
protected ExposedFields asExposedFields() {
    // MongoDB's $bucket and $bucketAuto always generate _id field
    ExposedFields fields = ExposedFields.from(new ExposedField(Fields.UNDERSCORE_ID, true));

    //rest of the code
}

Why This Bug Went Unnoticed

A previous fix (#3497) added a defensive fallback in ProjectionOperation that made project("_id") work with bucket operations. However, this fix only addressed the symptom for projection operations—it didn't solve the root cause. As a result, other operations like addFields() continued to fail when trying to reference the _id field that MongoDB actually generates.

Metadata

Metadata

Labels

has: ai-slopAn bloated issue that contains low-value AI-generated content.type: bugA general bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions