Skip to content

Implement filter, map, reduce #1031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 9, 2022

Conversation

katcharov
Copy link
Collaborator

JAVA-4781

(Javadocs are tentative, as per JAVA-4799)

* @param in the associative accumulation function
* @return the reduced value
*/
T reduce(T initialValue, BiFunction<T, T, T> in);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of the reduced value can be different than the type of the array elements, e.g.

{
    $reduce: {
       input: ["1", "2", "3"],
       initialValue: 0,
       in: { $sum : [{$toInt: "$$this"}, "$$value.sum"] }
 }

This can likely be dealt with by introducing a type parameter to the function definition to represent the reduced type.

Copy link
Collaborator Author

@katcharov katcharov Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we should force this to be expressed as:

ofStringArray("1", "2", "3", "4")
        .map(v -> v.toInt())
        .reduce(of(0), (a, b) -> a.add(b));

Or perhaps as:

ofStringArray("1", "2", "3", "4").addOn(v -> v.toInt());

This last variant is appealing because it hides the default value and presents reduction in a simplified way. If it can be made to work, I would argue for not exposing reduce at all. However, I'm not sure how to get the following to work:

ArrayExpression<IntegerExpression> numbers = ofIntegerArray(1, 2, 3, 4);
numbers.add(); // what interface is "add" on, if we want type safety?

* @param in the associative accumulation function
* @return the reduced value
*/
T reduce(T initialValue, BiFunction<T, T, T> in);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of a way to express multiple reductions, so we can support something like:

{
    $reduce: {
       input: ["1", "2", "3"],
       initialValue: {sum: 0, product: 0, concat: ""},
       in: { 
          sum :    { $sum :      [{$toInt: "$$this"}, "$$value.sum"] },
          product: { $multiply : [{$toInt: "$$this"}, "$$value.product"] }
          concat:  { $concat   : ["$$this",           "$$value.concat"] }
        }   
     }
 }

?

Copy link
Collaborator Author

@katcharov katcharov Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be accomplished with the existing method:

assertExpression(BsonDocument.parse("{'sum': 10, 'product': 24, 'concat': '1234'}"),

ofStringArray("1", "2", "3", "4")
        // map:
        .<DocumentExpression>map(v -> ofEmptyDocument()
                .setField("sum", v.toInt())
                .setField("product", v.toInt())
                .setField("concat", v))
        // identity:
        .reduce(ofEmptyDocument()
                        .setField("sum", of(0))
                        .setField("product", of(1))
                        .setField("concat", of("")),
        // reduction:
        (a, b) -> ofEmptyDocument()
                .setField("sum", a.getFieldInteger("sum").add(b.getFieldInteger("sum")))
                .setField("product", a.getFieldInteger("product").multiply(b.getFieldInteger("product")))
                .setField("concat", a.getFieldString("concat").concat(b.getFieldString("concat")))));

Above test passes. However, I am unsure if the server knows how to move a map into a reduction, if such a move is needed to help performance.

Compact with a schema object:

ofStringArray("1", "2", "3", "4")
        .map(v -> new Result(v.toInt(), v.toInt(), v))
        .reduce(new Result(0, 1, ""), (a, b) -> new Result(
                a.getSum().add(b.getSum()),
                a.getProduct().multiply(b.getProduct()),
                a.getConcat().concat(b.getConcat())) ))

Could try to push it to something approximating:

ofStringArray("1", "2", "3", "4").reduce(v -> result()
        .set("sum", v -> v.toInt().add())
        .set("product", v -> v.toInt().multiply())
        .set("concat", v -> v.concat()))

I think I'd need to have a closer look at $group. I wonder if this last variant (which is basically syntax sugar) obscures what is actually going on (a map followed by a reduce). On the other hand, I like that this last variant hides the identity value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What MQL does that first bit of code ("with the existing method") generate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
  "$reduce":{
    "input":{
      "$map":{
        "input":[
          "1",
          "2",
          "3",
          "4"
        ],
        "in":{
          "$setField":{
            "field":"concat",
            "input":{
              "$setField":{
                "field":"product",
                "input":{
                  "$setField":{
                    "field":"sum",
                    "input":{
                      "$literal":{
                        
                      }
                    },
                    "value":{
                      "$toInt":"$$this"
                    }
                  }
                },
                "value":{
                  "$toInt":"$$this"
                }
              }
            },
            "value":"$$this"
          }
        }
      }
    },
    "initialValue":{
      "$setField":{
        "field":"concat",
        "input":{
          "$setField":{
            "field":"product",
            "input":{
              "$setField":{
                "field":"sum",
                "input":{
                  "$literal":{
                    
                  }
                },
                "value":0
              }
            },
            "value":1
          }
        },
        "value":""
      }
    },
    "in":{
      "$setField":{
        "field":"concat",
        "input":{
          "$setField":{
            "field":"product",
            "input":{
              "$setField":{
                "field":"sum",
                "input":{
                  "$literal":{
                    
                  }
                },
                "value":{
                  "$add":[
                    {
                      "$getField":{
                        "input":"$$value",
                        "field":"sum"
                      }
                    },
                    {
                      "$getField":{
                        "input":"$$this",
                        "field":"sum"
                      }
                    }
                  ]
                }
              }
            },
            "value":{
              "$multiply":[
                {
                  "$getField":{
                    "input":"$$value",
                    "field":"product"
                  }
                },
                {
                  "$getField":{
                    "input":"$$this",
                    "field":"product"
                  }
                }
              ]
            }
          }
        },
        "value":{
          "$concat":[
            {
              "$getField":{
                "input":"$$value",
                "field":"concat"
              }
            },
            {
              "$getField":{
                "input":"$$this",
                "field":"concat"
              }
            }
          ]
        }
      }
    }
  }
}

}

@Override
public ArrayExpression<T> filter(final Function<T, BooleanExpression> cond) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem possible to specify the limit via this API. Is this this piece of MQL $filter functionality deliberately omitted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should implement this parameter in something like the following way instead:

array.filter(condition).limit(10)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan for this functionality? Given that this PR implements $filter, I'd expect the functionality added in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would array.limit(10) do? I don't see a $limit expression for arrays outside the context of a filter

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stIncMale MQL has added this parameter for the sake of convenience. This convenience helps programmers writing MQL (with query size), and possibly the server in interpreting MQL. However, it is detrimental for users of our API, because it unnecessarily introduces two ways of doing things, and the mechanism that it introduces is inelegant and bloats our API. So I do not think it should be implemented.

@jyemin I believe that method that would correspond to limit is $firstN. I do not like that the names are inconsistent, and would like to address that, but do not see a clean way to do so.

@stIncMale stIncMale mentioned this pull request Nov 3, 2022
@katcharov katcharov requested a review from jyemin November 3, 2022 22:11
@katcharov katcharov requested a review from stIncMale November 7, 2022 21:05
@katcharov katcharov merged commit 2c83ce7 into mongodb:expressions Nov 9, 2022
@katcharov katcharov deleted the expressions-array-init branch November 9, 2022 15:39
katcharov added a commit that referenced this pull request Jan 30, 2023
katcharov added a commit that referenced this pull request Jan 31, 2023
katcharov added a commit that referenced this pull request Jan 31, 2023
* Implement boolean expressions (#1025)

JAVA-4779

* Implement filter, map, reduce (#1031)

JAVA-4781

* Implement eq, ne, gt, gte, lt, lte (#1033)

JAVA-4784

* Implement string expressions (#1036)

JAVA-4801

* Implement arithmetic expressions (#1037)

Implement arithmetic expressions (from top 50, and others)

JAVA-4803

* Implement array expressions (#1043)

JAVA-4805

* Implement date expressions (#1045)

JAVA-4804

* Implement conversion/type expressions (#1050)

JAVA-4802

* Implement document expressions (#1052)

JAVA-4782

* Replace reduce with individual reductions (#1053)

JAVA-4814

* Implement map expressions (#1054)

JAVA-4817

* Implement switch expression (#1055)

JAVA-4813

* Test expressions in context (#1057)

JAVA-4820

* Add javadoc for boolean, date, number, integer, and expression (#1059)

 JAVA-4799

* Update and add documentation (#1059)

* Fix, tests

 JAVA-4799

* Add `@MqlUnchecked` and a few usage examples (#1059)

 JAVA-4799

* Add has to document, add tests (#1070)

 JAVA-4799

* Add javadocs for remaining classes (#1070)

 JAVA-4799

* 5.2 annotations (#1070)

 JAVA-4799

* 5.0 annotations (#1070)

 JAVA-4799

* 4.4 annotations (#1070)

 JAVA-4799

* 4.2 annotations (#1070)

 JAVA-4799

* 4.0 annotations (#1070)

 JAVA-4799

* Update and add documentation, add tests, fix minor issues (#1070)

Rename extractBsonValue

Fix access modifiers

Remove excess comments

Update docs

Fix: behaviour of get

Add notNull to API, add notNullApi test

Fix docs/annotations, tests

Fix docs, annotations, since

Fix docs

Revert external

Add missing MqlUnchecked

Fix missing null checks

Checkstyle

JAVA-4799

* Rename to Mql (automated) (#1073)

JAVA-3879

* Rename methods (automated) (#1073)

JAVA-3879

* Update naming, terms, and missing checks and annotations (#1073)

JAVA-3879

---------

Co-authored-by: Valentin Kovalenko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants