-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Implement filter, map, reduce #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement filter, map, reduce #1031
Conversation
* @param in the associative accumulation function | ||
* @return the reduced value | ||
*/ | ||
T reduce(T initialValue, BiFunction<T, T, T> in); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of the reduced value can be different than the type of the array elements, e.g.
{
$reduce: {
input: ["1", "2", "3"],
initialValue: 0,
in: { $sum : [{$toInt: "$$this"}, "$$value.sum"] }
}
This can likely be dealt with by introducing a type parameter to the function definition to represent the reduced type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that we should force this to be expressed as:
ofStringArray("1", "2", "3", "4")
.map(v -> v.toInt())
.reduce(of(0), (a, b) -> a.add(b));
Or perhaps as:
ofStringArray("1", "2", "3", "4").addOn(v -> v.toInt());
This last variant is appealing because it hides the default value and presents reduction in a simplified way. If it can be made to work, I would argue for not exposing reduce
at all. However, I'm not sure how to get the following to work:
ArrayExpression<IntegerExpression> numbers = ofIntegerArray(1, 2, 3, 4);
numbers.add(); // what interface is "add" on, if we want type safety?
* @param in the associative accumulation function | ||
* @return the reduced value | ||
*/ | ||
T reduce(T initialValue, BiFunction<T, T, T> in); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of a way to express multiple reductions, so we can support something like:
{
$reduce: {
input: ["1", "2", "3"],
initialValue: {sum: 0, product: 0, concat: ""},
in: {
sum : { $sum : [{$toInt: "$$this"}, "$$value.sum"] },
product: { $multiply : [{$toInt: "$$this"}, "$$value.product"] }
concat: { $concat : ["$$this", "$$value.concat"] }
}
}
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be accomplished with the existing method:
assertExpression(BsonDocument.parse("{'sum': 10, 'product': 24, 'concat': '1234'}"),
ofStringArray("1", "2", "3", "4")
// map:
.<DocumentExpression>map(v -> ofEmptyDocument()
.setField("sum", v.toInt())
.setField("product", v.toInt())
.setField("concat", v))
// identity:
.reduce(ofEmptyDocument()
.setField("sum", of(0))
.setField("product", of(1))
.setField("concat", of("")),
// reduction:
(a, b) -> ofEmptyDocument()
.setField("sum", a.getFieldInteger("sum").add(b.getFieldInteger("sum")))
.setField("product", a.getFieldInteger("product").multiply(b.getFieldInteger("product")))
.setField("concat", a.getFieldString("concat").concat(b.getFieldString("concat")))));
Above test passes. However, I am unsure if the server knows how to move a map into a reduction, if such a move is needed to help performance.
Compact with a schema object:
ofStringArray("1", "2", "3", "4")
.map(v -> new Result(v.toInt(), v.toInt(), v))
.reduce(new Result(0, 1, ""), (a, b) -> new Result(
a.getSum().add(b.getSum()),
a.getProduct().multiply(b.getProduct()),
a.getConcat().concat(b.getConcat())) ))
Could try to push it to something approximating:
ofStringArray("1", "2", "3", "4").reduce(v -> result()
.set("sum", v -> v.toInt().add())
.set("product", v -> v.toInt().multiply())
.set("concat", v -> v.concat()))
I think I'd need to have a closer look at $group. I wonder if this last variant (which is basically syntax sugar) obscures what is actually going on (a map followed by a reduce). On the other hand, I like that this last variant hides the identity value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What MQL does that first bit of code ("with the existing method") generate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{
"$reduce":{
"input":{
"$map":{
"input":[
"1",
"2",
"3",
"4"
],
"in":{
"$setField":{
"field":"concat",
"input":{
"$setField":{
"field":"product",
"input":{
"$setField":{
"field":"sum",
"input":{
"$literal":{
}
},
"value":{
"$toInt":"$$this"
}
}
},
"value":{
"$toInt":"$$this"
}
}
},
"value":"$$this"
}
}
}
},
"initialValue":{
"$setField":{
"field":"concat",
"input":{
"$setField":{
"field":"product",
"input":{
"$setField":{
"field":"sum",
"input":{
"$literal":{
}
},
"value":0
}
},
"value":1
}
},
"value":""
}
},
"in":{
"$setField":{
"field":"concat",
"input":{
"$setField":{
"field":"product",
"input":{
"$setField":{
"field":"sum",
"input":{
"$literal":{
}
},
"value":{
"$add":[
{
"$getField":{
"input":"$$value",
"field":"sum"
}
},
{
"$getField":{
"input":"$$this",
"field":"sum"
}
}
]
}
}
},
"value":{
"$multiply":[
{
"$getField":{
"input":"$$value",
"field":"product"
}
},
{
"$getField":{
"input":"$$this",
"field":"product"
}
}
]
}
}
},
"value":{
"$concat":[
{
"$getField":{
"input":"$$value",
"field":"concat"
}
},
{
"$getField":{
"input":"$$this",
"field":"concat"
}
}
]
}
}
}
}
}
driver-core/src/main/com/mongodb/client/model/expressions/MqlExpression.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/client/model/expressions/MqlExpression.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/client/model/expressions/MqlExpression.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/client/model/expressions/MqlExpression.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/client/model/expressions/MqlExpression.java
Show resolved
Hide resolved
} | ||
|
||
@Override | ||
public ArrayExpression<T> filter(final Function<T, BooleanExpression> cond) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem possible to specify the limit
via this API. Is this this piece of MQL $filter
functionality deliberately omitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should implement this parameter in something like the following way instead:
array.filter(condition).limit(10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the plan for this functionality? Given that this PR implements $filter
, I'd expect the functionality added in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would array.limit(10)
do? I don't see a $limit
expression for arrays outside the context of a filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stIncMale MQL has added this parameter for the sake of convenience. This convenience helps programmers writing MQL (with query size), and possibly the server in interpreting MQL. However, it is detrimental for users of our API, because it unnecessarily introduces two ways of doing things, and the mechanism that it introduces is inelegant and bloats our API. So I do not think it should be implemented.
@jyemin I believe that method that would correspond to limit is $firstN. I do not like that the names are inconsistent, and would like to address that, but do not see a clean way to do so.
driver-core/src/main/com/mongodb/client/model/expressions/ArrayExpression.java
Outdated
Show resolved
Hide resolved
* Implement boolean expressions (#1025) JAVA-4779 * Implement filter, map, reduce (#1031) JAVA-4781 * Implement eq, ne, gt, gte, lt, lte (#1033) JAVA-4784 * Implement string expressions (#1036) JAVA-4801 * Implement arithmetic expressions (#1037) Implement arithmetic expressions (from top 50, and others) JAVA-4803 * Implement array expressions (#1043) JAVA-4805 * Implement date expressions (#1045) JAVA-4804 * Implement conversion/type expressions (#1050) JAVA-4802 * Implement document expressions (#1052) JAVA-4782 * Replace reduce with individual reductions (#1053) JAVA-4814 * Implement map expressions (#1054) JAVA-4817 * Implement switch expression (#1055) JAVA-4813 * Test expressions in context (#1057) JAVA-4820 * Add javadoc for boolean, date, number, integer, and expression (#1059) JAVA-4799 * Update and add documentation (#1059) * Fix, tests JAVA-4799 * Add `@MqlUnchecked` and a few usage examples (#1059) JAVA-4799 * Add has to document, add tests (#1070) JAVA-4799 * Add javadocs for remaining classes (#1070) JAVA-4799 * 5.2 annotations (#1070) JAVA-4799 * 5.0 annotations (#1070) JAVA-4799 * 4.4 annotations (#1070) JAVA-4799 * 4.2 annotations (#1070) JAVA-4799 * 4.0 annotations (#1070) JAVA-4799 * Update and add documentation, add tests, fix minor issues (#1070) Rename extractBsonValue Fix access modifiers Remove excess comments Update docs Fix: behaviour of get Add notNull to API, add notNullApi test Fix docs/annotations, tests Fix docs, annotations, since Fix docs Revert external Add missing MqlUnchecked Fix missing null checks Checkstyle JAVA-4799 * Rename to Mql (automated) (#1073) JAVA-3879 * Rename methods (automated) (#1073) JAVA-3879 * Update naming, terms, and missing checks and annotations (#1073) JAVA-3879 --------- Co-authored-by: Valentin Kovalenko <[email protected]>
JAVA-4781
(Javadocs are tentative, as per JAVA-4799)