Skip to content
22 changes: 19 additions & 3 deletions src/transforms/groupby.js
Original file line number Diff line number Diff line change
Expand Up @@ -137,8 +137,22 @@ function pasteArray(newTrace, trace, j, a) {
);
}

// In order for groups to apply correctly to other transform data (e.g.
// a filter transform), we have to break the connection and clone the
// transforms so that each group writes grouped values into a different
// destination. This function does not break the array reference
// connection between the split transforms it creates. That's handled in
// initialize, which creates a new empty array for each arrayAttr.
function cloneTransforms(newTrace) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting fix 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rreusser is that something we should do for all transforms?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hah - of course, because the original Lib.extendDeepNoArrays({}, trace) doesn't dive into the transforms array. Do we need a Lib.extendDeepOnlyTheseArrays({}, trace, ['transforms']) ? 🙈

is that something we should do for all transforms?

This only applies to transforms that generate multiple traces, right? I do kind of like the idea of recursive transforms, but until we do something like that, that might be more inherently robust, I think we should manage this case-by-case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So… arrayAttrs does pick up on data_array arrays nested inside transforms. The particular challenge is executing a transform after another has created expanded traces. Because if the first transform splits it into multiple traces, then each expanded trace needs its own copy of the other transforms since they, for example, have different split target or groups data. By expanding deep with no arrays and then cloning the data arrays (which are in arrayAttrs, they all have their own copy that can be filtered appropriately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed a bit of subtlety. Let me check on the identity of the split transforms. Unless I'm mistaken, something somewhere is cloning those… Let me dig that up.

Copy link
Contributor Author

@rreusser rreusser Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I wondered the same. The downfall of transforms at the moment is that there are too many corner cases and different things transforms might need to do to lock them down to a completely airtight state such that they can only do what they need to do. The result is that there are some patterns like this that should not be a concern of individual transforms at all but which might need to be applied to them all separately. Or if a clear pattern arises, then it can be abstracted. At the moment, the sample size is just a bit to small for me to see overarching patterns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if a clear pattern arises, then it can be abstracted. At the moment, the sample size is just a bit to small for me to see overarching patterns.

Very good answer. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcjohnson Never mind. I read my code more carefully. I'm just manually transferring the transforms. 👍

var transforms = newTrace.transforms;
newTrace.transforms = [];
for(var j = 0; j < transforms.length; j++) {
newTrace.transforms[j] = Lib.extendDeepNoArrays({}, transforms[j]);
}
}

function transformOne(trace, state) {
var i;
var i, j;
var opts = state.transform;
var groups = trace.transforms[state.transformIndex].groups;

Expand All @@ -163,12 +177,14 @@ function transformOne(trace, state) {

var newTrace = newData[i] = Lib.extendDeepNoArrays({}, trace);

cloneTransforms(newTrace);

arrayAttrs.forEach(initializeArray.bind(null, newTrace));

for(var j = 0; j < len; j++) {
for(j = 0; j < len; j++) {
if(groups[j] !== groupName) continue;

arrayAttrs.forEach(pasteArray.bind(0, newTrace, trace, j));
arrayAttrs.forEach(pasteArray.bind(null, newTrace, trace, j));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't new here - except that the change you made reminded me how bind always confuses me - but it occurs to me that there's no reason to use it at all if we unwind the forEach to a for. Also pasteArray is a bit inefficient in the way it calls Lib.nestedProperty(newTrace, a) twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉 🎉 Glad to change 😄

Copy link
Contributor Author

@rreusser rreusser Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcjohnson @etpinard I've refactored… actually a fair amount of groupby. It was mostly trivial reordering and inlining in order to cut down on function calls and slice and concat operations, just to streamline things a bit function-call and memory-wise.

Oh, and also it was iterating over all points for each group, which is unnecessary. Now it just iterates over all points once and uses an index to pick the right expanded trace destination.

}

newTrace.name = groupName;
Expand Down
163 changes: 163 additions & 0 deletions test/jasmine/tests/transform_multi_test.js
Original file line number Diff line number Diff line change
Expand Up @@ -727,3 +727,166 @@ describe('restyle applied on transforms:', function() {
});

});

describe('supplyDefaults with groupby + filter', function() {
function calcDatatoTrace(calcTrace) {
return calcTrace[0].trace;
}

function _transform(data, layout) {
var gd = {
data: data,
layout: layout || {}
};

Plots.supplyDefaults(gd);
Plots.doCalcdata(gd);

return gd.calcdata.map(calcDatatoTrace);
}

it('filter + groupby with blank target', function() {
var out = _transform([{
x: [1, 2, 3, 4, 5, 6, 7],
y: [4, 6, 5, 7, 6, 8, 9],
transforms: [{
type: 'filter',
operation: '<',
value: 6.5
}, {
type: 'groupby',
groups: [1, 1, 1, 2, 2, 2, 2]
}]
}]);

expect(out[0].x).toEqual([1, 2, 3]);
expect(out[0].y).toEqual([4, 6, 5]);

expect(out[1].x).toEqual([4, 5, 6]);
expect(out[1].y).toEqual([7, 6, 8]);
});

it('fiter + groupby', function() {
var out = _transform([{
x: [5, 4, 3],
y: [6, 5, 4],
}, {
x: [1, 2, 3, 4, 5, 6, 7],
y: [4, 6, 5, 7, 8, 9, 10],
transforms: [{
type: 'filter',
target: [1, 2, 3, 4, 5, 6, 7],
operation: '<',
value: 6.5
}, {
type: 'groupby',
groups: [1, 1, 1, 2, 2, 2, 2]
}]
}]);

expect(out[0].x).toEqual([5, 4, 3]);
expect(out[0].y).toEqual([6, 5, 4]);

expect(out[1].x).toEqual([1, 2, 3]);
expect(out[1].y).toEqual([4, 6, 5]);

expect(out[2].x).toEqual([4, 5, 6]);
expect(out[2].y).toEqual([7, 8, 9]);
});

it('groupby + filter', function() {
var out = _transform([{
x: [1, 2, 3, 4, 5, 6, 7],
y: [4, 6, 5, 7, 6, 8, 9],
transforms: [{
type: 'groupby',
groups: [1, 1, 1, 2, 2, 2, 2]
}, {
type: 'filter',
target: [1, 2, 3, 4, 5, 6, 7],
operation: '<',
value: 6.5
}]
}]);

expect(out[0].x).toEqual([1, 2, 3]);
expect(out[0].y).toEqual([4, 6, 5]);

expect(out[1].x).toEqual([4, 5, 6]);
expect(out[1].y).toEqual([7, 6, 8]);
});

it('groupby + groupby', function() {
var out = _transform([{
x: [1, 2, 3, 4, 5, 6, 7, 8],
y: [4, 6, 5, 7, 6, 8, 9, 10],
transforms: [{
type: 'groupby',
groups: [1, 1, 1, 1, 2, 2, 2, 2]
}, {
type: 'groupby',
groups: [3, 4, 3, 4, 3, 4, 3, 5],
}]
}]);
// | | | | | | | |
// v v v v v v v v
// Trace number: 0 1 0 1 2 3 2 4

expect(out.length).toEqual(5);
expect(out[0].x).toEqual([1, 3]);
expect(out[1].x).toEqual([2, 4]);
expect(out[2].x).toEqual([5, 7]);
expect(out[3].x).toEqual([6]);
expect(out[4].x).toEqual([8]);
});

it('groupby + groupby + filter', function() {
var out = _transform([{
x: [1, 2, 3, 4, 5, 6, 7, 8],
y: [4, 6, 5, 7, 6, 8, 9, 10],
transforms: [{
type: 'groupby',
groups: [1, 1, 1, 1, 2, 2, 2, 2]
}, {
type: 'groupby',
groups: [3, 4, 3, 4, 3, 4, 3, 5],
}, {
type: 'filter',
target: [1, 2, 3, 4, 5, 6, 7, 8],
operation: '<',
value: 4.5
}]
}]);
// | | | | | | | |
// v v v v v v v v
// Trace number: 0 1 0 1 2 3 2 4

expect(out.length).toEqual(5);
expect(out[0].x).toEqual([1, 3]);
expect(out[1].x).toEqual([2, 4]);
expect(out[2].x).toEqual([]);
expect(out[3].x).toEqual([]);
expect(out[4].x).toEqual([]);
});

it('fiter + filter', function() {
var out = _transform([{
x: [1, 2, 3, 4, 5, 6, 7],
y: [4, 6, 5, 7, 8, 9, 10],
transforms: [{
type: 'filter',
target: [1, 2, 3, 4, 5, 6, 7],
operation: '<',
value: 6.5
}, {
type: 'filter',
target: [1, 2, 3, 4, 5, 6, 7],
operation: '>',
value: 1.5
}]
}]);

expect(out[0].x).toEqual([2, 3, 4, 5, 6]);
expect(out[0].y).toEqual([6, 5, 7, 8, 9]);
});
});