Fix crud.select() + after behavior problems #295

oleg-jukovec · 2022-05-24T09:00:41Z

It seems to me that now there are logical errors with select + after:

Always a smaller key is choosed to start the select when has_keydef.
Unsymmetric behavior on ascending and descending order with ER/REQ.
select(nil) + after always traverse to the after from beginning.

The patch fixes the problems.

test/integration/select_test.lua

DifferentialOrange

Thank you for your patch! You've found some really amusing bugs and misbehavings.

Does it block PR about select(nil) warning or they exist independently?

Although the test behavior seems valid, I think that this PR should be additionally reviewed with Alexander since he seems to be more in context than me.

CHANGELOG.md

test/integration/select_test.lua

crud/select/executor.lua

oleg-jukovec · 2022-05-26T18:01:45Z

I have updated the pull request:

I added a fix for LE->LT (the first first).
I left only the necessary tests.
I tried to reformulate changelog entries and commit messages.
Now EQ/REQ are not replaced by GE/LE (the third commit).

oleg-jukovec · 2022-05-26T18:03:56Z

Does it block PR about select(nil) warning or they exist independently?

It does not. But after the pull request we can delete a doc entry:

**Note:**

Pagination with parameter ``after`` leads to a full scan even with the condition '=' or '==' on the index fields.
...

from #277.

CHANGELOG.md

test/unit/select_plan_test.lua

DifferentialOrange

LGTM after restoring CHANGLELOG entry and rebasing to master.

Please, request additional approve from Alexander.

README.md

Totktonada · 2022-06-09T21:20:06Z

General feedback

To be honest, it is not easy to spot ideas behind those commits. I'll not insist, but I think that more detailed descriptions would be a good step toward good query planner/executor.

I'll share my thoughts on how I would describe such changes.

Abstract example of description of planner/executor change

select: fix <...>

Let's consider the following space:

| id | age |
| -- | --- |
|  1 |  32 |
|  2 |  22 |
|  3 |  33 |
|  4 |  24 |
|  5 |  23 |

We're traverse over the space with sorting by the `age` field with the
following condition: `age <= 30`. The page size is 2.

To make the explanation simpler, let's look on the same table, but
sorted by `age`:

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |
|  4 |  24 |
|  1 |  32 |
|  3 |  33 |

And filtered by `age <= 30`:

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |
|  4 |  24 |

First, we got the first page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2})

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |

Now we want to fetch the next page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2, after = {5, 23}})

| id | age |
| -- | --- |
|  4 |  24 |

Then we step back to the previous page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = -2, after = {4, 24}})

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |

It is where the problem appears. Before the change the response was the
following:

| id | age |
| -- | --- |
|  2 |  22 |

Several details about the query planner and executor are important here.

1. The plan contains `foo` and `bar` fields, which normally corresponds
   to a first select condition backed by an index. However in presence
   of `after` it is changed to <...>. Extra post-filter is added to the
   `baz` field. A stop condition is added to the `fiz` field.
2. When `first` value is negative the planner performs yet another
   transformation: <...>.
3. The executor just uses `foo` and `bar` disregarding `after` and
   `first` presence. However if `after` is present, it <...>.

The problem was in the wrong transformation in the query planner: it
takes into account <...>, but missed the case, when <...>. The patch
fixes the planner to support both cases.

There is another scenario, which should be taken into account: if the
index is non-unique, then <...>. Let's consider the example table
(sorted by `age`):

| id | age |
| -- | --- |
|  2 |  22 |
|  6 |  22 |
|  7 |  22 |
|  5 |  23 |
|  4 |  24 |
|  1 |  32 |
|  3 |  33 |

If we're on the second page and want to get the first one:

> crud.select('<...>', {{'>=', 'age', 30}}, {first = -2, after = {7, 22}})

| id | age |
| -- | --- |
|  2 |  22 |
|  6 |  22 |

The result is anyway correct, because it is handled by the following
code in the executor:

<..code here..>

As result the changed code produces correct result here as well.

Maybe my example is too wordy, but I would like to see some explanations. Not necessary so large ones.

We discussed the changes with @oleg-jukovec and @DifferentialOrange and now I understood them better. I'll summarize my understanding and why I think that the changes are okay.

1. select: fix LE and after results when the condition value < after

If the index is non-unique, the page border can be in a middle of sequence tuples with the same values of the key. Before the fix we would jump over all those tuples. Now it works correctly.

It was not obvious on the first glance, why we can relax the condition and internally use <= instead of <. Naive pagination implementation would just use < to get the next page (I assume descending order, that's why it is the next page, not previous), but it would work only for unique indexes. (Partial keys are irrelevant here, because we always can extract a full key from the after tuple).

The executor performs scrolling to the after tuple in the scroll_to_after_tuple() function. So in the case of the unique index we'll lookup one more tuple. In case of non-unique index, the scrolling is inevitable (while tarantool/tarantool#5282 is not implemented).

There is generate_value() machinery in the executor: it replaces the key to start index scan with the key extracted from the after tuple. It actually performs the replacement when the key from the after tuple is after the key from the condition (with correspondence to basic direction imposed by conditions). IOW, when replacing the value allows to save some CPU cycles on linear scanning and pass the better start value to the index scan (the index works logarithmically).

Since < (before the patch) and <= (after the patch) index iterator is applied to the resulting scan value, including one we get from the generate_value() function, we should make sure that the proposed change works good for both cases: when we replaced the scan value and when we don't.

When we replaced it, scroll_to_after_tuple() will compensate it. Okay.

When we don't do that, the after tuple is either before the original scan value (and so scroll_to_after_tuple() will be no-op) or we have no after tuple and the < to <= replacement was not performed in the query planner. Okay too.

The patch looks correct for me.

Here we have the headroom for a little optimization:

Use > and < for paginating over an unique index (when after is passed).

I would like to see this possibility mentioned somewhere in developer docs (say, in the scope of #292).

2. select: fix traverse start key in case condition + after

It's logical bugfix for travising in the descending order (< or <= in the first indexed condition). Okay.

3. select: fix EQ/REQ and after results

Wow. This change is not simple. I'll put here how I explained it for myself (based on the code and what Oleg tells me on the meeting). Some context is necessary here.

How a EQ/REQ condition + after was handled before the patch?

The condition is added as a post-filter.

Technically it is performed by this block of the code (from plan.new()):

-- Since we use pagination we should continue iteration since after tuple.
-- Such iterator could be run only with index:pairs(key(after_tuple), 'GT/LT').
-- To preserve original condition we should manually inject it in `filter_conditions`
-- See function `parse` in crud/select/filters.lua file for details.
if scan_after_tuple ~= nil then
    if scan_iter == box.index.REQ or scan_iter == box.index.EQ then
        table.insert(conditions, conditions[scan_condition_num])
    end
end

We add the condition once again into the array of conditions. It is the trick to overrun the following if-condition in the parse() function in crud/compare/filters.lua:

local filter_conditions = {}
for i, condition in ipairs(conditions) do
    if i ~= opts.scan_condition_num then          -- <-- THIS ONE
        <...>
        table.insert(filter_conditions, {<...>})
    end
end

So the condition is present twice in the array of conditions, but ignored only once.

The iterator is changed to GE or LT (yes, LE should be here, at least for non-unique index; it's a bug).
A stop condition should logically stop the traversal if we met a tuple that does not correspond to the given EQ/REQ condition. However we have no code that would do that. It is the bug.

See the is_early_exit_possible() function in crud/compare/filters.lua. There is no condition_iter == box.index.EQ or condition_iter == box.index.REQ case.

I guess we can add it, but it should be enabled only for the condition that corresponds to the index we iterate over. Adding it for arbitrary condition would make usual filtering conditions be stop conditions, it is incorrect.

There are several problems in this algorithm:

LT instead of LE.
No stop condition.
The 'potentially long select' warning (PR Log empty or nil select calls #277) is confused by the GE/LT in the query plan.

The first problem is the trivial. We can fix the remaining problems in different ways:

Mark stop conditions explicitly in the query planner and use this information in is_early_exit_possible(). Mark 'GE bounded by EQ' and 'LE bounded by REQ' queries in the query plan to use this information in the code that emits the warning.
Don't add the post-filter, don't replace EQ/REQ with GE/LE, don't replace the key from the first index-backed condition with the key extracted from the after tuple.

The second approach was choosen here. As I know from Oleg's words, he tried to implement something in the spirit of the first one, but met problems.

There are downsides in the choosen approach. We always start from the same key from conditions and linearly traverse the tuples to the after tuple. It is possible to have many same-by-the-key tuples in the case of partial key or non-unique index. While it is inevitable for a non-unique index (till we'll have tarantool/tarantool#5282 implemented), it is technically possible to extract the full key from the after tuple (in case of the partial key in the condition).

I guess that the requests with partial keys and non-unique index with large sequences of the-same-key tuples are rare, so I can accept it as the temporary solution. Please, file an issue regarding the problem described above.

I'll also add an inline comment regarding the implementation (see below).

To sum up: this patch is okay as the temporary solution.

4. select: improve behavior of select(nil) + after

That's nice.

5. doc: fix crud.select the first parameter desription

It's okay.

(Typo in the commit header: desription -> description.)

Summary

The patchset is okay for me.

I expect that you'll work on the patch descriptions to some extent (until it'll be too tiresome), that you'll file a follow up issue regarding pagination with the partial key condition (see comments to the the fourth patch), that you'll mention the little optimization opportunity for unique index scan during work on #292 (see comments to the first patch).

crud/select/executor.lua

Totktonada · 2022-06-09T21:33:10Z

What also possibly worth to mention in the developer documentation (#292): the key replacement¹ code is spread across the query planner and the executor. There is the code that handles the first < 0 case in the query planner:

if opts.first ~= nil then
    total_tuples_count = math.abs(opts.first)

    if opts.first < 0 then
        scan_iter = utils.invert_tarantool_iter(scan_iter)

        -- scan condition becomes border consition
        scan_condition_num = nil

        if scan_after_tuple ~= nil then                                -- <-- FROM HERE
            if has_keydef then
                local key_def = keydef_lib.new(scan_index.parts)
                scan_value = key_def:extract_key(scan_after_tuple)
            else
                scan_value = utils.extract_key(scan_after_tuple, scan_index.parts)
            end
        else
            scan_value = nil
        end                                                            -- <-- TO HERE
    end
end

However the tightly coupled (or even duplicating) code is in the generate_value() in the executor. That's strange. We should at least explain it or, better, rearrange the code to make it less strange.

Replace the key extracted from the first condition backed by an index with the key extracted from the after tuple. ↩

oleg-jukovec · 2022-06-10T09:00:02Z

There is the code that handles the first < 0 case in the query planner:

it's for case:

iterator: >
scan value: 1
after value: 5

after the code:

iterator: <
scan value: 5
after value: 5
filter (and stop condition): > 1

The executor at now just choose a bigger value from scan value and after value to start execution from a nearest value to expected results. There is some logic in that, but it seems to me that the planner code and the executor code are too closely depend on each other. Maybe I will do a small refactoring in #292 to make it clearer.

LE with after has always been replaced by LT. The idea was to start traverse not from the after value, but from a next value. The micro-optimization allowed to fetch one less tuple per time. But this caused the condition value to be skipped if the value is less than the after value.

A comparator that choosed a starting key for select from a condition value and after always returned a smaller one. This led to an unsymmetric behavior depending on the direction of a traverse. Plus conditions didn't work correctly in all cases.

crud.select reverses order of the results from the cluster in case of negative first. It is necessary because the pagination results should be in the same order when moving in both directions. The reverse function did not work correctly.

EQ with after has always been replaced by LT. The idea was to start traverse not from the after value, but from a next value. The micro-optimization allowed to fetch one less tuple per time. It worked well for unique indexes, but this led to wrong results for non-unique indexes.

Before the patch we copy an original EQ/REQ condition to post-filter and change the original to GE/LE. The problem is that EQ/REQ can't be a stop condition. As a result EQ/REQ + after unexpectedly became a full scan. The patch removes this behavior, the query becomes more expected and in base executes as regular EQ/REQ. We had an idea to replace EQ with GE+LE. This would help to skip less tuples in the executor inside scroll_to_after_tuple. But unfortunately, due to a bug[1], this will break the current behavior. So we need to fix the bug first. In additional, the 'potentially long select'[2] warning is confused by the GE/LE in the query plan. This patch only changes the internal behavior and does not affect a client code. 1. #301 2. #277

Previously, the traverse always started from the beginning of the index, now after the after key.

oleg-jukovec · 2022-06-10T11:43:14Z

@Totktonada Thank you, I've updated the pull request:

Rewrote the commit message for the LT->LE commit.
Split EQ/REQ commit into two separate (the bug fix + change in an internal behavior). Rewrote the commit messages.
Added more examples in CHANGELOG.md for non-obvious cases.
Added a new fix, see commit select: fix result order with negative first.

I didn't go into too much detail in the commit messages to keep them short as possible. But I hope that the patches have become clearer together with the examples in CHANGELOG.md. I will write more detailed documentation about the planner/executor logic in #292.

But I can add something if you think it is necessary.

Totktonada · 2022-06-10T17:15:02Z

I guess that the requests with partial keys and non-unique index with large sequences of the-same-key tuples are rare, so I can accept it as the temporary solution. Please, file an issue regarding the problem described above.

It is the only thing that I expect from you.

The patchset looks okay for me.

Overview This is a bugfix release. Several cases of pagination using select with after was fixed or improved. Warning has been added to potentially long select and count calls. Storage select errors were reworked to be consistent with other calls errors. Breaking changes There are no breaking changes in the release. New features * Optimize `crud.select()` without conditions and with `after` (PR #295). * A critical log entry containing the current stack traceback is created upon potentially long `select` and `count` calls — an user can explicitly request a full scan through by passing `fullscan=true` to `select` or `count` options table argument in which case a log entry will not be created (#276). Bugfixes * Make select error description more informative when merger built-in module or tuple-merger external module is used in case of disabled/uninit storage (#229). * `crud.select()` if a condition is '<=' and it's value < `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... After this release: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... * `crud.select()` filtration by a first condition if the condition is '>' or '>=' and it's value > `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 51] - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... After this release: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... * `crud.select()` results order with negative `first` (PR #295). * `crud.select()` if a condition is '=' or '==' with negative `first` (PR #295). Suppose we have a non-unique secondary index by the field `age` field and a space: tarantool> crud.select('developers', nil, {first = 10}) --- - metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'surname', 'type': 'string'}, {'name': 'age', 'type': 'number'}] rows: - [1, 477, 'Alexey', 'Adams', 20] - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 27] - [5, 1172, 'Dmitry', 'Jacobi', 27] - [6, 1064, 'Alexey', 'Sidorov', 31] - null ... Previous releases: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - [] ... After this release: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] ...

oleg-jukovec requested review from Totktonada and DifferentialOrange May 24, 2022 09:00

oleg-jukovec changed the title ~~Oleg jukovec/fix after behavior~~ Fix crud.select() + after behavior problems May 24, 2022

DifferentialOrange reviewed May 25, 2022

View reviewed changes

test/integration/select_test.lua Outdated Show resolved Hide resolved

test/integration/select_test.lua Outdated Show resolved Hide resolved

test/integration/select_test.lua Outdated Show resolved Hide resolved

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 3 times, most recently from 7d506d4 to caec420 Compare May 25, 2022 08:31

DifferentialOrange reviewed May 26, 2022

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

test/integration/select_test.lua Outdated Show resolved Hide resolved

crud/select/executor.lua Show resolved Hide resolved

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 7 times, most recently from 29a1fd7 to 60ffb84 Compare May 26, 2022 17:55

oleg-jukovec mentioned this pull request May 27, 2022

Log empty or nil select calls #277

Merged

DifferentialOrange reviewed May 27, 2022

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

DifferentialOrange reviewed May 27, 2022

View reviewed changes

test/unit/select_plan_test.lua Outdated Show resolved Hide resolved

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 2 times, most recently from b9fd43f to 29da4ae Compare May 27, 2022 14:19

DifferentialOrange approved these changes May 30, 2022

View reviewed changes

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 29da4ae to 4433dd3 Compare May 30, 2022 10:42

Totktonada reviewed Jun 8, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 4433dd3 to a556645 Compare June 9, 2022 05:56

Totktonada reviewed Jun 9, 2022

View reviewed changes

crud/select/executor.lua Outdated Show resolved Hide resolved

Totktonada approved these changes Jun 9, 2022

View reviewed changes

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 2 times, most recently from daaa534 to 2ccfa53 Compare June 10, 2022 08:39

oleg-jukovec added 7 commits June 10, 2022 13:34

select: fix result order with negative first

c189595

crud.select reverses order of the results from the cluster in case of negative first. It is necessary because the pagination results should be in the same order when moving in both directions. The reverse function did not work correctly.

select: improve behavior of select(nil) + after

58efb65

Previously, the traverse always started from the beginning of the index, now after the after key.

doc: fix crud.select the first parameter desription

ea9c588

oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 2ccfa53 to ea9c588 Compare June 10, 2022 11:32

DifferentialOrange merged commit 68b7820 into master Jun 15, 2022

DifferentialOrange deleted the oleg-jukovec/fix-after-behavior branch June 15, 2022 06:41

DifferentialOrange mentioned this pull request Jun 15, 2022

Release 0.11.3 #303

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix crud.select() + after behavior problems #295

Fix crud.select() + after behavior problems #295

Uh oh!

oleg-jukovec commented May 24, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DifferentialOrange left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oleg-jukovec commented May 26, 2022

Uh oh!

oleg-jukovec commented May 26, 2022

Uh oh!

Uh oh!

Uh oh!

DifferentialOrange left a comment

Uh oh!

Uh oh!

Totktonada commented Jun 9, 2022

Uh oh!

Uh oh!

Totktonada commented Jun 9, 2022

Uh oh!

oleg-jukovec commented Jun 10, 2022

Uh oh!

oleg-jukovec commented Jun 10, 2022

Uh oh!

Totktonada commented Jun 10, 2022

Uh oh!

Uh oh!

Fix crud.select() + after behavior problems #295

Fix crud.select() + after behavior problems #295

Uh oh!

Conversation

oleg-jukovec commented May 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DifferentialOrange left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oleg-jukovec commented May 26, 2022

Uh oh!

oleg-jukovec commented May 26, 2022

Uh oh!

Uh oh!

Uh oh!

DifferentialOrange left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Totktonada commented Jun 9, 2022

General feedback

1. select: fix LE and after results when the condition value < after

2. select: fix traverse start key in case condition + after

3. select: fix EQ/REQ and after results

4. select: improve behavior of select(nil) + after

5. doc: fix crud.select the first parameter desription

Summary

Uh oh!

Uh oh!

Totktonada commented Jun 9, 2022

Footnotes

Uh oh!

oleg-jukovec commented Jun 10, 2022

Uh oh!

oleg-jukovec commented Jun 10, 2022

Uh oh!

Totktonada commented Jun 10, 2022

Uh oh!

Uh oh!

oleg-jukovec commented May 24, 2022 •

edited

Loading