Skip to content

Fix crud.select() + after behavior problems #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 15, 2022

Conversation

oleg-jukovec
Copy link
Contributor

@oleg-jukovec oleg-jukovec commented May 24, 2022

It seems to me that now there are logical errors with select + after:

  1. Always a smaller key is choosed to start the select when has_keydef.
  2. Unsymmetric behavior on ascending and descending order with ER/REQ.
  3. select(nil) + after always traverse to the after from beginning.

The patch fixes the problems.

@oleg-jukovec oleg-jukovec changed the title Oleg jukovec/fix after behavior Fix crud.select() + after behavior problems May 24, 2022
@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 3 times, most recently from 7d506d4 to caec420 Compare May 25, 2022 08:31
Copy link
Member

@DifferentialOrange DifferentialOrange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patch! You've found some really amusing bugs and misbehavings.

Does it block PR about select(nil) warning or they exist independently?

Although the test behavior seems valid, I think that this PR should be additionally reviewed with Alexander since he seems to be more in context than me.

@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 7 times, most recently from 29a1fd7 to 60ffb84 Compare May 26, 2022 17:55
@oleg-jukovec
Copy link
Contributor Author

I have updated the pull request:

  • I added a fix for LE->LT (the first first).
  • I left only the necessary tests.
  • I tried to reformulate changelog entries and commit messages.
  • Now EQ/REQ are not replaced by GE/LE (the third commit).

@oleg-jukovec
Copy link
Contributor Author

Does it block PR about select(nil) warning or they exist independently?

It does not. But after the pull request we can delete a doc entry:

**Note:**

Pagination with parameter ``after`` leads to a full scan even with the condition '=' or '==' on the index fields.
...

from #277.

@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 2 times, most recently from b9fd43f to 29da4ae Compare May 27, 2022 14:19
Copy link
Member

@DifferentialOrange DifferentialOrange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after restoring CHANGLELOG entry and rebasing to master.

Please, request additional approve from Alexander.

@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 29da4ae to 4433dd3 Compare May 30, 2022 10:42
@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 4433dd3 to a556645 Compare June 9, 2022 05:56
@Totktonada
Copy link
Member

General feedback

To be honest, it is not easy to spot ideas behind those commits. I'll not insist, but I think that more detailed descriptions would be a good step toward good query planner/executor.

I'll share my thoughts on how I would describe such changes.

Abstract example of description of planner/executor change
select: fix <...>

Let's consider the following space:

| id | age |
| -- | --- |
|  1 |  32 |
|  2 |  22 |
|  3 |  33 |
|  4 |  24 |
|  5 |  23 |

We're traverse over the space with sorting by the `age` field with the
following condition: `age <= 30`. The page size is 2.

To make the explanation simpler, let's look on the same table, but
sorted by `age`:

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |
|  4 |  24 |
|  1 |  32 |
|  3 |  33 |

And filtered by `age <= 30`:

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |
|  4 |  24 |

First, we got the first page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2})

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |

Now we want to fetch the next page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2, after = {5, 23}})

| id | age |
| -- | --- |
|  4 |  24 |

Then we step back to the previous page:

> crud.select('<...>', {{'<=', 'age', 30}}, {first = -2, after = {4, 24}})

| id | age |
| -- | --- |
|  2 |  22 |
|  5 |  23 |

It is where the problem appears. Before the change the response was the
following:

| id | age |
| -- | --- |
|  2 |  22 |

Several details about the query planner and executor are important here.

1. The plan contains `foo` and `bar` fields, which normally corresponds
   to a first select condition backed by an index. However in presence
   of `after` it is changed to <...>. Extra post-filter is added to the
   `baz` field. A stop condition is added to the `fiz` field.
2. When `first` value is negative the planner performs yet another
   transformation: <...>.
3. The executor just uses `foo` and `bar` disregarding `after` and
   `first` presence. However if `after` is present, it <...>.

The problem was in the wrong transformation in the query planner: it
takes into account <...>, but missed the case, when <...>. The patch
fixes the planner to support both cases.

There is another scenario, which should be taken into account: if the
index is non-unique, then <...>. Let's consider the example table
(sorted by `age`):

| id | age |
| -- | --- |
|  2 |  22 |
|  6 |  22 |
|  7 |  22 |
|  5 |  23 |
|  4 |  24 |
|  1 |  32 |
|  3 |  33 |

If we're on the second page and want to get the first one:

> crud.select('<...>', {{'>=', 'age', 30}}, {first = -2, after = {7, 22}})

| id | age |
| -- | --- |
|  2 |  22 |
|  6 |  22 |

The result is anyway correct, because it is handled by the following
code in the executor:

<..code here..>

As result the changed code produces correct result here as well.

Maybe my example is too wordy, but I would like to see some explanations. Not necessary so large ones.

We discussed the changes with @oleg-jukovec and @DifferentialOrange and now I understood them better. I'll summarize my understanding and why I think that the changes are okay.

1. select: fix LE and after results when the condition value < after

If the index is non-unique, the page border can be in a middle of sequence tuples with the same values of the key. Before the fix we would jump over all those tuples. Now it works correctly.

It was not obvious on the first glance, why we can relax the condition and internally use <= instead of <. Naive pagination implementation would just use < to get the next page (I assume descending order, that's why it is the next page, not previous), but it would work only for unique indexes. (Partial keys are irrelevant here, because we always can extract a full key from the after tuple).

The executor performs scrolling to the after tuple in the scroll_to_after_tuple() function. So in the case of the unique index we'll lookup one more tuple. In case of non-unique index, the scrolling is inevitable (while tarantool/tarantool#5282 is not implemented).

There is generate_value() machinery in the executor: it replaces the key to start index scan with the key extracted from the after tuple. It actually performs the replacement when the key from the after tuple is after the key from the condition (with correspondence to basic direction imposed by conditions). IOW, when replacing the value allows to save some CPU cycles on linear scanning and pass the better start value to the index scan (the index works logarithmically).

Since < (before the patch) and <= (after the patch) index iterator is applied to the resulting scan value, including one we get from the generate_value() function, we should make sure that the proposed change works good for both cases: when we replaced the scan value and when we don't.

When we replaced it, scroll_to_after_tuple() will compensate it. Okay.

When we don't do that, the after tuple is either before the original scan value (and so scroll_to_after_tuple() will be no-op) or we have no after tuple and the < to <= replacement was not performed in the query planner. Okay too.

The patch looks correct for me.

Here we have the headroom for a little optimization:

  • Use > and < for paginating over an unique index (when after is passed).

I would like to see this possibility mentioned somewhere in developer docs (say, in the scope of #292).

2. select: fix traverse start key in case condition + after

It's logical bugfix for travising in the descending order (< or <= in the first indexed condition). Okay.

3. select: fix EQ/REQ and after results

Wow. This change is not simple. I'll put here how I explained it for myself (based on the code and what Oleg tells me on the meeting). Some context is necessary here.

How a EQ/REQ condition + after was handled before the patch?

  • The condition is added as a post-filter.

    Technically it is performed by this block of the code (from plan.new()):

    -- Since we use pagination we should continue iteration since after tuple.
    -- Such iterator could be run only with index:pairs(key(after_tuple), 'GT/LT').
    -- To preserve original condition we should manually inject it in `filter_conditions`
    -- See function `parse` in crud/select/filters.lua file for details.
    if scan_after_tuple ~= nil then
        if scan_iter == box.index.REQ or scan_iter == box.index.EQ then
            table.insert(conditions, conditions[scan_condition_num])
        end
    end

    We add the condition once again into the array of conditions. It is the trick to overrun the following if-condition in the parse() function in crud/compare/filters.lua:

    local filter_conditions = {}
    for i, condition in ipairs(conditions) do
        if i ~= opts.scan_condition_num then          -- <-- THIS ONE
            <...>
            table.insert(filter_conditions, {<...>})
        end
    end

    So the condition is present twice in the array of conditions, but ignored only once.

  • The iterator is changed to GE or LT (yes, LE should be here, at least for non-unique index; it's a bug).

  • A stop condition should logically stop the traversal if we met a tuple that does not correspond to the given EQ/REQ condition. However we have no code that would do that. It is the bug.

    See the is_early_exit_possible() function in crud/compare/filters.lua. There is no condition_iter == box.index.EQ or condition_iter == box.index.REQ case.

    I guess we can add it, but it should be enabled only for the condition that corresponds to the index we iterate over. Adding it for arbitrary condition would make usual filtering conditions be stop conditions, it is incorrect.

There are several problems in this algorithm:

The first problem is the trivial. We can fix the remaining problems in different ways:

  • Mark stop conditions explicitly in the query planner and use this information in is_early_exit_possible(). Mark 'GE bounded by EQ' and 'LE bounded by REQ' queries in the query plan to use this information in the code that emits the warning.
  • Don't add the post-filter, don't replace EQ/REQ with GE/LE, don't replace the key from the first index-backed condition with the key extracted from the after tuple.

The second approach was choosen here. As I know from Oleg's words, he tried to implement something in the spirit of the first one, but met problems.

There are downsides in the choosen approach. We always start from the same key from conditions and linearly traverse the tuples to the after tuple. It is possible to have many same-by-the-key tuples in the case of partial key or non-unique index. While it is inevitable for a non-unique index (till we'll have tarantool/tarantool#5282 implemented), it is technically possible to extract the full key from the after tuple (in case of the partial key in the condition).

I guess that the requests with partial keys and non-unique index with large sequences of the-same-key tuples are rare, so I can accept it as the temporary solution. Please, file an issue regarding the problem described above.

I'll also add an inline comment regarding the implementation (see below).

To sum up: this patch is okay as the temporary solution.

4. select: improve behavior of select(nil) + after

That's nice.

5. doc: fix crud.select the first parameter desription

It's okay.

(Typo in the commit header: desription -> description.)

Summary

The patchset is okay for me.

I expect that you'll work on the patch descriptions to some extent (until it'll be too tiresome), that you'll file a follow up issue regarding pagination with the partial key condition (see comments to the the fourth patch), that you'll mention the little optimization opportunity for unique index scan during work on #292 (see comments to the first patch).

@Totktonada
Copy link
Member

What also possibly worth to mention in the developer documentation (#292): the key replacement1 code is spread across the query planner and the executor. There is the code that handles the first < 0 case in the query planner:

if opts.first ~= nil then
    total_tuples_count = math.abs(opts.first)

    if opts.first < 0 then
        scan_iter = utils.invert_tarantool_iter(scan_iter)

        -- scan condition becomes border consition
        scan_condition_num = nil

        if scan_after_tuple ~= nil then                                -- <-- FROM HERE
            if has_keydef then
                local key_def = keydef_lib.new(scan_index.parts)
                scan_value = key_def:extract_key(scan_after_tuple)
            else
                scan_value = utils.extract_key(scan_after_tuple, scan_index.parts)
            end
        else
            scan_value = nil
        end                                                            -- <-- TO HERE
    end
end

However the tightly coupled (or even duplicating) code is in the generate_value() in the executor. That's strange. We should at least explain it or, better, rearrange the code to make it less strange.

Footnotes

  1. Replace the key extracted from the first condition backed by an index with the key extracted from the after tuple.

@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch 2 times, most recently from daaa534 to 2ccfa53 Compare June 10, 2022 08:39
@oleg-jukovec
Copy link
Contributor Author

There is the code that handles the first < 0 case in the query planner:

it's for case:

iterator: >
scan value: 1
after value: 5

after the code:

iterator: <
scan value: 5
after value: 5
filter (and stop condition): > 1

The executor at now just choose a bigger value from scan value and after value to start execution from a nearest value to expected results. There is some logic in that, but it seems to me that the planner code and the executor code are too closely depend on each other. Maybe I will do a small refactoring in #292 to make it clearer.

LE with after has always been replaced by LT. The idea was to start
traverse not from the after value, but from a next value. The
micro-optimization allowed to fetch one less tuple per time.

But this caused the condition value to be skipped if the value is less
than the after value.
A comparator that choosed a starting key for select from a condition
value and after always returned a smaller one. This led to an
unsymmetric behavior depending on the direction of a traverse. Plus
conditions didn't work correctly in all cases.
crud.select reverses order of the results from the cluster in case
of negative first. It is necessary because the pagination results
should be in the same order when moving in both directions. The reverse
function did not work correctly.
EQ with after has always been replaced by LT. The idea was to start
traverse not from the after value, but from a next value. The
micro-optimization allowed to fetch one less tuple per time.

It worked well for unique indexes, but this led to wrong results for
non-unique indexes.
Before the patch we copy an original EQ/REQ condition to post-filter
and change the original to GE/LE. The problem is that EQ/REQ can't be
a stop condition. As a result EQ/REQ + after unexpectedly became a
full scan.

The patch removes this behavior, the query becomes more expected and
in base executes as regular EQ/REQ.

We had an idea to replace EQ with GE+LE. This would help to skip less
tuples in the executor inside scroll_to_after_tuple. But unfortunately,
due to a bug[1], this will break the current behavior. So we need to
fix the bug first.

In additional, the 'potentially long select'[2] warning is confused by
the GE/LE in the query plan.

This patch only changes the internal behavior and does not affect a
client code.

1. #301
2. #277
Previously, the traverse always started from the beginning of the index,
now after the after key.
@oleg-jukovec oleg-jukovec force-pushed the oleg-jukovec/fix-after-behavior branch from 2ccfa53 to ea9c588 Compare June 10, 2022 11:32
@oleg-jukovec
Copy link
Contributor Author

@Totktonada Thank you, I've updated the pull request:

  1. Rewrote the commit message for the LT->LE commit.
  2. Split EQ/REQ commit into two separate (the bug fix + change in an internal behavior). Rewrote the commit messages.
  3. Added more examples in CHANGELOG.md for non-obvious cases.
  4. Added a new fix, see commit select: fix result order with negative first.

I didn't go into too much detail in the commit messages to keep them short as possible. But I hope that the patches have become clearer together with the examples in CHANGELOG.md. I will write more detailed documentation about the planner/executor logic in #292.

But I can add something if you think it is necessary.

@Totktonada
Copy link
Member

I guess that the requests with partial keys and non-unique index with large sequences of the-same-key tuples are rare, so I can accept it as the temporary solution. Please, file an issue regarding the problem described above.

It is the only thing that I expect from you.

The patchset looks okay for me.

@DifferentialOrange DifferentialOrange merged commit 68b7820 into master Jun 15, 2022
@DifferentialOrange DifferentialOrange deleted the oleg-jukovec/fix-after-behavior branch June 15, 2022 06:41
DifferentialOrange added a commit that referenced this pull request Jun 15, 2022
Overview

    This is a bugfix release. Several cases of pagination using select
    with after was fixed or improved. Warning has been added to
    potentially long select and count calls. Storage select errors
    were reworked to be consistent with other calls errors.

Breaking changes

    There are no breaking changes in the release.

New features

    * Optimize `crud.select()` without conditions and with
    `after` (PR #295).

    * A critical log entry containing the current stack traceback is
    created upon potentially long `select` and `count` calls —
    an user can explicitly request a full scan through by passing
    `fullscan=true` to `select` or `count` options table argument
    in which case a log entry will not be created (#276).

Bugfixes

    * Make select error description more informative when merger
    built-in module or tuple-merger external module is used
    in case of disabled/uninit storage (#229).

    * `crud.select()` if a condition is '<=' and it's
    value < `after` (PR #295).

    Previous releases:

    tarantool> crud.select('developers',
             > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows
    ---
    - - [2, 401, 'Sergey', 'Allred', 21]
      - [1, 477, 'Alexey', 'Adams', 20]
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows
    ---
    - - [3, 2804, 'Pavel', 'Adams', 27]
      - [2, 401, 'Sergey', 'Allred', 21]
      - [1, 477, 'Alexey', 'Adams', 20]
    ...

    * `crud.select()` filtration by a first condition if the condition
    is '>' or '>=' and it's value > `after` (PR #295).

    Previous releases:

    tarantool> crud.select('developers',
             > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows
    ---
    - - [3, 2804, 'Pavel', 'Adams', 27]
      - [4, 1161, 'Mikhail', 'Liston', 51]
      - [5, 1172, 'Dmitry', 'Jacobi', 16]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows
    ---
      - [5, 1172, 'Dmitry', 'Jacobi', 16]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    ...

    * `crud.select()` results order with negative `first` (PR #295).
    * `crud.select()` if a condition is '=' or '==' with
    negative `first` (PR #295).

    Suppose we have a non-unique secondary index by the field `age`
    field and a space:

    tarantool> crud.select('developers', nil, {first = 10})
    ---
    - metadata: [{'name': 'id', 'type': 'unsigned'},
        {'name': 'bucket_id', 'type': 'unsigned'},
        {'name': 'name', 'type': 'string'},
        {'name': 'surname', 'type': 'string'},
        {'name': 'age', 'type': 'number'}]
      rows:
      - [1, 477, 'Alexey', 'Adams', 20]
      - [2, 401, 'Sergey', 'Allred', 27]
      - [3, 2804, 'Pavel', 'Adams', 27]
      - [4, 1161, 'Mikhail', 'Liston', 27]
      - [5, 1172, 'Dmitry', 'Jacobi', 27]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    - null
    ...

    Previous releases:

    tarantool> crud.select('developers',
             > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows
    ---
    - []
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows
    ---
    - - [2, 401, 'Sergey', 'Allred', 27]
      - [3, 2804, 'Pavel', 'Adams', 27]
    ...
DifferentialOrange added a commit that referenced this pull request Jun 15, 2022
Overview

    This is a bugfix release. Several cases of pagination using select
    with after was fixed or improved. Warning has been added to
    potentially long select and count calls. Storage select errors
    were reworked to be consistent with other calls errors.

Breaking changes

    There are no breaking changes in the release.

New features

    * Optimize `crud.select()` without conditions and with
    `after` (PR #295).

    * A critical log entry containing the current stack traceback is
    created upon potentially long `select` and `count` calls —
    an user can explicitly request a full scan through by passing
    `fullscan=true` to `select` or `count` options table argument
    in which case a log entry will not be created (#276).

Bugfixes

    * Make select error description more informative when merger
    built-in module or tuple-merger external module is used
    in case of disabled/uninit storage (#229).

    * `crud.select()` if a condition is '<=' and it's
    value < `after` (PR #295).

    Previous releases:

    tarantool> crud.select('developers',
             > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows
    ---
    - - [2, 401, 'Sergey', 'Allred', 21]
      - [1, 477, 'Alexey', 'Adams', 20]
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows
    ---
    - - [3, 2804, 'Pavel', 'Adams', 27]
      - [2, 401, 'Sergey', 'Allred', 21]
      - [1, 477, 'Alexey', 'Adams', 20]
    ...

    * `crud.select()` filtration by a first condition if the condition
    is '>' or '>=' and it's value > `after` (PR #295).

    Previous releases:

    tarantool> crud.select('developers',
             > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows
    ---
    - - [3, 2804, 'Pavel', 'Adams', 27]
      - [4, 1161, 'Mikhail', 'Liston', 51]
      - [5, 1172, 'Dmitry', 'Jacobi', 16]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows
    ---
      - [5, 1172, 'Dmitry', 'Jacobi', 16]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    ...

    * `crud.select()` results order with negative `first` (PR #295).
    * `crud.select()` if a condition is '=' or '==' with
    negative `first` (PR #295).

    Suppose we have a non-unique secondary index by the field `age`
    field and a space:

    tarantool> crud.select('developers', nil, {first = 10})
    ---
    - metadata: [{'name': 'id', 'type': 'unsigned'},
        {'name': 'bucket_id', 'type': 'unsigned'},
        {'name': 'name', 'type': 'string'},
        {'name': 'surname', 'type': 'string'},
        {'name': 'age', 'type': 'number'}]
      rows:
      - [1, 477, 'Alexey', 'Adams', 20]
      - [2, 401, 'Sergey', 'Allred', 27]
      - [3, 2804, 'Pavel', 'Adams', 27]
      - [4, 1161, 'Mikhail', 'Liston', 27]
      - [5, 1172, 'Dmitry', 'Jacobi', 27]
      - [6, 1064, 'Alexey', 'Sidorov', 31]
    - null
    ...

    Previous releases:

    tarantool> crud.select('developers',
             > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows
    ---
    - []
    ...

    After this release:

    tarantool> crud.select('developers',
             > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows
    ---
    - - [2, 401, 'Sergey', 'Allred', 27]
      - [3, 2804, 'Pavel', 'Adams', 27]
    ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants