-
Notifications
You must be signed in to change notification settings - Fork 15
Fix crud.select() + after behavior problems #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7d506d4
to
caec420
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patch! You've found some really amusing bugs and misbehavings.
Does it block PR about select(nil)
warning or they exist independently?
Although the test behavior seems valid, I think that this PR should be additionally reviewed with Alexander since he seems to be more in context than me.
29a1fd7
to
60ffb84
Compare
I have updated the pull request:
|
It does not. But after the pull request we can delete a doc entry:
from #277. |
b9fd43f
to
29da4ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after restoring CHANGLELOG entry and rebasing to master.
Please, request additional approve from Alexander.
29da4ae
to
4433dd3
Compare
4433dd3
to
a556645
Compare
General feedbackTo be honest, it is not easy to spot ideas behind those commits. I'll not insist, but I think that more detailed descriptions would be a good step toward good query planner/executor. I'll share my thoughts on how I would describe such changes. Abstract example of description of planner/executor changeselect: fix <...>
Let's consider the following space:
| id | age |
| -- | --- |
| 1 | 32 |
| 2 | 22 |
| 3 | 33 |
| 4 | 24 |
| 5 | 23 |
We're traverse over the space with sorting by the `age` field with the
following condition: `age <= 30`. The page size is 2.
To make the explanation simpler, let's look on the same table, but
sorted by `age`:
| id | age |
| -- | --- |
| 2 | 22 |
| 5 | 23 |
| 4 | 24 |
| 1 | 32 |
| 3 | 33 |
And filtered by `age <= 30`:
| id | age |
| -- | --- |
| 2 | 22 |
| 5 | 23 |
| 4 | 24 |
First, we got the first page:
> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2})
| id | age |
| -- | --- |
| 2 | 22 |
| 5 | 23 |
Now we want to fetch the next page:
> crud.select('<...>', {{'<=', 'age', 30}}, {first = 2, after = {5, 23}})
| id | age |
| -- | --- |
| 4 | 24 |
Then we step back to the previous page:
> crud.select('<...>', {{'<=', 'age', 30}}, {first = -2, after = {4, 24}})
| id | age |
| -- | --- |
| 2 | 22 |
| 5 | 23 |
It is where the problem appears. Before the change the response was the
following:
| id | age |
| -- | --- |
| 2 | 22 |
Several details about the query planner and executor are important here.
1. The plan contains `foo` and `bar` fields, which normally corresponds
to a first select condition backed by an index. However in presence
of `after` it is changed to <...>. Extra post-filter is added to the
`baz` field. A stop condition is added to the `fiz` field.
2. When `first` value is negative the planner performs yet another
transformation: <...>.
3. The executor just uses `foo` and `bar` disregarding `after` and
`first` presence. However if `after` is present, it <...>.
The problem was in the wrong transformation in the query planner: it
takes into account <...>, but missed the case, when <...>. The patch
fixes the planner to support both cases.
There is another scenario, which should be taken into account: if the
index is non-unique, then <...>. Let's consider the example table
(sorted by `age`):
| id | age |
| -- | --- |
| 2 | 22 |
| 6 | 22 |
| 7 | 22 |
| 5 | 23 |
| 4 | 24 |
| 1 | 32 |
| 3 | 33 |
If we're on the second page and want to get the first one:
> crud.select('<...>', {{'>=', 'age', 30}}, {first = -2, after = {7, 22}})
| id | age |
| -- | --- |
| 2 | 22 |
| 6 | 22 |
The result is anyway correct, because it is handled by the following
code in the executor:
<..code here..>
As result the changed code produces correct result here as well. Maybe my example is too wordy, but I would like to see some explanations. Not necessary so large ones. We discussed the changes with @oleg-jukovec and @DifferentialOrange and now I understood them better. I'll summarize my understanding and why I think that the changes are okay. 1. select: fix LE and after results when the condition value < afterIf the index is non-unique, the page border can be in a middle of sequence tuples with the same values of the key. Before the fix we would jump over all those tuples. Now it works correctly. It was not obvious on the first glance, why we can relax the condition and internally use The executor performs scrolling to the There is Since When we replaced it, When we don't do that, the The patch looks correct for me. Here we have the headroom for a little optimization:
I would like to see this possibility mentioned somewhere in developer docs (say, in the scope of #292). 2. select: fix traverse start key in case condition + afterIt's logical bugfix for travising in the descending order ( 3. select: fix EQ/REQ and after resultsWow. This change is not simple. I'll put here how I explained it for myself (based on the code and what Oleg tells me on the meeting). Some context is necessary here. How a EQ/REQ condition +
There are several problems in this algorithm:
The first problem is the trivial. We can fix the remaining problems in different ways:
The second approach was choosen here. As I know from Oleg's words, he tried to implement something in the spirit of the first one, but met problems. There are downsides in the choosen approach. We always start from the same key from conditions and linearly traverse the tuples to the I guess that the requests with partial keys and non-unique index with large sequences of the-same-key tuples are rare, so I can accept it as the temporary solution. Please, file an issue regarding the problem described above. I'll also add an inline comment regarding the implementation (see below). To sum up: this patch is okay as the temporary solution. 4. select: improve behavior of select(nil) + afterThat's nice. 5. doc: fix crud.select the first parameter desriptionIt's okay. (Typo in the commit header: desription -> description.) SummaryThe patchset is okay for me. I expect that you'll work on the patch descriptions to some extent (until it'll be too tiresome), that you'll file a follow up issue regarding pagination with the partial key condition (see comments to the the fourth patch), that you'll mention the little optimization opportunity for unique index scan during work on #292 (see comments to the first patch). |
What also possibly worth to mention in the developer documentation (#292): the key replacement1 code is spread across the query planner and the executor. There is the code that handles the if opts.first ~= nil then
total_tuples_count = math.abs(opts.first)
if opts.first < 0 then
scan_iter = utils.invert_tarantool_iter(scan_iter)
-- scan condition becomes border consition
scan_condition_num = nil
if scan_after_tuple ~= nil then -- <-- FROM HERE
if has_keydef then
local key_def = keydef_lib.new(scan_index.parts)
scan_value = key_def:extract_key(scan_after_tuple)
else
scan_value = utils.extract_key(scan_after_tuple, scan_index.parts)
end
else
scan_value = nil
end -- <-- TO HERE
end
end However the tightly coupled (or even duplicating) code is in the Footnotes
|
daaa534
to
2ccfa53
Compare
it's for case: iterator: > after the code: iterator: < The executor at now just choose a bigger value from scan value and after value to start execution from a nearest value to expected results. There is some logic in that, but it seems to me that the planner code and the executor code are too closely depend on each other. Maybe I will do a small refactoring in #292 to make it clearer. |
LE with after has always been replaced by LT. The idea was to start traverse not from the after value, but from a next value. The micro-optimization allowed to fetch one less tuple per time. But this caused the condition value to be skipped if the value is less than the after value.
A comparator that choosed a starting key for select from a condition value and after always returned a smaller one. This led to an unsymmetric behavior depending on the direction of a traverse. Plus conditions didn't work correctly in all cases.
crud.select reverses order of the results from the cluster in case of negative first. It is necessary because the pagination results should be in the same order when moving in both directions. The reverse function did not work correctly.
EQ with after has always been replaced by LT. The idea was to start traverse not from the after value, but from a next value. The micro-optimization allowed to fetch one less tuple per time. It worked well for unique indexes, but this led to wrong results for non-unique indexes.
Before the patch we copy an original EQ/REQ condition to post-filter and change the original to GE/LE. The problem is that EQ/REQ can't be a stop condition. As a result EQ/REQ + after unexpectedly became a full scan. The patch removes this behavior, the query becomes more expected and in base executes as regular EQ/REQ. We had an idea to replace EQ with GE+LE. This would help to skip less tuples in the executor inside scroll_to_after_tuple. But unfortunately, due to a bug[1], this will break the current behavior. So we need to fix the bug first. In additional, the 'potentially long select'[2] warning is confused by the GE/LE in the query plan. This patch only changes the internal behavior and does not affect a client code. 1. #301 2. #277
Previously, the traverse always started from the beginning of the index, now after the after key.
2ccfa53
to
ea9c588
Compare
@Totktonada Thank you, I've updated the pull request:
I didn't go into too much detail in the commit messages to keep them short as possible. But I hope that the patches have become clearer together with the examples in CHANGELOG.md. I will write more detailed documentation about the planner/executor logic in #292. But I can add something if you think it is necessary. |
It is the only thing that I expect from you. The patchset looks okay for me. |
Overview This is a bugfix release. Several cases of pagination using select with after was fixed or improved. Warning has been added to potentially long select and count calls. Storage select errors were reworked to be consistent with other calls errors. Breaking changes There are no breaking changes in the release. New features * Optimize `crud.select()` without conditions and with `after` (PR #295). * A critical log entry containing the current stack traceback is created upon potentially long `select` and `count` calls — an user can explicitly request a full scan through by passing `fullscan=true` to `select` or `count` options table argument in which case a log entry will not be created (#276). Bugfixes * Make select error description more informative when merger built-in module or tuple-merger external module is used in case of disabled/uninit storage (#229). * `crud.select()` if a condition is '<=' and it's value < `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... After this release: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... * `crud.select()` filtration by a first condition if the condition is '>' or '>=' and it's value > `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 51] - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... After this release: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... * `crud.select()` results order with negative `first` (PR #295). * `crud.select()` if a condition is '=' or '==' with negative `first` (PR #295). Suppose we have a non-unique secondary index by the field `age` field and a space: tarantool> crud.select('developers', nil, {first = 10}) --- - metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'surname', 'type': 'string'}, {'name': 'age', 'type': 'number'}] rows: - [1, 477, 'Alexey', 'Adams', 20] - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 27] - [5, 1172, 'Dmitry', 'Jacobi', 27] - [6, 1064, 'Alexey', 'Sidorov', 31] - null ... Previous releases: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - [] ... After this release: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] ...
Overview This is a bugfix release. Several cases of pagination using select with after was fixed or improved. Warning has been added to potentially long select and count calls. Storage select errors were reworked to be consistent with other calls errors. Breaking changes There are no breaking changes in the release. New features * Optimize `crud.select()` without conditions and with `after` (PR #295). * A critical log entry containing the current stack traceback is created upon potentially long `select` and `count` calls — an user can explicitly request a full scan through by passing `fullscan=true` to `select` or `count` options table argument in which case a log entry will not be created (#276). Bugfixes * Make select error description more informative when merger built-in module or tuple-merger external module is used in case of disabled/uninit storage (#229). * `crud.select()` if a condition is '<=' and it's value < `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... After this release: tarantool> crud.select('developers', > {{'<=', 'id', 3}}, {first = 10, after = rows[5]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [2, 401, 'Sergey', 'Allred', 21] - [1, 477, 'Alexey', 'Adams', 20] ... * `crud.select()` filtration by a first condition if the condition is '>' or '>=' and it's value > `after` (PR #295). Previous releases: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 51] - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... After this release: tarantool> crud.select('developers', > {{'>=', 'id', 5}}, {first = 10, after = rows[2]}).rows --- - [5, 1172, 'Dmitry', 'Jacobi', 16] - [6, 1064, 'Alexey', 'Sidorov', 31] ... * `crud.select()` results order with negative `first` (PR #295). * `crud.select()` if a condition is '=' or '==' with negative `first` (PR #295). Suppose we have a non-unique secondary index by the field `age` field and a space: tarantool> crud.select('developers', nil, {first = 10}) --- - metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'surname', 'type': 'string'}, {'name': 'age', 'type': 'number'}] rows: - [1, 477, 'Alexey', 'Adams', 20] - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] - [4, 1161, 'Mikhail', 'Liston', 27] - [5, 1172, 'Dmitry', 'Jacobi', 27] - [6, 1064, 'Alexey', 'Sidorov', 31] - null ... Previous releases: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - [] ... After this release: tarantool> crud.select('developers', > {{'=', 'age', 27}}, {first = -10, after = rows[4]}).rows --- - - [2, 401, 'Sergey', 'Allred', 27] - [3, 2804, 'Pavel', 'Adams', 27] ...
It seems to me that now there are logical errors with
select
+after
:select
when has_keydef.select(nil)
+after
always traverse to theafter
from beginning.The patch fixes the problems.