Skip to content

Allows API pagination to return blank pages #12475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

stevethedev
Copy link

Description

This update prevents the API pagination from always returning data after the last page results. Prior to this patch, the searchCriteria[currentPage] value would always return the last page that would have returned content, rather than the page that was requested (i.e. given a data set of 10 items, and a page-length of 10, pages [-∞ .. ∞] would all return 10 items).

  • This patch preserves all previously existing functionality through additive changes.
  • This patch does not affect underflow, where pages [-∞ .. 0] are simplified to "Page 1".
  • This patch only affects the API, so paging behaves exactly the same as before in all other contexts.
  • The new paging behavior is enabled by the function Collection Collection::setIsPageLimited(bool = true).
  • The new paging behavior is disabled by the function Collection Collection::setIsPageLimited(false)
  • The new paging behavior can be checked with bool Collection::getIsPageLimited().

Fixed Issues (if relevant)

  1. REST API page size & current page search criteria options return item(s) past end of data set #8099: REST API page size & current page search criteria options return item(s) past end of data set

Manual testing scenarios

  1. Make API request for products specifying page size as 1, current page as 99999, eg:
GET /rest/default/V1/products/?searchCriteria[pageSize]=1&searchCriteria[currentPage]=9999 
HTTP/1.1
Host: m2.localhost:8000
Authorization: OAuth oauth_consumer_key="sitoq7tu4b1aj7kikj4irkog8tggh1ch",oauth_token="kr3bly2kbbnrr9flyws9r4mxdaagxngo",oauth_signature_method="HMAC-SHA1",oauth_timestamp="1484095699",oauth_nonce="TivkWr",oauth_version="1.0",oauth_signature="Q02C2wk4AgkDPkYAbACUoLMoXa4%3D"
Cache-Control: no-cache

@magento-cicd2
Copy link
Contributor

magento-cicd2 commented Nov 28, 2017

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

@magento-engcom-team magento-engcom-team added bugfix Reproduced on 2.1.x The issue has been reproduced on latest 2.1 release Reproduced on 2.2.x The issue has been reproduced on latest 2.2 release Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release Component: Framework/Webapi USE ONLY for FRAMEWORK RELATED BUG! E.g If bug related to Catalog WEB API use just Catalog labels Nov 28, 2017
@orlangur orlangur self-assigned this Nov 29, 2017
@orlangur
Copy link
Contributor

This patch only affects the API, so paging behaves exactly the same as before in all other contexts.

Such change would be pretty confusing.

This patch does not affect underflow, where pages [-∞ .. 0] are simplified to "Page 1".

That confuses even more.

The new paging behavior is enabled by the function Collection Collection::setIsPageLimited(bool = true)

Changing class behavior with some switches/flags is never a good idea.

There was already such discussion in #9255. From the issue you mentioned I see that API returns total amount of records: "total_count": 2048

This update prevents the API pagination from always returning data after the last page results. Prior to this patch, the searchCriteria[currentPage] value would always return the last page that would have returned content, rather than the page that was requested (i.e. given a data set of 10 items, and a page-length of 10, pages [-∞ .. ∞] would all return 10 items).

Could you please add more details why you need to access invalid page at all? From the limitations of patch you described it seems like there is some bug with last page determination in your code but pagination works fine with underflow. Why not fix invalid pagination instead of hacking collection class?

@stevethedev
Copy link
Author

stevethedev commented Nov 29, 2017

Such change would be pretty confusing.

I agree that this should, ideally, be universal. In an ideal world, this behavior wouldn't be specifically hard-coded into the Collection class. Unfortunately, it is hard-coded, and there's already a system built around that assumption. If you would prefer that this change affect all collections, that is a simple enough change to make and I will do that. However, this change preserves backward-compatibility and only addresses the undesirable behavior from within the API.

That confuses even more.

This limit is hard-coded into the Zend_Db_Select class, and it deviates from how MariaDB and MySQL behave. At least this protects users and programmers from bad input, though — MariaDB and MySQL will return an error for any LIMIT [a] OFFSET [b] where either a < 0 or b < 0. It would make sense for an API to simply return the lowest valid page, which is always Page 1 (or OFFSET 0, if you prefer).

Changing class behavior with some switches/flags is never a good idea.

You have members of this class which do exactly this. Of course, this wouldn't be necessary if the Collection class behaved like the underlying SQL database and returned an empty array instead of returning the wrong data.

Could you please add more details why you need to access invalid page at all?

  1. Because it is objectively wrong for an API to return Page 10 if I requested Page 10,000.
  2. Because this is how the underlying MySQL or MariaDB database behaves.
  3. Because this is pretty standard behavior for how developers expect paging to behave, including the Magento Blog and the Magento Marketplace if you go to page 30 of 29.
  4. Because while (current page has data) is not an uncommon technique for reading from APIs.
  5. Because pagination is implemented as part of the search criteria, and Magento doesn't read /V1/products?searchCriteria[filter_groups][0][filters][0][field]=name&searchCriteria[filter_groups][0][filters][0][value]= and assume that because I have no products with an empty string in the "name" field that I actually wanted products that do not meet those criteria.

From the limitations of patch you described it seems like there is some bug with last page determination in your code but pagination works fine with underflow.

It's not that "pagination works fine with underflow". A request for Page -1 should also return an empty array, IMO, but this behavior is hard-coded into the Zend_Db_Select class, and that's a very low-level class to modify for what should be a localized problem. Rather than detecting the error and returning an empty set, the design decision was to default to the first page. This is also predictable because a negative page-number is always, always, always invalid. A positive page-number is not always invalid.

These also aren't limitations of the patch. They're the related behaviors that I did not touch. I think that Page -1 returning data from Page 1 is just as wrong as Page 1 returning data from Page 7. I'm just a reasonable person, and I understand why a system would enforce a lower-limit of "Page 1" in a paginated system.

Why not fix invalid pagination instead of hacking collection class?

Because this is the highest-level in the call-stack where such a change makes sense, and the alternative is to edit every API endpoint individually. The odd behavior originates from the collection class because the collection class deviates in behavior from the underlying database. An empty array is a valid response to a query that returns no results. A populated array is not a valid response to a query that returns no results.

@orlangur
Copy link
Contributor

@stevethedev, thanks for detailed explanation,

You have members of this class which do exactly this.

Which is most likely Magento 1 legacy code and does not prove flags should ever be used ;)

https://marketplace.magento.com/extensions/marketing-automation-email-marketing.html#q=&idx=m2_cloud_prod_default_products&p=29

Ok, so we would display standard message "No products matching your selection".

I think that Page -1 returning data from Page 1 is just as wrong as Page 1 returning data from Page 7

Aha, okiz. Please share your thoughts in underlying issue regarding consistent fix VS proper error returned in API.

@stevethedev
Copy link
Author

Which is most likely Magento 1 legacy code and does not prove flags should ever be used ;)

I agree. I've pushed up a new version that addresses this issue in the API instead of in the Collection.

Ok, so we would display standard message "No products matching your selection".

I would prefer to receive an empty array from the API. An empty array unambiguously communicates two things:

  1. That my search request was properly formed.
  2. That my search request did not return any data.

This update simplifies communication with the API.

The specific use-case where the current behavior becomes a problem is if we use API Paging and JavaScript Promises as a way to throttle requests to a Magento 2 server. This significantly simplifies the way that content is read from the API:

// Node.js v8.9.1
async function getItems(endpoint, batchSize = 100) {
    const items = [];
    let page = 0;
    let batch = null;

    do {
        batch = await requestPage(endpoint, ++page, batchSize);
        items.push(...batch);
    } while (batch.length);

    return items;
}

Aha, okiz. Please share your thoughts in underlying issue regarding consistent fix VS proper error returned in API.

I think that my newest updates to the Pull Request probably address this issue on the API level. My primary concern is how Magento interacts with the world "outside" of Magento. My main complaint is that the Magento API returns content in a way that makes consumption needlessly complex.

@orlangur orlangur added this to the December 2017 milestone Dec 7, 2017
$searchBuilder->setPageSize(1);
$searchBuilder->setCurrentPage(2);
$pageSize = 1;
$expectedPageCount = ceil(count($expectedResult) / $pageSize);
Copy link
Contributor

@orlangur orlangur Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good idea to put logic into test. Why this integration test is even affected by API-only changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$searchBuilder = $this->_objectManager->create(\Magento\Framework\Api\SearchCriteriaBuilder::class); ah, ok, so this seems to be widely used in the system besides API.

Copy link
Author

@stevethedev stevethedev Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The API's code is used outside of the main API.

I've also found, while diagnosing the failed tests, that two assumptions exist in several places:

  1. Page < 1 should retrieve Page 1. I could not find the source of this assumption. I set the API to assume that any page number less than 1 should be set to Page 1 to avoid breaking existing behavior. We had previously discussed underflow, and this abides by that discussion.
  2. Page max + 1 should return Page max. In this instance, the test was passing because Page 2 of 1 was returning Page 1. In the AddressRepositoryTest, there are tests that should return only 1 page, but the test always sets "Page 2". A side-effect of the behavior this PR addresses is that the tests could pass if any number of addresses were returned (because Page 2 of 1 would return Page 1, and Page 2 of 3 would return Page 2). This test now assumes that it should retrieve the last page, which seemed to be the original intent of the test.

I haven't fixed all of the failures from the integration tests, yet, but all of the issues I've found recently are because of the two points I listed above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$pageSize = 1;
$expectedPageCount = ceil(count($expectedResult) / $pageSize);

is it different from

$expectedPageCount = count($expectedResult);

?

  1. There is no place between WebAPI and search builder which could be changed to not affect integration tests at all but only WebAPI?
  2. If so, maybe it makes more sense to affect Web UI pagination as well - by changing underlying collection logic and nothing more?

Copy link
Author

@stevethedev stevethedev Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{code} is it different from {code}

They are effectively the same. The first one is just for clarity, but could be easily updated to the second one. Would you prefer the 1-line version over the 2-line version?

  1. There is no place between the WebAPI and search builder which could be changed to not affect integration tests at all but only WebAPI?

These changes invalidate the assumption that (2 of 1) === (1 of 1). Any tests that make this assumption at the API level (such as this one) will need to be updated regardless of where the fix is applied.

  1. If so, maybe it makes more sense to affect Web UI pagination as well - by changing underlying collection logic and nothing more?

This is an option, but it would affect more of the system. This change would be at ./lib/internal/Magento/Framework/Data/Collection.php:239:

    public function getCurPage($displacement = 0)
    {
        if ($this->_curPage + $displacement < 1) {
            return 1;
/* These lines would be removed
        } elseif ($this->_curPage + $displacement > $this->getLastPageNumber()) {
            return $this->getLastPageNumber();
*/
        } else {
            return $this->_curPage + $displacement;
        }
    }

However, tests that assume that Page 2 of 1 should return Page 1 would still need to be updated. That's the logic being changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, tests like this would need to go:

    public function testGetCurPage()
    {
        $this->_model->setCurPage(10);
        $this->assertEquals(1, $this->_model->getCurPage());
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this test data would need to change:

    public function getPriceFormatDataProvider()
    {
        return [
            ['en_US', ['decimalSymbol' => '.', 'groupSymbol' => ',']],
            ['de_DE', ['decimalSymbol' => ',', 'groupSymbol' => '.']],
            ['de_CH', ['decimalSymbol' => '.', 'groupSymbol' => '\'']], //<--- Right Here
            ['uk_UA', ['decimalSymbol' => ',', 'groupSymbol' => ' ']]
        ];
    }

That value should be , not \'. This test should not be passing right now.

@@ -328,7 +328,7 @@ public function testSearchAddresses($filters, $filterGroup, $filterOrders, $expe
}

$searchBuilder->setPageSize(1);
$searchBuilder->setCurrentPage(2);
$searchBuilder->setCurrentPage(count($expectedResult));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is necessary because some of the tests only expect 1 page back. This change ensures that tests that expect 1 page do not request the second page.

$this->_model->setCurPage(0);
$this->assertEquals(1, $this->_model->getCurPage());

$this->_model->setCurPage(1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes the assumption that Page 10 of 1 should return Page 1, but preserves the assumption that Pages < 1 return Page 1.

@stevethedev
Copy link
Author

@orlangur

Has there been any further movement on this PR?

@orlangur
Copy link
Contributor

@stevenvdp not yet, we need to understand whether a deeper fix (right on collection) is better. No changes needed from your side currently.

@okorshenko okorshenko modified the milestones: December 2017, January 2018 Jan 8, 2018
@okorshenko okorshenko modified the milestones: January 2018, February 2018 Feb 7, 2018
@okorshenko okorshenko modified the milestones: February 2018, March 2018 Mar 1, 2018
@okorshenko okorshenko modified the milestones: March 2018, April 2018 Apr 16, 2018
@okorshenko okorshenko modified the milestones: April 2018, May 2018 May 18, 2018
@okorshenko okorshenko removed this from the May 2018 milestone May 31, 2018
@yaroslav-zenin
Copy link
Contributor

really wait for merge this PR

@sivaschenko
Copy link
Member

sivaschenko commented Nov 5, 2018

Closed due to inactivity. @stevethedev , Please feel free to port this pull request to 2.3-develop branch as 2.2-develop does not accept improvement anymore (see the porting section of the contributor guide for more details).

Also please sign our Contributor License Agreement before we can accept your contributions.

@sivaschenko sivaschenko self-assigned this Nov 5, 2018
@sivaschenko sivaschenko closed this Nov 5, 2018
@springimport
Copy link

That's all you need to know about m2. The whole approach to business in one thing.

@orlangur
Copy link
Contributor

orlangur commented Nov 6, 2018

@springimport this is not about m2, this is my personal fault actually. Raised this problem once in Slack without much attention, will raise again this week, luckily we now have an #appdesign dedicated channel and architecture repo for such discussions.

@springimport
Copy link

@orlangur Nice.

@mshifo
Copy link

mshifo commented Mar 15, 2020

the issue still exists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Component: Framework/Webapi USE ONLY for FRAMEWORK RELATED BUG! E.g If bug related to Catalog WEB API use just Catalog improvement Release Line: 2.2 Reproduced on 2.1.x The issue has been reproduced on latest 2.1 release Reproduced on 2.2.x The issue has been reproduced on latest 2.2 release Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants