From c10075523946b349ecd19fdd4efb1bf44317694a Mon Sep 17 00:00:00 2001 From: jason-price-mongodb <69260375+jason-price-mongodb@users.noreply.github.com> Date: Mon, 18 Oct 2021 09:44:03 -0700 Subject: [PATCH] DOCS-14855 reword-sample (#6012) Co-authored-by: jason-price-mongodb --- .../reference/operator/aggregation/sample.txt | 33 +++++++++++-------- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/source/reference/operator/aggregation/sample.txt b/source/reference/operator/aggregation/sample.txt index 90bf5631b1..998c295cb2 100644 --- a/source/reference/operator/aggregation/sample.txt +++ b/source/reference/operator/aggregation/sample.txt @@ -17,32 +17,37 @@ Definition .. versionadded:: 3.2 - Randomly selects the specified number of documents from its input. + Randomly selects the specified number of documents from the + input documents. The :pipeline:`$sample` stage has the following syntax: .. code-block:: javascript - { $sample: { size: } } + { $sample: { size: } } + + ``N`` is the number of documents to randomly select. Behavior -------- -:pipeline:`$sample` uses one of two methods to obtain N random -documents, depending on the size of the collection, the size of N, -and ``$sample``'s position in the pipeline. +If all of the following conditions are true, :pipeline:`$sample` uses a +pseudo-random cursor to select the ``N`` documents: + +- :pipeline:`$sample` is the first stage of the pipeline. +- ``N`` is less than 5% of the total documents in the collection. +- The collection contains more than 100 documents. + +If any of the previous conditions are false, :pipeline:`$sample`: -If all the following conditions are met, ``$sample`` uses a -pseudo-random cursor to select documents: +- Reads all documents that are output from a preceding aggregation + stage or a collection scan. +- Performs a random sort to select ``N`` documents. -- ``$sample`` is the first stage of the pipeline -- N is less than 5% of the total documents in the collection -- The collection contains more than 100 documents +.. note:: -If any of the above conditions are NOT met, ``$sample`` performs a -collection scan followed by a random sort to select N documents. In -this case, the :pipeline:`$sample` stage is subject to the -:ref:`sort memory restrictions `. + Random sorts are subject to the :ref:`sort memory restrictions + `. MMAPv1 May Return Duplicate Documents ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~