diff --git a/intermediate/xarray_and_dask.ipynb b/intermediate/xarray_and_dask.ipynb index 4de7a1ce..52b7decf 100644 --- a/intermediate/xarray_and_dask.ipynb +++ b/intermediate/xarray_and_dask.ipynb @@ -230,9 +230,6 @@ "cell_type": "code", "execution_count": null, "metadata": { - "jupyter": { - "outputs_hidden": true - }, "tags": [] }, "outputs": [], @@ -246,16 +243,13 @@ "source": [ "### Exercise\n", "\n", - "Try calling `mean.values` and `mean.data`. Do you understand the difference?" + "Try calling `ds.air.values` and `ds.air.data`. Do you understand the difference?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { - "jupyter": { - "outputs_hidden": true - }, "tags": [] }, "outputs": [], @@ -331,7 +325,7 @@ "2. `.load()` replaces the dask array in the xarray object with a numpy array.\n", " This is equivalent to `ds = ds.compute()`\n", " \n", - "**Tip:** There is a third option : \"persisting\". `.persist()` loads the values into distributed RAM. The values are computed but remain distributed across workers. So `ds.air.persist()` still returns a dask array. This is useful if you will be repeatedly using a dataset for computation but it is too large to load into local memory. You will see a persistent task on the dashboard. See the [dask user guide](https://docs.dask.org/en/latest/api.html#dask.persist) for more on persisting\n" + "**Tip:** There is a third option : \"persisting\". `.persist()` loads the values into distributed RAM. The values are computed but remain distributed across workers. So `ds.air.persist()` still returns a dask array. This is useful if you will be repeatedly using a dataset for computation but it is too large to load into local memory. You will see a persistent task on the dashboard. See the [dask user guide](https://docs.dask.org/en/latest/api.html#dask.persist) for more on persisting" ] }, { @@ -347,9 +341,6 @@ "cell_type": "code", "execution_count": null, "metadata": { - "jupyter": { - "outputs_hidden": true - }, "tags": [] }, "outputs": [], @@ -446,7 +437,10 @@ "\n", "You can use any kind of Dask cluster. This step is completely independent of\n", "xarray. While not strictly necessary, the dashboard provides a nice learning\n", - "tool." + "tool.\n", + "\n", + "By default, Dask uses the current working directory for writing temporary files.\n", + "We choose to use a temporary scratch folder `local_directory='/tmp'` in the example below instead." ] }, { @@ -464,10 +458,17 @@ "# if os.environ.get('JUPYTERHUB_USER'):\n", "# dask.config.set(**{\"distributed.dashboard.link\": \"/user/{JUPYTERHUB_USER}/proxy/{port}/status\"})\n", "\n", - "client = Client(local_directory='/tmp')\n", + "client = Client()\n", "client" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -483,9 +484,6 @@ "cell_type": "code", "execution_count": null, "metadata": { - "jupyter": { - "outputs_hidden": true - }, "tags": [] }, "outputs": [], @@ -539,9 +537,6 @@ "cell_type": "code", "execution_count": null, "metadata": { - "jupyter": { - "outputs_hidden": true - }, "tags": [] }, "outputs": [],