From 77d15ebae66d040aff5289a959ee680acfa118ff Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 10 Aug 2018 17:07:32 +0000 Subject: [PATCH 01/14] Adding ProcessCcd; removing ProcessEimage --- ImageProcessing/ProcessCcd.ipynb | 122 +++++++++++++++++ ImageProcessing/ProcessEimage.ipynb | 197 ---------------------------- 2 files changed, 122 insertions(+), 197 deletions(-) create mode 100644 ImageProcessing/ProcessCcd.ipynb delete mode 100644 ImageProcessing/ProcessEimage.ipynb diff --git a/ImageProcessing/ProcessCcd.ipynb b/ImageProcessing/ProcessCcd.ipynb new file mode 100644 index 00000000..987ac890 --- /dev/null +++ b/ImageProcessing/ProcessCcd.ipynb @@ -0,0 +1,122 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ProcessCcd \n", + "\n", + "Owner(s): **Alex Drlica-Wagner (@kadrlica)**\n", + "\n", + "Last Verified to Run: **2018-08-10**\n", + "\n", + "Verified Stack Release: **16.0**\n", + "\n", + "This notebook seeks to mostly replicate the first two steps of the LSST Stack [\"Getting started tutorials\"](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorials). We first setup a working directory to point to an installation of the HSC data (more details on available data sets can be found in [DataInventory.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/Basics/DataInventory.ipynb))." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir DATA" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!echo \"lsst.obs.hsc.HscMapper\" > DATA/_mapper" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!ingestImages.py DATA /project/shared/data/ci_hsc/raw/*.fits --mode=link" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!installTransmissionCurves.py DATA" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!ln -s /project/shared/data/ci_hsc/CALIB/ DATA/CALIB" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p DATA/ref_cats\n", + "!ln -s /project/shared/data/ci_hsc/ps1_pv3_3pi_20170110 DATA/ref_cats/ps1_pv3_3pi_20170110" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!processCcd.py DATA --rerun processCcdOutputs --id --show data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!processCcd.py DATA --rerun processCcdOutputs --id filter=HSC-I --show data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!processCcd.py DATA --rerun processCcdOutputs --id" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/ImageProcessing/ProcessEimage.ipynb b/ImageProcessing/ProcessEimage.ipynb deleted file mode 100644 index 137777cb..00000000 --- a/ImageProcessing/ProcessEimage.ipynb +++ /dev/null @@ -1,197 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Single Exposure Processing\n", - "\n", - "This is intended to walk you through the processing pipeline on jupyterlab. It builds on the first two hands-on tutorials in the LSST [\"Getting started\" tutorial series](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorial). It is intended for anyone getting started with using the LSST Science Pipelines for data processing. \n", - "\n", - "The goal of this tutorial is to setup a Butler for a simulated LSST data set and to run the `processCCD.py` pipeline task to produced reduced images." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Setting up the data repository\n", - "\n", - "Sample data for this tutorial comes from the `twinkles` LSST simulation and is available in a shared directory on `jupyterlab`. We will make a copy of the input data in our current directory:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!if [ ! -d DATA ]; then cp -r /project/shared/data/Twinkles_subset/input_data_v2 DATA; fi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Inside the data directory you'll see a directory structure that looks like this" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!ls -lh DATA/" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The Butler uses a mapper to find and organize data in a format specific to each camera. Here we're using `lsst.obs.lsstSim.LsstSimMapper` mapper for the Twinkles simulated data:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cat DATA/_mapper" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "All of the relavent images and calibrations have already been ingested into the Butler for this data set." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Reviewing what data will be processed\n", - "\n", - "We'll now process individual raw LSST simulated images in the Butler `DATA` repository into calibrated exposures. We’ll use the `processCcd.py` command-line task to remove instrumental signatures with dark, bias and flat field calibration images. `processCcd.py` will also use the reference catalog to establish a preliminary WCS and photometric zeropoint solution.\n", - "\n", - "First we'll examine the set of exposures available in the Twinkles data set using the Butler" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we'll do a similar thing using the `processEimageTask` from the LSST pipeline. **There is a bit of ugliness here because the `processEimage.py` command line script is only python2 compatible so we need to parse the arguments through the API. This has the nasty habit of trying to exit after the args.**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from lsst.obs.lsstSim.processEimage import ProcessEimageTask" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "args = 'DATA --rerun process-eimage --id filter=r --show data'\n", - "ProcessEimageTask.parseAndRun(args=args.split())\n", - "\n", - "# BUG: the command above exits early, due to a namespace problem:\n", - "# /opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/15.0/python/lsst/pipe/base/argumentParser.py in parse_args(self, config, args, log, override)\n", - "# 628 \n", - "# 629 if namespace.show and \"run\" not in namespace.show:\n", - "# --> 630 sys.exit(0)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The important arguments here are `--id` and `--show data`.\n", - "\n", - "The `--id` argument allows you to select datasets to process by their data IDs. Data IDs describe individual datasets in the Butler repository. Datasets also have types, and each command-line task will only process data of certain types. In this case, `processEimage.py` will processes raw simulated e-images **(need more description of e-images)**.\n", - "\n", - "In the above command, the `--id filter=r` argument selects data from the r filter. Specifying `--id` without any arguments acts as a wildcard that selects all raw-type data in the repository.\n", - "\n", - "The `--show data` argument puts `processEimage.py` into a dry-run mode that prints a list of data IDs to standard output that would be processed according to the `--id` argument rather than actually processing the data. \n", - "\n", - "Notice the keys that describe each data ID, such as the visit (exposure identifier), raft (identifies a specific LSST camera raft), sensor (identifies an individual ccd on a raft) and filter, among others. With these keys you can select exactly what data you want to process." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next we perform the same task directly with the Butler:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import lsst.daf.persistence as dafPersist" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "butler = dafPersist.Butler(inputs='DATA')\n", - "butler.queryMetadata('eimage', ['visit', 'raft', 'sensor','filter'], dataId={'filter': 'r'})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Processing data\n", - "\n", - "Now we'll move on to actually process some of the Twinkles data. To do this, we'll remove the `--show data` argument." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "args = 'DATA --rerun process-eimage --id filter=r --show data'\n", - "# The command below also exits early - see the error message above.\n", - "ProcessEimageTask.parseAndRun(args=args.split())" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "LSST_Stack (Python 3)", - "language": "python", - "name": "lsst_stack" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.2" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From 966414962c4a2a839b58b799477f79afd51e575b Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 10 Aug 2018 17:59:11 +0000 Subject: [PATCH 02/14] Adding some documentation --- ImageProcessing/ProcessCcd.ipynb | 77 ++++++++++++++++++++++++++------ 1 file changed, 64 insertions(+), 13 deletions(-) diff --git a/ImageProcessing/ProcessCcd.ipynb b/ImageProcessing/ProcessCcd.ipynb index 987ac890..bf9f2fe1 100644 --- a/ImageProcessing/ProcessCcd.ipynb +++ b/ImageProcessing/ProcessCcd.ipynb @@ -6,13 +6,41 @@ "source": [ "# ProcessCcd \n", "\n", - "Owner(s): **Alex Drlica-Wagner (@kadrlica)**\n", + "
Owner: **Alex Drlica-Wagner** ([@kadrlica](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@kadrlica))\n", + "
Last Verified to Run: **2018-08-10**\n", + "
Verified Stack Release: **v16.0**\n", "\n", - "Last Verified to Run: **2018-08-10**\n", + "## Learning Objectives:\n", "\n", - "Verified Stack Release: **16.0**\n", + "This notebook seeks to mostly replicate the first two steps of the LSST Stack [\"Getting started tutorials\"](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorials). We first setup a working directory to point to an installation of the HSC data (more details on available data sets can be found in [DataInventory.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/Basics/DataInventory.ipynb)).\n", "\n", - "This notebook seeks to mostly replicate the first two steps of the LSST Stack [\"Getting started tutorials\"](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorials). We first setup a working directory to point to an installation of the HSC data (more details on available data sets can be found in [DataInventory.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/Basics/DataInventory.ipynb))." + "After working through this tutorial you should be able to:\n", + "\n", + "* Ingest images from a shared data set into your personal `Butler`\n", + "* Investigate the content of a dataset using the `Butler` from the command line with `processCcd.py` \n", + "* Run the `processCcd.py` command line task to process images\n", + "* Interface with the `processCcd` API directly in python.\n", + "\n", + "## Logistics\n", + "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Command Line Processing\n", + "\n", + "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). Much more documentation can be found at those links, so we go through this pretty quickly..." ] }, { @@ -21,7 +49,8 @@ "metadata": {}, "outputs": [], "source": [ - "!mkdir DATA" + "# Define an environment variable that points to the shared data directory\n", + "os.environ['CI_HSC_DIR'] = \"/project/shared/data/ci_hsc\"" ] }, { @@ -30,16 +59,26 @@ "metadata": {}, "outputs": [], "source": [ + "# Create our DATA working directory in add the HSC mapper\n", + "# NOTE: We may want to delete an existing data directory first?\n", + "!mkdir DATA\n", "!echo \"lsst.obs.hsc.HscMapper\" > DATA/_mapper" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we ingest the images and transmission curves from the shared data store into our own private directory." + ] + }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "!ingestImages.py DATA /project/shared/data/ci_hsc/raw/*.fits --mode=link" + "!ingestImages.py DATA $CI_HSC_DIR/raw/*.fits --mode=link" ] }, { @@ -51,13 +90,20 @@ "!installTransmissionCurves.py DATA" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we link to the calibration and reference catalogs." + ] + }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "!ln -s /project/shared/data/ci_hsc/CALIB/ DATA/CALIB" + "!ln -s $CI_HSC_DIR/CALIB/ DATA/CALIB" ] }, { @@ -67,7 +113,14 @@ "outputs": [], "source": [ "!mkdir -p DATA/ref_cats\n", - "!ln -s /project/shared/data/ci_hsc/ps1_pv3_3pi_20170110 DATA/ref_cats/ps1_pv3_3pi_20170110" + "!ln -s $CI_HSC_DIR/ps1_pv3_3pi_20170110 DATA/ref_cats/ps1_pv3_3pi_20170110" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'd like to investigate what data we have available (the `--show-data` flag)." ] }, { @@ -80,12 +133,10 @@ ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "!processCcd.py DATA --rerun processCcdOutputs --id filter=HSC-I --show data" + "We choose one CCD `(visit=903334, ccd=16)` and pass it to the `processCcd` command line task" ] }, { @@ -94,7 +145,7 @@ "metadata": {}, "outputs": [], "source": [ - "!processCcd.py DATA --rerun processCcdOutputs --id" + "!processCcd.py DATA --rerun processCcdOutputs --id visit=903334 ccd=16" ] } ], From 395d477f6695dd98f6c324459502a83ce8d7a870 Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 14 Sep 2018 16:08:43 +0000 Subject: [PATCH 03/14] Working on a deeper dive into the underlying code --- ImageProcessing/ProcessCcd.ipynb | 123 ++++++++++++++++++++++--------- 1 file changed, 90 insertions(+), 33 deletions(-) diff --git a/ImageProcessing/ProcessCcd.ipynb b/ImageProcessing/ProcessCcd.ipynb index bf9f2fe1..9c459061 100644 --- a/ImageProcessing/ProcessCcd.ipynb +++ b/ImageProcessing/ProcessCcd.ipynb @@ -12,14 +12,12 @@ "\n", "## Learning Objectives:\n", "\n", - "This notebook seeks to mostly replicate the first two steps of the LSST Stack [\"Getting started tutorials\"](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorials). We first setup a working directory to point to an installation of the HSC data (more details on available data sets can be found in [DataInventory.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/Basics/DataInventory.ipynb)).\n", + "This notebook seeks to teach users how to unpack a command line task, specifically `processCcd.py`, to get at the python API functionality that is being called. It is a digression from Justin Myles implementation of the HSC rerun processing [link].\n", "\n", "After working through this tutorial you should be able to:\n", "\n", - "* Ingest images from a shared data set into your personal `Butler`\n", - "* Investigate the content of a dataset using the `Butler` from the command line with `processCcd.py` \n", - "* Run the `processCcd.py` command line task to process images\n", - "* Interface with the `processCcd` API directly in python.\n", + "* Find the source code for a command line task\n", + "* Investigate and run those tasks in python\n", "\n", "## Logistics\n", "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub." @@ -40,80 +38,139 @@ "source": [ "## Command Line Processing\n", "\n", - "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). Much more documentation can be found at those links, so we go through this pretty quickly..." + "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). We are specifically interested in digging into the following line:\n", + "\n", + "```\n", + "processCcd.py $DATADIR --rerun processCcdOutputs --id\n", + "```" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "# Define an environment variable that points to the shared data directory\n", - "os.environ['CI_HSC_DIR'] = \"/project/shared/data/ci_hsc\"" + "We start by tracking down the location of the `processCcd.py` shell script" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0+1/bin/processCcd.py\n" + ] + } + ], "source": [ - "# Create our DATA working directory in add the HSC mapper\n", - "# NOTE: We may want to delete an existing data directory first?\n", - "!mkdir DATA\n", - "!echo \"lsst.obs.hsc.HscMapper\" > DATA/_mapper" + "!(which processCcd.py)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Next we ingest the images and transmission curves from the shared data store into our own private directory." + "This is the proverbial \"end of the thread\". Our goal is to pull on this thread to unravel the python/C++ functions that are being called under the hood. We start by taking a peak in this script" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "#!/usr/bin/env python\n", + "#\n", + "# LSST Data Management System\n", + "# Copyright 2008, 2009, 2010 LSST Corporation.\n", + "#\n", + "# This product includes software developed by the\n", + "# LSST Project (http://www.lsst.org/).\n", + "#\n", + "# This program is free software: you can redistribute it and/or modify\n", + "# it under the terms of the GNU General Public License as published by\n", + "# the Free Software Foundation, either version 3 of the License, or\n", + "# (at your option) any later version.\n", + "#\n", + "# This program is distributed in the hope that it will be useful,\n", + "# but WITHOUT ANY WARRANTY; without even the implied warranty of\n", + "# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", + "# GNU General Public License for more details.\n", + "#\n", + "# You should have received a copy of the LSST License Statement and\n", + "# the GNU General Public License along with this program. If not,\n", + "# see .\n", + "#\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", + "\n", + "ProcessCcdTask.parseAndRun()\n" + ] + } + ], "source": [ - "!ingestImages.py DATA $CI_HSC_DIR/raw/*.fits --mode=link" + "!cat $(which processCcd.py)" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "!installTransmissionCurves.py DATA" + "Ok, this hasn't gotten us very far, but we now have the next link in our chain:\n", + "```\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", + "```\n", + "This seems promising! \n", + "\n", + "There are two ways we can proceed from here. One is to Google `lsst.pipe.tasks.processCcd`, which will take us to this [doxygen page](http://doxygen.lsst.codes/stack/doxygen/x_masterDoxyDoc/classlsst_1_1pipe_1_1tasks_1_1process_ccd_1_1_process_ccd_task.html) and/or this soure code on [GitHub](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py). The second approach is to do the import oursleves...\n" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 7, "metadata": {}, + "outputs": [], "source": [ - "Next we link to the calibration and reference catalogs." + "import lsst.pipe.tasks.processCcd\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask, ProcessCcdConfig" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ - "!ln -s $CI_HSC_DIR/CALIB/ DATA/CALIB" + "#help(lsst.pipe.tasks.processCcd)" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "metadata": {}, - "outputs": [], + "outputs": [ + { + "ename": "KeyError", + "evalue": "'*'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mProcessCcdConfig\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mc\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformatHistory\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'*'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pex_config/16.0/python/lsst/pex/config/config.py\u001b[0m in \u001b[0;36mformatHistory\u001b[0;34m(self, name, **kwargs)\u001b[0m\n\u001b[1;32m 698\u001b[0m \"\"\"\n\u001b[1;32m 699\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mlsst\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpex\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconfig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhistory\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpexHist\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 700\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mpexHist\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 701\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 702\u001b[0m \"\"\"\n", + "\u001b[0;32m/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pex_config/16.0/python/lsst/pex/config/history.py\u001b[0m in \u001b[0;36mformat\u001b[0;34m(config, name, writeSourceLine, prefix, verbose)\u001b[0m\n\u001b[1;32m 142\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 143\u001b[0m \u001b[0moutputs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 144\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstack\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabel\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mconfig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhistory\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 145\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 146\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mframe\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mstack\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mKeyError\u001b[0m: '*'" + ] + } + ], "source": [ - "!mkdir -p DATA/ref_cats\n", - "!ln -s $CI_HSC_DIR/ps1_pv3_3pi_20170110 DATA/ref_cats/ps1_pv3_3pi_20170110" + "c = ProcessCcdConfig()\n", + "c.formatHistory('*')" ] }, { From d76f0dc6f24dbeef9cf8646ca9d244b95ba3ddc5 Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 14 Sep 2018 20:37:21 +0000 Subject: [PATCH 04/14] Diving into how the fringe subtraction is done. --- ImageProcessing/ProcessCcd.ipynb | 244 ++++++++++++++++++++++++++----- 1 file changed, 211 insertions(+), 33 deletions(-) diff --git a/ImageProcessing/ProcessCcd.ipynb b/ImageProcessing/ProcessCcd.ipynb index 9c459061..3ec7b1cf 100644 --- a/ImageProcessing/ProcessCcd.ipynb +++ b/ImageProcessing/ProcessCcd.ipynb @@ -12,7 +12,7 @@ "\n", "## Learning Objectives:\n", "\n", - "This notebook seeks to teach users how to unpack a command line task, specifically `processCcd.py`, to get at the python API functionality that is being called. It is a digression from Justin Myles implementation of the HSC rerun processing [link].\n", + "This notebook seeks to teach users how to unpack a command line task, specifically `processCcd.py`, and access the python API functionality that is being called. This notebook is a digression from Justin Myles script that runs the HSC rerun processing [link].\n", "\n", "After working through this tutorial you should be able to:\n", "\n", @@ -25,11 +25,12 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ - "import os" + "import os\n", + "import pydoc" ] }, { @@ -54,7 +55,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "metadata": {}, "outputs": [ { @@ -78,7 +79,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -121,18 +122,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Ok, this hasn't gotten us very far, but we now have the next link in our chain:\n", + "Ok, this hasn't gotten us very far, but after getting through the stock header, we now have the next link in our chain:\n", "```\n", "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", "```\n", - "This seems promising! \n", "\n", - "There are two ways we can proceed from here. One is to Google `lsst.pipe.tasks.processCcd`, which will take us to this [doxygen page](http://doxygen.lsst.codes/stack/doxygen/x_masterDoxyDoc/classlsst_1_1pipe_1_1tasks_1_1process_ccd_1_1_process_ccd_task.html) and/or this soure code on [GitHub](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py). The second approach is to do the import oursleves...\n" + "There are two ways we can proceed from here. One is to [Google](http://lmgtfy.com/?q=lsst.pipe.tasks.processCcd) `lsst.pipe.tasks.processCcd`, which will take us to this [doxygen page](http://doxygen.lsst.codes/stack/doxygen/x_masterDoxyDoc/classlsst_1_1pipe_1_1tasks_1_1process_ccd_1_1_process_ccd_task.html) and/or the soure code on [GitHub](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py). \n", + "\n", + "The second approach is to do the import the class oursleves and try to investigate it interactively.\n" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -140,69 +142,245 @@ "from lsst.pipe.tasks.processCcd import ProcessCcdTask, ProcessCcdConfig" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can get to the source code for these classes directly using the [`stackclub` toolkit module](https://stackclub.readthedocs.io/), as shown in the [FindingDocs.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/GettingStarted/FindingDocs.ipynb)" + ] + }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)\n" + ] + } + ], + "source": [ + "from stackclub import where_is\n", + "where_is(ProcessCcdConfig)" + ] + }, + { + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "#help(lsst.pipe.tasks.processCcd)" + "Next we can create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 48, "metadata": {}, "outputs": [ { - "ename": "KeyError", - "evalue": "'*'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mProcessCcdConfig\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mc\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformatHistory\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'*'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pex_config/16.0/python/lsst/pex/config/config.py\u001b[0m in \u001b[0;36mformatHistory\u001b[0;34m(self, name, **kwargs)\u001b[0m\n\u001b[1;32m 698\u001b[0m \"\"\"\n\u001b[1;32m 699\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mlsst\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpex\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconfig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhistory\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpexHist\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 700\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mpexHist\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 701\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 702\u001b[0m \"\"\"\n", - "\u001b[0;32m/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pex_config/16.0/python/lsst/pex/config/history.py\u001b[0m in \u001b[0;36mformat\u001b[0;34m(config, name, writeSourceLine, prefix, verbose)\u001b[0m\n\u001b[1;32m 142\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 143\u001b[0m \u001b[0moutputs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 144\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstack\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabel\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mconfig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhistory\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 145\u001b[0m \u001b[0moutput\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 146\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mframe\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mstack\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mKeyError\u001b[0m: '*'" + "name": "stdout", + "output_type": "stream", + "text": [ + " | ----------------------------------------------------------------------\n", + " | Data descriptors defined here:\n", + " | \n", + " | isr\n", + " | Task to perform instrumental signature removal or load a post-ISR image; ISR consists of:\n", + " | - assemble raw amplifier images into an exposure with image, variance and mask planes\n", + " | - perform bias subtraction, flat fielding, etc.\n", + " | - mask known bad pixels\n", + " | - provide a preliminary WCS\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n", + " | charImage\n", + " | Task to characterize a science exposure:\n", + " | - detect sources, usually at high S/N\n", + " | - estimate the background, which is subtracted from the image and returned as field \"background\"\n", + " | - estimate a PSF model, which is added to the exposure\n", + " | - interpolate over defects and cosmic rays, updating the image, variance and mask planes\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n", + " | doCalibrate\n", + " | Perform calibration? (`bool`, default ``True``)\n", + " | \n", + " | calibrate\n", + " | Task to perform astrometric and photometric calibration:\n", + " | - refine the WCS in the exposure\n", + " | - refine the Calib photometric calibration object in the exposure\n", + " | - detect sources, usually at low S/N\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n" ] } ], "source": [ - "c = ProcessCcdConfig()\n", - "c.formatHistory('*')" + "config = ProcessCcdConfig()\n", + "#help(config)\n", + "helplist = pydoc.render_doc(config).split('\\n')\n", + "print('\\n'.join(helplist[18:47]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first step is to try to get at the documentation through the `help` function (commented out below for brevity). We can find " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We'd like to investigate what data we have available (the `--show-data` flag)." + "The `ProcessCcdConfig` object contains of both raw configurables like `doCalibrate` and other configs, like `isr`. To investigate one of these in more detail, we can import it and query it's \"Data descriptors\".\n" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 70, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " | \n", + " | Data descriptors defined here:\n", + " | \n", + " | doBias\n", + " | Apply bias frame correction? (`bool`, default ``True``)\n", + " | \n", + " | doDark\n", + " | Apply dark frame correction? (`bool`, default ``True``)\n", + " | \n", + " | doFlat\n", + " | Apply flat field correction? (`bool`, default ``True``)\n", + " | \n", + " | doFringe\n", + " | Apply fringe correction? (`bool`, default ``True``)\n", + " | \n", + " | doDefect\n", + " | Apply correction for CCD defects, e.g. hot pixels? (`bool`, default ``True``)\n", + " | \n", + " | doAddDistortionModel\n", + " | Apply a distortion model based on camera geometry to the WCS? (`bool`, default ``True``)\n", + " | \n", + " | doWrite\n", + " | Persist postISRCCD? (`bool`, default ``True``)\n", + " | ...\n" + ] + } + ], "source": [ - "!processCcd.py DATA --rerun processCcdOutputs --id --show data" + "from lsst.ip.isr.isrTask import IsrTask, IsrTaskConfig\n", + "print('\\n'.join(pydoc.render_doc(IsrTaskConfig).split('\\n')[16:40]) + '...')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We choose one CCD `(visit=903334, ccd=16)` and pass it to the `processCcd` command line task" + "These configurationas are pretty self-explanitory, but say that we really want to understand what `doFringe` is doing. Inorder to get that information we need to go to the source code." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 63, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)\n" + ] + } + ], + "source": [ + "where_is(IsrTaskConfig)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can then search this file for `doFringe` and we find [several lines](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/isrTask.py#L597-L598) that look like this:\n", + "\n", + " if self.config.doFringe and not self.config.fringeAfterFlat:\n", + " self.fringe.run(ccdExposure, **fringes.getDict())\n", + " \n", + "If we want to go deeper to see what `fringe.run` does, we can repeat the above process" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/markdown": [ + "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)\n" + ] + } + ], + "source": [ + "isr_task = IsrTask()\n", + "print(isr_task.fringe)\n", + "import lsst.ip.isr.fringe\n", + "where_is(lsst.ip.isr.fringe.FringeTask)" + ] + }, + { + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "!processCcd.py DATA --rerun processCcdOutputs --id visit=903334 ccd=16" + "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on the fringe subtraction." ] } ], From 6785a633ed2d0ca50be4c7c646492035c6954e36 Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 14 Sep 2018 20:49:43 +0000 Subject: [PATCH 05/14] More appropriate name for pipeline task tutorial --- ImageProcessing/PipelineTasks.py | 235 ++++++++++++++++++ ImageProcessing/ProcessCcd.ipynb | 408 ------------------------------- ImageProcessing/README.rst | 12 +- 3 files changed, 241 insertions(+), 414 deletions(-) create mode 100644 ImageProcessing/PipelineTasks.py delete mode 100644 ImageProcessing/ProcessCcd.ipynb diff --git a/ImageProcessing/PipelineTasks.py b/ImageProcessing/PipelineTasks.py new file mode 100644 index 00000000..43fe96c9 --- /dev/null +++ b/ImageProcessing/PipelineTasks.py @@ -0,0 +1,235 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Pipeline Tasks\n", + "\n", + "
Owner: **Alex Drlica-Wagner** ([@kadrlica](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@kadrlica))\n", + "
Last Verified to Run: **2018-08-10**\n", + "
Verified Stack Release: **v16.0**\n", + "\n", + "## Learning Objectives:\n", + "\n", + "This notebook seeks to teach users how to unpack a pipeline tasks. As an example, we focus on `processCcd.py`, with the goal of diving into the configuration, interface, and structure of pipeline tasks. This notebook is a digression from Justin Myles script that demonstrates how to run a series of pipeline tasks from the command line to rerun HSC data processing [link].\n", + "\n", + "After working through this tutorial you should be able to:\n", + "\n", + "* Find the source code for a pipeline task\n", + "* Configure (and investigate the configuration) of pipeline tasks\n", + "* Investigate and run those tasks in python\n", + "\n", + "## Logistics\n", + "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import pydoc" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Command Line Processing\n", + "\n", + "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). We are specifically interested in digging into the following line:\n", + "\n", + "```\n", + "processCcd.py $DATADIR --rerun processCcdOutputs --id\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We start by tracking down the location of the `processCcd.py` shell script" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!(which processCcd.py)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is the proverbial \"end of the thread\". Our goal is to pull on this thread to unravel the python/C++ functions that are being called under the hood. We start by taking a peak in this script" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!cat $(which processCcd.py)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Ok, this hasn't gotten us very far, but after getting through the stock header, we now have the next link in our chain:\n", + "```\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", + "```\n", + "\n", + "There are two ways we can proceed from here. One is to [Google](http://lmgtfy.com/?q=lsst.pipe.tasks.processCcd) `lsst.pipe.tasks.processCcd`, which will take us to this [doxygen page](http://doxygen.lsst.codes/stack/doxygen/x_masterDoxyDoc/classlsst_1_1pipe_1_1tasks_1_1process_ccd_1_1_process_ccd_task.html) and/or the soure code on [GitHub](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py). \n", + "\n", + "The second approach is to do the import the class oursleves and try to investigate it interactively.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import lsst.pipe.tasks.processCcd\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask, ProcessCcdConfig" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can get to the source code for these classes directly using the [`stackclub` toolkit module](https://stackclub.readthedocs.io/), as shown in the [FindingDocs.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/GettingStarted/FindingDocs.ipynb)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from stackclub import where_is\n", + "where_is(ProcessCcdConfig)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we can create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "config = ProcessCcdConfig()\n", + "#help(config)\n", + "helplist = pydoc.render_doc(config).split('\\n')\n", + "print('\\n'.join(helplist[18:47]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first step is to try to get at the documentation through the `help` function (commented out below for brevity). We can find " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `ProcessCcdConfig` object contains of both raw configurables like `doCalibrate` and other configs, like `isr`. To investigate one of these in more detail, we can import it and query it's \"Data descriptors\".\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from lsst.ip.isr.isrTask import IsrTask, IsrTaskConfig\n", + "print('\\n'.join(pydoc.render_doc(IsrTaskConfig).split('\\n')[16:40]) + '...')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "These configurationas are pretty self-explanitory, but say that we really want to understand what `doFringe` is doing. Inorder to get that information we need to go to the source code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "where_is(IsrTaskConfig)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can then search this file for `doFringe` and we find [several lines](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/isrTask.py#L597-L598) that look like this:\n", + "\n", + " if self.config.doFringe and not self.config.fringeAfterFlat:\n", + " self.fringe.run(ccdExposure, **fringes.getDict())\n", + " \n", + "If we want to go deeper to see what `fringe.run` does, we can repeat the above process" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "isr_task = IsrTask()\n", + "print(isr_task.fringe)\n", + "import lsst.ip.isr.fringe\n", + "where_is(lsst.ip.isr.fringe.FringeTask)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on the fringe subtraction." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/ImageProcessing/ProcessCcd.ipynb b/ImageProcessing/ProcessCcd.ipynb deleted file mode 100644 index 3ec7b1cf..00000000 --- a/ImageProcessing/ProcessCcd.ipynb +++ /dev/null @@ -1,408 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# ProcessCcd \n", - "\n", - "
Owner: **Alex Drlica-Wagner** ([@kadrlica](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@kadrlica))\n", - "
Last Verified to Run: **2018-08-10**\n", - "
Verified Stack Release: **v16.0**\n", - "\n", - "## Learning Objectives:\n", - "\n", - "This notebook seeks to teach users how to unpack a command line task, specifically `processCcd.py`, and access the python API functionality that is being called. This notebook is a digression from Justin Myles script that runs the HSC rerun processing [link].\n", - "\n", - "After working through this tutorial you should be able to:\n", - "\n", - "* Find the source code for a command line task\n", - "* Investigate and run those tasks in python\n", - "\n", - "## Logistics\n", - "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import pydoc" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Command Line Processing\n", - "\n", - "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). We are specifically interested in digging into the following line:\n", - "\n", - "```\n", - "processCcd.py $DATADIR --rerun processCcdOutputs --id\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We start by tracking down the location of the `processCcd.py` shell script" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0+1/bin/processCcd.py\n" - ] - } - ], - "source": [ - "!(which processCcd.py)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This is the proverbial \"end of the thread\". Our goal is to pull on this thread to unravel the python/C++ functions that are being called under the hood. We start by taking a peak in this script" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "#!/usr/bin/env python\n", - "#\n", - "# LSST Data Management System\n", - "# Copyright 2008, 2009, 2010 LSST Corporation.\n", - "#\n", - "# This product includes software developed by the\n", - "# LSST Project (http://www.lsst.org/).\n", - "#\n", - "# This program is free software: you can redistribute it and/or modify\n", - "# it under the terms of the GNU General Public License as published by\n", - "# the Free Software Foundation, either version 3 of the License, or\n", - "# (at your option) any later version.\n", - "#\n", - "# This program is distributed in the hope that it will be useful,\n", - "# but WITHOUT ANY WARRANTY; without even the implied warranty of\n", - "# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", - "# GNU General Public License for more details.\n", - "#\n", - "# You should have received a copy of the LSST License Statement and\n", - "# the GNU General Public License along with this program. If not,\n", - "# see .\n", - "#\n", - "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", - "\n", - "ProcessCcdTask.parseAndRun()\n" - ] - } - ], - "source": [ - "!cat $(which processCcd.py)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Ok, this hasn't gotten us very far, but after getting through the stock header, we now have the next link in our chain:\n", - "```\n", - "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", - "```\n", - "\n", - "There are two ways we can proceed from here. One is to [Google](http://lmgtfy.com/?q=lsst.pipe.tasks.processCcd) `lsst.pipe.tasks.processCcd`, which will take us to this [doxygen page](http://doxygen.lsst.codes/stack/doxygen/x_masterDoxyDoc/classlsst_1_1pipe_1_1tasks_1_1process_ccd_1_1_process_ccd_task.html) and/or the soure code on [GitHub](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py). \n", - "\n", - "The second approach is to do the import the class oursleves and try to investigate it interactively.\n" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "import lsst.pipe.tasks.processCcd\n", - "from lsst.pipe.tasks.processCcd import ProcessCcdTask, ProcessCcdConfig" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can get to the source code for these classes directly using the [`stackclub` toolkit module](https://stackclub.readthedocs.io/), as shown in the [FindingDocs.ipynb](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/GettingStarted/FindingDocs.ipynb)" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "data": { - "text/markdown": [ - "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)\n" - ] - } - ], - "source": [ - "from stackclub import where_is\n", - "where_is(ProcessCcdConfig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next we can create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." - ] - }, - { - "cell_type": "code", - "execution_count": 48, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " | ----------------------------------------------------------------------\n", - " | Data descriptors defined here:\n", - " | \n", - " | isr\n", - " | Task to perform instrumental signature removal or load a post-ISR image; ISR consists of:\n", - " | - assemble raw amplifier images into an exposure with image, variance and mask planes\n", - " | - perform bias subtraction, flat fielding, etc.\n", - " | - mask known bad pixels\n", - " | - provide a preliminary WCS\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n", - " | charImage\n", - " | Task to characterize a science exposure:\n", - " | - detect sources, usually at high S/N\n", - " | - estimate the background, which is subtracted from the image and returned as field \"background\"\n", - " | - estimate a PSF model, which is added to the exposure\n", - " | - interpolate over defects and cosmic rays, updating the image, variance and mask planes\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n", - " | doCalibrate\n", - " | Perform calibration? (`bool`, default ``True``)\n", - " | \n", - " | calibrate\n", - " | Task to perform astrometric and photometric calibration:\n", - " | - refine the WCS in the exposure\n", - " | - refine the Calib photometric calibration object in the exposure\n", - " | - detect sources, usually at low S/N\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n" - ] - } - ], - "source": [ - "config = ProcessCcdConfig()\n", - "#help(config)\n", - "helplist = pydoc.render_doc(config).split('\\n')\n", - "print('\\n'.join(helplist[18:47]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The first step is to try to get at the documentation through the `help` function (commented out below for brevity). We can find " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `ProcessCcdConfig` object contains of both raw configurables like `doCalibrate` and other configs, like `isr`. To investigate one of these in more detail, we can import it and query it's \"Data descriptors\".\n" - ] - }, - { - "cell_type": "code", - "execution_count": 70, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " | \n", - " | Data descriptors defined here:\n", - " | \n", - " | doBias\n", - " | Apply bias frame correction? (`bool`, default ``True``)\n", - " | \n", - " | doDark\n", - " | Apply dark frame correction? (`bool`, default ``True``)\n", - " | \n", - " | doFlat\n", - " | Apply flat field correction? (`bool`, default ``True``)\n", - " | \n", - " | doFringe\n", - " | Apply fringe correction? (`bool`, default ``True``)\n", - " | \n", - " | doDefect\n", - " | Apply correction for CCD defects, e.g. hot pixels? (`bool`, default ``True``)\n", - " | \n", - " | doAddDistortionModel\n", - " | Apply a distortion model based on camera geometry to the WCS? (`bool`, default ``True``)\n", - " | \n", - " | doWrite\n", - " | Persist postISRCCD? (`bool`, default ``True``)\n", - " | ...\n" - ] - } - ], - "source": [ - "from lsst.ip.isr.isrTask import IsrTask, IsrTaskConfig\n", - "print('\\n'.join(pydoc.render_doc(IsrTaskConfig).split('\\n')[16:40]) + '...')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "These configurationas are pretty self-explanitory, but say that we really want to understand what `doFringe` is doing. Inorder to get that information we need to go to the source code." - ] - }, - { - "cell_type": "code", - "execution_count": 63, - "metadata": {}, - "outputs": [ - { - "data": { - "text/markdown": [ - "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)\n" - ] - } - ], - "source": [ - "where_is(IsrTaskConfig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can then search this file for `doFringe` and we find [several lines](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/isrTask.py#L597-L598) that look like this:\n", - "\n", - " if self.config.doFringe and not self.config.fringeAfterFlat:\n", - " self.fringe.run(ccdExposure, **fringes.getDict())\n", - " \n", - "If we want to go deeper to see what `fringe.run` does, we can repeat the above process" - ] - }, - { - "cell_type": "code", - "execution_count": 75, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/markdown": [ - "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)\n" - ] - } - ], - "source": [ - "isr_task = IsrTask()\n", - "print(isr_task.fringe)\n", - "import lsst.ip.isr.fringe\n", - "where_is(lsst.ip.isr.fringe.FringeTask)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on the fringe subtraction." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "LSST", - "language": "python", - "name": "lsst" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.2" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/ImageProcessing/README.rst b/ImageProcessing/README.rst index 36bdd516..e5176be4 100644 --- a/ImageProcessing/README.rst +++ b/ImageProcessing/README.rst @@ -14,13 +14,13 @@ This folder contains a set of tutorial notebooks exploring the image processing - Owner - * - **ProcessEimage.ipynb** - - How to process a simulated "e-image" using the DM Stack. - - `ipynb `_, - `rendered `_ + * - **PipelineTasks.ipynb** + - Take a deep dive into the configuration, interface, and structure of pipeline tasks. + - `ipynb `_, + `rendered `_ - .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/ImageProcessing/log/ProcessEimage.svg - :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/ImageProcessing/log/ProcessEimage.log + .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/ImageProcessing/log/PipelineTasks.svg + :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/ImageProcessing/log/PipelineTasks.log - `Alex Drlica-Wagner `_ From 64200fdc4599f46c03a0844e01caaaef1a31ee2b Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Sat, 15 Sep 2018 02:43:50 +0000 Subject: [PATCH 06/14] fix suffix typo --- ImageProcessing/{PipelineTasks.py => PipelineTasks.ipynb} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename ImageProcessing/{PipelineTasks.py => PipelineTasks.ipynb} (100%) diff --git a/ImageProcessing/PipelineTasks.py b/ImageProcessing/PipelineTasks.ipynb similarity index 100% rename from ImageProcessing/PipelineTasks.py rename to ImageProcessing/PipelineTasks.ipynb From 3464543e0ad9c9c4d76db8c03f14db099db5596e Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Sat, 15 Sep 2018 02:50:52 +0000 Subject: [PATCH 07/14] Adding some detail --- ImageProcessing/PipelineTasks.ipynb | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/ImageProcessing/PipelineTasks.ipynb b/ImageProcessing/PipelineTasks.ipynb index 43fe96c9..29f3c048 100644 --- a/ImageProcessing/PipelineTasks.ipynb +++ b/ImageProcessing/PipelineTasks.ipynb @@ -38,9 +38,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Command Line Processing\n", + "## Diving into a Pipeline Task\n", "\n", - "In this first step we perform command line processing of images following the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html). We are specifically interested in digging into the following line:\n", + "Our goal is to dive into the inner workings of `ProcessCcd.py`. We pickup from the command line processing described in the getting started tutorials [here](https://pipelines.lsst.io/getting-started/data-setup.html#) and [here](https://pipelines.lsst.io/getting-started/processccd.html), as well as Justin Myles HSC reprocessing notebook [here](). We are specifically interested in digging into the following line:\n", "\n", "```\n", "processCcd.py $DATADIR --rerun processCcdOutputs --id\n", @@ -124,7 +124,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Next we can create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." + "# Diving into a Task Config\n", + "\n", + "Pipeline tasks are controlled and tweaked through there associated `TaskConfig` objects. To investigate the configuration parameters of the `ProcessCcdTask`, we create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." ] }, { From 449e8512a2a763063972629fd6bae7e3a68eaf4b Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 21 Sep 2018 17:42:06 +0000 Subject: [PATCH 08/14] comment edit --- ImageProcessing/PipelineTasks.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ImageProcessing/PipelineTasks.ipynb b/ImageProcessing/PipelineTasks.ipynb index 29f3c048..246ff98d 100644 --- a/ImageProcessing/PipelineTasks.ipynb +++ b/ImageProcessing/PipelineTasks.ipynb @@ -209,7 +209,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on the fringe subtraction." + "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on how the fringe correction is performed (i.e. by creating a fringe image and subtracting it from the data image)." ] } ], From 31ec1a1cf79a5b97a1ab4a063054fe6265361b1d Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 21 Sep 2018 17:42:39 +0000 Subject: [PATCH 09/14] Renaming notebook --- .../{PipelineTasks.ipynb => PipelineProcessingAPI.ipynb} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename ImageProcessing/{PipelineTasks.ipynb => PipelineProcessingAPI.ipynb} (100%) diff --git a/ImageProcessing/PipelineTasks.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb similarity index 100% rename from ImageProcessing/PipelineTasks.ipynb rename to ImageProcessing/PipelineProcessingAPI.ipynb From 0a71d82677816b0e6aed3621ce3be15e83a7e7f3 Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 12 Oct 2018 17:37:53 +0000 Subject: [PATCH 10/14] for working with merlin --- ImageProcessing/PipelineProcessingAPI.ipynb | 93 +++++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/ImageProcessing/PipelineProcessingAPI.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb index 246ff98d..1067b23b 100644 --- a/ImageProcessing/PipelineProcessingAPI.ipynb +++ b/ImageProcessing/PipelineProcessingAPI.ipynb @@ -211,6 +211,99 @@ "source": [ "We finally make our way to the source code for [FringeTask.run](https://github.com/lsst/ip_isr/blob/cc4efb7d763d3663c9e989339505df9654f23fd9/python/lsst/ip/isr/fringe.py#L104), which gives us details on how the fringe correction is performed (i.e. by creating a fringe image and subtracting it from the data image)." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Running a Task\n", + "\n", + "Now that we've figured out how to investigate the config for a task, let's try to run the task itself." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# First we get the raw image\n", + "from lsst.daf.persistence import Butler\n", + "datadir = '/project/stack-club/validation_data_hsc/data'\n", + "butler = Butler(datadir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we create the config and task instances" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Since we don't want to do calibration processing we turn off these configs\n", + "config = ProcessCcdConfig()\n", + "config.isr.doBias = False\n", + "config.isr.doDark = False\n", + "config.isr.doFlat = False" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "task = ProcessCcdTask(butler=butler, config=config)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#butler.queryMetadata('raw',['visit','ccd'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "subset = butler.subset('raw', dataId={'visit':903332})" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dRef = butler.dataRef('raw', dataId={'visit':903332,'ccd':25})" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "task.run(dRef)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { From 99ad9fcc93531a0e61454ed4110baa9b2732eff5 Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 2 Nov 2018 16:40:42 +0000 Subject: [PATCH 11/14] processCcd works from command line --- ImageProcessing/PipelineProcessingAPI.ipynb | 100 +++++++++++++++++--- 1 file changed, 87 insertions(+), 13 deletions(-) diff --git a/ImageProcessing/PipelineProcessingAPI.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb index 1067b23b..7617a8b9 100644 --- a/ImageProcessing/PipelineProcessingAPI.ipynb +++ b/ImageProcessing/PipelineProcessingAPI.ipynb @@ -8,8 +8,20 @@ "\n", "
Owner: **Alex Drlica-Wagner** ([@kadrlica](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@kadrlica))\n", "
Last Verified to Run: **2018-08-10**\n", - "
Verified Stack Release: **v16.0**\n", - "\n", + "
Verified Stack Release: **v16.0**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "## Learning Objectives:\n", "\n", "This notebook seeks to teach users how to unpack a pipeline tasks. As an example, we focus on `processCcd.py`, with the goal of diving into the configuration, interface, and structure of pipeline tasks. This notebook is a digression from Justin Myles script that demonstrates how to run a series of pipeline tasks from the command line to rerun HSC data processing [link].\n", @@ -24,6 +36,29 @@ "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub." ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# What version of the Stack are we using?\n", + "! echo $HOSTNAME\n", + "! eups list -s | grep lsst_distrib" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Filter some warnings printed by v16.0 of the stack\n", + "import warnings\n", + "warnings.simplefilter(\"ignore\", category=FutureWarning)\n", + "warnings.simplefilter(\"ignore\", category=UserWarning)" + ] + }, { "cell_type": "code", "execution_count": null, @@ -228,9 +263,36 @@ "outputs": [], "source": [ "# First we get the raw image\n", + "import shutil\n", + "repodir = '/project/stack-club/validation_data_hsc'\n", + "datadir = os.path.join(repodir,'data')\n", + "outdir = '/home/kadrlica/tmpdir/'\n", + "# output directory cannot exist (wait for Gen3 Butler...)\n", + "if os.path.exists(outdir):\n", + " shutil.rmtree(outdir)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Setup the astrometry reference catalogs following:\n", + "# https://github.com/lsst/validation_data_hsc\n", + "!export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", + "!export ASTROMETRY_NET_DATA_DIR={repodir + '/sdss-dr9-fink-v5b'}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ "from lsst.daf.persistence import Butler\n", - "datadir = '/project/stack-club/validation_data_hsc/data'\n", - "butler = Butler(datadir)" + "#butler = Butler(datadir) # specify output in the butler...\n", + "butler = Butler(inputs=datadir, outputs=outdir)" ] }, { @@ -250,7 +312,8 @@ "config = ProcessCcdConfig()\n", "config.isr.doBias = False\n", "config.isr.doDark = False\n", - "config.isr.doFlat = False" + "config.isr.doFlat = False\n", + "config.doCalibrate = False" ] }, { @@ -268,7 +331,9 @@ "metadata": {}, "outputs": [], "source": [ - "#butler.queryMetadata('raw',['visit','ccd'])" + "# Find the available data sets\n", + "ccds = butler.queryMetadata('raw',['visit','ccd'])\n", + "print(ccds[10])" ] }, { @@ -277,7 +342,7 @@ "metadata": {}, "outputs": [], "source": [ - "subset = butler.subset('raw', dataId={'visit':903332})" + "#subset = butler.subset('raw', dataId={'visit':903332, 'ccd':25})" ] }, { @@ -286,7 +351,7 @@ "metadata": {}, "outputs": [], "source": [ - "dRef = butler.dataRef('raw', dataId={'visit':903332,'ccd':25})" + "dataRef = butler.dataRef('raw', dataId={'visit':903332,'ccd':25})" ] }, { @@ -295,15 +360,24 @@ "metadata": {}, "outputs": [], "source": [ - "task.run(dRef)" + "task.run(dataRef)" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "raw", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "# This appears to work when called from the shell:\n", + "\n", + "cd /project/stack-club/validation_data_hsc\n", + "export OMP_NUM_THREADS=1\n", + "export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", + "export ASTROMETRY_NET_DATA_DIR=sdss-dr9-fink-v5b\n", + "input=\"data\"\n", + "output=\"/home/kadrlica/tmpdir\"\n", + "rm -rf $output\n", + "processCcd.py $input --calib CALIB --output $output --id ccd=25 visit=903982" + ] } ], "metadata": { From ec41f84c50d6c2ed38a15915b4ff1f38d7c7d8fd Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 2 Nov 2018 18:10:22 +0000 Subject: [PATCH 12/14] Things are running --- ImageProcessing/PipelineProcessingAPI.ipynb | 369 +++++++++++++++++--- 1 file changed, 315 insertions(+), 54 deletions(-) diff --git a/ImageProcessing/PipelineProcessingAPI.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb index 7617a8b9..1f4cdf1a 100644 --- a/ImageProcessing/PipelineProcessingAPI.ipynb +++ b/ImageProcessing/PipelineProcessingAPI.ipynb @@ -11,13 +11,6 @@ "
Verified Stack Release: **v16.0**" ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, @@ -38,9 +31,18 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "jld-lab-kadrlica-r160\n", + "lsst_distrib 16.0+1 \tcurrent v16_0 setup\n" + ] + } + ], "source": [ "# What version of the Stack are we using?\n", "! echo $HOSTNAME\n", @@ -49,7 +51,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -61,7 +63,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -91,9 +93,17 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0+1/bin/processCcd.py\n" + ] + } + ], "source": [ "!(which processCcd.py)" ] @@ -107,9 +117,41 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "#!/usr/bin/env python\n", + "#\n", + "# LSST Data Management System\n", + "# Copyright 2008, 2009, 2010 LSST Corporation.\n", + "#\n", + "# This product includes software developed by the\n", + "# LSST Project (http://www.lsst.org/).\n", + "#\n", + "# This program is free software: you can redistribute it and/or modify\n", + "# it under the terms of the GNU General Public License as published by\n", + "# the Free Software Foundation, either version 3 of the License, or\n", + "# (at your option) any later version.\n", + "#\n", + "# This program is distributed in the hope that it will be useful,\n", + "# but WITHOUT ANY WARRANTY; without even the implied warranty of\n", + "# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", + "# GNU General Public License for more details.\n", + "#\n", + "# You should have received a copy of the LSST License Statement and\n", + "# the GNU General Public License along with this program. If not,\n", + "# see .\n", + "#\n", + "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", + "\n", + "ProcessCcdTask.parseAndRun()\n" + ] + } + ], "source": [ "!cat $(which processCcd.py)" ] @@ -130,7 +172,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": {}, "outputs": [], "source": [ @@ -147,9 +189,29 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)\n" + ] + } + ], "source": [ "from stackclub import where_is\n", "where_is(ProcessCcdConfig)" @@ -166,9 +228,45 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " | ----------------------------------------------------------------------\n", + " | Data descriptors defined here:\n", + " | \n", + " | isr\n", + " | Task to perform instrumental signature removal or load a post-ISR image; ISR consists of:\n", + " | - assemble raw amplifier images into an exposure with image, variance and mask planes\n", + " | - perform bias subtraction, flat fielding, etc.\n", + " | - mask known bad pixels\n", + " | - provide a preliminary WCS\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n", + " | charImage\n", + " | Task to characterize a science exposure:\n", + " | - detect sources, usually at high S/N\n", + " | - estimate the background, which is subtracted from the image and returned as field \"background\"\n", + " | - estimate a PSF model, which is added to the exposure\n", + " | - interpolate over defects and cosmic rays, updating the image, variance and mask planes\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n", + " | doCalibrate\n", + " | Perform calibration? (`bool`, default ``True``)\n", + " | \n", + " | calibrate\n", + " | Task to perform astrometric and photometric calibration:\n", + " | - refine the WCS in the exposure\n", + " | - refine the Calib photometric calibration object in the exposure\n", + " | - detect sources, usually at low S/N\n", + " | (`ConfigurableInstance`, default ````)\n", + " | \n" + ] + } + ], "source": [ "config = ProcessCcdConfig()\n", "#help(config)\n", @@ -192,9 +290,40 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " | \n", + " | Data descriptors defined here:\n", + " | \n", + " | doBias\n", + " | Apply bias frame correction? (`bool`, default ``True``)\n", + " | \n", + " | doDark\n", + " | Apply dark frame correction? (`bool`, default ``True``)\n", + " | \n", + " | doFlat\n", + " | Apply flat field correction? (`bool`, default ``True``)\n", + " | \n", + " | doFringe\n", + " | Apply fringe correction? (`bool`, default ``True``)\n", + " | \n", + " | doDefect\n", + " | Apply correction for CCD defects, e.g. hot pixels? (`bool`, default ``True``)\n", + " | \n", + " | doAddDistortionModel\n", + " | Apply a distortion model based on camera geometry to the WCS? (`bool`, default ``True``)\n", + " | \n", + " | doWrite\n", + " | Persist postISRCCD? (`bool`, default ``True``)\n", + " | ...\n" + ] + } + ], "source": [ "from lsst.ip.isr.isrTask import IsrTask, IsrTaskConfig\n", "print('\\n'.join(pydoc.render_doc(IsrTaskConfig).split('\\n')[16:40]) + '...')" @@ -209,9 +338,29 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)\n" + ] + } + ], "source": [ "where_is(IsrTaskConfig)" ] @@ -230,9 +379,36 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/markdown": [ + "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)\n" + ] + } + ], "source": [ "isr_task = IsrTask()\n", "print(isr_task.fringe)\n", @@ -258,7 +434,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 23, "metadata": {}, "outputs": [], "source": [ @@ -266,32 +442,45 @@ "import shutil\n", "repodir = '/project/stack-club/validation_data_hsc'\n", "datadir = os.path.join(repodir,'data')\n", - "outdir = '/home/kadrlica/tmpdir/'\n", - "# output directory cannot exist (wait for Gen3 Butler...)\n", - "if os.path.exists(outdir):\n", - " shutil.rmtree(outdir)" + "outdir = '/home/kadrlica/tmpdir/'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we run the task as we would from the command line. All that we are doing here is parsing the `cmdline` string as if it is arguments passed to `processCcdTask.py` on the command line. Unfortunately, the task does not print it's output to a notebook cell, so we just need to wait for ~1 minute for this to run. We turn off several optional subtask steps so that things run faster..." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ - "# Setup the astrometry reference catalogs following:\n", - "# https://github.com/lsst/validation_data_hsc\n", - "!export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", - "!export ASTROMETRY_NET_DATA_DIR={repodir + '/sdss-dr9-fink-v5b'}" + "%%timeit\n", + "task = ProcessCcdTask()\n", + "cmdline = '{0}/data --calib {0}/CALIB --output {1} --id ccd=25 visit=903982'.format(repodir,outdir)\n", + "cmdline += ' --config doCalibrate=False isr.doBias=False isr.doDark=False isr.doFlat=False'\n", + "struct = task.parseAndRun(cmdline.split())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next level in complexity is to try to call `task.run`. In order to do this we need to provide a `dataRef`. To do this, we create a butler." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "from lsst.daf.persistence import Butler\n", - "#butler = Butler(datadir) # specify output in the butler...\n", + "# output directory cannot exist (wait for Gen3 Butler...)\n", + "if os.path.exists(outdir): shutil.rmtree(outdir)\n", "butler = Butler(inputs=datadir, outputs=outdir)" ] }, @@ -304,7 +493,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 30, "metadata": {}, "outputs": [], "source": [ @@ -318,7 +507,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 33, "metadata": {}, "outputs": [], "source": [ @@ -327,9 +516,17 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(903332, 10)\n" + ] + } + ], "source": [ "# Find the available data sets\n", "ccds = butler.queryMetadata('raw',['visit','ccd'])\n", @@ -347,7 +544,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 28, "metadata": {}, "outputs": [], "source": [ @@ -356,11 +553,34 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 36, "metadata": {}, "outputs": [], "source": [ - "task.run(dataRef)" + "# Setup the astrometry reference catalogs following:\n", + "# https://github.com/lsst/validation_data_hsc\n", + "!export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", + "!export ASTROMETRY_NET_DATA_DIR={repodir + '/sdss-dr9-fink-v5b'}\n", + "\n", + "# TODO: Create the astrometry reference object to pass to the task" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "40.8 s ± 476 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" + ] + } + ], + "source": [ + "%%timeit\n", + "struct = task.run(dataRef)" ] }, { @@ -378,6 +598,47 @@ "rm -rf $output\n", "processCcd.py $input --calib CALIB --output $output --id ccd=25 visit=903982" ] + }, + { + "cell_type": "raw", + "metadata": {}, + "source": [ + "repo=/project/stack-club/validation_data_hsc\n", + "output=/home/kadrlica/tmpdir\n", + "processCcd.py $repo/data --calib $repo/CALIB --output $output --id ccd=25 visit=903982" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Running ISR Task" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "# Try to get the task outputs into the notebook (and failing)\n", + "import sys,logging\n", + "logger = logging.getLogger()\n", + "logger.setLevel(logging.DEBUG)\n", + "\n", + "# Create STDERR handler\n", + "handler = logging.StreamHandler(sys.stderr)\n", + "# ch.setLevel(logging.DEBUG)\n", + "\n", + "# Create formatter and add it to the handler\n", + "formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')\n", + "handler.setFormatter(formatter)\n", + "\n", + "# Set STDERR handler as the only handler \n", + "logger.handlers = [handler]" + ] } ], "metadata": { From 76c7018f3df481df1467787896273b578254bc4c Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Fri, 9 Nov 2018 19:17:17 +0000 Subject: [PATCH 13/14] running ISR task --- ImageProcessing/PipelineProcessingAPI.ipynb | 511 ++++++++------------ 1 file changed, 212 insertions(+), 299 deletions(-) diff --git a/ImageProcessing/PipelineProcessingAPI.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb index 1f4cdf1a..f88b8166 100644 --- a/ImageProcessing/PipelineProcessingAPI.ipynb +++ b/ImageProcessing/PipelineProcessingAPI.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Pipeline Tasks\n", + "# Pipeline Task API\n", "\n", "
Owner: **Alex Drlica-Wagner** ([@kadrlica](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@kadrlica))\n", "
Last Verified to Run: **2018-08-10**\n", @@ -31,18 +31,9 @@ }, { "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "jld-lab-kadrlica-r160\n", - "lsst_distrib 16.0+1 \tcurrent v16_0 setup\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "# What version of the Stack are we using?\n", "! echo $HOSTNAME\n", @@ -51,7 +42,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -63,12 +54,23 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ + "%matplotlib inline\n", "import os\n", - "import pydoc" + "import pydoc\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import lsst.afw.display as afwDisplay" ] }, { @@ -93,17 +95,9 @@ }, { "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0+1/bin/processCcd.py\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!(which processCcd.py)" ] @@ -117,41 +111,9 @@ }, { "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "#!/usr/bin/env python\n", - "#\n", - "# LSST Data Management System\n", - "# Copyright 2008, 2009, 2010 LSST Corporation.\n", - "#\n", - "# This product includes software developed by the\n", - "# LSST Project (http://www.lsst.org/).\n", - "#\n", - "# This program is free software: you can redistribute it and/or modify\n", - "# it under the terms of the GNU General Public License as published by\n", - "# the Free Software Foundation, either version 3 of the License, or\n", - "# (at your option) any later version.\n", - "#\n", - "# This program is distributed in the hope that it will be useful,\n", - "# but WITHOUT ANY WARRANTY; without even the implied warranty of\n", - "# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", - "# GNU General Public License for more details.\n", - "#\n", - "# You should have received a copy of the LSST License Statement and\n", - "# the GNU General Public License along with this program. If not,\n", - "# see .\n", - "#\n", - "from lsst.pipe.tasks.processCcd import ProcessCcdTask\n", - "\n", - "ProcessCcdTask.parseAndRun()\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!cat $(which processCcd.py)" ] @@ -172,7 +134,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -189,29 +151,9 @@ }, { "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/markdown": [ - "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.pipe.tasks.processCcd](https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/processCcd.py)\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "from stackclub import where_is\n", "where_is(ProcessCcdConfig)" @@ -226,54 +168,6 @@ "Pipeline tasks are controlled and tweaked through there associated `TaskConfig` objects. To investigate the configuration parameters of the `ProcessCcdTask`, we create an instance of the `ProcessCcdConfig` and try calling the `help` method (commented out for brevity). What we are really interested in are the \"Data descriptors\", which we can print directly after capturing the documentation output by `help`." ] }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " | ----------------------------------------------------------------------\n", - " | Data descriptors defined here:\n", - " | \n", - " | isr\n", - " | Task to perform instrumental signature removal or load a post-ISR image; ISR consists of:\n", - " | - assemble raw amplifier images into an exposure with image, variance and mask planes\n", - " | - perform bias subtraction, flat fielding, etc.\n", - " | - mask known bad pixels\n", - " | - provide a preliminary WCS\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n", - " | charImage\n", - " | Task to characterize a science exposure:\n", - " | - detect sources, usually at high S/N\n", - " | - estimate the background, which is subtracted from the image and returned as field \"background\"\n", - " | - estimate a PSF model, which is added to the exposure\n", - " | - interpolate over defects and cosmic rays, updating the image, variance and mask planes\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n", - " | doCalibrate\n", - " | Perform calibration? (`bool`, default ``True``)\n", - " | \n", - " | calibrate\n", - " | Task to perform astrometric and photometric calibration:\n", - " | - refine the WCS in the exposure\n", - " | - refine the Calib photometric calibration object in the exposure\n", - " | - detect sources, usually at low S/N\n", - " | (`ConfigurableInstance`, default ````)\n", - " | \n" - ] - } - ], - "source": [ - "config = ProcessCcdConfig()\n", - "#help(config)\n", - "helplist = pydoc.render_doc(config).split('\\n')\n", - "print('\\n'.join(helplist[18:47]))" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -290,40 +184,9 @@ }, { "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " | \n", - " | Data descriptors defined here:\n", - " | \n", - " | doBias\n", - " | Apply bias frame correction? (`bool`, default ``True``)\n", - " | \n", - " | doDark\n", - " | Apply dark frame correction? (`bool`, default ``True``)\n", - " | \n", - " | doFlat\n", - " | Apply flat field correction? (`bool`, default ``True``)\n", - " | \n", - " | doFringe\n", - " | Apply fringe correction? (`bool`, default ``True``)\n", - " | \n", - " | doDefect\n", - " | Apply correction for CCD defects, e.g. hot pixels? (`bool`, default ``True``)\n", - " | \n", - " | doAddDistortionModel\n", - " | Apply a distortion model based on camera geometry to the WCS? (`bool`, default ``True``)\n", - " | \n", - " | doWrite\n", - " | Persist postISRCCD? (`bool`, default ``True``)\n", - " | ...\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "from lsst.ip.isr.isrTask import IsrTask, IsrTaskConfig\n", "print('\\n'.join(pydoc.render_doc(IsrTaskConfig).split('\\n')[16:40]) + '...')" @@ -338,29 +201,9 @@ }, { "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/markdown": [ - "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.ip.isr.isrTask](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/isrTask.py)\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "where_is(IsrTaskConfig)" ] @@ -379,36 +222,9 @@ }, { "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "data": { - "text/markdown": [ - "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[lsst.ip.isr.fringe](https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/fringe.py)\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "isr_task = IsrTask()\n", "print(isr_task.fringe)\n", @@ -434,103 +250,106 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "# First we get the raw image\n", + "# First we setup the directories for the raw image and calibration.\n", "import shutil\n", - "repodir = '/project/stack-club/validation_data_hsc'\n", - "datadir = os.path.join(repodir,'data')\n", + "basedir = '/project/stack-club/validation_data_hsc'\n", + "repodir = os.path.join(basedir,'data')\n", + "# By default the calibration directory lives in ${datadir}/CALIB; \n", + "# however, in the validation_data_hsc it lives in ${basedir}/CALIB\n", + "calibdir = os.path.join(basedir,'CALIB')\n", + "# The directory for our processed output\n", "outdir = '/home/kadrlica/tmpdir/'" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "First, we run the task as we would from the command line. All that we are doing here is parsing the `cmdline` string as if it is arguments passed to `processCcdTask.py` on the command line. Unfortunately, the task does not print it's output to a notebook cell, so we just need to wait for ~1 minute for this to run. We turn off several optional subtask steps so that things run faster..." + "# Next, we create a butler to manage the data repositories\n", + "from lsst.daf.persistence import Butler\n", + "# output directory cannot exist (wait for Gen3 Butler...)\n", + "if os.path.exists(outdir): shutil.rmtree(outdir)\n", + "\n", + "# This strange structure for Butler initialization is a 'feature' of the Gen2 butler \n", + "# and should be replaced in Gen3\n", + "butler = Butler(inputs={'root': datadir, 'mapperArgs': {'calibRoot': calibdir}}, \n", + " outputs={'root': outdir})" ] }, { - "cell_type": "code", - "execution_count": 19, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "%%timeit\n", - "task = ProcessCcdTask()\n", - "cmdline = '{0}/data --calib {0}/CALIB --output {1} --id ccd=25 visit=903982'.format(repodir,outdir)\n", - "cmdline += ' --config doCalibrate=False isr.doBias=False isr.doDark=False isr.doFlat=False'\n", - "struct = task.parseAndRun(cmdline.split())" + "We can use the butler to list the available data sets, and to choose the CCD that we want to process." ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "The next level in complexity is to try to call `task.run`. In order to do this we need to provide a `dataRef`. To do this, we create a butler." + "# We can list available data sets\n", + "ccds = butler.queryMetadata('raw',['visit','ccd'])\n", + "print(ccds[0:10],'...')" ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "from lsst.daf.persistence import Butler\n", - "# output directory cannot exist (wait for Gen3 Butler...)\n", - "if os.path.exists(outdir): shutil.rmtree(outdir)\n", - "butler = Butler(inputs=datadir, outputs=outdir)" + "# We choose one of these CCDs as our dataRef\n", + "dataRef = butler.dataRef('raw', dataId={'visit':903332,'ccd':25})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Next, we create the config and task instances" + "Now we are prepared to create our `IsrTaskConfig`. At this point, we will turn off many of the processing steps to speed up the task." ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "# Since we don't want to do calibration processing we turn off these configs\n", - "config = ProcessCcdConfig()\n", - "config.isr.doBias = False\n", - "config.isr.doDark = False\n", - "config.isr.doFlat = False\n", - "config.doCalibrate = False" + "config = IsrTaskConfig()\n", + "config.doBias = True\n", + "config.doDark = True\n", + "config.doFlat = True" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With the `config` created and configured, we can now create the `IsrTask`" ] }, { "cell_type": "code", - "execution_count": 33, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "task = ProcessCcdTask(butler=butler, config=config)" + "isr_task = IsrTask(config)" ] }, { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "(903332, 10)\n" - ] - } - ], - "source": [ - "# Find the available data sets\n", - "ccds = butler.queryMetadata('raw',['visit','ccd'])\n", - "print(ccds[10])" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And run it!" ] }, { @@ -539,48 +358,56 @@ "metadata": {}, "outputs": [], "source": [ - "#subset = butler.subset('raw', dataId={'visit':903332, 'ccd':25})" + "%%timeit\n", + "struct = isr_task.runDataRef(dataRef)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The output is a `struct` containing the processed exposure:" ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "dataRef = butler.dataRef('raw', dataId={'visit':903332,'ccd':25})" + "print(struct)" ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "# Setup the astrometry reference catalogs following:\n", - "# https://github.com/lsst/validation_data_hsc\n", - "!export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", - "!export ASTROMETRY_NET_DATA_DIR={repodir + '/sdss-dr9-fink-v5b'}\n", + "afwDisplay.setDefaultBackend('matplotlib') \n", + "# ADW: why can't we set the backend before this cell?\n", "\n", - "# TODO: Create the astrometry reference object to pass to the task" + "figure = plt.figure(1,figsize=(10,10))\n", + "afw_display = afwDisplay.Display(1)\n", + "afw_display.scale('asinh', 'zscale')\n", + "afw_display.mtv(dataRef.get())\n", + "plt.title(\"Raw\")\n", + "\n", + "figure = plt.figure(2,figsize=(12,12))\n", + "afw_display = afwDisplay.Display(2)\n", + "afw_display.scale('asinh', 'zscale')\n", + "afw_display.mtv(struct.exposure)\n", + "plt.title(\"ISR 'Corrected'\")" ] }, { "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "40.8 s ± 476 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" - ] - } - ], + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ - "%%timeit\n", - "struct = task.run(dataRef)" + "for maskName, maskBit in struct.exposure.mask.getMaskPlaneDict().items():\n", + " print('{}: {}'.format(afw_display.getMaskPlaneColor(maskName),maskName))" ] }, { @@ -614,15 +441,7 @@ "metadata": {}, "outputs": [], "source": [ - "# Running ISR Task" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [], - "source": [ + "# https://github.com/ipython/ipykernel/issues/111#issuecomment-237089618\n", "# Try to get the task outputs into the notebook (and failing)\n", "import sys,logging\n", "logger = logging.getLogger()\n", @@ -639,6 +458,100 @@ "# Set STDERR handler as the only handler \n", "logger.handlers = [handler]" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Run processCcdTask" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we run the task as we would from the command line. All that we are doing here is parsing the `cmdline` string as if it is arguments passed to `processCcdTask.py` on the command line. Unfortunately, the task does not print it's output to a notebook cell, so we just need to wait for ~1 minute for this to run. We turn off several optional subtask steps so that things run faster..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create the task\n", + "task = ProcessCcdTask()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%timeit\n", + "# output directory cannot exist (wait for Gen3 Butler...)\n", + "if os.path.exists(outdir): shutil.rmtree(outdir)\n", + "cmdline = '{0} --calib {1} --output {2} --id ccd=25 visit=903982'.format(repodir,calibdir,outdir)\n", + "cmdline += ' --config doCalibrate=False isr.doBias=False isr.doDark=False isr.doFlat=False'\n", + "struct = task.parseAndRun(cmdline.split())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "config = ProcessCcdConfig()\n", + "#help(config)\n", + "helplist = pydoc.render_doc(config).split('\\n')\n", + "print('\\n'.join(helplist[18:47]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next level in complexity is to try to call `task.run`. In order to do this we need to provide a `dataRef`. To do this, we create a butler." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Since we don't want to do calibration processing we turn off these configs\n", + "config = ProcessCcdConfig()\n", + "config.isr.doBias = False\n", + "config.isr.doDark = False\n", + "config.isr.doFlat = False\n", + "config.doCalibrate = False" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "task = ProcessCcdTask(butler=butler, config=config)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Setup the astrometry reference catalogs following:\n", + "# https://github.com/lsst/validation_data_hsc\n", + "#!export SETUP_ASTROMETRY_NET_DATA=\"astrometry_net_data sdss-dr9-fink-v5b\"\n", + "#!export ASTROMETRY_NET_DATA_DIR={repodir + '/sdss-dr9-fink-v5b'}\n", + "\n", + "# TODO: Create the astrometry reference object to pass to the task" + ] } ], "metadata": { From aaf674caf1b2b18aff92b61851ab1c250d0b8caa Mon Sep 17 00:00:00 2001 From: Alex Drlica-Wagner Date: Wed, 5 Dec 2018 13:43:55 +0000 Subject: [PATCH 14/14] minor tweaks --- ImageProcessing/PipelineProcessingAPI.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ImageProcessing/PipelineProcessingAPI.ipynb b/ImageProcessing/PipelineProcessingAPI.ipynb index f88b8166..419fc893 100644 --- a/ImageProcessing/PipelineProcessingAPI.ipynb +++ b/ImageProcessing/PipelineProcessingAPI.ipynb @@ -278,7 +278,7 @@ "\n", "# This strange structure for Butler initialization is a 'feature' of the Gen2 butler \n", "# and should be replaced in Gen3\n", - "butler = Butler(inputs={'root': datadir, 'mapperArgs': {'calibRoot': calibdir}}, \n", + "butler = Butler(inputs={'root': repodir, 'mapperArgs': {'calibRoot': calibdir}}, \n", " outputs={'root': outdir})" ] }, @@ -358,7 +358,7 @@ "metadata": {}, "outputs": [], "source": [ - "%%timeit\n", + "%%time\n", "struct = isr_task.runDataRef(dataRef)" ] }, @@ -393,7 +393,7 @@ "afw_display.mtv(dataRef.get())\n", "plt.title(\"Raw\")\n", "\n", - "figure = plt.figure(2,figsize=(12,12))\n", + "figure = plt.figure(2,figsize=(10,10))\n", "afw_display = afwDisplay.Display(2)\n", "afw_display.scale('asinh', 'zscale')\n", "afw_display.mtv(struct.exposure)\n",