diff --git a/src/faq.md b/src/faq.md index 8e7f41a2..1b4419bb 100644 --- a/src/faq.md +++ b/src/faq.md @@ -40,10 +40,10 @@ outputs: ## Rename an input file -This example shows how you can change the name of an input file +This example demonstrates how to change the name of an input file as part of a tool description. This could be useful when you are taking files produced from another -step in a workflow and don't want to work with the default names that these +step in a workflow, and don't want to work with the default names that these files were given when they were created. ```yaml @@ -58,7 +58,7 @@ requirements: ## Rename an output file -This example shows how you can change the name of an output file +This example demonstrates how to change the name of an output file from the default name given to it by a tool: ```yaml @@ -79,6 +79,61 @@ outputs: outputEval: ${self[0].basename=inputs.otu_table_name; return self;} ``` +## Referencing a local script + +There are two ways to reference a local script: + +The first method involves adding the folder containing your scripts to the `PATH` environment variable. This allows you to run the shell script without using `sh` or `bash` commands. + +Start with adding a _shebang_ at the top of your file: + +```{code-block} +#!/bin/bash +``` + +After that, make the script executable with the command `chmod +x scriptname.sh` + +Finally, modify your `PATH` to add the directory where your script is located. +(It is good practice to use `$HOME/bin` for storing your own scripts). + +```bash +export PATH=$PATH:/appropriate/directory +``` + +Now you can use `baseCommand: scriptname.sh` to run the script directly. + +```cwl +#!/bin/bash +cwlVersion: v1.0 +class: CommandLineTool +baseCommand: scriptname.sh +``` + +When you wish to share your work later, you can place your script in a software container in the Docker format. + +The second method involves including an input of `type: File` in the script itself: + +```cwl +class: CommandLineTool + +inputs: + my_script: + type: File + inputBinding: + position: 0 + + + # other inputs go here + +baseCommand: sh + +outputs: [] +``` + +```{note} +In CWL, everything must be directly stated. +``` + ## Setting `self`-based input bindings for optional inputs Currently, `cwltool` can't cope with missing optional inputs if their @@ -108,8 +163,8 @@ outputs: [] ## Model a "one-or-the-other" parameter -Below is an example of how -you can specify different strings to be added to a command line +Below is an example showing how +to specify different strings to be added to a command line, based on the value given to a Boolean parameter. ```yaml @@ -265,7 +320,7 @@ For command-line flags that are either **mutually exclusive** or **dependent**, ``` ## Setting Mutually Exclusive Parameters -In order to properly set fields in a record input type, you need to pass a dictionary to the input to properly set the parameters. This is done by using inline javascript and returning the dictionary with the key of the field you want to set. The source field is set to indicate the input from the workflow to be used as the value. +To properly set fields in a record input type, you need to pass a dictionary to the input to properly set the parameters. This is done by using inline JavaScript and returning the dictionary with the key of the field you want to set. The source field is set to indicate the input from the workflow to be used as the value. ```yaml steps: diff --git a/src/introduction/basic-concepts.md b/src/introduction/basic-concepts.md index 30459c8a..1c81fde7 100644 --- a/src/introduction/basic-concepts.md +++ b/src/introduction/basic-concepts.md @@ -1,10 +1,10 @@ # Basic Concepts -This section describes the basic concepts for users to get started working with +This section describes the basic concepts for users to get started on working with Common Workflow Language (CWL) workflows. Readers are expected to be familiar -with workflow managers, YAML, and comfortable following instructions for the -command-line. The other sections of the user guide cover the same concepts but -in more detail. If you are already familiar with CWL or looking for more advanced +with workflow managers, YAML, and comfortable with following instructions for the +command-line. The other sections of the user guide cover the same concepts, but +in more detail. If you are already familiar with CWL or you are looking for more advanced content, you may want to skip this section. ## The CWL specification @@ -24,7 +24,7 @@ is the {{ cwl_version }}. The specification version can have up to three numbers separated by `.`'s (dots). The first number is the major release, used for backward-incompatible changes like -the removal of deprecated features. The second is the minor release number, +the removal of deprecated features. The second number is the minor release, used for new features or smaller changes that are backward-compatible. The last number is used for bug fixes, like typos and other corrections to the specification. @@ -37,7 +37,7 @@ the end of this section to [learn more](#learn-more) about it. ## Implementations An implementation of the CWL specification is any software written following -what is defined in a version of the specification document. Implementations may +what is defined in a version of the specification document. However, implementations may not implement every aspect of the specification. CWL implementations are licensed under both Open Source and commercial licenses. @@ -47,6 +47,7 @@ in parallel across many nodes. % TODO: add a link to the Core Concepts -> Requirements section below? + ```{graphviz} :name: specification-and-implementations-graph :caption: CWL specification, implementations, and other tools. @@ -125,13 +126,13 @@ takes inputs and produces outputs like a command-line tool. The workflow is a process that contains steps. Steps can be other workflows (nested workflows), command-line tools, or expression tools. -The inputs of a workflow can be passed to any of its steps, and +The inputs of a workflow can be passed to any of its steps, while the outputs produced by its steps can be used in the final output of the workflow. Operation is an abstract process that also takes inputs, produces outputs, and can be used in a workflow. But it is a special operation -not so commonly used. It is discussed in another section. +not so commonly used. It is discussed in the [Operations section](../topics/operations.md) of this user guide. The CWL specification allows for implementations to provide extra functionality and specify prerequisites to workflows through *requirements*. @@ -149,12 +150,12 @@ runners. Hints are similar to requirements, but while requirements list features that are required, hints list optional features. Requirements are explained -in detail in another section. +in detail in the [Requirements](../topics/requirements-and-hints.md) section. ## FAIR workflows > The FAIR principles have laid a foundation for sharing and publishing -> digital assets and, in particular, data. The FAIR principles emphasize +> digital assets, and in particular, data. The FAIR principles emphasize > machine accessibility and that all digital assets should be Findable, > Accessible, Interoperable, and Reusable. Workflows encode the methods > by which the scientific process is conducted and via which data are @@ -164,11 +165,11 @@ in detail in another section. > Workflows Community Initiative. CWL has roots in "make" and many similar tools that determine order of -execution based on dependencies between tasks. However, unlike "make", CWL +execution, based on dependencies between tasks. However, unlike "make", CWL tasks are isolated, and you must be explicit about your inputs and outputs. The benefit of explicitness and isolation are flexibility, portability, and -scalability: tools and workflows described with CWL can transparently leverage +scalability; tools and workflows described with CWL can transparently leverage technologies such as Docker and be used with CWL implementations from different vendors. diff --git a/src/introduction/prerequisites.md b/src/introduction/prerequisites.md index f739add6..20a654a4 100644 --- a/src/introduction/prerequisites.md +++ b/src/introduction/prerequisites.md @@ -1,18 +1,18 @@ # Prerequisites -% This page supersedes the old setup.md. We used that page as reference while +% This page supersedes the old setup page: setup.md. We used that page as a reference while % writing this documentation. The software and configurations listed in this section are prerequisites for following this user guide. The CWL standards are implemented by many different workflow runners and platforms. This list of requirements focuses on the CWL reference runner, -`cwltool`. You can use another CWL compatible runner or workflow systems but the results and +`cwltool`. You can use another CWL-compatible runner or workflow system, but the results and interface may look different (though the exact workflow outputs should be identical). ```{admonition} CWL Implementations -There are many implementations of the CWL standards. Some are complete CWL runners, -others are plug-ins or extensions to workflow engines. We have a better +There are many implementations of the CWL standards. Some are complete CWL runners, while +others could be plug-ins or extensions to workflow engines. We have a better explanation in the [Implementations](basic-concepts.md#implementations) section. ``` @@ -26,7 +26,7 @@ of the following options for your operating system: - Windows ```{note} -If you are using Windows, you will have to install the Windows Subsystem for Linux 2. +If you are using Windows, you will have to install the [Windows Subsystem for Linux 2](https://learn.microsoft.com/en-us/windows/wsl/install) (WSL2). Visit the `cwltool` [documentation](https://github.com/common-workflow-language/cwltool/blob/main/README.rst#ms-windows-users) for details on installing WSL2. Your operating system also needs internet access and a recent version of Python (3.6+). @@ -59,7 +59,7 @@ $ (venv) pip install cwltool ``` ```{note} -You can find the `cwl-runner` source code [here](https://github.com/common-workflow-language/cwltool/tree/main/cwlref-runner). +You can find the `cwl-runner` [source code](https://github.com/common-workflow-language/cwltool/tree/main/cwlref-runner) in the `cwltool` repository. Visit the `cwltool` [documentation](https://github.com/common-workflow-language/cwltool#install) for other ways to install `cwltool` with `apt` and `conda`. ``` @@ -71,10 +71,10 @@ Let's use a simple CWl tool description `true.cwl` with `cwltool`. :name: true.cwl ``` -The `cwltool` command has an option to validate CWL tool and workflow descriptionss. It will parse the -CWL document, look for syntax errors, and verify that the descriptions are compliant -with the CWL standards, without running it. To validate CWL workflows (or even a -standalone command line tool description like above) pass the `--validate` option +The `cwltool` command has an option to validate CWL tool and workflow descriptions. This option will parse the +CWL document, look for syntax errors, and verify that the workflow descriptions are compliant +with the CWL standards. However, these actions will be performed without running the document. To validate CWL workflows (or even a +standalone command line tool description like the above) pass the `--validate` option to the `cwltool` command: ```{runcmd} cwltool --validate true.cwl @@ -91,7 +91,7 @@ You can run the CWL tool description by omitting the `--validate` option: ### cwl-runner Python module -`cwl-runner` is an implementation-agnostic alias for CWL Runners. +`cwl-runner` is an implementation-agnostic alias for CWL Runners. This simply means that the `cwl-runner` alias command can be invoked independently, and is not reliant on a particular CWL runner. Users can invoke `cwl-runner` instead of invoking a CWL runner like `cwltool` directly. The `cwl-runner` alias command then chooses the correct CWL runner. This is convenient for environments with multiple CWL runners. @@ -106,7 +106,7 @@ an alias for `cwltool` under the name `cwl-runner` $ pip install cwlref-runner ``` -Now you can validate and run your workflow with `cwl-runner` executable, +Now you can validate and run your workflow with the `cwl-runner` executable, which will invoke `cwltool`. You should have the same results and output as in the previous section. @@ -120,9 +120,9 @@ as in the previous section. :caption: Running `true.cwl` with `cwl-runner`. ``` -Another way to execute `cwl-runner` is invoking the file directly. For that, -the first thing you need to copy `true.cwl` workflow into a new file -`true_shebang.cwl` and include a special first line, a *shebang*: +Another way to execute `cwl-runner` is by invoking the file directly. For that, +the first thing you need to do is copy `true.cwl` workflow into a new file: +`true_shebang.cwl`, and include a special first line, a *shebang*: ```{literalinclude} /_includes/cwl/true_shebang.cwl :language: cwl @@ -139,7 +139,7 @@ Now you can make the file `true_shebang.cwl` executable with `chmod u+x`. $ chmod u+x true.cwl ``` -And finally you can execute it directly in the command-line and the program +And finally, you can execute it directly in the command-line. On execution, the program specified in the shebang (`cwl-runner`) will be used to execute the rest of the file. @@ -152,9 +152,8 @@ rest of the file. The *shebang* is the two-character sequence `#!` at the beginning of a script. When the script is executable, the operating system will execute the script using the executable specified after the shebang. It is -considered a good practice to use `/usr/bin/env ` since it -looks for the `` program in the system `PATH`, instead of -using a hard-coded location. +considered a good practice to use `/usr/bin/env ` rather than using a hard-coded location, since `/usr/bin/env ` +looks for the `` program in the system `PATH`, ``` ## Text Editor @@ -164,7 +163,7 @@ an editor with YAML support. Popular editors are Visual Studio Code, Sublime, WebStorm, vim/neovim, and Emacs. There are extensions for Visual Studio Code and WebStorm that provide -integration with CWL, with customized syntax highlighting and better +integration with CWL, and features such as customized syntax highlighting and better auto-complete: - Visual Studio Code with the Benten (CWL) plugin - @@ -182,7 +181,7 @@ Follow the instructions in the Docker documentation to install it for your operating system: . You do not need to know how to write and build Docker containers. In the -rest of the user guide we will use existing Docker images for running +rest of the user guide, we will use existing Docker images for running examples, and to clarify the differences between the execution models with and without containers. diff --git a/src/topics/yaml-guide.md b/src/topics/yaml-guide.md index 6d12538e..9475f83a 100644 --- a/src/topics/yaml-guide.md +++ b/src/topics/yaml-guide.md @@ -5,7 +5,7 @@ [YAML][yaml] is a file format designed to be readable by both computers and humans. -This guide introduces the features of YAML +This guide introduces the features of YAML that are relevant when writing CWL descriptions and input parameter files. ```{note} @@ -27,7 +27,7 @@ Fundamentally, a file written in YAML consists of a set of _key-value pairs_. Each pair is written as `key: value`, where whitespace after the `:` is required. Key names in CWL files should not contain whitespace - -We use [_camelCase_][camelCase] for multi-word key names +[_camelCase_][camelCase] is used for multi-word key names that have special meaning in the CWL specification and underscored key names otherwise. For example: @@ -48,7 +48,7 @@ numeric (integer, floating point, or scientific representation), Boolean (`true` or `false`), or more complex nested types (see below). -Values may be wrapped in quotation marks +Values may be wrapped in quotation marks, but be aware that this may change the way that they are interpreted i.e. `"1234"` will be treated as a character string , while `1234` will be treated as an integer. @@ -80,7 +80,7 @@ be sure to add at least one space before the `#`! When describing a tool or workflow with CWL, it is usually necessary to construct more complex, nested representations. -Called _maps_, +Referred to as _maps_, these hierarchical structures are described in YAML by providing additional key-value pairs as the value of any key. These pairs (sometimes referred to as "children") are written @@ -101,7 +101,7 @@ inputs: # this key has an object value prefix: -f ``` -The YAML above illustrates how you can build up complex nested object +The YAML above illustrates how to build up complex nested object descriptions relatively quickly. The `inputs` map contains a single key, `example_flag`, which itself contains two keys, `type` and `inputBinding`, @@ -126,7 +126,7 @@ graph TD ## Arrays -In certain circumstances it is necessary to provide +In certain circumstances, it is necessary to provide multiple values or objects for a single key. As we've already seen in the [Maps](#maps) section above, more than one key-value pair can be mapped to a single key. @@ -166,8 +166,8 @@ exclusive_parameters: ## JSON Style -YAML is based on [JavaScript Object Notation (JSON)][json] -and maps and arrays can also be defined in YAML using the native JSON syntax. +YAML is based on [JavaScript Object Notation (JSON)][json]. +Maps and arrays can also be defined in YAML using the native JSON syntax. For example: ```yaml @@ -182,13 +182,13 @@ inputs: {example_flag: {type: boolean, inputBinding: {position: 1, prefix: -f}}} ``` Native JSON can be useful -to indicate where a field is being left intentionally empty +in indicating where a field is intentionally left empty (such as `[]` for an empty array), -and where it makes more sense +as well as where it makes more sense for the values to be located on the same line -(such as when providing option flags and their values in a shell command). +(For example, when providing option flags and their values in a shell command). However, as the second example above shows, -it can severely affect the readability of a YAML file +it can severely affect the readability of a YAML file, and should be used sparingly. ## Reference