Skip to content

Commit bb0bcc3

Browse files
authored
Merge branch 'master' into entryname_fix
2 parents b152532 + 15039a4 commit bb0bcc3

File tree

280 files changed

+5415
-3928
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

280 files changed

+5415
-3928
lines changed

.gitignore

+15
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,18 @@ eggs/
1515
*~
1616
\#*\#
1717
.desktop
18+
19+
# virtualenv
20+
venv/
21+
venv3/
22+
23+
# pycharm
24+
.idea/
25+
26+
# typshed repo
27+
typeshed/2and3/schema_salad
28+
typeshed/2and3/ruamel/yaml
29+
30+
31+
#mypy
32+
.mypy_cache/

Makefile

+13-13
Original file line numberDiff line numberDiff line change
@@ -152,26 +152,26 @@ list-author-emails:
152152
@git log --format='%aN,%aE' | sort -u | grep -v 'root'
153153

154154

155-
mypy: ${PYSOURCES}
156-
rm -Rf typeshed/2.7/ruamel/yaml
155+
mypy2: ${PYSOURCES}
156+
rm -Rf typeshed/2and3/ruamel/yaml
157157
ln -s $(shell python -c 'from __future__ import print_function; import ruamel.yaml; import os.path; print(os.path.dirname(ruamel.yaml.__file__))') \
158-
typeshed/2.7/ruamel/yaml
159-
rm -Rf typeshed/2.7/schema_salad
158+
typeshed/2and3/ruamel/yaml
159+
rm -Rf typeshed/2and3/schema_salad
160160
ln -s $(shell python -c 'from __future__ import print_function; import schema_salad; import os.path; print(os.path.dirname(schema_salad.__file__))') \
161-
typeshed/2.7/schema_salad
162-
MYPYPATH=typeshed/2.7 mypy --py2 --disallow-untyped-calls \
163-
--warn-redundant-casts --warn-unused-ignores --fast-parser \
161+
typeshed/2and3/schema_salad
162+
MYPYPATH=$MYPYPATH:typeshed/2.7:typeshed/2and3 mypy --py2 --disallow-untyped-calls \
163+
--warn-redundant-casts --warn-unused-ignores \
164164
cwltool
165165

166166
mypy3: ${PYSOURCES}
167-
rm -Rf typeshed/3/ruamel/yaml
167+
rm -Rf typeshed/2and3/ruamel/yaml
168168
ln -s $(shell python3 -c 'from __future__ import print_function; import ruamel.yaml; import os.path; print(os.path.dirname(ruamel.yaml.__file__))') \
169-
typeshed/3/ruamel/yaml
170-
rm -Rf typeshed/3/schema_salad
169+
typeshed/2and3/ruamel/yaml
170+
rm -Rf typeshed/2and3/schema_salad
171171
ln -s $(shell python3 -c 'from __future__ import print_function; import schema_salad; import os.path; print(os.path.dirname(schema_salad.__file__))') \
172-
typeshed/3/schema_salad
173-
MYPYPATH=typeshed/3 mypy --disallow-untyped-calls \
174-
--warn-redundant-casts --warn-unused-ignores --fast-parser \
172+
typeshed/2and3/schema_salad
173+
MYPYPATH=$MYPYPATH:typeshed/3:typeshed/2and3 mypy --disallow-untyped-calls \
174+
--warn-redundant-casts --warn-unused-ignores \
175175
cwltool
176176

177177
FORCE:

README.rst

+245-15
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,212 @@ The easiest way to use cwltool to run a tool or workflow from Python is to use a
139139

140140
# result["out"] == "foo"
141141

142+
Leveraging SoftwareRequirements (Beta)
143+
--------------------------------------
144+
145+
CWL tools may be decoarated with ``SoftwareRequirement`` hints that cwltool
146+
may in turn use to resolve to packages in various package managers or
147+
dependency management systems such as `Environment Modules
148+
<http://modules.sourceforge.net/>`__.
149+
150+
Utilizing ``SoftwareRequirement`` hints using cwltool requires an optional
151+
dependency, for this reason be sure to use specify the ``deps`` modifier when
152+
installing cwltool. For instance::
153+
154+
$ pip install 'cwltool[deps]'
155+
156+
Installing cwltool in this fashion enables several new command line options.
157+
The most general of these options is ``--beta-dependency-resolvers-configuration``.
158+
This option allows one to specify a dependency resolvers configuration file.
159+
This file may be specified as either XML or YAML and very simply describes various
160+
plugins to enable to "resolve" ``SoftwareRequirement`` dependencies.
161+
162+
To discuss some of these plugins and how to configure them, first consider the
163+
following ``hint`` definition for an example CWL tool.
164+
165+
.. code:: yaml
166+
167+
SoftwareRequirement:
168+
packages:
169+
- package: seqtk
170+
version:
171+
- r93
172+
173+
Now imagine deploying cwltool on a cluster with Software Modules installed
174+
and that a ``seqtk`` module is avaialble at version ``r93``. This means cluster
175+
users likely won't have the ``seqtk`` the binary on their ``PATH`` by default but after
176+
sourcing this module with the command ``modulecmd sh load seqtk/r93`` ``seqtk`` is
177+
available on the ``PATH``. A simple dependency resolvers configuration file, called
178+
``dependency-resolvers-conf.yml`` for instance, that would enable cwltool to source
179+
the correct module environment before executing the above tool would simply be:
180+
181+
.. code:: yaml
182+
183+
- type: module
184+
185+
The outer list indicates that one plugin is being enabled, the plugin parameters are
186+
defined as a dictionary for this one list item. There is only one required parameter
187+
for the plugin above, this is ``type`` and defines the plugin type. This parameter
188+
is required for all plugins. The available plugins and the parameters
189+
available for each are documented (incompletely) `here
190+
<https://docs.galaxyproject.org/en/latest/admin/dependency_resolvers.html>`__.
191+
Unfortunately, this documentation is in the context of Galaxy tool ``requirement`` s instead of CWL ``SoftwareRequirement`` s, but the concepts map fairly directly.
192+
193+
cwltool is distributed with an example of such seqtk tool and sample corresponding
194+
job. It could executed from the cwltool root using a dependency resolvers
195+
configuration file such as the above one using the command::
196+
197+
cwltool --beta-dependency-resolvers-configuration /path/to/dependency-resolvers-conf.yml \
198+
tests/seqtk_seq.cwl \
199+
tests/seqtk_seq_job.json
200+
201+
This example demonstrates both that cwltool can leverage
202+
existing software installations and also handle workflows with dependencies
203+
on different versions of the same software and libraries. However the above
204+
example does require an existing module setup so it is impossible to test this example
205+
"out of the box" with cwltool. For a more isolated test that demonstrates all
206+
the same concepts - the resolver plugin type ``galaxy_packages`` can be used.
207+
208+
"Galaxy packages" are a lighter weight alternative to Environment Modules that are
209+
really just defined by a way to lay out directories into packages and versions
210+
to find little scripts that are sourced to modify the environment. They have
211+
been used for years in Galaxy community to adapt Galaxy tools to cluster
212+
environments but require neither knowledge of Galaxy nor any special tools to
213+
setup. These should work just fine for CWL tools.
214+
215+
The cwltool source code repository's test directory is setup with a very simple
216+
directory that defines a set of "Galaxy packages" (but really just defines one
217+
package named ``random-lines``). The directory layout is simply::
218+
219+
tests/test_deps_env/
220+
random-lines/
221+
1.0/
222+
env.sh
223+
224+
If the ``galaxy_packages`` plugin is enabled and pointed at the
225+
``tests/test_deps_env`` directory in cwltool's root and a ``SoftwareRequirement``
226+
such as the following is encountered.
227+
228+
.. code:: yaml
229+
230+
hints:
231+
SoftwareRequirement:
232+
packages:
233+
- package: 'random-lines'
234+
version:
235+
- '1.0'
236+
237+
Then cwltool will simply find that ``env.sh`` file and source it before executing
238+
the corresponding tool. That ``env.sh`` script is only responsible for modifying
239+
the job's ``PATH`` to add the required binaries.
240+
241+
This is a full example that works since resolving "Galaxy packages" has no
242+
external requirements. Try it out by executing the following command from cwltool's
243+
root directory::
244+
245+
cwltool --beta-dependency-resolvers-configuration tests/test_deps_env_resolvers_conf.yml \
246+
tests/random_lines.cwl \
247+
tests/random_lines_job.json
248+
249+
The resolvers configuration file in the above example was simply:
250+
251+
.. code:: yaml
252+
253+
- type: galaxy_packages
254+
base_path: ./tests/test_deps_env
255+
256+
It is possible that the ``SoftwareRequirement`` s in a given CWL tool will not
257+
match the module names for a given cluster. Such requirements can be re-mapped
258+
to specific deployed packages and/or versions using another file specified using
259+
the resolver plugin parameter `mapping_files`. We will
260+
demonstrate this using `galaxy_packages` but the concepts apply equally well
261+
to Environment Modules or Conda packages (described below) for instance.
262+
263+
So consider the resolvers configuration file
264+
(`tests/test_deps_env_resolvers_conf_rewrite.yml`):
265+
266+
.. code:: yaml
267+
268+
- type: galaxy_packages
269+
base_path: ./tests/test_deps_env
270+
mapping_files: ./tests/test_deps_mapping.yml
271+
272+
And the corresponding mapping configuraiton file (`tests/test_deps_mapping.yml`):
273+
274+
.. code:: yaml
275+
276+
- from:
277+
name: randomLines
278+
version: 1.0.0-rc1
279+
to:
280+
name: random-lines
281+
version: '1.0'
282+
283+
This is saying if cwltool encounters a requirement of ``randomLines`` at version
284+
``1.0.0-rc1`` in a tool, to rewrite to our specific plugin as ``random-lines`` at
285+
version ``1.0``. cwltool has such a test tool called ``random_lines_mapping.cwl``
286+
that contains such a source ``SoftwareRequirement``. To try out this example with
287+
mapping, execute the following command from the cwltool root directory::
288+
289+
cwltool --beta-dependency-resolvers-configuration tests/test_deps_env_resolvers_conf_rewrite.yml \
290+
tests/random_lines_mapping.cwl \
291+
tests/random_lines_job.json
292+
293+
The previous examples demonstrated leveraging existing infrastructure to
294+
provide requirements for CWL tools. If instead a real package manager is used
295+
cwltool has the oppertunity to install requirements as needed. While initial
296+
support for Homebrew/Linuxbrew plugins is available, the most developed such
297+
plugin is for the `Conda <https://conda.io/docs/#>`__ package manager. Conda has the nice properties
298+
of allowing multiple versions of a package to be installed simultaneously,
299+
not requiring evalated permissions to install Conda itself or packages using
300+
Conda, and being cross platform. For these reasons, cwltool may run as a normal
301+
user, install its own Conda environment and manage multiple versions of Conda packages
302+
on both Linux and Mac OS X.
303+
304+
The Conda plugin can be endlessly configured, but a sensible set of defaults
305+
that has proven a powerful stack for dependency management within the Galaxy tool
306+
development ecosystem can be enabled by simply passing cwltool the
307+
``--beta-conda-dependencies`` flag.
308+
309+
With this we can use the seqtk example above without Docker and without
310+
any externally managed services - cwltool should install everything it needs
311+
and create an environment for the tool. Try it out with the follwing command::
312+
313+
cwltool --beta-conda-dependencies tests/seqtk_seq.cwl tests/seqtk_seq_job.json
314+
315+
The CWL specification allows URIs to be attached to ``SoftwareRequirement`` s
316+
that allow disambiguation of package names. If the mapping files described above
317+
allow deployers to adapt tools to their infrastructure, this mechanism allows
318+
tools to adapt their requirements to multiple package managers. To demonstrate
319+
this within the context of the seqtk, we can simply break the package name we
320+
use and then specify a specific Conda package as follows:
321+
322+
.. code:: yaml
323+
324+
hints:
325+
SoftwareRequirement:
326+
packages:
327+
- package: seqtk_seq
328+
version:
329+
- '1.2'
330+
specs:
331+
- https://anaconda.org/bioconda/seqtk
332+
- https://packages.debian.org/sid/seqtk
333+
334+
The example can be executed using the command::
335+
336+
cwltool --beta-conda-dependencies tests/seqtk_seq_wrong_name.cwl tests/seqtk_seq_job.json
337+
338+
The plugin framework for managing resolution of these software requirements
339+
as maintained as part of `galaxy-lib <https://github.com/galaxyproject/galaxy-lib>`__ - a small, portable subset of the Galaxy
340+
project. More information on configuration and implementation can be found
341+
at the following links:
342+
343+
- `Dependency Resolvers in Galaxy <https://docs.galaxyproject.org/en/latest/admin/dependency_resolvers.html>`__
344+
- `Conda for [Galaxy] Tool Dependencies <https://docs.galaxyproject.org/en/latest/admin/conda_faq.html>`__
345+
- `Mapping Files - Implementation <https://github.com/galaxyproject/galaxy/commit/495802d229967771df5b64a2f79b88a0eaf00edb>`__
346+
- `Specifications - Implementation <https://github.com/galaxyproject/galaxy/commit/81d71d2e740ee07754785306e4448f8425f890bc>`__
347+
- `Initial cwltool Integration Pull Request <https://github.com/common-workflow-language/cwltool/pull/214>`__
142348

143349
Cwltool control flow
144350
--------------------
@@ -207,43 +413,67 @@ Extension points
207413
The following functions can be provided to main(), to load_tool(), or to the
208414
executor to override or augment the listed behaviors.
209415

210-
executor(tool, job_order_object, **kwargs)
211-
(Process, Dict[Text, Any], **Any) -> Tuple[Dict[Text, Any], Text]
416+
executor
417+
::
418+
419+
executor(tool, job_order_object, **kwargs)
420+
(Process, Dict[Text, Any], **Any) -> Tuple[Dict[Text, Any], Text]
212421

213422
A toplevel workflow execution loop, should synchronously execute a process
214423
object and return an output object.
215424

216-
makeTool(toolpath_object, **kwargs)
217-
(Dict[Text, Any], **Any) -> Process
425+
makeTool
426+
::
427+
428+
makeTool(toolpath_object, **kwargs)
429+
(Dict[Text, Any], **Any) -> Process
218430

219431
Construct a Process object from a document.
220432

221-
selectResources(request)
222-
(Dict[Text, int]) -> Dict[Text, int]
433+
selectResources
434+
::
435+
436+
selectResources(request)
437+
(Dict[Text, int]) -> Dict[Text, int]
223438

224439
Take a resource request and turn it into a concrete resource assignment.
225440

226-
versionfunc()
227-
() -> Text
441+
versionfunc
442+
::
443+
444+
()
445+
() -> Text
228446

229447
Return version string.
230448

231-
make_fs_access(basedir)
232-
(Text) -> StdFsAccess
449+
make_fs_access
450+
::
451+
452+
make_fs_access(basedir)
453+
(Text) -> StdFsAccess
233454

234455
Return a file system access object.
235456

236-
fetcher_constructor(cache, session)
237-
(Dict[unicode, unicode], requests.sessions.Session) -> Fetcher
457+
fetcher_constructor
458+
::
459+
460+
fetcher_constructor(cache, session)
461+
(Dict[unicode, unicode], requests.sessions.Session) -> Fetcher
238462

239463
Construct a Fetcher object with the supplied cache and HTTP session.
240464

241-
resolver(document_loader, document)
242-
(Loader, Union[Text, dict[Text, Any]]) -> Text
465+
resolver
466+
::
467+
468+
resolver(document_loader, document)
469+
(Loader, Union[Text, dict[Text, Any]]) -> Text
243470

244471
Resolve a relative document identifier to an absolute one which can be fetched.
245472

246473
logger_handler
247-
logging.Handler
474+
::
475+
476+
logger_handler
477+
logging.Handler
248478

249479
Handler object for logging.

cwlref-runner/setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
#!/usr/bin/env python
2-
2+
from __future__ import absolute_import
33
import os
44

55
from setuptools import setup, find_packages

cwltool.py

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
#!/usr/bin/env python
2+
from __future__ import absolute_import
23
"""Convienance entry point for cwltool.
34
45
This can be used instead of the recommended method of `./setup.py install`

cwltool/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1+
from __future__ import absolute_import
12
__author__ = '[email protected]'

cwltool/__main__.py

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
from __future__ import absolute_import
12
import sys
23

34
from . import main

0 commit comments

Comments
 (0)