Closed
Description
I am attempting to associate a secondaryFile
(BAI) with each File
is an array
(of BAMs).
I get desired behavior with the Jan26 release [0], but the following Jan27 release [1] breaks our cwl.
This the the cwl snippet
cwlVersion: "cwl:draft-2"
requirements:
- import: node-engine.cwl
- import: envvar-global.cwl
- class: DockerRequirement
dockerPull: quay.io/___
class: CommandLineTool
inputs:
- id: "#input_bam_path"
type:
type: array
items: File
inputBinding:
prefix: --bam_path
secondaryFiles:
- engine: node-engine.cwl
script: |
{
return {"path": $self.path.slice(0,-4)+".bai", "class": "File"};
}
with --debug
gives the desired bindings with the Jan26 release [2], but not with the Jan27 release [3].
I ran a diff
[4] of the two releases. It looks like secondaryFiles
are now in schema
instead of binding
. Is the post-Jan27 behavior the desired to our cwl?
[2]
{
"secondaryFiles": [
"${\nreturn {\"path\": self.path.slice(0,-4)+\".bai\", \"class\": \"File\"};\n}\n"
],
"prefix": "--bam_path",
"do_eval": {
"path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam",
"class": "File",
"secondaryFiles": [
{
"path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bai",
"class": "File"
}
]
},
"valueFrom": {
"path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam",
"class": "File",
"secondaryFiles": [
{
"path": "/tmp/job633706396_ubuntu/SCRATCH/47b42e81-2500-4ebc-a0c2-acd3187cc2f0_513/test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bai",
"class": "File"
}
]
},
"position": [
0,
0,
"input_bam_path",
"input_bam_path"
]
},
[3]
{
"position": [
0,
0,
"input_bam_path",
"input_bam_path"
],
"prefix": "--bam_path",
"do_eval": {
"path": "/tmp/job557517512_test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam",
"class": "File"
},
"valueFrom": {
"path": "/tmp/job557517512_test/C440.TCGA-IN-8462-01A-11D-2340-08.1.bam",
"class": "File"
}
},
[4]
https://gist.github.com/jeremiahsavage/c82b38027be30eccbf9b47361c8f7fbb
Activity
tetron commentedon May 28, 2016
What happens if you try to run it with the latest cwltool?
You are correct that secondaryFiles moved up one level between draft-2 and draft-3. However, if you are writing draft-2 documents you should be using the draft-2 syntax. Recent versions of cwltool (since 3 weeks ago or so) have tightened up validation for specific spec version. Prior to that cwltool incorrectly accepted invalid documents that blended draft-2 and draft-3 syntax, it is possible that is where you are getting into trouble. Does that help?
jeremiahsavage commentedon May 31, 2016
With latest
cwltool 1.0.20160523144113
the secondaryFile is not picked up.I'll try altering the syntax.
jeremiahsavage commentedon Jun 1, 2016
I've created a self-contained test (just used
echo
) incwl:draft-3
which also doesn't associate secondaryFiles. This is the debug output:and this the cwl
Shenglai commentedon Jun 2, 2016
With latest cwltool 1.0.20160523144113 the secondaryFile is not picked up even with
cwlVersion: "cwl:draft-2"
the cwl is:
and the debug output is:
jeremiahsavage commentedon Jun 2, 2016
Fixed with #91
jeremiahsavage commentedon Jun 2, 2016
Actually, the fix only works if cwl is draft-2, and causes crash if draft-3. So don't merge.
jeremiahsavage commentedon Jun 3, 2016
Now tested successfully with draft-3 using the below cwl and output (also tested with a gatk indelrealigner tool that requires bai for each bam).
cwl:
output
mr-c commentedon Jun 6, 2016
Hello @jeremiahsavage, thank you for your issue and PR.
I'm trying to understand your use case better: As @chapmanb points out in https://groups.google.com/d/msg/common-workflow-language/u9q03lFBHpQ/L9on3M1MAgAJ it would seem that the
secondaryFiles
with a string value of.bai
or^.bai
should meet your needs.jeremiahsavage commentedon Jun 6, 2016
Hi @mr-c . Yes. I've tried using the
.bai
and^.bai
methods suggested. But using latest (post January 26) version ofcwltool
, secondaryFiles were not attached to items of an array. For Jan26, and before the bug does not exist. Perhaps @chapmanb is using a branch prior to Jan26?I used a lot of
print()
statements to see thatsecondaryFiles
was not being passed to each item of the array at the point the patch touches.The fix allows cwl written in
draft-2
ordraft-3
to pass secondaryFiles through arrays.For example.
cwl:draft-2 case:
cwl: https://gist.github.com/jeremiahsavage/f49146e32e098697494d74b20ea10526
latest cwltool debug output (no bai): https://gist.github.com/jeremiahsavage/72172b5eeca50a654ee1732bd131317e
with patch debug output (gets bai): https://gist.github.com/jeremiahsavage/a8b923d9e50ed6ba34875f7ccae4d206
cwl:draft-3 case:
cwl (
^.bai
): https://gist.github.com/jeremiahsavage/760ffdae6a5220d327b997269b5f52eelatest cwltool debug output (no bai): https://gist.github.com/jeremiahsavage/760ffdae6a5220d327b997269b5f52ee
with patch debug output (gets bai): https://gist.github.com/jeremiahsavage/f9b3e989d5d185015e1d7317e14591e1
chapmanb commentedon Jun 7, 2016
Jeremiah;
I'm using 1.0.20160427142240 from bioconda (https://bioconda.github.io/). I'm not sure about the very latest version, Peter would be most helpful on assessing that. It would be useful to know if you still run into issues with the version in bioconda. You can install with:
Sorry to not have a good idea why you're seeing this but hopefully this helps some.
jeremiahsavage commentedon Jun 7, 2016
Hi Brad and Michael,
I've created a true minimal test case to show the issue and the fix. In this test, the process will actually fail instead of just showing the pickup of the secondaryFile in the
--debug
output I was reporting before.Three step instructions at:
https://github.com/jeremiahsavage/array_secondary
Docker has to be involved, as when docker is not use, paths are not redirected (to something like
/var/lib/cwl/job445880475
), so the issue of secondaryFiles not passing, is masked.I've tested with the latest bioconda (to show issue) and with the fix (in a python virtualenv). The output is in the
README.md
mr-c commentedon Jun 7, 2016
Thank you Jeremiah for your detailed debugging. We are a bit swamped with
getting some other changes done in time for 1.0 of the standard. My work
day is over; but I will take a look tomorrow.
Pe 7 iun. 2016 8:09 p.m., "Jeremiah H. Savage" notifications@github.com a
scris:
jeremiahsavage commentedon Jun 23, 2016
Just a ping. It would be great if this was fixed in 1.0
test: https://github.com/jeremiahsavage/array_secondary
fix: #91
mr-c commentedon Oct 6, 2016
Fix is now in #170 (but needs some assistance)
kmhernan commentedon Oct 6, 2016
@mr-c I am having this issue with v1.0 and cwltool 1.0.20160913171024
For example:
Will not bind the secondaryFiles and no information about them is available in the
--debug
output.8 remaining items