Skip to content

Required secondaryFiles outputs do not work in CWL CommandLineTools which run in Docker #1869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ThomasHickman opened this issue Jun 30, 2023 · 4 comments · Fixed by #1870
Closed

Comments

@ThomasHickman
Copy link
Member

Expected Behavior

If you have a CommandLineTool with an output of something like:

  output:
    type: File
    secondaryFiles:
      - pattern: ^.ext2
        required: true
    outputBinding:
      glob: file.ext1

cwltool should succeed if output files with extensions .ext1 and .ext2 exist in the output folder, when using:

requirements:
  DockerRequirement:
    dockerPull: ubuntu:20.04

Actual Behavior

This causes an error.

Workflow Code

#!/usr/bin/env cwl-runner

cwlVersion: v1.2
class: CommandLineTool
requirements:
  DockerRequirement:
    dockerPull: ubuntu:20.04

baseCommand:
    - bash
    - -c
    - touch file.ext1 && touch file.ext2

inputs: []


outputs:
  output:
    type: File
    secondaryFiles:
      - pattern: ^.ext2
        required: true
    outputBinding:
      glob: file.ext1

Full Traceback

$ cwltool --debug test.cwl
INFO /home/user/.local/bin/cwltool 3.1.20230624081518
INFO Resolved 'test.cwl' to 'file:///home/user/test.cwl'
DEBUG Parsed job order from command line: {
    "id": "test.cwl"
}
DEBUG [job test.cwl] initializing from file:///home/user/test.cwl
DEBUG [job test.cwl] {}
DEBUG [job test.cwl] path mappings is {}
DEBUG [job test.cwl] command line bindings is [
    {
        "position": [
            -1000000,
            0
        ],
        "datum": "bash"
    },
    {
        "position": [
            -1000000,
            1
        ],
        "datum": "-c"
    },
    {
        "position": [
            -1000000,
            2
        ],
        "datum": "touch file.ext1 && touch file.ext2"
    }
]
DEBUG [job test.cwl] initial work dir {}
INFO [job test.cwl] /tmp/hr1231gc$ docker \
    run \
    -i \
    --mount=type=bind,source=/tmp/hr1231gc,target=/ejjPTP \
    --mount=type=bind,source=/tmp/m9f50qdw,target=/tmp \
    --workdir=/ejjPTP \
    --read-only=true \
    --net=none \
    --user=1000:1000 \
    --rm \
    --cidfile=/tmp/6_1oisyz/20230630162631-435717.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/ejjPTP \
    ubuntu:20.04 \
    bash \
    -c \
    'touch file.ext1 && touch file.ext2'
INFO [job test.cwl] Max memory used: 0MiB
ERROR [job test.cwl] Job error:
("Error collecting output for parameter 'output': test.cwl:20:5: Missing required secondary file '/ejjPTP/file.ext2'", {})
WARNING [job test.cwl] completed permanentFail
DEBUG [job test.cwl] outputs {}
DEBUG [job test.cwl] Removing input staging directory /tmp/ov123yse
DEBUG [job test.cwl] Removing temporary directory /tmp/m9f50qdw
DEBUG Removing intermediate output directory /tmp/hr1231gc
{}WARNING Final process status is permanentFail

ERROR [job test.cwl] Job error:
("Error collecting output for parameter 'output': test.cwl:20:5: Missing required secondary file '/BBMzhN/file.ext2'", {})
WARNING [job test.cwl] completed permanentFail
DEBUG [job test.cwl] outputs {}
DEBUG [job test.cwl] Removing input staging directory /tmp/bhsgvfzk
DEBUG [job test.cwl] Removing temporary directory /tmp/afj9_yrv
DEBUG Removing intermediate output directory /tmp/h6ucwttr
{}
WARNING Final process status is permanentFail

Your Environment

  • cwltool version: 3.1.20220830195442
@mr-c
Copy link
Member

mr-c commented Jul 1, 2023

Hello @ThomasHickman

  • cwltool version: 3.1.20220830195442

Can you try a more recent release? The latest version is dated 2023-06-24: https://pypi.org/project/cwltool/3.1.20230624081518/

@mr-c
Copy link
Member

mr-c commented Jul 1, 2023

I tried with the latest version and can confirm this incorrect behavior with docker (version 24.0.2), --podman (version 4.3.1), --singularity (version 3.11.0), and --udocker (version 1.3.10).

@mr-c
Copy link
Member

mr-c commented Jul 1, 2023

Even old cwltool has this problem, it seems. I tested version 3.1.20211107152837.

Changing the syntax a bit fixes it.

Note from the CWL v1.2 standard:

Default value for required field is true for secondary files on input and false for secondary files on output.

outputs:
  output:
    type: File
    secondaryFiles:
      - ^.ext2

Even though it is not required, it is still found:

INFO [job 1869.cwl] completed success
{
    "output": {
        "location": "file:///home/michael/cwltool/file.ext1",
        "basename": "file.ext1",
        "class": "File",
        "checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
        "size": 0,
        "secondaryFiles": [
            {
                "basename": "file.ext2",
                "location": "file:///home/michael/cwltool/file.ext2",
                "class": "File",
                "checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
                "size": 0,
                "path": "/home/michael/cwltool/file.ext2"
            }
        ],
        "path": "/home/michael/cwltool/file.ext1"
    }
}

@mr-c
Copy link
Member

mr-c commented Jul 1, 2023

Okay, here is a fix; I'll add a test and a new cwltool release and I'll submit a new conformance test to be included in CWL v1.2.1+ ; thank you @ThomasHickman !

diff --git a/cwltool/command_line_tool.py b/cwltool/command_line_tool.py
index 53805b00..854d2725 100644
--- a/cwltool/command_line_tool.py
+++ b/cwltool/command_line_tool.py
@@ -1454,7 +1454,10 @@ class CommandLineTool(Process):
                                         continue
                                     if isinstance(sfitem, str):
                                         sfitem = {"path": pathprefix + sfitem}
-                                    if not fs_access.exists(sfitem["path"]) and sf_required:
+                                    if (
+                                        not fs_access.exists(revmap(sfitem)["location"])
+                                        and sf_required
+                                    ):
                                         raise WorkflowException(
                                             "Missing required secondary file '%s'"
                                             % (sfitem["path"])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants