Skip to content

Use existing location if a secondary file has one. #1380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

DailyDreaming
Copy link
Contributor

@DailyDreaming DailyDreaming commented Nov 30, 2020

This allows Toil to pass v1.2 conformance test 307.

In Toil, we pass the following (example) inputs to the executor:

{
    "inputWithSecondary": {
        "class": "File",
        "location": "toilfs:0:files/no-job/file-1120f3b05d62444083a9ff5c60c7f772/secondary_file_test.txt",
        "size": 0,
        "basename": "secondary_file_test.txt",
        "nameroot": "secondary_file_test",
        "nameext": ".txt",
        "secondaryFiles": [
            {
                "location": "toilfs:0:files/no-job/file-ccead684521d44f98ff81b118f37159b/secondary_file_test.txt.accessory",
                "class": "File",
                "basename": "secondary_file_test.accessory",
                "nameroot": "secondary_file_test",
                "nameext": ".accessory",
                "size": 0
            }
        ]
    }
}

When cwltool makes its builder, it swaps:

toilfs:0:files/no-job/file-ccead684521d44f98ff81b118f37159b/secondary_file_test.txt.accessory

for

toilfs:0:files/no-job/file-1120f3b05d62444083a9ff5c60c7f772/secondary_file_test.txt.accessory

The bolded portion is the same basedir as the primary file, which creates a path to a file that does not exist.

It does this here:

d_location[0 : d_location.rindex("/") + 1]

This seems to be because cwltool:

  1. Assumes that primary and secondary files are in the same directory and thus have the same interchangeable basedir in the toilfs: schema.
  2. Does not check to see if secondaryfiles have been resolved yet. (the list has resolved secondary files in toil because we process with a builder already before it runs a builder again in cwltool)

Two possible proposed solutions:

  • The easier solution imo (this PR) is to check for an already processed secondary file before fabricating a new secondary file's entry.
  • An alternative solution might be to force toil to place secondary files at the same basedir as the primary file if the first solution isn't preferable.

@codecov
Copy link

codecov bot commented Nov 30, 2020

Codecov Report

Merging #1380 (f40a017) into main (78fe9d4) will decrease coverage by 0.07%.
The diff coverage is 7.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1380      +/-   ##
==========================================
- Coverage   54.25%   54.17%   -0.08%     
==========================================
  Files          42       42              
  Lines        7767     7780      +13     
  Branches     1976     1984       +8     
==========================================
+ Hits         4214     4215       +1     
- Misses       2970     2981      +11     
- Partials      583      584       +1     
Impacted Files Coverage Δ
cwltool/builder.py 45.48% <7.14%> (-1.25%) ⬇️
cwltool/job.py 41.05% <0.00%> (-0.30%) ⬇️
cwltool/process.py 69.46% <0.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78fe9d4...f40a017. Read the comment docs.

@mr-c mr-c requested a review from tetron November 30, 2020 22:46
Copy link
Member

@mr-c mr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DailyDreaming
Copy link
Contributor Author

@mr-c The linting should be addressed (although the checking is perhaps a bit verbose; please advise if I you'd prefer I handle it differently). One test is failing because run_test.sh in cwl-v1.2 no longer has executable permissions. I made a PR here and tests should pass once it's merged: common-workflow-language/cwl-v1.2#74

@DailyDreaming DailyDreaming requested a review from mr-c December 7, 2020 05:16
@DailyDreaming DailyDreaming force-pushed the check-for-processed-secondary-files branch 2 times, most recently from 008e6e6 to bf2aa5d Compare December 7, 2020 06:30
Cruft.

Cruft.
@DailyDreaming DailyDreaming force-pushed the check-for-processed-secondary-files branch from f1959ff to 0434297 Compare December 7, 2020 06:40
@mr-c
Copy link
Member

mr-c commented Dec 7, 2020

We are out of Travis CI coins, sorry @DailyDreaming ; If you have Travis CI coins in your personal account, you could fork the repo and run them there

@DailyDreaming
Copy link
Contributor Author

@mr-c Sorry I used your coins. Will be much more conservative about running tests in the future.

All tests passed when run earlier except for the cwl-v1.2 tests, which passed locally for me after the execution bit fix. I'll just wait on Travis here then.

@mr-c
Copy link
Member

mr-c commented Dec 7, 2020

@DailyDreaming Don't worry about it! There are a lot of projects that pull from the same bucket of coins.

@DailyDreaming
Copy link
Contributor Author

Seems like this may have been fixed already. Updating cwltool in Toil seems to have resolved this issue. Closing!

@DailyDreaming DailyDreaming deleted the check-for-processed-secondary-files branch December 17, 2020 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants