-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Description of the bug
It is now over a month that I handle my data with fetchngs
and I am pretty satisfied with the results. However, I recently encountered some difficulties when trying to force data download via sratools. Previously everything worked fine (in this context previously refers to the month May) but I had to reprocess and thus redownload some of the samples which resulted in pipeline fails due to error when fetching the data with prefetch
. I vaguely remember reading somewhere that the SRA has made changes to its data storage policies or similar around beginning of June and the error I get as well as the timing (i.e. rerunning the same pipe command with as in May in June) is quite a hint towards a connection to this change. Looking at the .command.log
file of the respective jobs reveals the core of the issue where prefetch
will not download the typical *.sra
file but something called *.sralite
which is not recognized by the subsequent vdb-validate
command as prefetch
just puts it in the temp directory and not in the ./temp_dir/SRAsomething
directory as expected by vdb-validate
. This in turn causes the pipeline to fail. I haven't looked into it further as to if vdb-validate
also excepts the *.sralite
file and the problem being resolved by just checking if prefetch generates the expected folder or the *.sralite
file and handling the cases accordingly. However, downloading the failing samples via the ENA FTP is still possible so a temporary fix is downloading everything I can with sratools
and fetching the rest from the FTP.
Command used and terminal output
nextflow run nf-core/fetchngs ... --force_sratools_download
2022-06-24T14:44:39 prefetch.2.11.0 int: self NULL while reading file within network system module - cannot Make Compute Environment Token
2022-06-24T14:44:40 prefetch.2.11.0: 1) Downloading 'ERR1141695.sralite'...
2022-06-24T14:44:40 prefetch.2.11.0: Downloading via HTTPS...
|-------------------------------------------------- 100%
2022-06-24T14:45:04 prefetch.2.11.0: HTTPS download succeed
2022-06-24T14:45:05 prefetch.2.11.0: 'ERR1141695.sralite' is valid
2022-06-24T14:45:05 prefetch.2.11.0: 1) 'ERR1141695.sralite' was downloaded successfully
2022-06-24T14:45:06 vdb-validate.2.11.0 info: 'ERR1141695' could not be found
Relevant files
No response
System information
No response