Closed
Description
Bug Report
Description
Can't pull all files from azure remote. The files that fail do exist on the remote. This happens for images and text data.
dvc pull -v
for one of the files:
2022-04-20 14:31:37,043 ERROR: failed to transfer 'md5: 2b764da58921d765c111b8dfddb181d2'
------------------------------------------------------------
Traceback (most recent call last):
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\data\transfer.py", line 25, in wrapper
func(fs_path, *args, **kwargs)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\data\transfer.py", line 162, in func
return dest.add(
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\objects\db.py", line 117, in add
self._add_file(
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\objects\db.py", line 89, in _add_file
return fs.utils.transfer(
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\utils.py", line 96, in transfer
_try_links(links, from_fs, from_path, to_fs, to_path)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\utils.py", line 66, in _try_links
return _copy(from_fs, from_path, to_fs, to_path)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\utils.py", line 47, in _copy
return from_fs.download_file(from_path, to_path)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\base.py", line 292, in download
return self._download_file(from_info, to_info, callback=callback)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\base.py", line 354, in _download_file
self.get_file(from_info, tmp_file, callback=callback)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 251, in get_file
total: int = self.getsize(from_info)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\base.py", line 177, in getsize
return self.info(path).get("size")
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 106, in info
return self.fs.info(path)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\adlfs\spec.py", line 627, in info
return sync(self.loop, self._info, path, refresh)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\fsspec\asyn.py", line 65, in sync
raise return_result
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\fsspec\asyn.py", line 25, in _runner
result[0] = await coro
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\adlfs\spec.py", line 654, in _info
out = await self._ls(path, invalidate_cache=invalidate_cache, **kwargs)
File "C:\Users\user\Miniconda3\envs\test\lib\site-packages\adlfs\spec.py", line 876, in _ls
raise FileNotFoundError
FileNotFoundError
------------------------------------------------------------
I tried different versions of dvc, error message is for this one:
DVC version: 2.10.1 (conda)
---------------------------------
Platform: Python 3.9.12 on Windows-10-10.0.19042-SP0
Supports:
azure (adlfs = 2022.4.0, knack = 0.6.3, azure-identity = 1.9.0),
webhdfs (fsspec = 2022.3.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: hardlink
Cache directory: NTFS on C:\
Caches: local
Remotes: azure
Workspace directory: NTFS on C:\
Repo: dvc, git
Remotes: azure
Workspace directory: NTFS on F:\
Repo: dvc, git