Description
Bug Report
Description
I have dvc setup in the root of my project folder, which is at
C:\Users\raylu\Documents\Github\audit-engine
the stage file is established in
resources\WI_Ozaukee_20201103\dvc\precheck\dvc.yaml
I issue this command:
dvc status -R -v -v -v --show-json resources\WI_Ozaukee_20201103\dvc
And I expect that it will walk the subtree under
C:\Users\raylu\Documents\Github\audit-engine\resources\WI_Ozaukee_20201103\dvc
to look for dvc.yaml stage files. Instead, it appears to walk the full tree below
C:\Users\raylu\Documents\Github\audit-engine
and this takes 75 seconds (there is 112 GB of data).
But this is just a hunch. We temporarily moved the .dvc folder to inside the folder
C:\Users\raylu\Documents\Github\audit-engine\resources\WI_Ozaukee_20201103\dvc
and it takes only 5.6 seconds (which is still pretty long). This should probably take only a second or two, because getting the etags from the three s3 files is very fast and it needs only to find one stage file. It seems something is wrong here.
Reproduce
To reproduce this, dvc must be configured with no scm, no remote, no cache and use -R in status, so it can find the dvc.yaml stage files. We have only one.
Expected
See above.
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.6.4 (pip)
---------------------------------
Platform: Python 3.7.6 on Windows-10-10.0.19041-SP0
Supports:
http (requests = 2.24.0),
https (requests = 2.24.0),
s3 (s3fs = 2021.8.0, boto3 = 1.17.106)
Additional Information (if any):
I will attach the profile dump and plot.
Profile Dump
https://cdn.discordapp.com/attachments/882823608949411850/884465153716920380/dump.prof
https://cdn.discordapp.com/attachments/882823608949411850/884467942111203348/image_output.png