Skip to content

run: run is computing checksums even though --no-exec is specified #5368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Honzys opened this issue Jan 29, 2021 · 6 comments
Closed

run: run is computing checksums even though --no-exec is specified #5368

Honzys opened this issue Jan 29, 2021 · 6 comments
Labels
awaiting response we are waiting for your reply, please respond! :)

Comments

@Honzys
Copy link
Contributor

Honzys commented Jan 29, 2021

Bug Report

run: run cache is not ignored when --no-exec

Description

When you specify --no-exec param, dvc run command should be treated asi if the --no-run-cache was also specified. I guess it makes no sense to check the cache when --no-exec is toggled, or am I wrong?

The problem is when I run dvc run --no-exec ... with dependency pointing to a large folder, it takes a long time to execute this command. Also after the stage is created and I actually want to run the stage using dvc repro it takes a long time to execute that command too.

Our use-case is that we generate all the stages with --no-exec param before we run it for real. If we somehow change the pipeline we would rerun the "generation of all stages with --no-exec param" and then again run it for real after it's refreshed.

Would it makes sense to set the --no-run-cache to True when --no-exec is specified? Or am I overlooking something?

Thank you very much !

Reproduce

Example:

  1. Create a large folder, ideally with lots of images
  2. dvc run --no-exec --force --deps folder_path -n stage # This should be done in instance
  3. dvc repro stage # This should take some time to compute checksum for dependencies
  4. dvc run --no-exec --force --deps folder_path -n stage # This should be done in instance (but it actually will take some time)

Expected

I would expect that steps 2 and 4 would be completed aproximately in the same amount of time.

Environment information

Output of dvc version:

DVC version: 1.11.13 (pip)
---------------------------------
Platform: Python 3.6.8 on Linux-4.18.0-259.el8.x86_64-x86_64-with-centos-8
Supports: http, https, s3
Cache types: reflink, hardlink, symlink
Cache directory: xfs on *********
Caches: local
Remotes: s3
Workspace directory: xfs on *********
Repo: dvc, git
@efiop
Copy link
Contributor

efiop commented Feb 1, 2021

Hi @Honzys !

I'm not able to reproduce the issue with 1.11.13 or newer dvc. I see that the commands you've provided are not the actual commands that you've used (e.g. there is no command in the dvc run), so probably something important is missing here. Could you provide actual commands or provide a reproducer script that actually runs, please?

@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Feb 1, 2021
@Honzys
Copy link
Contributor Author

Honzys commented Feb 1, 2021

@efiop Thank you for your response.

I am sorry, I wrote the reproducer as a "pseudocode" just to show the basic idea.

Unfortunately I cannot create exact reproducer to this issue (since I cannot provide you the dependency folder). But I can show you some other info.

When I run the dvc run as in the third step mentioned above with verbose mode and let's asume it was already reproduced once with dvc repro, this is the output:

$ dvc run --verbose -n some_stage --deps folder_on_nas --wdir . --no-exec --force whatever_command_we_want
2021-02-01 19:36:15,560 DEBUG: Check for update is enabled.
2021-02-01 19:36:15,573 DEBUG: fetched: [(3,)]                        
2021-02-01 19:36:15,598 DEBUG: Assuming '/opt/dvc_cache/d3/a47aba9c5976a1635bce713740ac31' is unchanged since it is read-only
2021-02-01 19:36:15,599 DEBUG: Path '/home/dev/output.h5' inode '5980356654'
2021-02-01 19:36:15,599 DEBUG: fetched: [('1612034172963527168', '4917983', 'd3a47aba9c5976a1635bce713740ac31', '1612207342449105664')]
^C2021-02-01 19:38:03,992 DEBUG: fetched: [(76,)]
2021-02-01 19:38:04,026 ERROR: interrupted by the user
------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/venv/lib/python3.6/site-packages/dvc/main.py", line 90, in main
    ret = cmd.run()
  File "/opt/venv/lib/python3.6/site-packages/dvc/command/run.py", line 60, in run
    desc=self.args.desc,
  File "/opt/venv/lib/python3.6/site-packages/dvc/repo/__init__.py", line 54, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/venv/lib/python3.6/site-packages/dvc/repo/scm_context.py", line 4, in run
    result = method(repo, *args, **kw)
  File "/opt/venv/lib/python3.6/site-packages/dvc/repo/run.py", line 117, in run
    if kwargs.get("run_cache", True) and stage.can_be_skipped:
  File "/opt/venv/lib/python3.6/site-packages/dvc/stage/__init__.py", line 387, in can_be_skipped
    if self.is_cached and not self.is_callback and not self.always_changed:
  File "/opt/venv/lib/python3.6/site-packages/dvc/stage/__init__.py", line 681, in is_cached
    return self.name in self.dvcfile.stages and super().is_cached
  File "/opt/venv/lib/python3.6/site-packages/dvc/stage/__init__.py", line 405, in is_cached
    self.save_deps()
  File "/opt/venv/lib/python3.6/site-packages/dvc/stage/__init__.py", line 443, in save_deps
    dep.save()
  File "/opt/venv/lib/python3.6/site-packages/dvc/output/base.py", line 276, in save
    self.hash_info = self.get_hash()
  File "/opt/venv/lib/python3.6/site-packages/dvc/output/base.py", line 186, in get_hash
    return self.tree.get_hash(self.path_info)
  File "/opt/venv/lib/python3.6/site-packages/funcy/decorators.py", line 39, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/opt/venv/lib/python3.6/site-packages/dvc/tree/base.py", line 45, in use_state
    return call()
  File "/opt/venv/lib/python3.6/site-packages/funcy/decorators.py", line 60, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/opt/venv/lib/python3.6/site-packages/dvc/tree/base.py", line 271, in get_hash
    hash_info = self.state.get(path_info)
  File "/opt/venv/lib/python3.6/site-packages/dvc/state.py", line 446, in get
    actual_mtime, actual_size = get_mtime_and_size(path, self.tree)
  File "/opt/venv/lib/python3.6/site-packages/dvc/utils/fs.py", line 40, in get_mtime_and_size
    stats = tree.stat(file_path)
  File "/opt/venv/lib/python3.6/site-packages/dvc/tree/local.py", line 162, in stat
    return os.stat(path)
KeyboardInterrupt
------------------------------------------------------------
2021-02-01 19:38:04,062 DEBUG: Analytics is enabled.
2021-02-01 19:38:04,121 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpl552fx8n']'
2021-02-01 19:38:04,123 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpl552fx8n']'

As you can see, it is stucked for quite some time on computing hash (I guess for the dependencies provided), till I interrupted it. Since the dependency is folder on our NAS with a lot of small images it should take quite some time to compute checksum for it. But the question is - Should the hash be computed even though the --no-exec param is provided?

Shouldn't this line https://github.com/iterative/dvc/blob/1.11.13/dvc/repo/run.py#L117
be changed from:

if kwargs.get("run_cache", True) and stage.can_be_skipped:

to

if not no_exec and kwargs.get("run_cache", True) and stage.can_be_skipped:

If I run the exact same stage with --no-run-cache it is completed instantly - which is what I would expect.

$ dvc run --verbose -n some_stage --deps folder_on_nas --wdir . --no-exec --no-run-cache --force whatever_command_we_want
2021-02-01 19:53:33,687 DEBUG: Check for update is enabled.
2021-02-01 19:53:33,700 DEBUG: fetched: [(3,)]
Modifying stage 'some_stage' in 'dvc.yaml'

To track the changes with git, run:

        git add dvc.yaml
2021-02-01 19:53:34,154 DEBUG: fetched: [(76,)]
2021-02-01 19:53:34,190 DEBUG: Analytics is enabled.
2021-02-01 19:53:34,284 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp_4thwkz3']'
2021-02-01 19:53:34,286 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp_4thwkz3']'

IMHO checking the hashes are unnecessary since the stage shouldn't be actually ran, because --no-exec param is set on.

Please let me know if you need more info on this, I am not sure that I described the issue clearly.

@Honzys Honzys changed the title run: run cache is not ignored when --no-exec is specified run: run is computing checksums even though --no-exec is specified Feb 1, 2021
@efiop
Copy link
Contributor

efiop commented Feb 1, 2021

Thank you @Honzys ! That makes sense! That's an old piece of legacy caching that we were meaning to get rid of. We'll bypass it in 1.11.x and will likely get rid of it in 2.0 soon. Thank you for the feedback! 🙏

@efiop
Copy link
Contributor

efiop commented Feb 1, 2021

@Honzys Btw, would you like to submit a PR for 1.11 branch with that not no_exec fix you've tried? 🙂

@Honzys
Copy link
Contributor Author

Honzys commented Feb 1, 2021

@Honzys Btw, would you like to submit a PR for 1.11 branch with that not no_exec fix you've tried?

Yes, I can submit a PR.

Honzys added a commit to Honzys/dvc that referenced this issue Feb 1, 2021
Prevent computing hashes of dependencies when
no_exec is set.

Fixes iterative#5368
efiop pushed a commit that referenced this issue Feb 2, 2021
Prevent computing hashes of dependencies when
no_exec is set.

Fixes #5368

Co-authored-by: Jan Stratil <[email protected]>
@efiop efiop mentioned this issue Feb 2, 2021
11 tasks
@efiop
Copy link
Contributor

efiop commented Feb 2, 2021

1.11 is fixed by #5380 and 2.0 will remove that logic in #4841

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :)
Projects
None yet
Development

No branches or pull requests

2 participants