Skip to content

dvc run with --overwrite-dvcfile is not caching runs #2843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SdgJlbl opened this issue Nov 25, 2019 · 4 comments · Fixed by iterative/dvc.org#832
Closed

dvc run with --overwrite-dvcfile is not caching runs #2843

SdgJlbl opened this issue Nov 25, 2019 · 4 comments · Fixed by iterative/dvc.org#832
Labels
awaiting response we are waiting for your reply, please respond! :) question I have a question?

Comments

@SdgJlbl
Copy link
Contributor

SdgJlbl commented Nov 25, 2019

Please provide information about your setup
DVC version 0.70.0
Ubuntu 18.04
Installed via pip in a conda env

According to the documentation, and it was the case in previous versions of DVC to the best of my knowledge, dvc run should not re-execute the command if nothing changed. It is no longer the case.

To reproduce:

mkdir tmp
cd tmp
git init
dvc init
echo "echo 'RUNNING'; echo 'hello' > output.out" > cmd.sh
chmod +x cmd.sh
dvc run --overwrite-dvcfile -o output.out ./cmd.sh 
# this print "RUNNING" as the command is executed
dvc run --overwrite-dvcfile -o output.out ./cmd.sh 
# the run should be cached but it's not and "RUNNING" is printed again
# expected output: "Stage is cached, skipping."

@ghost ghost added the triage Needs to be triaged label Nov 25, 2019
@efiop efiop added awaiting response we are waiting for your reply, please respond! :) question I have a question? labels Dec 2, 2019
@ghost ghost removed the triage Needs to be triaged label Dec 2, 2019
@efiop
Copy link
Contributor

efiop commented Dec 2, 2019

Hi @SdgJlbl !

Sorry for the delay, the notification got lost in the flow 🙁 The command that you are using doesn't have any dependencies, so it is considered as a "callback stage", meaning that it is considered as always changed. Here is the piece of code that is responsible https://github.com/iterative/dvc/blob/0.71.0/dvc/stage.py#L527 and it has been like that for a very long time. Regular "non-callback" stages are still using build cache. Btw, which doc were you talking about? Could you post a link, please? Which dvc version did you upgrade from?

@SdgJlbl
Copy link
Contributor Author

SdgJlbl commented Dec 3, 2019

Thank you for your reply, I have checked the behaviour with a dependency and it is actually behaving as it should.

I was refering to the reference documentation https://dvc.org/doc/command-reference/run. It states that :

if an exactly equal DVC-file exists (same list of outputs and inputs, the same command to run), which has been already executed and is up to date, dvc run won't normally execute the command again (thus "build cache")

with no explicit mention that a step with no dependency will never be cached.

It was not the case in the version 0.34, where even step with no dependency were cached, hence my tests breaking :)

@efiop
Copy link
Contributor

efiop commented Dec 3, 2019

@SdgJlbl Thanks for the info! Created iterative/dvc.org#832 to clarify that in the docs.

Indeed, 0.34 didn't have that logic in place. Do you still need build cache for no-deps dvc-files? If so, coudl you describe your scenario a bit more, please?

@SdgJlbl
Copy link
Contributor Author

SdgJlbl commented Dec 3, 2019

Actually, in my real pipeline, I never use dvc without dependencies, it was just a use-case popping in some tests where we didn't bother adding a dependency as input (because uh tests ^^).
But I understand the logic behind considering that one probably always want to execute a no-deps step, thanks for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) question I have a question?
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants