Skip to content

Stating dependencies between scripts/modules #1401

Closed
@erdnaavlis

Description

@erdnaavlis

Hello!

I hope the explanation below is clear. Please let me know otherwise.

Say I have a utils.py with some awesome helpful classes that I reuse frequently in a certain dvc tracked repo.

Say I have:

  • script1.py that uses utils.py and that takes data0.csv and processes it to data1.csv
  • script2.py that uses utils.py and that takes data1.csv and processes it to data2.csv
  • script3.py that uses utils.py and that takes data2.csv and processes it to data3.csv
  • etc ...

(In the example above all scripts are part of the same pipeline but it they could be from different pipelines.)

The point that perhaps could be improved is that, as far as I know, for each data*.csv I have to add to its dependencies the correspondent script and utils.py. And of course that this can cascade if utils.py depends on utils1.py which depends on utils2.py, etc... If that is the case, then I have to remember to, every time utils.py is a dependency, to include the others utils*.py as dependencies as well.

Is there a way in the dvc to say that a scriptB.py depends on scriptA.py so that every time scriptB.py is a dependency, then scriptA.py is also an implicit dependency?
Like a variant or an alternative to dvc run where the "output" is not a data file but a .py?

Thanks is advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions