-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Wildcards in pipeline dependencies #5252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You can specify what to ignore in the Regarding the globbing, I am not sure as it might hurt reproducibility. |
Of course, I should have thought of that! That actually solves my problem as I can just add I will just close the issue since I don't need it anymore but it could still be useful for someone who has multiple file types in a directory and only wants to specify a subset of them - not sure how niche that would be though. I think it would actually be better for reproducibility as it's easier to glob than specify individual files in that directory and maintain that list over time in the |
DVC works best if the outputs and the dependencies are immutable. Adding glob might make it inconsistent as it now depends on the workspace. We might need to support this on #331, but I am not sure of other scenarios where it could be useful (for example, in yours, you just needed to specify the directory as a depenendency). |
Uh oh!
There was an error while loading. Please reload this page.
In pipeline dependencies, it would be a lot easier and more robust if we could include wildcards. For instance, let's say I have a
data
step in my pipeline which makes use of some source files. I'd like to specify all relevant python files in a given directory as being dependencies:dvc.yaml
:However, that doesn't work as wildcards aren't recognised. Instead, I can specify the directory itself (
src/data
) but this appears to include__pycache__
and.ipynbcheckpoints
(and anything else that may be in that directory) which is obviously not desired behaviour and results in the dvc hash being different to someone who has freshly cloned the repo.The text was updated successfully, but these errors were encountered: