Skip to content

dvc gc -a fails if dvc.yaml has the wrong format in a branch #3885

Closed
@courentin

Description

@courentin

Hello !

When running dvc gc -a -v, I have this error:

2020-05-27 12:41:57,820 ERROR: 'dvc.yaml' format error: extra keys not allowed @ data['stages']['eval_classifier_fr']['metrics_no_cache']
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/dvcfile.py", line 113, in validate
    cls.SCHEMA(d)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 272, in __call__
    return self._compiled([], data)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 594, in validate_dict
    return base_validate(path, iteritems(data), out)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/voluptuous/schema_builder.py", line 432, in validate_mapping
    raise er.MultipleInvalid(errors)
voluptuous.error.MultipleInvalid: extra keys not allowed @ data['stages']['eval_classifier_fr']['metrics_no_cache']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/main.py", line 53, in main
    ret = cmd.run()
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/command/gc.py", line 57, in run
    workspace=self.args.workspace,
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 25, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/gc.py", line 73, in gc
    jobs=jobs,
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 297, in used_cache
    for stage, filter_info in pairs:
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 293, in <genexpr>
    for target in targets
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 235, in collect_granular
    return [(stage, None) for stage in self.stages]
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 437, in stages
    return self._collect_stages()
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/repo/__init__.py", line 454, in _collect_stages
    stage_loader = Dvcfile(self, path).stages
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/dvcfile.py", line 222, in stages
    data, _ = self._load()
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/dvcfile.py", line 106, in _load
    self.validate(d, self.relpath)
  File "/Users/***/.local/share/virtualenvs/speech-XlnoNrSS/lib/python3.6/site-packages/dvc/dvcfile.py", line 115, in validate
    raise StageFileFormatError(f"'{fname}' format error: {exc}")
dvc.stage.exceptions.StageFileFormatError: 'dvc.yaml' format error: extra keys not allowed @ data['stages']['eval_classifier_fr']['metrics_no_cache']
------------------------------------------------------------

I figured out that on some older branches that I have on my local computer, my dvc_yaml has the key metrics_no_cache which is wrong.

I would expect the dvc gc -a to be resilient to formatting errors in the dvc.yaml format in older branches, or at least having a clearer error message (I need to run my debugger to find on which branch the dvc.yaml is wrong).

Here is my config:

DVC version: 1.0.0a5
Python version: 3.6.8
Platform: Darwin-19.3.0-x86_64-i386-64bit
Binary: False
Package: pip
Supported remotes: http, https, s3
Cache: reflink - supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('apfs', '/dev/disk1s2')
Repo: dvc, git
Filesystem type (workspace): ('apfs', '/dev/disk1s2')

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementEnhances DVCp2-mediumMedium priority, should be done, but less important

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions