Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Dataset Validation script #138

Closed
@JackKelly

Description

@JackKelly

Detailed Description

A script which goes through all the pre-prepared batches and checks:

  • That there's no overlap between train, test, and validation sets :)
  • The every batch contains the fields we'd expect
  • That every value is within the range we'd expect
  • That the duration of each example is what we'd expect

Context

Hopefully our unit-tests will catch these bugs. But, just to be super-careful, it might be nice to also have a script which goes through the entire pre-prepared dataset and checks for these issues!

Metadata

Metadata

Assignees

Labels

dataNew data source or feature; or modification of existing data sourceenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions