This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
Dataset Validation script #138
Closed
Description
Detailed Description
A script which goes through all the pre-prepared batches and checks:
- That there's no overlap between train, test, and validation sets :)
- The every batch contains the fields we'd expect
- That every value is within the range we'd expect
- That the duration of each example is what we'd expect
Context
Hopefully our unit-tests will catch these bugs. But, just to be super-careful, it might be nice to also have a script which goes through the entire pre-prepared dataset and checks for these issues!