Metadata API: test invalid input values for attributes #1460

MVrachev · 2021-06-22T11:48:52Z

Description of the changes being introduced by the pull request:

A while ago we decided that it's best to research each of the individuals
attributes one by one and identify what level of validation it needs
compared to how we use it:
#1366 (comment).

This work is ongoing and there are a couple of commits already merged
for this:

We want to be able to test the validation we add for attributes against known bad
values. The way we want to do that is with table testing we have added
using decorators for our metadata classes defined in New API:
#1416.
This gives us an easy way to add new cases for each of the attributes and
not depend on external files.

It's important to note that I haven't added invalid tests for all attributes, because
we haven't finished with the metadata research.

When new validation for a certain attribute is added we should make sure to add
invalid tests in TestInvalidSerialization as well.

Please verify and check that the pull request fulfills the following
requirements:

The code follows the Code Style Guidelines
Tests have been added for the bug fix or new feature
Docs have been added for the bug fix or new feature

MVrachev · 2021-06-23T14:04:13Z

This pr is ready for review after the merge of #1416.

jku

This seems very fine grained. There's no benefit from having a separate test for each attribute (when tests are practically identical) but it makes the patch massive. I would expect less granularity would mean at least 50% reduction in commit size?

Also, currently it's hard to see what is being tested: I need to read deep into the test code to find what is the actual class that is being tested by an individual test.

I don't see why we would not do the exact same thing we already do for valid data tests: one invalid-data dataset and one new test per class-under-test. I also think the invalid test+data for a class should be next to the valid test+data for the same class -- If the tests need to be split by for some reason, I expect it to be more useful to split by class-under-test (so e.g. TestKeySerialization) than splitting by TestSerialization/TestInvalidSerialization: I often want to run tests for a specific class, but I have never wanted to run tests based whether success or failure is expected.

MVrachev · 2021-06-30T12:19:43Z

This seems very fine grained. There's no benefit from having a separate test for each attribute (when tests are practically identical) but it makes the patch massive. I would expect less granularity would mean at least 50% reduction in commit size?

Also, currently it's hard to see what is being tested: I need to read deep into the test code to find what is the actual class that is being tested by an individual test.

I don't see why we would not do the exact same thing we already do for valid data tests: one invalid-data dataset and one new test per class-under-test. I also think the invalid test+data for a class should be next to the valid test+data for the same class -- If the tests need to be split by for some reason, I expect it to be more useful to split by class-under-test (so e.g. TestKeySerialization) than splitting by TestSerialization/TestInvalidSerialization: I often want to run tests for a specific class, but I have never wanted to run tests based whether success or failure is expected.

I did it more granularly because we have more cases per attribute.
Will create a commit making test cases per class and we will comment it again.

MVrachev · 2021-07-01T13:09:07Z

In a discussion with Jussi, he proposed that I should group invalid and valid tests close to each other in the same test class.
Added a commit doing that.

jku

I would still really like it if the variable and function names were consistent

invalid_key vs valid_keys
invalid_role vs valid_roles
invalid_metafile_attr vs valid_metafiles
invalid_targetfile_hashes_length vs valid_targetfiles

With some of those I really need to squint to figure out that they are supposed to be tests for the same class

Other comments are just nitpicks I think