Skip to content

Metadata API: test invalid input values for attributes #1460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 7, 2021

Conversation

MVrachev
Copy link
Collaborator

@MVrachev MVrachev commented Jun 22, 2021

Fixes #1434

Description of the changes being introduced by the pull request:

A while ago we decided that it's best to research each of the individuals
attributes one by one and identify what level of validation it needs
compared to how we use it:
#1366 (comment).

This work is ongoing and there are a couple of commits already merged
for this:

We want to be able to test the validation we add for attributes against known bad
values. The way we want to do that is with table testing we have added
using decorators for our metadata classes defined in New API:
#1416.
This gives us an easy way to add new cases for each of the attributes and
not depend on external files.

It's important to note that I haven't added invalid tests for all attributes, because
we haven't finished with the metadata research.

When new validation for a certain attribute is added we should make sure to add
invalid tests in TestInvalidSerialization as well.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@MVrachev MVrachev changed the title Invalid attributes tests Metadata API: test invalid input values for attributes Jun 22, 2021
@MVrachev MVrachev force-pushed the invalid-attributes-tests branch from 74b6c0d to 3213d9c Compare June 23, 2021 13:59
@MVrachev MVrachev marked this pull request as ready for review June 23, 2021 14:01
@MVrachev
Copy link
Collaborator Author

This pr is ready for review after the merge of #1416.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very fine grained. There's no benefit from having a separate test for each attribute (when tests are practically identical) but it makes the patch massive. I would expect less granularity would mean at least 50% reduction in commit size?

Also, currently it's hard to see what is being tested: I need to read deep into the test code to find what is the actual class that is being tested by an individual test.

I don't see why we would not do the exact same thing we already do for valid data tests: one invalid-data dataset and one new test per class-under-test. I also think the invalid test+data for a class should be next to the valid test+data for the same class -- If the tests need to be split by for some reason, I expect it to be more useful to split by class-under-test (so e.g. TestKeySerialization) than splitting by TestSerialization/TestInvalidSerialization: I often want to run tests for a specific class, but I have never wanted to run tests based whether success or failure is expected.

@MVrachev
Copy link
Collaborator Author

This seems very fine grained. There's no benefit from having a separate test for each attribute (when tests are practically identical) but it makes the patch massive. I would expect less granularity would mean at least 50% reduction in commit size?

Also, currently it's hard to see what is being tested: I need to read deep into the test code to find what is the actual class that is being tested by an individual test.

I don't see why we would not do the exact same thing we already do for valid data tests: one invalid-data dataset and one new test per class-under-test. I also think the invalid test+data for a class should be next to the valid test+data for the same class -- If the tests need to be split by for some reason, I expect it to be more useful to split by class-under-test (so e.g. TestKeySerialization) than splitting by TestSerialization/TestInvalidSerialization: I often want to run tests for a specific class, but I have never wanted to run tests based whether success or failure is expected.

I did it more granularly because we have more cases per attribute.
Will create a commit making test cases per class and we will comment it again.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 1, 2021

In a discussion with Jussi, he proposed that I should group invalid and valid tests close to each other in the same test class.
Added a commit doing that.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still really like it if the variable and function names were consistent

invalid_key vs valid_keys
invalid_role vs valid_roles
invalid_metafile_attr vs valid_metafiles
invalid_targetfile_hashes_length vs valid_targetfiles

With some of those I really need to squint to figure out that they are supposed to be tests for the same class

Other comments are just nitpicks I think

class TestSerialization(unittest.TestCase):

invalid_signed: DataSet = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a lot of test cases here that are unlikely to be useful (e.g. 5 invalid spec versions is just overkill) but since the tests are so fast... I guess no harm done?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed one of the cases one digit spec_version, the others seem relevant to the checks we are making.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but do keep in mind that the idea isn't to test the lines of code that we happen to have in there right now: the idea is to have test data that covers as much of the possible inputs as practically useful -- since we can't cover 100% of the possibilities we inevitably end up making decisions on what's important: we test the inputs we think could be interesting.

@MVrachev MVrachev force-pushed the invalid-attributes-tests branch from b0d0780 to 4dcbbc0 Compare July 5, 2021 16:13
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 5, 2021

@jku squashed my commits and addressed all of your comments (including the inconsistencies) except for the ones
about keyval which are addressed in another pr.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a couple of comments that are up to you

@@ -159,6 +247,20 @@ def test_delegation_serialization(self, test_case_data: str):
self.assertDictEqual(case_dict, delegation.to_dict())


invalid_targetfiles: DataSet = {
"no hashes": '{"length": 1}',
"no length": '{"hashes": {"sha256": 1}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's an invalid hashes value type

Comment on lines 62 to 63
"empty str spec_version": '{"_type": "signed", "spec_version": "", "version": 1, "expires": "2030-01-01T00:00:00Z", \
"meta": {"f.txt": {"version": 1}}}',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshot meta isn't required to contain anything: "meta": {} is enough if you're going for the minimal serialised form

nitpick on the style: I think this would be easier to read if you just barely need two lines (also allows names to be as long as you want):

        "empty str spec_version":
            '{"_type": "signed", "spec_version": "", "version": 1, "expires": "2030-01-01T00:00:00Z", "meta": {}}',

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will do that, but will add a comment that per spec having a snapshot object like this in a repository is considered invalid.

class TestSerialization(unittest.TestCase):

invalid_signed: DataSet = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but do keep in mind that the idea isn't to test the lines of code that we happen to have in there right now: the idea is to have test data that covers as much of the possible inputs as practically useful -- since we can't cover 100% of the possibilities we inevitably end up making decisions on what's important: we test the inputs we think could be interesting.

@jku
Copy link
Member

jku commented Jul 6, 2021

(that last comment is a reply to earlier discussion on spec_version test cases, for some reason github shows it independently here)

@MVrachev MVrachev force-pushed the invalid-attributes-tests branch from 4dcbbc0 to 42d8264 Compare July 6, 2021 14:30
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 6, 2021

@jku I addressed your comments.

MVrachev added 2 commits July 7, 2021 13:59
In pr theupdateframework#1449 we introduced
Key validation and there decided to raise "ValueError" if one of
keyid, scheme, keyval or keytype is not a string.
That seems like a mistake given that in the python docs it's written:
"Passing arguments of the wrong type
(e.g. passing a list when an int is expected) should result
in a TypeError, but passing arguments with the wrong value
(e.g. a number outside expected boundaries) should result
in a ValueError."

Signed-off-by: Martin Vrachev <[email protected]>
A while ago we decided that it's best to research each of the individuals
attributes one by one and identify what level of validation it needs
compared to how we use it:
theupdateframework#1366 (comment).

This work is ongoing and there are a couple of commits already merged
for this:
- theupdateframework@6c5d970
- theupdateframework@f20664d
- theupdateframework@41afb1e

We want to be able to test the attributes validation against known bad
values.
The way we want to do that is with table testing we have added
using decorators for our metadata classes defined in New API:
theupdateframework#1416.
This gives us an easy way to add new cases for each of the attributes and
not depend on external files.

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev MVrachev force-pushed the invalid-attributes-tests branch from 42d8264 to 9c8aa1d Compare July 7, 2021 11:00
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 7, 2021

Rebased on top of the latest changes in develop branch and fixed conflicts.

@jku jku merged commit 24fa112 into theupdateframework:develop Jul 7, 2021
@MVrachev MVrachev deleted the invalid-attributes-tests branch July 19, 2021 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata API: Tests (de)serialization with invalid arguments
2 participants