Skip to content

New metadata API: add MetaFile and TargetFile classes #1329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 19, 2021

Conversation

MVrachev
Copy link
Collaborator

@MVrachev MVrachev commented Mar 31, 2021

This pr is a continuation from pr #1223

This is the final pr planned for #1139,

Description of the changes being introduced by the pull request:

In the top-level metadata classes, there are complex attributes such as
"meta" in Targets and Snapshot, "key" and "roles" in Root etc.
We want to represent those complex attributes with a class to allow
easier verification and support for metadata with unrecognized fields.
For more context read ADR 0004 and ADR 0008 in the docs/adr folder.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@MVrachev MVrachev changed the title New api classes New metadata API: add MetadataInfo and TargetFile classes Mar 31, 2021
Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the PR description you say that it will:

Rename a few instances of the update method in our classes and use a more verbose name.

but there is only one commit with marked TODOs.

My general comment is about removing hardcoded file extensions and mentions of "json" throughout the code. After the abstract (de)serialization was implemented, TUF supposedly supports other file formats too.

@MVrachev MVrachev force-pushed the new-api-classes branch 2 times, most recently from def194a to 5380720 Compare April 6, 2021 10:34
@MVrachev
Copy link
Collaborator Author

MVrachev commented Apr 6, 2021

I rebased and fixed your two small comments @sechkova about other issues besides the hard-coded file names.

I fixed the problem of hardcoded json strings partially in the places I modified already.
To fully fix this problem we either have to use a more complicate solution or wait and consider other alternatives
which will give us the ability to set metadata extensions as a configuration option.

Created an issue about the hardcoded filename problems: #1333

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for not looking at this before. I've gone through most, but maybe not quite processed all of it.

Martin knows of this but for others information: the changes to .pop() usage are going to clash with another PR #1345: they need to be coordinated. In general I think this PR could have been several smaller instead -- I get that it means annoying depending PRs... but this way means I feel bad for leaving so many comments 😁

Would be great to see a bit more prose about the design and usability of the specific API classes somewhere: If we go through the trouble of creating objects for everything, we should try to make them more usable than the dictionaries they replace. As examples, see question in code about snapshot_info attribute and ideas like 'should Targets maybe implement abc.Mapping so caller could access targets.signed["target_name"] instead of targets.signed.targets["target_name"]' -- that last one may not be a good idea but it would be good to see some ideas and discussion before we start maintaining more code.

This feels like a lot to ask on a big PR but would be nice to see how the code now copes without broken input in tests (missing fields, extra fields, etc) -- to confirm the benefits that we should be getting from this work.

@jku
Copy link
Member

jku commented Apr 21, 2021

Apologies for not looking at this before. I've gone through most, but maybe not quite processed all of it.

Also grumpy yesterday me sounds like he is unsatisfied with this PR but that's not true so I'll say it now:
PR looks correct to me (no bugs that I can find), good work, thank you

@jku
Copy link
Member

jku commented Apr 21, 2021

Would be great to see a bit more prose about the design and usability of the specific API classes somewhere: If we go through the trouble of creating objects for everything, we should try to make them more usable than the dictionaries they replace

After further discussion, this opinion seems to go against previous decisions to keep the API as close to file format as possible -- so possibly we need to discuss further or implement this as "faithful reproduction of spec" (disregarding my comments) and think about potential usability improvements afterwards...

@MVrachev
Copy link
Collaborator Author

There were a lot of discussions after my last update on this pr.
I would prefer if we wait for #1345 and #1360 to be merged first.
Then, I will update/split this pr.

@MVrachev
Copy link
Collaborator Author

MVrachev commented May 12, 2021

I have rebased and made changes in this pr after "develop" has evolved a lot.

Some of the changes I made:

  • rename MetadataInfo to MetaFile and TargetInfo to TargetFile. Those names seem better in my opinion because they communicate better what the classes represent.
  • added support for unrecognized fields in MetaFile and TargetFile. I didn't add a test, because tests: API: Test unrecognized fields everywhere #1382 will add an automatic test for that for dictionaries and classes.
  • dropped a couple of the planned commits after Make hashes, length and delegations optional + improvements #1367 was merged.
  • dropped a couple more commits, because I realized they were not needed - removing the side effect of destroying the dictionaries when calling from_dict() from the classes (this is needed for ADR 8) and adding a todo comment about the update() functions naming (there is an issue for that New API: Revise "update" methods in tuf/api/metadata.py #1230)
  • added from_dict() for MetaFile and TargetFile
  • added MetaFile and TargetFile specific tests
  • disabled the C0302: Too many lines in module warning for tuf/api/metadata.py.

@jku and @sechkova pr is again ready for a fresh review.

@MVrachev MVrachev requested a review from jku May 12, 2021 13:44
@MVrachev MVrachev marked this pull request as ready for review May 12, 2021 13:44
@MVrachev MVrachev changed the title New metadata API: add MetadataInfo and TargetFile classes New metadata API: add MetadataFile and TargetFile classes May 12, 2021
@MVrachev MVrachev force-pushed the new-api-classes branch 2 times, most recently from 1a4f05d to d13522b Compare May 12, 2021 13:55
@MVrachev MVrachev changed the title New metadata API: add MetadataFile and TargetFile classes New metadata API: add MetaFile and TargetFile classes May 12, 2021
@jku
Copy link
Member

jku commented May 13, 2021

A question on the __eq__ implementations: do we need them? None of the other classes seem to have them -- I assumed you used them in the tests but that does not seem the case (or maybe I just missed it)

MVrachev added a commit to MVrachev/tuf that referenced this pull request May 13, 2021
In the top-level metadata classes, there are complex attributes such as
"meta" in Targets and Snapshot, "key" and "roles" in Root etc.
We want to represent those complex attributes with a class to allow
easier verification and support for metadata with unrecognized fields.
For more context read ADR 0004 and ADR 0008 in the docs/adr folder.

In TargetFile, after a suggestion made by Jussi here:
theupdateframework#1329 (comment)
I realized that "custom" has the same purpose as "unrecognized_fields"
and can replace it.
I removed the "unrecognized_field" attribute and added a property for
unrecognized_fields which actually access "custom".

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev
Copy link
Collaborator Author

Updated the pr addressing almost all of the comments @jku made with the only exceptions when commenting on the update functions.
There I feel the changes Jussi proposed are out of scope for this pr as I described here #1329 (comment).

Additionally, I added a new commit where I remove the __eq__ methods.

MVrachev added a commit to MVrachev/tuf that referenced this pull request May 17, 2021
In the top-level metadata classes, there are complex attributes such as
"meta" in Targets and Snapshot, "key" and "roles" in Root etc.
We want to represent those complex attributes with a class to allow
easier verification and support for metadata with unrecognized fields.
For more context read ADR 0004 and ADR 0008 in the docs/adr folder.

In TargetFile, after a suggestion made by Jussi here:
theupdateframework#1329 (comment)
I realized that "custom" has the same purpose as "unrecognized_fields"
and can replace it.
I removed the "unrecognized_field" attribute and added a property for
unrecognized_fields which actually access "custom".

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev MVrachev force-pushed the new-api-classes branch 2 times, most recently from a289347 to bea222c Compare May 17, 2021 10:35
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else is a style comment (feel free to disregard) but I think there's a bug in the "custom" handling

@MVrachev MVrachev force-pushed the new-api-classes branch 2 times, most recently from 3a6403b to 9fe0808 Compare May 17, 2021 17:33
@MVrachev
Copy link
Collaborator Author

Addressed all of @jku commets, and additionally I wanted to make sure that each of the commits passes the CI and makes sense.
That's why I also made the following changes:

  1. Merge changes in "Replace usage of dict.copy with copy.deepcopy()" in "Add MetaFile" and "Add TargetFile". Then drop "Replace usage of dict.copy with copy.deepcopy()"
  2. Simplified how we handle "custom" in TargetFile.to_dict()
  3. Squash "New API: represent custom with unrecognized_fields" into "Add TargetFile"
  4. Added a new commit removing two tests testing "update with only version" because they are obsolete.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment in-source won't show up since it's in a resolved discussion but:

The __eq__() implementation is still not complete: if it is there, it should compare unrecognized_fields instead of custom.

Altogether this looks good, thanks. The docstrings are very verbose but let's trim those later... I'm marking this approved already, but please fix __eq__() still in this PR.

@MVrachev
Copy link
Collaborator Author

I think my comment in-source won't show up since it's in a resolved discussion but:

The __eq__() implementation is still not complete: if it is there, it should compare unrecognized_fields instead of custom.

Altogether this looks good, thanks. The docstrings are very verbose but let's trim those later... I'm marking this approved already, but please fix __eq__() still in this PR.

I thought I had removed the __eq__ methods.
It was a mistake that they are left there.
Removed them and rebased.

@jku
Copy link
Member

jku commented May 19, 2021

I was about to merge but now it has conflicts with a bug fix I merged yesterday. Sorry. Could you rebase?

MVrachev added 6 commits May 19, 2021 13:04
In the top-level metadata classes, there are complex attributes such as
"meta" in Targets and Snapshot, "key" and "roles" in Root etc.
We want to represent those complex attributes with a class to allow
easier verification and support for metadata with unrecognized fields.
For more context read ADR 0004 and ADR 0008 in the docs/adr folder.

Additionally, after adding the MetaFile class, when we create an object
we are now calling from dict twice - one for the main class (Timestamp,
Snapshot) and one for the pacticular complex attribute -
MetaFile.from_dict(). Given that the "from_dict" methods have the
side effect of destroying the given dictionary, we would need to
start using deepcopy() for our tests.

Signed-off-by: Martin Vrachev <[email protected]>
In the top-level metadata classes, there are complex attributes such as
"meta" in Targets and Snapshot, "key" and "roles" in Root etc.
We want to represent those complex attributes with a class to allow
easier verification and support for metadata with unrecognized fields.
For more context read ADR 0004 and ADR 0008 in the docs/adr folder.

As written in the spec "targets" in "targets.json" has defined the
"custom" field serving the same purpose as "unrecognized_fields" in the
implementation.
That's why to conform against the spec and support "custom" and allow
"unrecognized_fields" everywhere where it's not sensitive we can define
custom as property which actually access data stored in
unrecognized_fields.
For context read ADR 8 in tuf/docs/adr.

Additionally, after adding the TargetFile class, when we create a
Targets an object we are now calling from dict twice - one for the main
Targets class and one for each of the complex attributes
TargetFile.from_dict() and Delegations.from_dict().
Given that the "from_dict" methods have the side effect of destroying
the given dictionary, we would need to start using deepcopy()
for our tests.

Signed-off-by: Martin Vrachev <[email protected]>
Disable the "C0302: Too many lines in module" warning which warns for modules
with more 1000 lines, because all of the code here is logically connected
and currently, we are above 1000 lines by a small margin.

Signed-off-by: Martin Vrachev <[email protected]>
Currently, when we call Targets/Snapshot/Timestamp.update() we are
passing all of the necessary values to create MetaFile/Targets File
respectively.
This is not needed, given that one of the reasons we have created
MetaFile and TargetFile is to make the API easier to use.

Signed-off-by: Martin Vrachev <[email protected]>
We have tests which make sure we can use `Timestamp.update()` and
`Snapshot.update()` with MetaFile instance storing only version
(because length and hashes are optional).
Those tests were created to make sure that we are actually supporting
optional hashes and length when we call `update` for those classes, but
after we changed the `update()` signature to accept `MetaFile` instance
the tests are obsolete.
The reason is that length and hashes can be optional because of the
MetaFile implementation, no the update function itself and we have
other tests validating creating a MetaFie instance without hashes and
length.

Signed-off-by: Martin Vrachev <[email protected]>
@MVrachev
Copy link
Collaborator Author

Done.

@jku jku merged commit f935ea3 into theupdateframework:develop May 19, 2021
@MVrachev MVrachev deleted the new-api-classes branch May 19, 2021 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants