-
Notifications
You must be signed in to change notification settings - Fork 278
Make length and hashes optional for timestamp and snapshot roles #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make length and hashes optional for timestamp and snapshot roles #1031
Conversation
e6b657d
to
03ea4d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks, Martin!
03ea4d1
to
6f72866
Compare
Thank you, @trishankatdatadog for your comprehensive review! I decided to rebase the particular commits, address your comments, and make them ready for merge.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks Martin!
I have one major reservation: the FILEINFO_SCHEMA
between timestamp
/snapshot
and targets
metadata should be distinguished...
I will fix my merge conflicts when the tests on the develop branch are fixed and pass. |
I just found out that the tests on the development branch are actually passing, but I had problems when I run Sorry, it was my problem. |
1ae739d
to
e92ef92
Compare
I fixed the merge conflicts. This commit is big and contains the following changes:
The code is ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your continued diligent work on this Martin. There are two important changes we should make (see below) and some minor stylistic changes that would be good to see (comments in-line).
Because FILEINFO_SCHEMA
was versatile in its use for both target files and metadata files, it included the optional fields used by both the METAFILES and TARGETS objects defined in the specification.
As we are wisely separating duties here, we can remove the fields from our new objects that don't apply. Specifically:
TARGETS_FILEINFO_SCHEMA
doesn't need a version fieldMETADATA_FILEINFO_SCHEMA
doesn't need a custom field
Thanks for all your hard work, guys! Please let me know when it's ready to review, and I'll take another look |
e92ef92
to
186be76
Compare
I think the pr should be in good shape now and ready for another set of reviews. |
I'm happy if @joshuagl is happy |
186be76
to
8dc5168
Compare
I have only edited one comment on which Joshua commented, but I had forgotten to change it. I don't have a clue why the Travis CI build for python 2.7 fails when locally it doesn't... |
Could be some timeout. I've seen this every now and then in the TUF tests. Just restarted the job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your continued diligence on this PR @MVrachev. I have a few minor nits (indentation related and some clarifications) and suggest a couple of clean-ups that would be great additions to this PR.
tuf/repository_lib.py
Outdated
def generate_snapshot_metadata(metadata_directory, version, expiration_date, | ||
targets_filename, storage_backend, consistent_snapshot=False, | ||
repository_name='default'): | ||
repository_name='default', use_length=True, use_hashes=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
96142d6
to
c9f220f
Compare
I addressed Joshua's comments and add a two commits The pr is ready for yet another set reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sticking at this Martin. This is looking really solid. I have only one non-trivial concern, which is the change to generate_timestamp_metadata
– I really don't think the snapshot_file_path
argument is necessary. All we really want is the filename, which you get in your changes here by calling os.path.split()
, and that is defined by the constant SNAPSHOT_FILENAME
– let's just keep using the constant as the code does today and drop the argument to generate_timestamp_metadata()
.
tuf/repository_lib.py
Outdated
_, snapshot_filename = os.path.split(snapshot_file_path) | ||
# Retrieve the versioninfo of the Snapshot metadata file. | ||
snapshot_version = get_metadata_versioninfo('snapshot', repository_name) | ||
snapshot_fileinfo[SNAPSHOT_FILENAME] = \ | ||
tuf.formats.make_fileinfo(length, hashes, version=snapshot_version['version']) | ||
snapshot_fileinfo[snapshot_filename] = \ | ||
tuf.formats.make_metadata_fileinfo(snapshot_version['version'], | ||
length, hashes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this necessary? The function worked before by using the constant SNAPSHOT_FILENAME
why would do we need to change away from that? I don't think there's any need for the snapshot_file_path
argument. AFAICS the one place that wasn't passing in 'snapshot.json' was passing 'path/to/snapshot.json' which your change here will strip down to 'snapshot.json'. Am I misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapshot_file_path
is needed to calculate the hashes and length of the snapshot file by using get_file_details
from securesyslib: https://github.com/secure-systems-lab/securesystemslib/blob/9b3a78e998412c39418efb29f411591b20cdd236/securesystemslib/util.py#L45
which has the argument filepath
- the absolute path to the file and in our case to the snapshot file.
Here we can't remove the snapshot_file_path as we did with the target_filename in the generate_snapshot_metadata
because in generate_snapshot_metadata
we have the extra metadata_directory
argument from which we can use to create the absolute path needed for get_file_details
.
dbd8242
to
e93dda8
Compare
I had to rebase the second time to remove a commit not made by me which introduced into the changes. |
be3c0af
to
52f2fc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your perseverance with these changes @MVrachev.
I realised that in generate_snapshot_metadata()
we are computing lengths and hashes for all targets metadata files, regardless of whether we are using lengths and hashes or not. This doesn't seem like a major issue until you consider that the PEP 458 implementation will have over 16k delegations (and therefore over 16k targets metadata files to compute the hashes for).
I anticipate that Warehouse will want to use length and hashes for targets metadata, but we should make sure that if they opt out we aren't doing a lot of unnecessary work. Happy to see that as a patch here or we can file it as an Issue to be implemented in a separate PR before the next release. What do you think?
tests/test_formats.py
Outdated
{'length': 1024, | ||
'hashes': {'sha256': 'A4582BCF323BCEF'}, | ||
'version': 1}), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: indentation is not right here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you mean by "not right".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dictionary was originally indented to align with the opening delimiter on the line above (PEP 8 style). With the new longer variable name the indentation no longer matches the existing implicit coding style, nor does it match the explicit coding style of the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, now I get it better.
I was confused that you left the comment in a random location where I placed a new line.
52f2fc6
to
d275e3c
Compare
I will add your suggestions, but if you don't mind without the abbreviations of |
Signed-off-by: Martin Vrachev <[email protected]>
f02bd18
to
80818e9
Compare
I updated the last commit with your suggestion on all places where we used flags to control the computation snapshot length and hashes. |
One minor comment: please don't force-push unless absolutely necessary, makes it hard to see new changes... |
Thanks for all of your work on this @MVrachev and the thorough review @trishankatdatadog ! |
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. This is a possible breaking change. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
As per the specification (v1.0.1) length and hashes fields in timestamp and snapshot metadata are optional. We have implement this in the older API (see theupdateframework#1031) and we should implement it in the new API. Signed-off-by: Martin Vrachev <[email protected]>
Please fill in the fields below to submit a pull request. The more information
that is provided, the better.
Fixes issue #: #996
Description of the changes being introduced by the pull request:
As per the tuf specification (v1.0.1) length and hashes fields are optional for
the timestamp.json and snapshot.json.
There could be a discussion around what should be the default.
My opinion is that we should apply the rule
security by default
here and use hashes and length by default.That way if somebody decides to not use them then he/she should make it explicit through passing the use_length=False and use_hashes=False arguments when calling the respective functions.
Please verify and check that the pull request fulfills the following
requirements: