-
Notifications
You must be signed in to change notification settings - Fork 6
Download and validate root TUF data automatically if needed #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This looks great. Does anyone else have thoughts? |
Maybe add the file size as well as the hash? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks functionally correct to me. I have one question about key rotation and one request to rename some variables.
"root": "root.json" | ||
"root": { | ||
"hashes": { | ||
"sha256": "6cf95b77cedc832c980b81560704bd2fb9ee32ec4c1a73395a029b76715705cc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, the TUF spec locates the root key offline (e.g., literally in a physical safe or something) and suggests rotating it roughly annually. Wouldn't this sha hash change when that happens? And in that case, would this "just stop working" one day for our users until they update the hash from the webpage or whatever they originally got it from? cc @davidstrauss @tedbow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to review the code again to understand how it's implemented, but what it should do is only use this pinning to bootstrap new clients. The code should probably ship with the version and size for that, too.
When we rotate root, it should have the following effects with this design:
- No special effect on bootstrapped clients, which have already pulled and stored local root data and no longer use the bootstrapping information.
- If someone installs using a download from before the rotation, their client will need to bootstrap to the old root and update to the new one by standard means. This would also be the case if someone installed from a stale tarball that included a full but stale
root.json
. We're required to keep the chain of previous root data published, at least as far back as we expect clients to need to bootstrap.
Here's how my thought process went around working with this root bootstrapping problem:
- We could use a design where we ship the initial root data in a way that both populates the client's repo metadata store and retains its writability, but that's kind of awkward. We usually like to split things into (a) shipped and nearly immutable vs. (b) mutable but initially empty (e.g. /files).
- We could ship
root.json
and copy it into the client repo metadata. It's kind of unwieldly to embed in other JSON, though. We'd probably want to ship it separately and reference it somehow in the repo config. This was the prior design, and my suggestion was to simplify it by going to the next level of abstraction by "pinning" the expected metadata rather than providing it directly. - Since old repo metadata must remain published, we ought to be able to bootstrap with just a hash (and maybe size) for the data. If it can't fetch the current (or fresher) root data, then it probably can't pull anything else from the repo. Having a full root.json doesn't seem to help the client survive any attacks above and beyond providing it the version + hash + size.
Still, I'll run this past people at the Secure Systems Lab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Marina Moore at the Secure Systems Lab on TUF Slack:
For the purpose of bootstrapping trust, the secure hash should be functionally equivalent to the root metadata, so that seems like a reasonable space-saving mechanism. It also gives more paranoid users an opportunity to look for external verification that they have the correct root metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unclear of the motivation for this PR. @phenaproxima in the first comment describes the problem as "we can't easily resolve the path of the starting root metadata,"
but then @davidstrauss in TUF slack mentions "include it in Composer's repository configuration in a compact way" and Marina Moore mentions "reasonable space-saving mechanism".
Is there not an actual problem of resolving? Are the initial root.json files so big we are worried about the space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just awkward to have the Composer config either (1) reference a file elsewhere on disk or (2) embed the root data into an otherwise sensible JSON file.
Another option could be some sort of naming convention for discovering the root data for a given repo. For example, we could use the FQDN or hash of the base URL. This would allow populating the root on disk without awkwardly referencing it from the JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that would solve the problem. The issue is, if we expect to find the root JSON in the file system, we need to know its path. And as you know, there are two kinds of paths: relative and absolute. Both present problems:
If we used a relative path, then what is it relative to? The current working directory? The plugin's src
directory? The composer.json file itself? Some package's path in the vendor directory? (If it's relative to a package, what's the path of the package? We cannot know this until the package info is loaded, but TUF needs to be fully bootstrapped and updated before we can even begin to figure that out.)
If it's an absolute URL, then how can we guarantee that path be valid in all environments the composer.json file might live in?
Downloading the root JSON is not my favorite idea, but it does get around these questions, because the root file lives at a stable, known location (i.e., a URL on the server). But I'm absolutely open to other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option that @davidstrauss suggested later in this issue is to use magic naming and a fixed directory structure. For example, if we want people to add TUF protection to the packages.drupal.org
repository, we document that we expect the project to have a directory structure like this:
.pki/
tuf/
packages.drupal.org.json
composer.json
@TravisCarden I think that, to avoid getting too verbose, the best thing to do might just be a comment. What do you think? |
The Secure Systems Lab has suggested we review/collaborate with this PIP issue. |
Some drive-by comments (from python client experience, without any knowledge of the php version):
That said, I don't understand why you are doing this: if you have place to store the hash of a root.json, I don't see why you would not have a place to store the root.json itself? I would definitely choose embedding it in the source code if a local file can't be reliably loaded from disk. |
$fetcher = $repoConfig['tuf']['_fileFetcher'] ?? GuzzleFileFetcher::createFromUri($url); | ||
$this->updater = $repoConfig['tuf']['_updater'] ?? new ComposerCompatibleUpdater($fetcher, [], new FileStorage($repoPath)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we are already doing this for _fileFetcher
but it seems weird to accept config options that should only ever be used in testing. I don't know enough about how Composer plugins work but it makes me wonder if another plugin would ever have a chance to alter this config to say pass malicious file fetcher or updater in these keys.
It seems like if we don't want _updater
to be used during a non-test run we should do whatever we can to make that impossible.
Looking at the test
return $this->composer->getRepositoryManager()
->createRepository('composer', $config);
Because TufValidatedComposerRepository
is not final could we extend it as TestTufValidatedComposerRepository
to be able to set the fetcher and updater in some way then use \Composer\Repository\RepositoryManager::setRepositoryClass
to make it use our test class. The maybe avoiding having this test only code in our class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed an alternate approach in #22, although I don't know if it's really much of an improvement.
A lot of these problems stem from the fact that ComposerRepository::$config is private, and I need to be able to modify the configuration in the constructor before it's given to the parent class. If that property were protected, this would be a non-issue, and I could move all of this logic into a different method, rather than being constrained by the fact that it's all got to happen in the constructor.
I don't have a strong opinion that we should do this. It was simply a way to allow So, this approach makes Another approach, which I might actually like more, is where |
I kinda like that idea, @davidstrauss. This is a case where I think that convention and magic naming are going to be the clearest option for users. And Drupal could certainly modify the core-composer-scaffold plugin to automatically put its One small question we'd have to figure out is how to handle repository URLs where the path is significant -- |
Should we close this PR or set it to draft since it seems we want to go with #23 instead? |
Closed in favor of #23. |
This is an experiment, implementing @davidstrauss's idea in #7.
The idea is basically: to get around the fact that we can't easily resolve the path of the starting root metadata, we could instead expect consumers of this plugin to have a set of hashes for the root metadata in their composer.json. The plugin could then try to download the root metadata and validate it against those hashes very early, before any other data is downloaded.