How should clients handle interrupted updates?

I am currently doing some exploration into how clients should handle interrupted, partially successful updates. For example, say we have a client that has a local cached copy of valid and  unexpired metadata.  We start an update process which includes a new `timestamp`, `snapshot`, and `targets` metadata. Unfortunately, we download the new timestamp and snapshot and persist them to disk, but the device loses power. Then when power is restored the network is down. We'd still like to make queries against the TUF targets file, but according to the workflow, we should get an error. We can only recover from this by restoring the network.

This is particularly relevant to Fuchsia, because of how we have created our packaging system. We want to treat the TUF targets as the list of executable packages, since it allows us to maintain a cryptographic chain of trust all the way down to the bootloader for what things can be executed. All our packages are stored in a content-addressed filesystem, and we use the `custom` field in a TUF target to provide the mapping from a human readable name to a merkle addressed package blob. When we try to open a package, we first look in TUF to find the merkle, then we check if we've already downloaded that blob. If so, we open up that package and serve it to the caller. See this [slightly stale doc](https://fuchsia.dev/fuchsia-src/development/workflows/package_update) for more details. Due to this interrupted update problem, there's a chance a Fuchsia device could be made unusable until we are able to finish updating our metadata. 

If not, we have had a few ideas on how to approach this:

* If an update fails, we could still query the local latest `targets` metadata, assuming it was signed with a key that's still trusted by the `root` metadata.
* During the update, we delay writing all the metadata to disk until all the files have been downloaded and verified. Then the files are written in one atomic transaction.
* For consistent snapshot metadata (which we only plan on supporting), fetch the `timestamp` metadata, but don't persist it to disk yet. Fetch and write the versioned-prefixed `snapshot` and `targets` metadata, and any other delegated metadata, to disk. Atomically write the `timestamp` metadata to disk, then clean up any old snapshot/targets/etc metadata files.

I'm not sure if these ideas would weaken the TUF security model though. Is there a better way for dealing with this, and could we incorporate this into the spec (or a POUF?), since I imagine other folks might need a solution for this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How should clients handle interrupted updates? #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How should clients handle interrupted updates? #69

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions