Asynchronous Datastores

## Proposal

1. Add a `Sync(prefix Key)` function to the `Datastore` interface.
    * This function will be a no-op when the datastore is in synchronous mode (the default).
    * Otherwise, `Sync(prefix)` guarantees that any `Put(prefix + ..., value)` calls that returned _before_ `Sync(prefix)` was called will be observed _after_ `Sync(prefix)` returns, even if the program crashes.
2. Insert calls to `Sync` where appropriate (in go-ipfs and go-libp2p).
3. When ready, turn off sync writes in _go-ipfs's_ datastore (by default). (we'll have an experimental transition with heavy testing)

Notes:

1. We're not changing the default behavior. Datastores will still write synchronously unless configured not to do so.
2. Put will either completely put a value or not put a value. Even when sync writes is turned off, the datastore will never be left in a corrupt state.

## Motivation

Writing to disk synchronously has poor performance and is rarely necessary.

Poor performance: `ipfs add` performance is _doubled_ (on linux/ext4) when badger is used and synchronous writes are turned off.

Rarely necessary:
* The DHT expects some number of nodes to be faulty so losing a few records is usually fine.
* IPFS only guarantees that blocks are persisted when pinned. There's no reason to sync after every write.
  * Note: For now, we'll likely want to explicitly `sync` after a full `ipfs add` as most users have GC turned off and expect the data to be persisted anyways. However, doing this _once_ is cheaper than doing it for every write.
* The peerstore _definitely_ doesn't need synchronous writes.

## Alternatives

* Create a buffered/batching/async wrapper. This is what go-ipfs _currently_ does but we could do better.
* Use the "autobatching" datastore.

However:

1. Buffering/caching isn't easy.
2. Unlike buffering inside the OS, they can't (easily) respond to memory pressure.
3. Conversely, they force one to _eagerly_ sync/flush periodically instead of as-needed. The OS knows when we have enough memory to keep writing into memory.

@Stebalien @whyrusleeping @raulk Seem like a reasonable plan?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Asynchronous Datastores #137

Proposal

Motivation

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Asynchronous Datastores #137

Description

Proposal

Motivation

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions