IPFS database: pubsub, consistency and persistence #244
Description
This topic has probably been touched in various other notes, but I feel I need a specific place to address some of my concerns related to building an eventually-consistent key-value store on top of IPFS. I'm sure there is a huge body of knowledge out there on this subject, and I would like to gather some of it here. Also, I know some things about how how other databases try to solve this, but I'm more interested in how to do this using the primitives that IPFS already provides.
Partitioning and scaling
Instead of a global store, approaches like OrbitDB split a database into "tables" or "partitions" (I'll just call it "partition" from now on, as table reminds me of relational DBs too much). This model allows a database to scale, and keep all the messages related to that partition only to the nodes that are interested in it.
Assuming that each operation is appended to a log, and that this log is saved onto IPFS. The hash of the latest head of the log is then broadcasted using a pubsub channel that is specific to that partition.
Each node that is interested in a given partition subscribes to updates on that partition, and then it keeps receiving updates on the latest known head. When a node gets an update, it then uses IPFS to retrieve the content of the head (and parents) until it has all the operation log data that is needed to apply to the database. (Conflicts may arise and each node must be able to resolve them in a deterministic way, but I think this may be a discussion to have in the realm of CRDTs and related topics).
Unreliable message delivery
But, as we know, pubsub does not have reliable delivery, which means that messages can be lost. Poorly connected nodes and new nodes need to have a way to query what is the latest head of a given partition.
This can be a problem, which can be circumvented in several ways, and I enumerate a few:
-
If there is enough operations on that partition, these pubsub messages pertaining to a given partition will have enough frequency that all interested nodes will eventually get a message
-
Instead of one broadcast once there is a new operation, if nodes that participate in a partition keep broadcasting the head (at least while they think the network has interest in that partition), all interested nodes will eventually get the latest head.
I think that, for some use cases, scenario 1 may be too weak, since it's bound to database activity.
Scenario 2 provides more consistency and persistence guarantees, but at the expense of network activity, which is, if not carefully designed, increases linearly with the number of nodes. (even though it is limited to the nodes that participate in that partition). Some mechanisms to soften this would be to:
a) broadcast when a new new node is added to the topic
b) back-off the frequency of broadcasts once the activity stops
c) broadcast less frequently as the number of interested nodes increase
Is this a valid concern? If so, what are your thoughts or experience on these?
// pinging @diasdavid @haadcode @gritzko