Skip to content
This repository was archived by the owner on Feb 8, 2023. It is now read-only.
This repository was archived by the owner on Feb 8, 2023. It is now read-only.

Implement databases over IPFS using Persistent Balanced Trees #161

Open
@eladve

Description

@eladve

Here are some notes for implementing a more-scalable version of Orbit-DB using persistent balanced trees. This came out of a discussion with @haadcode . Before implementing it, I'd love to get the community's feedback.

Currently, Orbit-DB operates as a key-value storage (i.e. a Python Dictionary / Go Mapping / Javascript Object) over IPFS. It supports insert, change, delete and lookup. The running time for all change operations is O(1) (by simply adding a node to the changelog and publishing the new head), but the lookup time is O(m), where m is the number of previous operations that already happened in the DB. In principle, the total cost of all lookup times at a particular node can be made to be O(m) by keeping an up-to-date version of the DB locally, and updating it according to the changelog, but this is unpleasant, and requires local storage: it requires us to keep track of everything that's happening in the DB, just to be able to do lookups: this is
unscalable. For a chat application it's actually fine, since the user is presumably interested in anything that gets posted to the channel, but for other applications it won't do.

My alternative suggestion is to implement the DB as a semi-balanced search tree (e.g. AVL, 2-3, Red-Black, etc), indexed by the key, using Path Copying (I explain the details below.) With this approach, lookups will always be sent to the remote IPFS storage, and a client does not have to store anything locally. All operation times will take O(logn) time, where n is the number of keys currently in the DB. This approach will also allow us to perform look-ups in previous versions of the database (just like the current version allows). Also, when we implement this approach, the resulting codebase will provide a good template or proof-of-concept for DBs that support other operations. For example, we can create a database over IPFS that supports prefix searches (e.g. for searching for text in a chat log ) by implementing a suffix tree style data structure, with Path Copying, over IPFS. Also, since the operation times will now be scalable, we won't have the separate orbit-db into "channels": instead, we can keep the entire DB at one place, and use the channel name as part of the search key.

The implementation I suggest will work as follows: The DB will always look like a tree (e.g. AVL Tree): There will always be a "root node", which is the most recently published root on the pubsub system for this channel. Each node will have a key, key_0, a value, val_0, and two IPFS pointers to children (any of which can be null). All keys on the right side will be larger than key_0, and all keys on the left side will be smaller than key_0. For a lookup operation on the most up-to-date version we just start from the pointer to the root node, and we make our way down in binary search fashion until we find the requested key or discover that it does not exist in the tree. Now for updates: if we wish to update the value related to a certain key, we find the node holding this key, call it v. Obviously we can't change v, because we are working in an immutable system. Instead, we create a new copy of v, call it v',which is identical to v, except that the value is changed to the new value. Now we need to update the parent of v, call it u, to point to v' instead of v. But, of course, we can't do that, because IPFS is immutable. So we take u, and make a copy of it, u', where u' is identical to u, except that the pointer to v is replaced by a pointer to v'. We go up like this in the tree, until getting a new copy of the root, r'. Now we take r', we make it the new root, and we publish the IPFS pointer to r' in the pubsub system as the new root. What do we do with the old root r? Well, there is still a pointer to it in the pubsub system, as the old root, so anyone who wishes to access the old version of the tree can still do that by going through r. We should make sure that timestamps and other metadata is published along with r' (and maybe with each vertex), in order to aid people at finding the version they want, in case they want it. For the chat application, and for most other applications, we'll actually only be interested in the newest available root. Timestamps and metadata will also help clients know the freshness of the data they're getting. Summing up, the change operation took O(height) time, and created O(height) new IPFS blocks. In an AVL tree, height=O(logn), so we're good.

How do we implement insert and delete operations? We do them like in the underlying tree (say AVL). We find the node to delete, or the place where the insertion should occur, and we do a classical AVL operation, except that each time we need to modify a node, we instead create a new copy of it, modified in the appropriate places, and we make sure that the parent points to the new copy rather than the old one. One can check that because an AVL operation operated along a root-to-node path anyway, only O(logn) nodes will be copied, and only O(logn) operations will be needed. (This also follows from the classical performance analysis of the Path Copying method.)

For performance sake, we can store a whole chunk of the tree in one IPFS object, instead of just a single node: this might lower overheads.

To know more about this approach to implementing data structures, here are some resources. What I described here basically amounts to an implementation of partially-persistent data structures. While the original motivation behind persistent data structures was to allow access to old versions, that academic field of knowledge can also be re-appropriated to implementing data structures over immutable storage, where we simply are not allowed to delete old data. Research on persistent data structures was originally motivated by functional programming, because all objects in purely functional languages are immutable. This once again shows us how blockchain and decentralized systems are a perfect fit to functional programming: functional programming is great for security and for avoiding bugs, it is great for program verification, it's great for avoiding weird synchronicity bugs in distributed systems, and it naturally makes objects immutable, which means that any data structure written in a functional program is automatically ready for IPFS. For example, any implementation of the data structure I describe above in a functional language will automatically be trivially portable to IPFS; this is not necessarily true for a Go/Python/JS implementation. One more thing: another setting where immutability was needed is flash memory and other kinds of write-only memory (or limited-rewrite memory, like SSD drives and USB drives). This means that the literature on data structures on flash-memory/SSD might contain interesting ideas for implementing various data structures on IPFS.

Here are some myriad links about the topic of Persistent Data Structures:
https://en.wikipedia.org/wiki/Persistent_data_structure
Lectures 1,2,3 at https://courses.csail.mit.edu/6.851/spring14/lectures/
http://www.cs.tau.ac.il/~haimk/papers/persistent-survey.ps
https://www.reddit.com/r/rust/comments/3e000u/survey_persistent_data_structures/
http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
http://www.cs.tau.ac.il/~haimk/papers/journal2.pdf
https://www.cs.cmu.edu/~sleator/papers/making-data-structures-persistent.pdf
More references can be found here (search for persistence, and expand
the link): http://jeffe.cs.illinois.edu/teaching/datastructures/schedule.html

A final note I don't know how the native CRDT system implemented in IPFS interacts with the data structure I describe. I doubt it will harm correctness, but it might harm efficiency. More thought should be given to this.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions