Description
The filestore is a datastore, but it is only designed to handle a subset of the blocks that can be used in IPFS, therefore the main datastore is still needed and some sort of support for multiple block or datastores is needed so both the filestore and datastore can coexist. This is a required infrastructure change in order to land #2634.
The following describes how it is currently implemented now. Please let me know if you agree with and understand the changes. Once there is a general consensus I can separate out the non-filestore bits to support this infrastructure change so we can work through the implementation details.
Sorry if it is a bit long.
@whyrusleeping please CC anyone else who should be involved.
Overview
There are several ways to support the "filestore". What I believe makes the most sense and will be the easiest to implement will be to support a "cache" and then any number of additional "aux"
datastores with the following semantics:
- When looking up a block the "cache" is first tried, if the block is not found then each "aux" datastores is tried in turn. The order of the "aux" datastores will explicitly set by the user.
- Any operations that modify the datastore only act on the "cache".
- The "aux" datastores are allowed to read-only. When they are not additional specialized API calls will be required for adding or removing data from the "aux" datastores.
- Each of these datastore is given a name and can be accessed via its name from the repo.
- Duplicate data should be avoided if possible, but not completely disallowed.
These rules imply that the garbage collector should only attempt to remove data from the "cache" and leave the other datastores alone.
High level implementation details
The multiplexing can either happen at the datastore or the blockstore level. I originally implemented it at the datastore level but changed it to the blockstore level to better interact with caching. The filestore is still implemented as a datastore (for now).
In the fsrepo
Normal blocks are mounted under the /blocks
prefix (this is unchanged) the filestore is mounted under the /filestore
prefix (this is new). The fsrepo
has been enhanced to be able to retrieve the underlying datastore based on its prefix. (This is required by the filestore.)
The top-level blockstore is now a multi-blockstore that works by checking a pre-configured set of prefixs in turn in order to find a matching key. Each mount is wrapped in its own blockstore with its
own caching semantics. The first mount "/blocks" is considered the cache and all Put and Deletes only go to the cache. The multiblock store interface is as follows:
type MultiBlockstore interface {
GCBlockstore
FirstMount() Blockstore // returns the first mount
Mounts() []string // lists the mounts
Mount(prefix string) Blockstore // returns a mount by name
Locate(key key.Key) []LocateInfo // lists all locations of a block
}
The garbage collector uses FirstMount().AllKeysChan(ctx) to get the list of blocks for the list of to try and delete.
Any caching is currently only done on the first mount for now.
As an implementation detail it is worth noting that files are removed or added to the filestore directly using a specialized interface that bypasses the normal Blockstore and Filestore interface. This was discussed with @whyrusleeping (#2634 (comment)).
Duplicate blocks (that is blocks found under more than one mount) are not forbidden as doing so would be impractical. The Locate() method can be used to discover what mount a block is found. It will list all mounts which can be used to help eliminate the duplicates.
Other uses
The two mounts /blocks
and /filestore
are currently hard coded, with some effort this can be made into a more general purpose mechanize to support multiple blockstores.
One use case I can think of is to have a separate read-only datastore to store permanent content as a alternative to maintaining a large pin-set which currently has performance problems. The datastore could even be on a readonly filesystem to prevent any possibly of the data accidental being deleted either by user error or a software bug. Some additional design decisions will need to made for this so I am not proposing it right now, but merely offering it as a possibility.
Another possibility is to support a local cache on a local filesystem and a larger datastore on the cloud.