-
Notifications
You must be signed in to change notification settings - Fork 20
About this project... #88
Comments
|
Isn't there any company that could offer a couple of servers to pin content? Protocol Labs for example, they got quite a lot of money for IPFS, why can't they use a couple of servers to pin free/libre content? Doing that would be perfectly useful to promote IPFS and speed up adoption, especially if the content pinned in IPFS is immediately available via the web as |
Note: This is a good example of a discussion that fits in forum software (we're currently trying out discourse) rather than github, since it's more of a discussion and a request for information rather than a feature request or a bug report. I expressed concern about exactly this confusion about 2 weeks ago when we set up archives.ipfs.io. At that time I said:
This ipfs/archives github repository was set up as a catch-all for keeping track of any work that anyone is doing with archives on ipfs. That work has evolved a lot over the past year, which has made the information in this repository scattered and unclear. Until this week, the repository didn't have a "captain" tending to it. I've been named as the temporary captain while we look for a community member to take the role. The big picture goal for datasets on the distributed web is for many people and organizations to pin the data they care about and build their own registries with metadata about those datasets along with hashes to identify the data on the network. Those registries can then serve as points of reference for people and orgs who want to hold & serve copies of the data they care about. In the meantime, https://archives.ipfs.io/ gives a short list of datasets that we know are pinned.
IPFS is that huge, distributed repository. The challenges lie in
Some of the things we're doing to address these challenges
Protocol Labs is designed as a Research & Development & Deployment lab for Networks. R&D Labs are great for developing new technologies and helping the world understand how to make the most of those technologies. They move fast and constantly shake things up. That's not the type of organization who should be storing everyone's content. Also, asking a central authority to hold everyone's data would run against the whole point of decentralization. The work of providing preservation, discovery, and access services on content is the work of communities and institutions. The OAIS Reference Model is a good starting point for thinking about what it means to store, preserve and serve content over the long term. Especially note the "minimum responsibilities" that an OAIS-type archive is expected to meet. Protocol Labs isn't set up to do this, but many institutions are. Libraries are the obvious place to turn, which is why it's so important to form a community around Distributed Web for Libraries Archives and Museums. That's not the final solution -- individuals, communities and companies should also pin, annotate & serve the data they care about -- but it's a solid start. |
From the neighboring issue (#87), it looks like @protocol1 (and undisclosed participating institutions) is pushing an effort to publish https://data.gov (which data is currently hosted by sungard or akamai) to IPFS so that its content is accessible to all researchers. AFAIK, the existing orgs/private co's/institutions which would have the resource to maintain such libre content are: http://www.alexandria.io/ (@dloa), http://ga4gh.org/ (@ga4gh, for genomics), http://ipdb.foundation/ (@ipdb), http://www.mediachain.io/ (@mediachain), ... [1] (with this, "anybody from Protocol Labs" should have seen this issue thread) it is up to anyone to assume that these agents/orgs are making a perfectly rational decision (for the sake of the public datasets?), after having taken into account of various possible moves. |
@rht I'm not sure I understand your comments. FYI: the "undisclosed participating institutions" are the libraries at a few large universities. They have the staff, resources and organizational infrastructure to host, curate and preserve high-impact datasets for the long term. We will be adding their names to the sprint issue and other docs over the next few days once they've had a chance to confirm that they want their names on the project. They approached us because they plan to experiment with using ipfs to host & distribute the content. We allocated a 2-week sprint as time where a handful of us can focus our energy on helping them. This is likely the first of many collaborations that will involve many institutions around the world from both the public and private sectors. |
Closing this issue. We've answered the questions posed and created issue #89 to follow up by improving the docs with info from this issue. |
... if I understand correctly, the point of this project is to choose some data to pin and offer as "archive".
The text was updated successfully, but these errors were encountered: