Skip to content
This repository was archived by the owner on Aug 3, 2021. It is now read-only.

ipfs-pack tutorial #8

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions tutorial/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Tutorial: IPFS Pack
====
These Lessons are tested with ipfs-pac version ???. _Please update this file on github to reflect any other versions that have been tested._
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type on ipfs-pack


## Prerequisites

- You should have some familiarity with the command line.
- You should have golang installed
- You **do not need to have ipfs installed already**.

## Learning Objectives
These Lessons will teach you how to
* Understand the use cases that motivated the creation of ipfs-pack
* Understand the core concepts behind ipfs-pack
* Compare and contrast the "copy-on-add" and "nocopy" approaches to adding content to ipfs
* Decide whether IPFS Pack is the right tool for your needs
* Install the ipfs-pack tool
* Initialize a directory on your machine as an IPFS Pack
* Serve the contents of an IPFS Pack over the IPFS network

## Key Concepts
* TODO

# Lessons

1. [Lesson: Understanding IPFS Pack](lessons/understanding-ipfs-pack.md)
2. [Lesson: Install ipfs-pack](lessons/install-ipfs-pack.md)
3. [Lesson: Initialize an IPFS Pack](lessons/initialize-a-pack.md)
4. [Lesson: Serve the Contents of the Pack to the Network](lessons/serve-pack-contents.md)

## Next Steps
91 changes: 91 additions & 0 deletions tutorial/lessons/initialize-a-pack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Lesson: Initialize an IPFS Pack

## Goals

After doing this Lesson you will be able to

* Initialize a directory on your machine as an IPFS Pack
* Serve the contents of an IPFS Pack over the IPFS network
* Explain the contents of the PackManifest file in the root of your IPFS Pack

## Steps

### Step 1: Create a Pack
This will create a PackManifest file in the directory you specified. The PackManifest represents that directory and all the files & directories within it as a [pack tree](../spec.md#terms).
The directory you specified, where the PackManifest file is generated, is referred to as the [pack root](../spec.md#terms)

First `cd` into the root directory of a dataset that you want to track.
```
$ cd {path to your directory}
```

While in the root directory of the dataset, run `ipfs-pack make`.
```
$ ipfs-pack make
wrote PackManifest
```

Contratulations. Now you've built a Pack for your dataset. It created a `PackManifest` file, which lists the directory's contents and their hashes, and an `.ipfs-pack` directory which contains the IPFS repo (aka. object store) that represents your dataset at hashed, content-addressed chunks that are ready to be served to the network. Remember, unlike normal IPFS repositories which actually contain a copy of your data, the `.ipfs-pack` repo _only contains references to the data_. They're referenced by paths relative to the root of your pack.

### Step 2: Inspect the PackManifest

Use a text editor to look at the contents of the `PackManifest` file.

```
$ vi PackManifest
```

It has a bunch of lines that look like this:
```
zb2rhmyBBjTBw1q9TQQv3YV69Gc2tHccTv7egjNNVXuu8YPpw f0000120001 ./tutorial/lessons/initialize-a-pack.md
```

The format of each line is:
`<content hash> <format string> <relative path>`

The `<content hash>` is the hash of the file located at `<relative path>`. The `<format string>` provides a [future-proofed](https://www.youtube.com/watch?v=soUG72j7kB0) record of how the content hash was generated and how the content was added to the `.ipfs-pack` repo.

### Step 3: Verify the contents of the directory against the PactManifest

At any point you can use the PackManifest to verify that the contents of your pack have not changed. To do this, run `ipfs-pack verify`

```
$ ipfs-pack verify
Pack verified successfully!
```

If any of the files in your pack have changed, the output will tell you which files have changed and conclude with the message `Pack verify found some corruption.` For example, if my pack contaiined a file called `styles/site.css` and I modified it after building the pack, I would get a message like:

```
Checksum mismatch on ./styles/site.css. (zb2rhof9xknpBt36jvWRPVADLfsk2zhL7y5dLUSRRNfMuTGnF)
Pack verify found some corruption.
```

For more info about using this command, read the helptext on the command line.
```
$ ipfs-pack verify --help
```

### Step 4: Update the PackManifest

ipfs-pack is not like git, which keeps previous versions of your files when you commit new changes. On the contrary, the current version of ipfs-pack is specifically designed **not to duplicate** any of your content. This is so you can add any amount of data, possibly hundreds of Terabytes, without taking up extra storage. For more info about this, read the lesson on [understanding ipfs-pack](understanding-ipfs-pack.md).

If you change any of the files in your pack, you can update the PackManifest by running `ipfs-pack make` again. This will update the PackManifest to accurately represent the current contents of the pack tree. , Note that this will change the root hash of the pack and if you remove or modify any of the dataset contents the old information will no longer be available from your ipfs-pack node.

### Step 5: Rebuild the Local IPFS Object Store inside your Pack

When you ran `ipfs-pack make` it built an IPFS repository in the root of your pack called `.ipfs-pack`. If you want to regenerate that repository, run `ipfs-pack repo regen`

```
$ ipfs-pack repo regen
```

To see the other commands you can run against the pack repo, run

```
$ ipfs-pack repo --help
```

## Next Steps

Now you're ready to [serve the pack contents over ipfs](serve-pack-contents.md).
41 changes: 41 additions & 0 deletions tutorial/lessons/install-ipfs-pack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
Lesson: Install ipfs-pack
=====

## Prerequisites

* Install [go](https://golang.org/dl/)
* Make sure that $GOPATH is on $PATH `export PATH=$PATH:$GOPATH/bin`

You do not need to install ipfs separately. ipfs-pack will handle that for you.

## (SOON) Download the Prebuilt Binaries

_If you see ipfs-pack binaries listed on dist.ipfs.io, then this document is out of date. Please submit a PR or an issue on Github to correct it._

Currently there are not prebuilt ipfs-pack binaries. After we have tested the tool more thoroughly we will release prebuilt binaries on dist.ipfs.io. In the meantime, you will have to build from source. (read on...)

## Build from source

### Step 1: Clone the Repository

```sh
git clone [email protected]:ipfs/ipfs-pack
cd ipfs-pack
```

### Step 2: Build the binaries from source
Build ipfs-pack, which includes go-ipfs. This will take a while because it downloads and builds all of go-ipfs.
```
make build
```

This generates a binary called `ipfs-pack`

### Step 3: add the binary to your PATH

Add the generated binary to your executable PATH. The way to do this depends on your operating system.


## Next Steps

Next, [Initialize a Pack](initialize-a-pack.md)
58 changes: 58 additions & 0 deletions tutorial/lessons/serve-pack-contents.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Lesson: Serve the Contents of the Pack to the Network

## Goals

After doing this Lesson you will be able to

* Serve the contents of an IPFS pack to the IPFS network

## Steps

### Step 1: Start the IPFS node

ipfs-pack lets you serve the contents of your pack directly to the ipfs network. When you created the pack with the `ipfs-pack make` command, it set up everything you need. Now all you need to do is start the node so it can connect to the network and serve the content. To do this, run

```
$ ipfs-pack serve
```

After starting the ipfs node with `ipfs-pack serve`, you will see some info about the node printed on the console. It will look like:

```
verified pack, starting server...
Serving data in this pack...
Peer ID: QmVbXV7mQ5Fs3tYY2Euek5YdkkzcRafUg8qGWvFdgaBMuo
/ip4/127.0.0.1/tcp/58162
/ip4/1.2.3.4/tcp/58162
Pack root is QmRguPt6jHmVMzu1NM8wQmpoymM9UeqDJGXdQyU3GhiPy4
Shared: 0 blocks, 0 B total data uploaded
```

### Step 2: Share the Pack Root Hash

Now other nodes on the network can access the dataset, but they need to know the hashes of the dataset's content. The easiest way to give them access to that info is by giving them the pack root hash. That hash is in the information that was printed on the command line when you ran `ipfs-pack serve`. In the example output above, it's the second to last line, which reads:

```
Pack root is QmRguPt6jHmVMzu1NM8wQmpoymM9UeqDJGXdQyU3GhiPy4
```

If you give that hash, which usually begins with `Qm`, to anyone else they can use it to request the dataset from your node.

### Step 3: Try it out

If you have a copy of regular ipfs (not ipfs-pack) installed, you can confirm that your dataset is available on the network by running a second ipfs node on your machine and using it to read the data. For example, you can list the contents of your pack's root directory `ipfs ls`

First, start a second, regular ipfs node. This will let your ipfs node retrieve the data from your pack repo using the ipfs protocol.

```
$ ipfs daemon
```

Now list the contents of your pack's root directory using `ipfs ls`
```
$ ipfs ls YOUR_ROOT_HASH
```

## Next Steps

You're all set! Go to IRC or the IPFS forums to tell us about the data you're serving with ipfs-pack. For info about how to connect with the community, visit https://github.com/ipfs/community
51 changes: 51 additions & 0 deletions tutorial/lessons/understanding-ipfs-pack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
Lesson: Understanding IPFS Pack
===

## Goals

After doing this Lesson you will be able to
* Understand the use cases that motivated the creation of ipfs-pack
* Compare and contrast the "copy-on-add" and "nocopy" approaches to adding content to ipfs
* Decide whether IPFS Pack is the right tool for your needs
* Explain terms from the ipfs-pack Specification, such as "pack tree", "pack root" and "PackManifest"


## Steps

### The Motivation

We designed ipfs-pack to support situations where you have a large dataset that you do not intend to change.

This is not like git, which keeps previous versions of your files when you commit new changes. On the contrary, the current version of ipfs-pack is specifically designed **not to duplicate** any of your content. This is so you can add any amount of data, possibly hundreds of Terabytes, without taking up much extra storage.

### Is ipfs-pack the Right Tool for You?

copy-on-add vs "nocopy"

Over time, ipfs-pack might evolve to gracefully handle situations where the registered content _can_ change.

Not appropriate for accumulating a version history of a dataset that changes over time. For that you should use regular ipfs, adding the content to your ipfs repository with `ipfs add` whenever it changes. That way, the copy of your data that accumulates in the ipfs repository will contain all versions of the data and will attempt to store those versions in a compact way that minimizes duplication.

### Don't use ipfs-pack for...
* **Don't use for** Sharing files that you intend to change frequently
* **Don't use for** Tracking version histories of content

For those situations, use regular ipfs, not ipfs-pack.

#### Good uses for ipfs-pack
* Sharing archival copies of content which you don't intend to change
* Sharing large volumes of data that you can't afford to duplicate

**Example: Serving and Archival Copy** If you have an archival copy of a dataset that you actively plan to preserve in its current structure, you can use ipfs-pack to serve that archival copy directly to the ipfs network. That way, it does double-duty. It's both an archival copy and and a network-available seed of the dataset.

**Example: Serving a Dataset that's Too Big to Duplicate** If you want to serve a dataset on IPFS but don't have enough storage space available to store a second copy of the data in your IPFS repository, you can use ipfs-pack to serve the dataset directly from its current location.

For both of these examples, if a situation comes up where you have to change or rearrange the dataset, you can always rebuild the pack, but note that this will change the root hash of the pack and if you remove or modify any of the dataset contents they will cease to be available through your ipfs-pack node -- _because_ ipfs-pack doesn't copy the data into your IPFS repo, there's no backup copy. With the ipfs-pack approach, the files in your pack are the only copies of the data.

### The ipfs-pack Specification

Read [the ipfs-pack Spec](../../spec.md) to learn more about the design of ipfs-pack. Especially pay attention to the [terms](../../spec.md#terms) section to learn what we mean by "pack tree", "pack root", "pack repo" and "PackManifest".

## Next Steps

Now that you've read about ipfs-pack, either [install it](install-ipfs-pack.md) or proceed to [initialize a pack](initialize-a-pack.md).