-
Notifications
You must be signed in to change notification settings - Fork 4
perf: concurrent file import #41
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -124,26 +124,26 @@ The input's file paths and directory structure will be preserved in the [`dag-pb | |
- `chunker` (string, defaults to `"fixed"`): the chunking strategy. Supports: | ||
- `fixed` | ||
- `rabin` | ||
- `chunkerOptions` (object, optional): the options for the chunker. Defaults to an object with the following properties: | ||
- `avgChunkSize` (positive integer, defaults to `262144`): the average chunk size (rabin chunker only) | ||
- `minChunkSize` (positive integer): the minimum chunk size (rabin chunker only) | ||
- `maxChunkSize` (positive integer, defaults to `262144`): the maximum chunk size | ||
- `avgChunkSize` (positive integer, defaults to `262144`): the average chunk size (rabin chunker only) | ||
- `minChunkSize` (positive integer): the minimum chunk size (rabin chunker only) | ||
- `maxChunkSize` (positive integer, defaults to `262144`): the maximum chunk size | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a mistake. Internally it's been flattened for ages so the |
||
- `strategy` (string, defaults to `"balanced"`): the DAG builder strategy name. Supports: | ||
- `flat`: flat list of chunks | ||
- `balanced`: builds a balanced tree | ||
- `trickle`: builds [a trickle tree](https://github.com/ipfs/specs/pull/57#issuecomment-265205384) | ||
- `maxChildrenPerNode` (positive integer, defaults to `174`): the maximum children per node for the `balanced` and `trickle` DAG builder strategies | ||
- `layerRepeat` (positive integer, defaults to 4): (only applicable to the `trickle` DAG builder strategy). The maximum repetition of parent nodes for each layer of the tree. | ||
- `reduceSingleLeafToSelf` (boolean, defaults to `true`): optimization for, when reducing a set of nodes with one node, reduce it to that node. | ||
- `dirBuilder` (object): the options for the directory builder | ||
- `hamt` (object): the options for the HAMT sharded directory builder | ||
- bits (positive integer, defaults to `8`): the number of bits at each bucket of the HAMT | ||
- `hamtHashFn` (async function(string) Buffer): a function that hashes file names to create HAMT shards | ||
- `hamtBucketBits` (positive integer, defaults to `8`): the number of bits at each bucket of the HAMT | ||
- `progress` (function): a function that will be called with the byte length of chunks as a file is added to ipfs. | ||
- `onlyHash` (boolean, defaults to false): Only chunk and hash - do not write to disk | ||
- `hashAlg` (string): multihash hashing algorithm to use | ||
- `cidVersion` (integer, default 0): the CID version to use when storing the data (storage keys are based on the CID, _including_ it's version) | ||
- `rawLeaves` (boolean, defaults to false): When a file would span multiple DAGNodes, if this is true the leaf nodes will not be wrapped in `UnixFS` protobufs and will instead contain the raw file bytes | ||
- `leafType` (string, defaults to `'file'`) what type of UnixFS node leaves should be - can be `'file'` or `'raw'` (ignored when `rawLeaves` is `true`) | ||
- `blockWriteConcurrency` (positive integer, defaults to 10) How many blocks to hash and write to the block store concurrently. For small numbers of large files this should be high (e.g. 50). | ||
- `fileImportConcurrency` (number, defaults to 50) How many files to import concurrently. For large numbers of small files this should be high (e.g. 50). | ||
|
||
[ipld-resolver instance]: https://github.com/ipld/js-ipld-resolver | ||
[UnixFS]: https://github.com/ipfs/specs/tree/master/unixfs | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,33 +38,34 @@ | |
"homepage": "https://github.com/ipfs/js-ipfs-unixfs-importer#readme", | ||
"devDependencies": { | ||
"aegir": "^20.0.0", | ||
"async-iterator-buffer-stream": "^1.0.0", | ||
"async-iterator-last": "^1.0.0", | ||
"chai": "^4.2.0", | ||
"cids": "~0.7.1", | ||
"deep-extend": "~0.6.0", | ||
"detect-node": "^2.0.4", | ||
"dirty-chai": "^2.0.1", | ||
"ipfs-unixfs-exporter": "^0.39.0", | ||
"ipld": "^0.25.0", | ||
"ipld-in-memory": "^3.0.0", | ||
"it-buffer-stream": "^1.0.0", | ||
"it-last": "^1.0.0", | ||
"multihashes": "~0.4.14", | ||
"nyc": "^14.0.0", | ||
"sinon": "^7.1.0" | ||
}, | ||
"dependencies": { | ||
"async-iterator-all": "^1.0.0", | ||
"async-iterator-batch": "~0.0.1", | ||
"async-iterator-first": "^1.0.0", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to get PRs to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure - I'll do that in the PR that adds support for the new options |
||
"bl": "^4.0.0", | ||
"deep-extend": "~0.6.0", | ||
"err-code": "^2.0.0", | ||
"hamt-sharding": "~0.0.2", | ||
"ipfs-unixfs": "^0.2.0", | ||
"ipld-dag-pb": "^0.18.0", | ||
"it-all": "^1.0.1", | ||
"it-batch": "^1.0.3", | ||
"it-first": "^1.0.1", | ||
"it-parallel-batch": "1.0.2", | ||
"merge-options": "^2.0.0", | ||
"multicodec": "~0.5.1", | ||
"multihashing-async": "^0.8.0", | ||
"rabin-wasm": "~0.0.8", | ||
"superstruct": "^0.8.2" | ||
"rabin-wasm": "~0.0.8" | ||
}, | ||
"contributors": [ | ||
"Alan Shaw <[email protected]>", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,7 @@ | ||
'use strict' | ||
|
||
const batch = require('async-iterator-batch') | ||
const all = require('it-all') | ||
|
||
module.exports = async function * (source, reduce) { | ||
const roots = [] | ||
|
||
for await (const chunk of batch(source, Infinity)) { | ||
roots.push(await reduce(chunk)) | ||
} | ||
|
||
yield roots[0] | ||
yield await reduce(await all(source)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can we get an acccompanying PR to
js-ipfs
?