
Description
go-libp2p-net.MessageSizeMax
puts an upper limit of ~4 MiB on the size of messages on a libp2p protocol stream: https://github.com/libp2p/go-libp2p-net/blob/70a8d93f2d8c33b5c1a5f6cc4d2aea21663a264c/interface.go#L20
That means Bitswap will refuse to transfer blocks that are bigger than (1 << 22) - bitswapMsgHeaderLength
, while locally these blocks are usable just fine. In unixfs that limit is fine, because we apply chunking. In ipld-git however, we can't apply chunking because we must retain the object's original hash. It's quite common to have files larger than 4 MiB in a Git repository, so we should come with a way forward pretty soon.
Here's three options:
- Leave it as is. Very unsatisfactory.
- Make MessageSizeMax configurable. Better, but still far from satisfactory.
- Make Bitswap capable of message fragmentation. The size limit exists mainly to prevent memory exhaustion due to reading big messages and not being able to verify and store them as we go. We could teach Bitswap how to verify and temporarily store fragmented messages. This would end up overly complex though, since these fragments are not ipld blocks, and thus can't reuse the stuff we already have.
- Introduce some kind of "virtual" blocks, which look similarly to our existing chunking data structuers, but whose hash is derived from the concatenated contents of its children. This is of course hacky because we can't verify the virtual block until we have fetched all children, but it lets us do 3) while reusing IPLD and the repo, and we can verify the children as we go.
Related issues: ipfs/kubo#4473 ipfs/kubo#3155 (and slightly less related ipfs/kubo#4280 ipfs/kubo#4378)