Skip to content

ipfs ls tries to get the contents inside the folder #4229

@wpfnihao

Description

@wpfnihao

Version information:

go-ipfs version: 0.4.10-
Repo version: 5
System version: amd64/linux
Golang version: go1.8.3

Type:

Medium

Severity:

Low

Description:

Environment:
One folder containing 100,000 files, each of the size 100KB.
Two servers with 1Gb network.
The test folder added in one server and requested by the other one.

I have tested the performance of ipfs ls. It seems that ipfs ls with the option --resolve-type=true, which is the default setting, ipfs will get the contents inside the folder for just listing the folder information (see Fig. 3 and Fig. 4).

I have a quick look of the go-ipfs codes at go-ipfs/core/commands/ls.go, which confirms my guess:

// code snippet
output[i] = LsObject{
	Hash:  paths[i],
	Links: make([]LsLink, len(links)),
}

for j, link := range links {
	t := unixfspb.Data_DataType(-1)

	linkNode, err := link.GetNode(req.Context(), dserv)
	if err == merkledag.ErrNotFound && !resolve {
		// not an error
		linkNode = nil
	} else if err != nil {
		res.SetError(err, cmds.ErrNormal)
		return
	}

After that, I also tested ipfs ls --resolve-type=false and FUSE + native ls (see Fig. 1-2 and Fig. 5-6). The results show that with --resolve-type=false ipfs will only get the folder info block, and FUSE + native ls has the similar behavior with ipfs ls --resolve-type=true.

I know that ipfs needs the linked blocks to resolve the data type (in go-ipfs/unixfs/pb/unixfs.pb.go):

 type Data_DataType int32  
 const (        
     Data_Raw       Data_DataType = 0      
     Data_Directory Data_DataType = 1 
     Data_File      Data_DataType = 2 
     Data_Metadata  Data_DataType = 3 
     Data_Symlink   Data_DataType = 4             
     Data_HAMTShard Data_DataType = 5    
 )

My problems here:

  1. In some cases, I only need the folder info. and then get only a few files in the folder. However, ipfs will download the entire folder, which can be extremely slow.
  2. Do you have any plan to store the data type information along with the hash links, so that tackling with folders won't be cumbersome.

Thank you.

Related to #3120

Fig. 1: ipfs ls --resolve-type=false Sender
ipfs-ls-resolve-false-135-to-137-100k-100k/sys_fig_135.jpg

Fig. 2: ipfs ls --resolve-type=false Receiver
ipfs-ls-resolve-false-135-to-137-100k-100k/sys_fig_137.jpg

Fig. 3: ipfs ls --resolve-type=true Sender
ipfs-ls-135-to-137-100k-100k/sy_fig_135.jpg

Fig. 4: ipfs ls --resolve-type=true Receiver
ipfs-ls-135-to-137-100k-100k/sys_fig.jpg

Fig. 5: ls with FUSE Sender
fuse-ls-135-to-137-100k-100k/sys_fig_135.jpg

Fig. 6: ls with FUSE Receiver
fuse-ls-135-to-137-100k-100k/sys_fig.jpg

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions