Skip to content

get/import: can't import directory #4079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
efiop opened this issue Jun 22, 2020 · 8 comments · Fixed by #4083
Closed

get/import: can't import directory #4079

efiop opened this issue Jun 22, 2020 · 8 comments · Fixed by #4083
Assignees
Labels
bug Did we break something?

Comments

@efiop
Copy link
Contributor

efiop commented Jun 22, 2020

$ dvc get https://github.com/iterative/dvc scripts
                                                                                         
ERROR: unexpected error - Could not accommodate requested object type 'tree', got commit 

And with -v it hangs on computing hashes 😱

@ghost ghost added the triage Needs to be triaged label Jun 22, 2020
@efiop efiop added bug Did we break something? p0-critical labels Jun 22, 2020
@ghost ghost removed the triage Needs to be triaged label Jun 22, 2020
@skshetry
Copy link
Collaborator

skshetry commented Jun 22, 2020

This was happening for a long time in my machine, I thought something was wrong on my machine (and, it was an intermittent issue).

I get different sets of error messages, some of them are:
1.

SHA b'parent' could not be resolved, git returned: b'parent d3145f92171b593748cc36452455dd099c49e239'
ValueError: Could not accommodate requested object type 'tree', got commit
IndexError: index out of range
ValueError: Failed to parse header: b'40000 .dvc\x00t\xa8Y3vH\xb7\x80\xb5\x97b\x02Y\xdb\\\x8d\xae<+\x1f100644 .gitignore\x00u^R\n'

Looks like an issue with GitPython, at least I found one similar issue: gitpython-developers/GitPython#1016 and gitpython-developers/GitPython#584

@efiop
Copy link
Contributor Author

efiop commented Jun 22, 2020

Am able to reproduce if I change the directories around in the command. Get different errors though. Very interesting

@efiop
Copy link
Contributor Author

efiop commented Jun 22, 2020

Ok, I think I found it. There is a bug in gitpython where cat-file --batch returns output in unexpected format. Looking into it...

@skshetry
Copy link
Collaborator

@efiop, I tried running this on 0.94, and it did not fail once. I was able to bisect the problem to 9ead641. At the moment,save_info is passed a tree, which is a RepoTree instance since then. This felt wrong to me, as it was not the case before that particular commit but I am too tired to debug right now.

Maybe, this is the reason why it has started hitting GitTree (which should not have? 😕)?

@efiop
Copy link
Contributor Author

efiop commented Jun 22, 2020

We are computing the checksum in a thread pool, while stream_object_data is not thread safe. Looks like that is the cause.

@efiop
Copy link
Contributor Author

efiop commented Jun 22, 2020

@skshetry Yeah, that's the one for sure. Nothing inherently wrong with it, we just forgot that some gitpython methods are not thread safe. Looking into solving it somehow... Thanks for bisecting! Have a good rest 🙂

@skshetry
Copy link
Collaborator

@efiop, okay, but shouldn't it read through the checked-out directory rather than git objects?

@efiop
Copy link
Contributor Author

efiop commented Jun 22, 2020

@skshetry No, we are trying to go away from checking out git repos and doing everything in-memory instead. So we are on the right path.

efiop added a commit to efiop/dvc that referenced this issue Jun 22, 2020
GitPython is not threadsafe, which was causing issues when we were
computing hash for a directory.

Fixes iterative#4079

Related to tree generalization iterative#4050
efiop added a commit that referenced this issue Jun 22, 2020
GitPython is not threadsafe, which was causing issues when we were
computing hash for a directory.

Fixes #4079

Related to tree generalization #4050
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something?
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants