-
Notifications
You must be signed in to change notification settings - Fork 2.3k
ZSTD_GENERIC_ERROR when compressing large binary files #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
When reaching this line, if the test triggers the error code, Now, if that is the right scenario, it means the next step is to understand why the normalization would fail. It could be a very specific corner case that the fuzzer is unable to produce. It may sound a long stretch, but could it be possible to access the faulty file, for debugging ? If the problem is the one described above, it means it's not related to the size of the file. |
Sorry, I can't give out the faulty file since it contains some sensitive information. One thing to note is that after removing the line and compressing, then decompressing the file, the file is a perfect match for the original. |
OK. Also, as a secondary question : |
The comment "there are more symbols than the max symbol limit" was purely a guess from what I saw of the code. It's very likely I mis-read the code and the real issue is something else entirely. If it helps, I can give you some gdb output, so for example here's some variables from gdb when it stops at that line
|
OK. symbol necessarily exits the look at maxSymbolValue+1, so this part is correct. What is not correct is "position", which is supposed to end at "0". I've made a small update of FSE within the "dev" branch of FSE. |
When using the dev branch of FSE, compression works. |
Thanks for the feedback |
Fix integrated into zstd "dev" branch |
merged into master |
When compressing a large 6GB binary file a compression error happens. After some investigation it appears to happen when there are more symbols than the max symbol limit. The line is here
https://github.com/Cyan4973/zstd/blob/master/lib/fse.c#L1458
A simple temporary fix is just deleting this line, but my guess is that this isn't a good solution. Increasing the max symbol limit didn't seem to work, but I'm not that familiar with the code base so I'm sure I missed something.
The text was updated successfully, but these errors were encountered: