This repository was archived by the owner on Mar 1, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 36
deflate: avoid use of uninitialized variable #4
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Uses SSE2 subtraction with saturation to shift the hash in 16B chunks. Renames the old fill_window implementation to fill_window_c(), and adds a new fill_window_sse() implementation in fill_window_sse.c. Moves UPDATE_HASH into deflate.h and changes the scope of read_buf from local to ZLIB_INTERNAL for sharing between the two implementations. Updates the configure script to check for SSE2 intrinsics and enables this optimization by default on x86. The runtime check for SSE2 support only occurs on 32-bit, as x86_64 requires SSE2. Adds an explicit rule in Makefile.in to build fill_window_sse.c with the -msse2 compiler flag, which is required for SSE2 intrinsics.
For systems supporting SSE4.2, use the crc32 instruction as a fast hash function. Also, provide a better fallback hash. For both new hash functions, we hash 4 bytes, instead of 3, for certain levels. This shortens the hash chains, and also improves the quality of each hash entry.
Rather than copy the input data from strm->next_in into the window and then compute the CRC, this patch combines these two steps into one. It performs a SSE memory copy, while folding the data down in the SSE registers. A final step is added, when we write the gzip trailer, to reduce the 4 SSE registers to 32b. Adds some extra padding bytes to the window to allow for SSE partial writes.
The deflate_quick strategy is designed to provide maximum deflate performance. deflate_quick achieves this through: - only checking the first hash match - using a small inline SSE4.2-optimized longest_match - forcing a window size of 8K, and using a precomputed dist/len table - forcing the static Huffman tree and emitting codes immediately instead of tallying This patch changes the scope of flush_pending, bi_windup, and static_ltree to ZLIB_INTERNAL and moves END_BLOCK, send_code, put_short, and send_bits to deflate.h. Updates the configure script to enable by default for x86. On systems without SSE4.2, fallback is to deflate_fast strategy.
From: Arjan van de Ven <[email protected]> As the name suggests, the deflate_medium deflate strategy is designed to provide an intermediate strategy between deflate_fast and deflate_slow. After finding two adjacent matches, deflate_medium scans left from the second match in order to determine whether a better match can be formed.
(Note emit_match() doesn't currently use the value at all.)
Hi Nathan, Thank you for your contribution. I'll merge this, but could you first verify that you're releasing this patch under the Zlib license? Thanks. |
On 14-05-27 02:48 PM, jtkukunas wrote:
Yes. Released under zlib license (if copyright can apply to 4 characters) :) -Nathan |
Heh. The reason that I ask is because the master branch tracks the patchset revisions I send to the zlib-devel mailing list. So regardless of the size of the patch, I needed to double check. Thanks. |
Merged into master. |
jtkukunas
pushed a commit
that referenced
this pull request
Jun 4, 2014
(Note emit_match() doesn't currently use the value at all.) Fixes #4
jtkukunas
pushed a commit
that referenced
this pull request
Jun 16, 2014
(Note emit_match() doesn't currently use the value at all.) Fixes #4
jtkukunas
pushed a commit
that referenced
this pull request
Jul 26, 2014
(Note emit_match() doesn't currently use the value at all.) Fixes #4
jtkukunas
pushed a commit
that referenced
this pull request
Jun 21, 2018
(Note emit_match() doesn't currently use the value at all.) Fixes #4
busykai
pushed a commit
to busykai/zlib
that referenced
this pull request
Jan 26, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes intel#4
busykai
pushed a commit
to busykai/zlib
that referenced
this pull request
Jan 28, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes intel#4
jtkukunas
pushed a commit
that referenced
this pull request
Apr 15, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes #4
jtkukunas
pushed a commit
that referenced
this pull request
Apr 25, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes #4
jtkukunas
pushed a commit
that referenced
this pull request
Aug 29, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes #4
busykai
pushed a commit
to busykai/zlib
that referenced
this pull request
Nov 9, 2022
(Note emit_match() doesn't currently use the value at all.) Fixes intel#4
busykai
pushed a commit
that referenced
this pull request
Nov 19, 2023
(Note emit_match() doesn't currently use the value at all.) Fixes #4
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(Note emit_match() doesn't currently use the value at all.)