Skip to content
This repository was archived by the owner on Mar 1, 2024. It is now read-only.

deflate: avoid use of uninitialized variable #4

Closed
wants to merge 6 commits into from

Conversation

nathankidd
Copy link
Contributor

(Note emit_match() doesn't currently use the value at all.)

jtkukunas and others added 6 commits January 17, 2014 13:12
Uses SSE2 subtraction with saturation to shift the hash in
16B chunks. Renames the old fill_window implementation to
fill_window_c(), and adds a new fill_window_sse() implementation
in fill_window_sse.c.

Moves UPDATE_HASH into deflate.h and changes the scope of
read_buf from local to ZLIB_INTERNAL for sharing between
the two implementations.

Updates the configure script to check for SSE2 intrinsics and enables
this optimization by default on x86. The runtime check for SSE2 support
only occurs on 32-bit, as x86_64 requires SSE2. Adds an explicit
rule in Makefile.in to build fill_window_sse.c with the -msse2 compiler
flag, which is required for SSE2 intrinsics.
For systems supporting SSE4.2, use the crc32 instruction as a fast
hash function. Also, provide a better fallback hash.

For both new hash functions, we hash 4 bytes, instead of 3, for certain
levels. This shortens the hash chains, and also improves the quality
of each hash entry.
Rather than copy the input data from strm->next_in into the window and
then compute the CRC, this patch combines these two steps into one. It
performs a SSE memory copy, while folding the data down in the SSE
registers. A final step is added, when we write the gzip trailer,
to reduce the 4 SSE registers to 32b.

Adds some extra padding bytes to the window to allow for SSE partial
writes.
The deflate_quick strategy is designed to provide maximum
deflate performance.

deflate_quick achieves this through:
    - only checking the first hash match
    - using a small inline SSE4.2-optimized longest_match
    - forcing a window size of 8K, and using a precomputed dist/len
      table
    - forcing the static Huffman tree and emitting codes immediately
      instead of tallying

This patch changes the scope of flush_pending, bi_windup, and
static_ltree to ZLIB_INTERNAL and moves END_BLOCK, send_code,
put_short, and send_bits to deflate.h.

Updates the configure script to enable by default for x86. On systems
without SSE4.2, fallback is to deflate_fast strategy.
From: Arjan van de Ven <[email protected]>

As the name suggests, the deflate_medium deflate strategy is designed
to provide an intermediate strategy between deflate_fast and deflate_slow.
After finding two adjacent matches, deflate_medium scans left from
the second match in order to determine whether a better match can be
formed.
(Note emit_match() doesn't currently use the value at all.)
@jtkukunas
Copy link
Contributor

Hi Nathan,

Thank you for your contribution. I'll merge this, but could you first verify that you're releasing this patch under the Zlib license?

Thanks.

@nathankidd
Copy link
Contributor Author

On 14-05-27 02:48 PM, jtkukunas wrote:

but could you first verify that you're releasing this patch under the
Zlib license?

Yes. Released under zlib license (if copyright can apply to 4 characters) :)

-Nathan

@jtkukunas
Copy link
Contributor

Heh. The reason that I ask is because the master branch tracks the patchset revisions I send to the zlib-devel mailing list. So regardless of the size of the patch, I needed to double check.

Thanks.

@jtkukunas
Copy link
Contributor

Merged into master.

@jtkukunas jtkukunas closed this in b6f9a34 May 28, 2014
jtkukunas pushed a commit that referenced this pull request Jun 4, 2014
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
jtkukunas pushed a commit that referenced this pull request Jun 16, 2014
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
jtkukunas pushed a commit that referenced this pull request Jul 26, 2014
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
jtkukunas pushed a commit that referenced this pull request Jun 21, 2018
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
busykai pushed a commit to busykai/zlib that referenced this pull request Jan 26, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes intel#4
busykai pushed a commit to busykai/zlib that referenced this pull request Jan 28, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes intel#4
jtkukunas pushed a commit that referenced this pull request Apr 15, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
jtkukunas pushed a commit that referenced this pull request Apr 25, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
jtkukunas pushed a commit that referenced this pull request Aug 29, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
busykai pushed a commit to busykai/zlib that referenced this pull request Nov 9, 2022
(Note emit_match() doesn't currently use the value at all.)

Fixes intel#4
busykai pushed a commit that referenced this pull request Nov 19, 2023
(Note emit_match() doesn't currently use the value at all.)

Fixes #4
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants