Skip to content

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Oct 7, 2025

Rationale for this change

Fix issues found by OSS-Fuzz when invalid Parquet data is fed to the Parquet reader:

Are these changes tested?

Yes, using the updated fuzz regression files from apache/arrow-testing#115

Are there any user-facing changes?

No.

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

@pitrou pitrou requested a review from wgtmac as a code owner October 7, 2025 13:57
@pitrou
Copy link
Member Author

pitrou commented Oct 7, 2025

@github-actions crossbow submit -g cpp

@pitrou
Copy link
Member Author

pitrou commented Oct 7, 2025

@AntoinePrv Would you like to take a look?

@github-actions github-actions bot added the awaiting review Awaiting review label Oct 7, 2025
@pitrou pitrou requested review from mapleFU and adamreeve October 7, 2025 13:58
// There may be remaining null if they are not greedily filled by either decoder calls
check_and_handle_fully_null_remaining();

ARROW_DCHECK(batch.is_done() || exhausted());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check could trigger if the RLE-bit-packed data is invalid (for example a run of invalid size). @AntoinePrv

@pitrou pitrou changed the title GH-47740: [C++][Parquet] Fix dangerous behavior when reading invalid Parquet data GH-47740: [C++][Parquet] Fix undefined behavior when reading invalid Parquet data Oct 7, 2025
Copy link

github-actions bot commented Oct 7, 2025

Revision: d620685

Submitted crossbow builds: ursacomputing/crossbow @ actions-0059c16459

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-42-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Oct 7, 2025
@pitrou
Copy link
Member Author

pitrou commented Oct 7, 2025

Valgrind failure is unrelated and will be fixed by #47743

Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants