Skip to content

PdfWriter.write() in context manager closes stream when it should not #2905

@alexaryn

Description

@alexaryn

Calling PdfWriter.write(fileobj) unexpectedly closed fileobj causing my program to crash later when it tried to do fileobj.seek(0).

Environment

Which environment were you using when you encountered the problem?

Linux amd64; Python 3.11 and 3.12, but it doesn't matter, as it's a logic bug in the code.

Code

This code will trigger the issue with any PDF input:

def select_pdf_pages(input: BinaryIO, out: BinaryIO, page_list: list[int]) -> None:
    input.seek(0)
    with pypdf.PdfReader(input) as pdf_reader:
        with pypdf.PdfWriter() as pdf_writer:
            for page_num in page_list:
                pdf_writer.add_page(pdf_reader.pages[page_num - 1])
            pdf_writer.write(out)

After calling this, out is closed. It should not be.

Traceback

n/a

Explanation

The problem arises here:

if self.with_as_usage:

which seems like a special case for when write() is called from __exit__(). However, just because the PdfWriter was used in a with ... as statement doesn't mean that the stream passed into write() is the internal self.fileobj. If it belongs to the caller, it should not be messed with.

Potential Solution

Rather than closing in write(), why not close afterward in __exit__()?

Partial Workaround

write_stream() does not have the problematic logic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PdfWriterThe PdfWriter component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions