-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Description
Calling PdfWriter.write(fileobj)
unexpectedly closed fileobj
causing my program to crash later when it tried to do fileobj.seek(0)
.
Environment
Which environment were you using when you encountered the problem?
Linux amd64; Python 3.11 and 3.12, but it doesn't matter, as it's a logic bug in the code.
Code
This code will trigger the issue with any PDF input:
def select_pdf_pages(input: BinaryIO, out: BinaryIO, page_list: list[int]) -> None:
input.seek(0)
with pypdf.PdfReader(input) as pdf_reader:
with pypdf.PdfWriter() as pdf_writer:
for page_num in page_list:
pdf_writer.add_page(pdf_reader.pages[page_num - 1])
pdf_writer.write(out)
After calling this, out
is closed. It should not be.
Traceback
n/a
Explanation
The problem arises here:
Line 1396 in dd39992
if self.with_as_usage: |
which seems like a special case for when
write()
is called from __exit__()
. However, just because the PdfWriter
was used in a with
... as
statement doesn't mean that the stream passed into write()
is the internal self.fileobj
. If it belongs to the caller, it should not be messed with.
Potential Solution
Rather than closing in write()
, why not close afterward in __exit__()
?
Partial Workaround
write_stream()
does not have the problematic logic.
Metadata
Metadata
Assignees
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF