Closed
Description
Rationale
In one of my apps, I started replacing few libraries with MuPDF and I noticed that MuPDF has Tesseract support (duplicate symbol errors with Leptonica
in my app, only on Windows).
- My app uses leptonica-sys with tesseract-sys, I was able to "resolve" linkage errors with
RUSTFLAGS
(No issue on Linux and Mac OS). - This led me to check if I can just use Tesseract with MuPDF and compile with
crt-static
flags (without additional flags such as -C link-arg=/FORCE:MULTIPLE)
Goal
I believe that "most" people don't have to use tesseract-rs
or its related sys
crates containing more advanced features. With mupdf-rs
I can drop 5 dependencies in one of my apps (poppler-rs
, cairo-rs
, lopdf
, tesseract-sys
, leptonica-sys
).
Proposal
I can send a pull request for the following:
- Expose a new method in
mupdf-rs
(DocumentWriter) to allow OCR via fz_new_pdfocr_writer - This allows trivial branches for applications that optionally perform OCR, because it doesn't require a different data structure.
- I suppose that this would need to be guarded by the
tesseract
feature in the code
Other notes
I did some quick tests under Windows 11 and it's fine. I'll test soon under Linux and Mac OS prior submitting a pull request.
Metadata
Metadata
Assignees
Labels
No labels