Skip to content

[Feature]: Feature Request - Use Google Document AI or VIsion AI instead of Tesseract #1434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
epatels opened this issue Nov 18, 2024 · 2 comments
Assignees
Labels
enhancement triage Issue needs triage

Comments

@epatels
Copy link

epatels commented Nov 18, 2024

Describe the proposed feature

Hi,

I know Tesseract OCR engine is free. But unfortunately is not very good especially while performing OCR for Indian Languages.

This is where Google Document AI and Google VIsion AI excels. I understand there is a cost involved in using these services.

But I am looking for a solutions that performs underlying OCR process using Google Document AI or Google VIsion AI OCR engine. The rest can remain unmodified with the output being a searchable PDF (in Indian Languages).

@grantbarrett
Copy link

grantbarrett commented Apr 24, 2025

I have forked and updated kkrell2016’s pre-existing Google Vision OCRmyPDF plugin and am happy to report that it works well. See here: https://github.com/grantbarrett/son-of-ocrmypdf_plugin_GoogleVision. I recognize that by using a paid service like Google it goes against the spirit of open source, but the Google Vision OCR is very, very good so the compromise is worth it to me.

@epatels
Copy link
Author

epatels commented Apr 24, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement triage Issue needs triage
Projects
None yet
Development

No branches or pull requests

3 participants