Skip to content

use pyrnn.gz models instead of pyrnn #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

kba
Copy link
Contributor

@kba kba commented May 31, 2020

Companion to cisocrgroup/ocrd_cis#49

pyrnn models are generally distributed gzipped. If loading gzipped models is inefficient, I can also adapt OCR-D/ocrd_all#103 to gunzip models after download.

@bertsky
Copy link
Owner

bertsky commented Jun 2, 2020

Is this really necessary? In recognize, there's get_model (which tries to append .gz), and then in load_object there's ocropus_find_file (which also tries to append .gz). In short, whenever the file can be found with ocrolib, it can also be found without the .gz suffix.

Now, I understand you pushed for cisocrgroup/ocrd_cis#49, which in turn requires the change here. But as I mentioned in my review there, I don't think that was warranted itself.

It does not hurt to use the full file names, though. Maybe this is better regardless of the change in ocrd_cis. With the full file name, the makefile could also have rules to check / download the model files.

@kba kba marked this pull request as draft June 3, 2020 12:34
@kba
Copy link
Contributor Author

kba commented Jun 3, 2020

It should not be necessary but I failed to get ocropy to pick up the models in OCROPUS_DATA, so I tried on different levels to fix it. I need to properly debug this later, the easiest "fix" was to gunzip the models or use explicit .gz extensions. I should have used the Draft PR feature or an issue.

@bertsky
Copy link
Owner

bertsky commented Jun 3, 2020

It should not be necessary but I failed to get ocropy to pick up the models in OCROPUS_DATA, so I tried on different levels to fix it. I need to properly debug this later, the easiest "fix" was to gunzip the models or use explicit .gz extensions.

Here's a summary of exactly where ocrd_cis is currently looking. Hope that helps.

I am still inclined to switch to the .gz filenames in the makefile regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants