Skip to content

Performance issue: much slower compared to "vanilla" opencv #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
enricotagliavini opened this issue Jan 14, 2020 · 12 comments
Closed

Performance issue: much slower compared to "vanilla" opencv #287

enricotagliavini opened this issue Jan 14, 2020 · 12 comments

Comments

@enricotagliavini
Copy link

Description of the problem

I notice running many different applications using opencv libraries are showing very significant performance difference when using the upstream python bindings vs. opencv-python. I verified this on Red Hat 7 and Fedora 30 / 31 Linux distributions. Fedora provides opencv python packages (3.4 series) which I also adapted to work on Red Hat Linux (modifying the RPM spec file to disable a couple of components not available in Red Hat).

When running the same code / applications slowdown is significant. Depending on the application it can take 3 to more than 10 longers. One of the tested application is: DeepLabCut. Switching only the opencv package and keeping the rest of the environment exactly the same, tripled the performance of our analysis time.

I tried to recompile the opencv-python locally using the setup.py and gcc-8 on CentOS 7, but it didn't make a big difference (there was a 10-20% improvement, but still very far away from the 300% observed with upstream build).

Looking at the cmake arguments in the RPM spec file and comparing it to the one in setup.py I cannot see anything that could explain the problem. I can see the Fedora RPM forces the use of OPENMP rather than using pthreads (and I understand opencv-python uses pthreads instead), not sure if this could be the primary cause of the issue. Sounds strange if it is though. I ran out of ideas and thought about opening this issue.

Expected behaviour

Same or close to same performance as upstream python bindings

Actual behaviour

Not entirely sure, but multiple applications are showing significant slowdowns when using this opencv-python rather than the upstream python binding

Steps to reproduce

  • https://github.com/AlexEMG/DeepLabCut / any 3D reconstruction demo from the opencv example code should also work
  • CentOS 7 / Fedora 30 - 31
  • x86_64
  • opencv-python both 29 and 30 (so opencv 3.4 and 4.2) seems to be affected

Thank your for any help.

@johnthagen
Copy link
Contributor

I think something related that could help would be documentation on how opencv-python is compiled. Maybe a FAQ entry. What should users expect in terms of performance? I have little experience with OpenCV, but for example, to make the opencv-python wheels universally installable, are certain optimizations disabled? Should users expect CUDA support, SIMD, etc?

Maybe a link to the relevant build files/flags would help?

@enricotagliavini
Copy link
Author

I didn't checked the details, but it doesn't look like generic compiler optimizations are disabled. While compiling opencv-python I see options such as -O2 and -O3 passed to the compiler. The packages I compiled are also generic x86_64 and not tuned to any CPU architecture.

@skvark
Copy link
Member

skvark commented Mar 21, 2020

You can check the compiler flags as well as other information about the build environment with cv2.getBuildInformation().

About the performance issue: it might be that pthreads vs. OpenMP is the culprit. Hard to say.

Keep in mind that these packages are not optimized for performance. The main target is to provide an easy way to install OpenCV on almost any OS. There has been made some tradeoffs especially regarding 3rd party dependencies. The current CentOS 5 environment is not the easiest to work with since it's quite old. This means that the maintenance burden becames overwhelming very fast if I enable every dependency or optimization OpenCV might have support for. Due to this a locally built OpenCV version will almost always have better performance than the one provided via this repository.

@enricotagliavini
Copy link
Author

enricotagliavini commented Mar 22, 2020

I understand, but this is getting extreme. My own code runs in a few seconds with Fedora packages and also on Red Hat 7 with my own build based on the Fedora one. With opencv-python after one full hour it didn't finished.

It might be time to consider dropping older platforms, the slowdown is getting unsustainable in my humble opinion.

@johnthagen
Copy link
Contributor

Perhaps manylinux2010 could be the new baseline or alternative? It's based on CentOS 6.

@skvark
Copy link
Member

skvark commented Mar 22, 2020

I looked into the Fedora configs and also Lapack is on there while it's not enabled in opencv-python. There might be also other differences.

And yes, manylinux2010 is the next step.

@johnthagen
Copy link
Contributor

And for completeness, since at this point CentOS 6 is already EOL, wanted to also reference manylinux2014

@skvark
Copy link
Member

skvark commented Mar 22, 2020

It might be good to skip manylinux2010 and migrate straight to manylinux2014.

@johnthagen
Copy link
Contributor

johnthagen commented Mar 22, 2020

It might be good to skip manylinux2010 and migrate straight to manylinux2014.

I would agree.

It's best practice to pip install --upgrade pip in a virtual environment these days, so I think this should fine, especially since this would lower maintenance burden and (potentially) improve performance.

@skvark
Copy link
Member

skvark commented Jul 6, 2020

Could you check how the latest releases perform?

@mshabunin
Copy link

Default configuration can have many differences from opencv-python, most obvious things are:

  • image reading - perhaps opencv-python uses libjpeg-turbo provided with OpenCV, it will be built with disabled vectorization, when you build OpenCV by yourself you most probably will link with system-provided library which should be optimized
  • video reading - default OpenCV can use optimized system FFmpeg and GStreamer for decoding (latter can have HW-accelerated plugins installed), opencv-python uses built-in FFmpeg (probably with safe optimization level)
  • GUI - default OpenCV uses GTK+3 on Linux, opencv-python uses Qt backend with more features
  • some math processing can be done with LAPACK/MKL/etc - can be auto-detected when building OpenCV, not used in opencv-python
  • opencv-python is built with older compiler, your system most probably have newer version (esp. Fedora)

Other features/dependencies in opencv-python are similar to the default build configuration, including CPU optimizations (SSSE3 + all dispatching) and parallel-processing backend (pthreads).

So it is important to localize performance drop: compare configs, measure execution time on different stages (input, processing, GUI), profile, etc..

@skvark
Copy link
Member

skvark commented Nov 1, 2020

Closing due to inactivity.

@skvark skvark closed this as completed Nov 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants