-
Notifications
You must be signed in to change notification settings - Fork 3.4k
What will be recommended HW setup for Realtime face Detection ? #1004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I haven't used a Jetson myself, so I don't know. But it really comes down
you your specific needs. A 1080ti is way better. But it obviously costs
more. It's up to you to decide. I would prototype both and see if you can
fit your application into the Jetson, if you can then there you go. But if
not then you need to spend more money on better hardware.
Also, if you want to use dlib from java you can easily define an interface
between java and C++ using the tooling here
https://github.com/davisking/dlib/tree/master/dlib/java. There is also a
bit more discussion of this here:
http://blog.dlib.net/2014/10/mitie-v03-released-now-with-java-and-r.html.
The newest version of the java/C++ tooling is in that dlib/java folder.
|
I will do the both soon and let you know the results here for comparison. Meanwhile how we can use all available cores and cpus in the server in dblib ? (its only using one core ). I read about the openblas and Intel MKL (paid) and installed both . I didnt see significant improvement still 1 or 2 core is busy. How I can check that dlib example using the blas or intel mkl ? I am using the Clion by the way (mac os x) It could be very useful a blog post to explain how to benefit Dlib full power with armed with the full CPUs with openblas etc. Many thanks.. |
The output of cmake tells you what it's doing with regards to any BLAS or
GPU usage.
|
Yes I saw below, even it says it found BLAS . still using only one core.. may be a very stupid question but I couldnt find a clear explanation for using full cpu power. best... cmake .. -DUSE_AVX_INSTRUCTIONS=1 |
Either the version of BLAS you are using isn't multi-core aware or you
aren't using any part of dlib that benefits from it.
|
I will investigate , I am using : facedetection, facelandmark and face recognition part. Is these benefits from multicore ? |
It depends on what exactly you mean. There are multiple face detectors in
dlib.
|
I am using below code: "Python"
and the encoding compare... When profile : %56 of the consumed by :
Is it caused by Python (not allowing the Dlib benefiting from cores that I should switch bare C++ . or BLAS is not multicore aware ? I have plenty 8 cores and I can only use 1 of them ... (Mac Book Pro 2017) |
Don't call get_frontal_face_detector() over and over. Call it once.
Anyway, most of this stuff isn't multicore. Only the DNN stuff is. It's
up to you to thread the rest appropriately for your application.
|
Technical note: if your application allows latency (like 5 frames delay), you don't need to perform face detection on every frame. Just detect faces every 5 frames, and interpolate face positions between them. Just saying, in case this might be something that will do for your scenario. |
That's a decent idea as well. But the deeper issue is that you shouldn't be calling code inside your processing loop that doesn't need to be there. Case in point, model loading code like |
Another idea came up from my friend that send each frame to different process (multiprocessing ) so we can benefit from the CPU cores.
And we need to keep the frame sequence as it is..
… On 12 Dec 2017, at 14:40, Davis E. King ***@***.***> wrote:
That's a decent idea as well. But the deeper issue is that you shouldn't be calling code inside your processing loop that doesn't need to be there. Case in point, model loading code like get_frontal_face_detector has no business being called more than once, let alone on every frame.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#1004 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AQscnxhCRIQ5P1K3is7mVltDFh34BzGcks5s_mYxgaJpZM4Q8OWV>.
|
I have installed dlib with AVX_INSTRUCTIONS and CUDA+cuDNN. But running a real-time facial detector (5 point) from webcam lags about 1~2 sec per frame when opencv capture is used on the code. The code should run smoothly about 30 fps (theoretically) on my GTX1080 GPU but I am confused whether Dlib using the GPU at all. Checking GPU memory while runtime shows only 15Mb consumption. Any idea whats happening? |
Hi, |
CMake tells you if it's going to use cuda when you install it. I also recently added the |
Hi,
This could be help full for all of us i assume :)
What will be the best setup (NVIDIA 1080ti or NVIDIA Jetson TX2 . etc ) in perspective of the both performance and price wise ?
Jetson is ARM based and suspecting the possible problems 1080ti could be expensive for my demo project.
I need at least 30 fps realtime face detection + Recognition (multiply faces in the camera).
According to your experience may be have an idea.
Many thanks for the library. I hope some day there is a solid java/scala port :) . I started to learn c++ because of the great DLIB.
Cheers
The text was updated successfully, but these errors were encountered: