Add Video Capture Support for macOS through AVFoundation/Swift #821

yongtang · 2020-03-01T20:48:51Z

This PR is part of the effort in resolving #814.

In #814, the feature request is to add video capture support for Linux, likely through Video4Linux. This PR fixes #814

Due to some limitations Video4Linux will need a compatible USB camera first.

This PR, instead tries to resolve the featue requrest on macOS first.

On macOS the built-in camera could be accessed through AVFoundation's Swift API.

This PR uses Swift to access AVCaptureSession/etc, and exported to C function (cdecl) so that it could be used in C++ kernel in tensorflow-io.

Since macOS's raw video capture format is NV12 (kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) additional work is needed to convert NV12 into RGB format, so that a whole pipeline could be built up to allow using video capture for tf.keras' inference.

This PR does not resolve the NV12 => RGB yet. Will address in separate PRs.

Also, since video capture is technically a continuous stream and is not repeatable, it is not possible to train based on video capture with multiple epochs.

Finally, the following is a sample usage which takes video capture and saves as nv12 raw file.

The NV12 raw file could be checked by using ffmpeg to convert to JPEG to validate.

Note: the following is a validation, YUV image could be converted to JPEG with:

ffmpeg -s 1280x720 -pix_fmt nv12 -i frame_{i}.yuv frame_{i}.jpg

Usage:

dataset = tfio.experimental.IODataset.stream().from_video_capture(
    "device").take(5)
i = 0
for frame in dataset:
  print("Frame {}: shape({}) dtype({}) length({})".format(
      i, frame.shape, frame.dtype, tf.strings.length(frame)))
  tf.io.write_file("frame_{}.yuv".format(i), frame)
  i += 1

/cc @bhack @ivelin

Signed-off-by: Yong Tang [email protected]

This PR is part of the effort in resolving 814. In 814, the feature request is to add video capture support for Linux, likely through Video4Linux. Due to some limitations Video4Linux will need a compatible USB camera first. This PR, instead tries to resolve the featue requrest on macOS first. On macOS the built-in camera could be accessed through AVFoundation's Swift API. This PR uses Swift to access AVCaptureSession/etc, and exported to C function (`cdecl`) so that it could be used in C++ kernel in tensorflow-io. Since macOS's raw video capture format is NV12 (kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) additional work is needed to convert NV12 into RGB format, so that a whole pipeline could be built up to allow using video capture for tf.keras' inference. This PR does not resolve the NV12 => RGB yet. Will address in separate PRs. Also, since video capture is technically a continuous stream and is not repeatable, it is not possible to train based on video capture with multiple epochs. Finally, the following is a sample usage which takes video capture and saves as nv12 raw file. The NV12 raw file could be checked by using ffmpeg to convert to JPEG to validate. Note: the following is a validation YUV image could be converted to JPEG with: ``` ffmpeg -s 1280x720 -pix_fmt nv12 -i frame_{i}.yuv frame_{i}.jpg ``` Usage: ``` dataset = tfio.experimental.IODataset.stream().from_video_capture( "device").take(5) i = 0 for frame in dataset: print("Frame {}: shape({}) dtype({}) length({})".format( i, frame.shape, frame.dtype, tf.strings.length(frame))) tf.io.write_file("frame_{}.yuv".format(i), frame) i += 1 ``` Signed-off-by: Yong Tang <[email protected]>

Signed-off-by: Yong Tang <[email protected]>

yongtang · 2020-03-03T15:39:41Z

Now video capture on Linux has been added. It is possible to use

dataset = tfio.experimental.IODataset.stream().from_video_capture(
    "/dev/video0").take(5)
i = 0
for frame in dataset:
  print("Frame {}: shape({}) dtype({}) length({})".format(
      i, frame.shape, frame.dtype, tf.strings.length(frame)))
  tf.io.write_file("frame_{}.yuv".format(i), frame)
  i += 1

on Linux platforms where video is available.

I tested on a Debian VM with Video4Linux Loopback and it works as expected.

One thing to note is that by default I only tested and enabled yuyv422 format. (Not also macOS is NV12 and Android is NV21).

decode NV12 and YUYV to RGB will be done in follow up PRs.

/cc @bhack

terrytangyuan

LGTM

…rflow#821) * Add Video Capture Support for macOS through AVFoundation/Swift This PR is part of the effort in resolving 814. In 814, the feature request is to add video capture support for Linux, likely through Video4Linux. Due to some limitations Video4Linux will need a compatible USB camera first. This PR, instead tries to resolve the featue requrest on macOS first. On macOS the built-in camera could be accessed through AVFoundation's Swift API. This PR uses Swift to access AVCaptureSession/etc, and exported to C function (`cdecl`) so that it could be used in C++ kernel in tensorflow-io. Since macOS's raw video capture format is NV12 (kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) additional work is needed to convert NV12 into RGB format, so that a whole pipeline could be built up to allow using video capture for tf.keras' inference. This PR does not resolve the NV12 => RGB yet. Will address in separate PRs. Also, since video capture is technically a continuous stream and is not repeatable, it is not possible to train based on video capture with multiple epochs. Finally, the following is a sample usage which takes video capture and saves as nv12 raw file. The NV12 raw file could be checked by using ffmpeg to convert to JPEG to validate. Note: the following is a validation YUV image could be converted to JPEG with: ``` ffmpeg -s 1280x720 -pix_fmt nv12 -i frame_{i}.yuv frame_{i}.jpg ``` Usage: ``` dataset = tfio.experimental.IODataset.stream().from_video_capture( "device").take(5) i = 0 for frame in dataset: print("Frame {}: shape({}) dtype({}) length({})".format( i, frame.shape, frame.dtype, tf.strings.length(frame))) tf.io.write_file("frame_{}.yuv".format(i), frame) i += 1 ``` Signed-off-by: Yong Tang <[email protected]> * Add Video4Linux V2 support on Linux Signed-off-by: Yong Tang <[email protected]> * Update to use device name in API calls Signed-off-by: Yong Tang <[email protected]> * Fix typo in Windows Signed-off-by: Yong Tang <[email protected]> * Fix test typo Signed-off-by: Yong Tang <[email protected]>

yongtang mentioned this pull request Mar 1, 2020

Video4Linux and Genicam #814

Closed

yongtang force-pushed the video branch 2 times, most recently from 2a35d14 to 0775c86 Compare March 2, 2020 03:37

yongtang added 4 commits March 2, 2020 16:22

Add Video4Linux V2 support on Linux

b3ffc71

Signed-off-by: Yong Tang <[email protected]>

Update to use device name in API calls

c86a08e

Signed-off-by: Yong Tang <[email protected]>

Fix typo in Windows

51b0e5a

Signed-off-by: Yong Tang <[email protected]>

yongtang force-pushed the video branch from 0775c86 to 51b0e5a Compare March 3, 2020 04:06

Fix test typo

eff6922

Signed-off-by: Yong Tang <[email protected]>

terrytangyuan approved these changes Mar 3, 2020

View reviewed changes

terrytangyuan merged commit 4477e34 into tensorflow:master Mar 3, 2020

yongtang deleted the video branch March 3, 2020 17:18

yongtang mentioned this pull request Mar 3, 2020

Add support to convert from NV12 and YUYV to RGB #825

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Video Capture Support for macOS through AVFoundation/Swift #821

Add Video Capture Support for macOS through AVFoundation/Swift #821

yongtang commented Mar 1, 2020 •

edited

Loading

yongtang commented Mar 3, 2020

terrytangyuan left a comment

Add Video Capture Support for macOS through AVFoundation/Swift #821

Add Video Capture Support for macOS through AVFoundation/Swift #821

Conversation

yongtang commented Mar 1, 2020 • edited Loading

yongtang commented Mar 3, 2020

terrytangyuan left a comment

Choose a reason for hiding this comment

yongtang commented Mar 1, 2020 •

edited

Loading