Skip to content

mvsoom/camera-activitynet-client

Repository files navigation

Transcribing the ActivityNet into descriptions

Installation

I used Python 3.11.4, but other versions probably work fine too.

Download, setup venv with python -m venv .venv, activate the venv, then

pip install -r requirements

NOTE: These requirements are bloated and should be skimmed aggressively.

Usage

This repo transcripts video streams from the ActivityNet database into .description files.

The videos are downloaded in random batches using FiftyOne. Watch out, because running this script for too long canwill get you banned from YouTube.

The camera-server must run on the local net for this client to work, as it queries the server worker at http://localhost:40000.

You can start transcribing ActivityNet into descriptions with

cd [ROOT OF THIS REPO]
./activitynet.sh 2>&1 | tee activitynet.sh.log

Statistics

Here is some pseudo code together with some outputs.

$ cd ./dataset-zoo

# Number of .description files (videos watched)
$ find . -type f -name "*.description" | wc -l
6603

# Number of descriptions (text lines)
$ find . -type f -name "*.description" -exec cat {} \; | wc -l
326815

# Number of words including timestamps
$ find . -type f -name "*.description" -exec cat {} \; | wc -w
25747879

# Number of words excluding timestamps (in the descriptions only)
= 25747879 - 326815 = 25421064 = 25M words = 34M tokens

# Cat a random description file
$ find . -type f -name "*.description" | shuf -n 1 | xargs cat

# Cat a random description file without timestamps
$ find . -type f -name "*.description" | shuf -n 1 | xargs cat | cut -d" " -f2-

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published