Skip to content

Add flow_to_image() visualization util #5091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NicolasHug opened this issue Dec 13, 2021 · 5 comments · Fixed by #5134
Closed

Add flow_to_image() visualization util #5091

NicolasHug opened this issue Dec 13, 2021 · 5 comments · Fixed by #5134

Comments

@NicolasHug
Copy link
Member

NicolasHug commented Dec 13, 2021

Now that RAFT is done, it'd be nice to have a conversion util from a 2xHxW flow to a 3xHxW RGB image. I think the implementation of the original RAFT paper follows a standard color map, so we could do something similar (trimmed down to the minimal parts, e.g. no BRG etc.) https://github.com/princeton-vl/RAFT/blob/master/core/utils/flow_viz.py

Note: this is a visualization util meant to land in torchvision.utils; it's not meant to be a transform nor an ops.

@oke-aditya
Copy link
Contributor

I can take this 😄

@NicolasHug
Copy link
Member Author

Thanks @oke-aditya ! Looking forward to your PR!

@oke-aditya
Copy link
Contributor

It would take some time for me. As I haven't tried raft. Not experienced with optical flow either.

But definitely will prototype before end of year.

@NicolasHug
Copy link
Member Author

I'll be on holidays the last 2 weeks of Dec, so there's absolutely no rush on our side ;)

Just FYI, you don't really need to be familiar with optical flow models or with RAFT for this task, since it's just about converting a flow into an RGB image.

But if you want to play around with RAFT, I've been using this (hacky) script to apply our model to videos:

import os
from argparse import ArgumentParser
import sys
sys.path.append('../RAFT/core')  # This is the original RAFT repo from https://github.com/princeton-vl/RAFT

import cv2
import numpy as np
import torch

from torchvision.prototype.models.optical_flow import raft_large
from utils import flow_viz


def frame_preprocess(frame, device):
    frame = torch.from_numpy(frame).permute(2, 0, 1).float()
    frame = frame.unsqueeze(0)
    frame = frame.to(device)
    return frame


def vizualize_flow(img, flo, save, counter):
    # permute the channels and change device is necessary
    img = img[0].permute(1, 2, 0).cpu().numpy()
    flo = flo[0].permute(1, 2, 0).cpu().numpy()

    # map flow to rgb image
    flo = flow_viz.flow_to_image(flo)
    flo = cv2.cvtColor(flo, cv2.COLOR_RGB2BGR)

    # concatenate, save and show images
    img_flo = np.concatenate([img, flo], axis=0)
    if save:
        cv2.imwrite(f"demo_frames/frame_{str(counter)}.jpg", img_flo)
    cv2.imshow("Optical Flow", img_flo / 255.0)
    k = cv2.waitKey(25) & 0xFF
    if k == 27:
        return False
    return True


def inference(args):
    model = raft_large(weights="Raft_Large_Weights.C_T_SKHT_K_V2")

    save = args.save
    if save:
        if not os.path.exists("demo_frames"):
            os.mkdir("demo_frames")

    device = "cpu"
    model.eval()

    video_path = args.video

    cap = cv2.VideoCapture(video_path)
    cap.set(cv2.CAP_PROP_POS_FRAMES, int(3 * 30))
    ret, frame_1 = cap.read()

    frame_1 = frame_preprocess(frame_1, device)

    counter = 0
    with torch.no_grad():
        while True:
            # read the next frame
            ret, frame_2 = cap.read()
            if not ret:
                break
            frame_2 = frame_preprocess(frame_2, device)
            flow_preds = model(2 * frame_1 / 255 - 1, 2 * frame_2 / 255 - 1)
            flow_up = flow_preds[-1]
            ret = vizualize_flow(frame_1, flow_up, save, counter)
            if not ret:
                break
            frame_1 = frame_2
            counter += 1


def main():
    parser = ArgumentParser()
    parser.add_argument("--model", help="restore checkpoint")
    parser.add_argument("--iters", type=int, default=12)
    parser.add_argument("--video", type=str, default="./videos/car.mp4")
    parser.add_argument("--save", action="store_true", help="save demo frames")
    parser.add_argument("--small", action="store_true", help="use small model")
    parser.add_argument(
        "--mixed_precision", action="store_true", help="use mixed precision"
    )

    args = parser.parse_args()
    inference(args)


if __name__ == "__main__":
    main()

(adapted from https://github.com/spmallick/learnopencv/blob/master/Optical-Flow-Estimation-using-Deep-Learning-RAFT/inference.py). The goal of this issue is to implement the flow_viz.flow_to_image() of the script

@oke-aditya
Copy link
Contributor

I'll be on holidays the last 2 weeks of Dec, so there's absolutely no rush on our side ;)

Happy Holidays Nicolas. :)

I will have a look at the script and try to implement a minimalist version.
I will ping you back in a PR hopefully soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants