Skip to content

ZhangLab-DeepNeuroCogLab/MotionPerceiver

Repository files navigation

Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception

Authors: Shuangpeng Han, Ziyu Wang, Mengmi Zhang

This is a PyTorch implementation of the Motion Perceiver (MP) proposed by our paper. Our paper has been accepted in NeurIPS 2024.

   

Project Description

Biological motion perception (BMP) refers to humans' ability to perceive and recognize the actions of living beings solely from their motion patterns, sometimes as minimal as those depicted on point-light displays. While humans excel at these tasks without any prior training, current AI models struggle with poor generalization performance. To close this research gap, we propose the Motion Perceiver (MP). MP solely relies on patch-level optical flows from video clips as inputs. During training, it learns prototypical flow snapshots through a competitive binding mechanism and integrates invariant motion representations to predict action labels for the given video. During inference, we evaluate the generalization ability of all AI models and humans on 62,656 video stimuli spanning 24 BMP conditions using point-light displays in neuroscience. Remarkably, MP outperforms all existing AI models with a maximum improvement of 29% in top-1 action recognition accuracy on these conditions. Moreover, we benchmark all AI models in point-light displays of two standard video datasets in computer vision. MP also demonstrates superior performance in these cases. More interestingly, via psychophysics experiments, we found that MP recognizes biological movements in a way that aligns with human behavioural data.


...

Some sample video stimuli are shown below.

RGB J-6P SP-8P-1LT
J-26P J-6P SP-8P-1LT

Environment Setup

Our code is based on Pytorch 2.0.0, CUDA 11.2 and Python 3.9.

We recommend using conda for installation:

conda env create -f environment.yml

conda activate MP

cd MotionPerceiver

git clone https://github.com/facebookresearch/pytorchvideo.git

cd pytorchvideo

pip install -e .

cd ..

export PYTHONPATH=.

Dataset

The RGB videos in our Biological Motion Perception (BMP) dataset are from NTU RGB+D 120 dataset, please download the dataset from the following link:

To generate other BMP conditions, you may follow the instructions in DATASET.md to prepare them.

Other datasets used to train and evaluate our model are as follows:

Training & Testing

Our codebase is structured in a manner similar to the SlowFast repository.

We use Fast Forward Computer Vision (FFCV) to expedite data loading during training and testing.

All hyperparameters are listed and explained in the Config.

Remember to modify the following parameters in the YAML file to your own:

  • TRAIN.FFCV.DATAPATH_PREFIX
  • TRAIN.SPLIT
  • VAL.FFCV.DATAPATH_PREFIX
  • VAL.SPLIT
  • TEST.FFCV.DATAPATH_PREFIX
  • TEST.SPLIT
  • OUTPUT_DIR

For example, if your training data is stored in ".../FFCV_data/train_RGB.beton", then TRAIN.FFCV.DATAPATH_PREFIX and TRAIN.SPLIT are ".../FFCV_data" and "train_RGB" respectively.

You can start training the model from scratch by running:

python3 Mainframe/run.py --cfg YAML/BMP.yaml

We use TRAIN.ENABLE and TEST.ENABLE in the YAML file to determine if training or testing should be performed for the current task. If you wish to conduct only testing, set TRAIN.ENABLE to False.

You can test the model by running:

python3 Mainframe/run.py --cfg YAML/BMP.yaml\
  TEST.CHECKPOINT_FILE_PATH path_to_your_checkpoint \
  TRAIN.ENABLE False \

if you want to train or test our Enhanced - Motion Perceiver (En-MP) model, please remember to set MODEL.MODEL_NAME to En_MP in the YAML file.

Our pretrained MP model on RGB videos of BMP dataset is available at link. Our pretrained En-MP model on RGB videos of BMP dataset is available at link. You can download the checkpoint and configure its path to TEST.CHECKPOINT_FILE_PATH for inference.

Human Psychophysics Experiments on Amazon Mechanical Turk

We have conducted a series of Mechanical Turk experiments using the Psiturk platform, which requires JavaScript, HTML, and Python 2.7. Please refer to Put-In-Context and doc for detailed Psiturk instructions.

You can find the code for the human psychophysics experiments at the Psiturk Folder. The important files or folders are described in detail below:

  • instructions/instruct-examples.html: instructions for human subjects.
  • instructions/stage.html: HTML configuration for each video stimulus.
  • static/js/task.js: Main file for stimulus loading and running the experiment.
  • config.txt: Configuration of the Human Psychophysics Experiment

Data from all participants in our human psychophysics experiments are available at the link, and data from those who passed our dummy trials are accessible through this link.

Citation

If you find our work useful in your research, please use the following BibTeX entry for citation.

@inproceedings{hanflow,
  title={Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception},
  author={Han, Shuangpeng and Wang, Ziyu and Zhang, Mengmi},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published