This repository shows the data and demo code for Jong-Yun Park et al., Sound reconstruction from human brain activity via a generative model with brain-like auditory features.
- Raw fMRI data: TBA
- Preprocessed fMRI data, DNN features extracted from sound clips: figshare
- Trained transformer models: figshare
- Stimulus sound clips: Refer to data/README.md .
- Clone this
SoundReconstruction
repository to your local machine (GPU machine preferred).
git clone [email protected]:KamitaniLab/SoundReconstruction.git
- Create conda environment using the
specvqgan.yaml
.
conda env create --name specvqgan -f specvqgan.yaml
python -c "import torch; print(torch.cuda.is_available())"
# True
- Clone
SpecVQGAN
repository next toSoundReconstruction
directory. Please use the following fork repository instead of the original SpecVQGAN repository because the path of the Transformer configuration file has been rewritten.
git clone [email protected]:KamitaniLab/SpecVQGAN.git
See data/README.md.
We provide scripts that reproduce main results in the original paper. Please execute the sh files in the following order.
- Train feature decoders to predict the VGGishish features.
./1_train_batch.sh
- Using the decoders trained in step.1, perform feature predictions. (Perform the prediction for the attention task dataset at the same time.)
./2_test_batch.sh
- Validate the prediction accuracy of predicted features.
./3_eval_batch.sh
Visualize the prediction accuracy with the following notebook. This notebook draws Fig.3D and Fig.3E of the original paper.
feature_decoding/makefigures_featdec_eval.ipynb
- Reconstruct sound clips using predicted features.
./4_recon_batch.sh
- Validate the quality of reconstructed sound.
./5_recon_eval_batch.sh
Visualize the reconstruction quality with the following notebooks. These notebooks draws Fig.4C and Fig.8C of the original paper.
reconstruction/makefigures_recon_eval.ipynb
reconstruction/makefigures_recon_eval_attention.ipynb