title | emoji | colorFrom | colorTo | sdk | sdk_version | python_version | app_file | pinned | license |
---|---|---|---|---|---|---|---|---|---|
salad bowl (vampnet) |
🥗 |
yellow |
green |
gradio |
5.23.2 |
3.11 |
app.py |
false |
cc-by-nc-4.0 |
- setting up
- programmatic usage
- launching the web app
- training / fine-tuning
- exporting your model
- unloop
- token telephone
- a note on argbind
- take a look at the pretrained models
- licensing for pretrained models
python 3.9-3.11 works well. (for example, using conda)
conda create -n vampnet python=3.9
conda activate vampnet
install VampNet
git clone https://github.com/hugofloresgarcia/vampnet.git
pip install -e ./vampnet
quick start!
import random
import vampnet
import audiotools as at
# load the default vampnet model
interface = vampnet.interface.Interface.default()
# list available finetuned models
finetuned_model_choices = interface.available_models()
print(f"available finetuned models: {finetuned_model_choices}")
# pick a random finetuned model
model_choice = random.choice(finetuned_model_choices)
print(f"choosing model: {model_choice}")
# load a finetuned model
interface.load_finetuned(model_choice)
# load an example audio file
signal = at.AudioSignal("assets/example.wav")
# get the tokens for the audio
codes = interface.encode(signal)
# build a mask for the audio
mask = interface.build_mask(
codes, signal,
periodic_prompt=7,
upper_codebook_mask=3,
)
# generate the output tokens
output_tokens = interface.vamp(
codes, mask, return_mask=False,
temperature=1.0,
typical_filtering=True,
)
# convert them to a signal
output_signal = interface.decode(output_tokens)
# save the output signal
output_signal.write("scratch/output.wav")
You can launch a gradio UI to play with vampnet.
python app.py
To train a model, run the following script:
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints
for multi-gpu training, use torchrun:
torchrun --nproc_per_node gpu scripts/exp/train.py --args.load conf/vampnet.yml --save_path path/to/ckpt
You can edit conf/vampnet.yml
to change the dataset paths or any training hyperparameters.
For coarse2fine models, you can use conf/c2f.yml
as a starting configuration.
See python scripts/exp/train.py -h
for a list of options.
To debug training, it's easier to debug with 1 gpu and 0 workers
CUDA_VISIBLE_DEVICES=0 python -m pdb scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints --num_workers 0
To fine-tune a model, use the script in scripts/exp/fine_tune.py
for an audio folder
python scripts/exp/fine_tune.py /path/to/audio/folder <fine_tune_name>
for multiple files
python scripts/exp/fine_tune.py "/path/to/audio1.mp3 /path/to/audio2/ /path/to/audio3.wav" <fine_tune_name>
This creates configuration files for a fine tuning train job. The save_paths will be set to runs/<fine_tune_name>/coarse
and runs/<fine_tune_name>/c2f
.
launch the coarse job:
python scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/coarse.yml
this will save the coarse model to runs/<fine_tune_name>/coarse/ckpt/best/
.
launch the c2f job:
python scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/c2f.yml
To resume from checkpoint, use the --resume
flag and the --save_path
to point to the checkpoint you want to resume from.
python scripts/exp/train.py --args.load conf/generated/steve/coarse.yml --save_path runs/steve/coarse --resume
Once your model has been fine-tuned, you can export it to a HuggingFace model.
In order to use your model in app.py
, you will need to export it to HuggingFace.
NOTE: In order to export, you will need a huggingface account.
Now, log in to huggingface using the command line:
huggingface-cli login
replace the contents of the file named ./DEFAULT_HF_MODEL_REPO
with your <HUGGINGFACE_USERNAME>/vampnet
. A model repo will be automatically created for you with export.py
. The default is hugggof/vampnet
.
for example, if my username is hugggof
, I would run the following command:`
echo 'hugggof/vampnet' > ./DEFAULT_HF_MODEL_REPO
Now, run the following command to export your model (replace <your_finetuned_model_name>
with the name of your model):
python scripts/exp/export.py --name <your_finetuned_model_name> --model latest
Once that's done, your model should appear on the list of available models in the gradio interface.
Simply run python app.py
and select your model from the dropdown list.
Make sure you have Max installed on your laptop!
NOTE: To run unloop (with a GPU-powered server), you will need to install the vampnet repo in both your local machine and your GPU server.
First, on your GPU server, run the gradio server:
python app.py --args.load conf/interface.yml --Interface.device cuda
This will run a vampnet gradio API on your GPU server. Copy the address. It will be something like https://127.0.0.1:7860/
.
IMPORTANT Make sure that this gradio port (by default 7860
) is forwarded to your local machine, where you have Max installed.
Now, on your local machine, run the unloop gradio client.
cd unloop
pip install -r requirements.txt
python client.py --vampnet_url https://127.0.0.1:7860/ # replace with your gradio server address
This will start a gradio client that connects to the gradio server running on your GPU server.
Now, open the unloop Max patch. It's located at unloop/max/unloop.maxpat
.
In the tape controls, check the heartbeat (<3
) to make sure the connection to the local gradio client is working.
have fun!
Instructions forthcoming, but the sauce is in token_telephone/tt.py
This repository relies on argbind to manage CLIs and config files.
Config files are stored in the conf/
folder.
All the pretrained models (trained by hugo) are stored here: https://huggingface.co/hugggof/vampnet
The weights for the models are licensed CC BY-NC-SA 4.0
. Likewise, any VampNet models fine-tuned on the pretrained models are also licensed CC BY-NC-SA 4.0
.
Download the pretrained models from this link. Then, extract the models to the models/
folder.