Skip to content

[ICML 2025 Tokenization Workshop] HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling

License

Notifications You must be signed in to change notification settings

opendilab/HH-Codec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling

🎉 Discrete Neural Codec With 24 Tokens Per Second (24KHZ) for Spoken Language Modeling!

Twitter Python 3.10 PyTorch wandb PyTorch Lightning

Installation

To install HHCodec, follow these steps:

conda create -n hhcodec python=3.10 # it must >3.10 beacause use bigvgan
conda activate hhcodec
git clone https://github.com/rongkunxue/HH-Codec.git
cd HH-Codec 
pip install -e .

#if you want to eval by UTMOS
pip install pip==24.0
pip install fairseq

Train

Step 1: Prepare the Training Dataset

Ensure your dataset is preprocessed by following the instructions in dataset

Step 2: Modify Configuration Files

Before starting training, update the configuration settings

# Open and modify the following file "configs/train.yaml"
# Adjust parameters such as:
# - log settings
# - train_path
# - save_dir
# - device (e.g., CPU/GPU)

Step 3: Start Training

Once the dataset is prepared and the configuration is set, launch the training process:

#We expect to finalize and open-source the training code within two weeks.

Acknowledgement

The HHCodec codebase is adapted from the following repositories:

A huge thanks to the authors of these projects for their outstanding contributions! 🎉

About

[ICML 2025 Tokenization Workshop] HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages