An Orchestration Learning Framework for Ultrasound Imaging: Prompt-Guided Hyper-Perception and Attention-Matching Downstream Synchronization
This repository provides the official PyTorch implementation for our work published in Medical Image Analysis, 2025. The framework introduces:
- Prompt-Guided Hyper-Perception for incorporating prior domain knowledge via learnable prompts.
- Attention-Matching Downstream Synchronization to seamlessly transfer knowledge across segmentation and classification tasks.
- Support for diverse ultrasound datasets with both segmentation and classification annotations (
$M^2-US$ dataset). - Distributed training and inference pipelines based on the Swin Transformer backbone.
For more details, please refer to the paper (temporary free link, expires on July 17, 2025) and Project Page,
An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention-matching Downstream Synchronization
Zehui Lin, Shuo Li, Shanshan Wang, Zhifan Gao, Yue Sun, Chan-Tong Lam, Xindi Hu, Xin Yang, Dong Ni, and Tao Tan. Medical Image Analysis, 2025.
- Clone the repository
git clone https://github.com/Zehui-Lin/PerceptGuide
cd PerceptGuide
- Create a conda environment
conda create -n PerceptGuide python=3.10
conda activate PerceptGuide
- Install the dependencies
pip install -r requirements.txt
Organize your data directory with classification
and segmentation
sub-folders. Each sub-folder should contain a config.yaml
file and train/val/test lists:
data
├── classification
│ └── DatasetA
│ ├── 0
│ ├── 1
│ ├── config.yaml
│ ├── train.txt
│ ├── val.txt
│ └── test.txt
└── segmentation
└── DatasetB
├── imgs
├── masks
├── config.yaml
├── train.txt
├── val.txt
└── test.txt
Use the examples provided in the codebase as a reference when preparing new datasets.
The repository bundles several ultrasound datasets. Their licenses and redistribution conditions are listed below. You can download the preprocessed datasets which allow for redistribution from here.
Dataset | License | Redistribution | Access |
---|---|---|---|
Appendix | CC BY-NC 4.0 | Included in repo | link |
BUS-BRA | CC BY 4.0 | Included in repo | link |
BUSIS | CC BY 4.0 | Included in repo | link |
UDIAT | Private License | Not redistributable | link |
CCAU | CC BY 4.0 | Included in repo | link |
CUBS | CC BY 4.0 | Included in repo | link |
DDTI | Unspecified License | License unclear | link |
TN3K | Unspecified License | License unclear | link |
EchoNet-Dynamic | Private License | Not redistributable | link |
Fatty-Liver | CC BY 4.0 | Included in repo | link |
Fetal_HC | CC BY 4.0 | Included in repo | link |
MMOTU | CC BY 4.0 | Included in repo | link |
kidneyUS | CC BY-NC-SA | Included in repo | link |
BUSI | CC0 Public Domain | Included in repo | link |
HMC-QU | CC BY 4.0 | Included in repo | link |
TG3K | Unspecified License | License unclear | link |
Notes
- Private-license datasets (UDIAT, EchoNet-Dynamic) cannot be redistributed here; please request access through the provided links.
- Unspecified/unclear-license datasets (TN3K, TG3K, DDTI) may have redistribution restrictions. Download them directly from the source or contact the data owners for permission.
We employ torch.distributed
for multi-GPU training (single GPU is also supported):
python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 omni_train.py --output_dir exp_out/trial_1 --prompt
For evaluation, run:
python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 omni_test.py --output_dir exp_out/trial_1 --prompt
Download the Swin Transformer backbone and place it in pretrained_ckpt/
:
The folder structure should look like:
pretrained_ckpt
└── swin_tiny_patch4_window7_224.pth
You can download the pre-trained checkpoints from the release pages.
If you find this project helpful, please consider citing:
@article{lin2025orchestration,
title={An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention-matching Downstream Synchronization},
author={Lin, Zehui and Li, Shuo and Wang, Shanshan and Gao, Zhifan and Sun, Yue and Lam, Chan-Tong and Hu, Xindi and Yang, Xin and Ni, Dong and Tan, Tao},
journal={Medical Image Analysis},
pages={103639},
year={2025},
publisher={Elsevier}
}
This repository is built upon the Swin-Unet codebase. We thank the authors for making their work publicly available.