Skip to content

Conversation

xiaoshenlong17
Copy link

@xiaoshenlong17 xiaoshenlong17 commented Sep 1, 2025

卫星时间序列影像分割:U-TAE

案例简介

本案例实现了 U-TAE (Unsupervised Temporal Attention Encoder) 模型在 PASTIS 数据集 上的语义分割与全景分割任务。
该模型最早发表于 ICCV 2021(Segmentation of Satellite Image Time Series with Convolutional Temporal Attention Networks)。
我们将原始 PyTorch 实现迁移至 PaddlePaddle,并提供完整的训练与测试脚本。

数据集

  • PASTIS:一个专为遥感时间序列影像语义分割设计的数据集,包含 Sentinel-2 多时相影像及对应的地块标注。
  • 任务类型:
    • 语义分割 (Semantic Segmentation)
    • 全景分割 (Panoptic Segmentation)

数据集可通过 PASTIS 官网 下载。

模型结构

  • U-TAE Backbone:采用卷积 + 时间注意力机制对卫星影像时序特征进行建模。
  • ConvLSTM & LTAE:用于时序特征编码与解码。
  • Panoptic Head:在语义分割结果上进一步进行全景分割。

迁移实现包括以下核心组件:

  • src/backbones/:网络骨干(ConvLSTM, LTAE, Positional Encoding, UTAE)
  • src/learning/:训练过程相关(loss、miou、权重初始化)
  • src/panoptic/:全景分割模块(PaPs, loss, FocalLoss, metrics, utils)
  • 训练/测试脚本:train_semantic.pytest_semantic.pytrain_panoptic.pytest_panoptic.py

使用方法

语义分割任务

训练:

python train_semantic.py --config configs/utae_semantic.yaml

测试:

python test_semantic.py --config configs/utae_semantic.yaml --weights output/utae_semantic/best_model.pdparams

全景分割任务

训练:

python train_panoptic.py --config configs/utae_panoptic.yaml

测试:

python test_panoptic.py --config configs/utae_panoptic.yaml --weights output/utae_panoptic/best_model.pdparams

实验结果

在 PASTIS 数据集上,本案例复现了以下性能(PaddlePaddle 实现):

  • SQ (Segmentation Quality): 83.8
  • RQ (Recognition Quality): 58.9
  • PQ (Panoptic Quality): 49.7

对比 PyTorch 原版实现,性能相近,说明迁移有效。

文件结构

UTAE
 ├── src
 │   ├── backbones/              # 模型骨干
 │   ├── learning/               # 训练相关工具
 │   ├── panoptic/               # 全景分割模块
 │   ├── dataset.py              # 数据加载
 │   ├── model_utils.py          # 模型工具函数
 │   └── utils.py                # 通用工具
 ├── train_semantic.py           # 语义分割训练
 ├── test_semantic.py            # 语义分割测试
 ├── train_panoptic.py           # 全景分割训练
 └── test_panoptic.py            # 全景分割测试

Copy link

paddle-bot bot commented Sep 1, 2025

Thanks for your contribution!

Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有几处问题还麻烦修改下

Comment on lines 22 to 37
"""
Lightweight Temporal Attention Encoder (L-TAE) for image time series.
Attention-based sequence encoding that maps a sequence of images to a single feature map.
A shared L-TAE is applied to all pixel positions of the image sequence.
Args:
in_channels (int): Number of channels of the input embeddings.
n_head (int): Number of attention heads.
d_k (int): Dimension of the key and query vectors.
mlp (List[int]): Widths of the layers of the MLP that processes the concatenated outputs of the attention heads.
dropout (float): dropout
d_model (int, optional): If specified, the input tensors will first processed by a fully connected layer
to project them into a feature space of dimension d_model.
T (int): Period to use for the positional encoding.
return_att (bool): If true, the module returns the attention masks along with the embeddings (default False)
positional_encoding (bool): If False, no positional encoding is used (default True).
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class的docstring请移动到类的下方,__init__的上方

本案例基于 UTAE(U-TAE) 实现,用 PaddleScience 封装如下:
```
--8<--
examples/utae/src/model.py:1:80
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类似的路径跟本PR上传的路径对不上,所以文档里都是无法显示的,麻烦根据PR的文档预览链接,修改一下,

Image

Comment on lines 82 to 127
### 语义分割任务
- 训练:
```bash
python train_semantic.py \
--dataset_folder "/path/to/PASTIS" \
--epochs 100 \
--batch_size 2 \
--num_workers 0 \
--display_step 10
```

- 测试:
```bash
wget -nc -O pretrained/utae_semantic.pdparams https://paddle-org.bj.bcebos.com/paddlescience/models/utae/semantic.pdparams
python test_semantic.py \
--weight_file pretrained/utae_semantic.pdparams \
--dataset_folder "/path/to/PASTIS" \
--device gpu --num_workers 0

```

### 全景分割任务
- 训练:
```bash
python train_panoptic.py \
--dataset_folder "/path/to/PASTIS" \
--epochs 100 \
--batch_size 2 \
--num_workers 0 \
--warmup 5 \
--l_shape 1 \
--display_step 10
```

- 测试:
```bash
wget -O pretrained/utae_panoptic.pdparams \
https://paddle-org.bj.bcebos.com/paddlescience/models/utae/panoptic.pdparams
python test_panoptic.py \
--weight_file ./pretrained/utae_panoptic.pdparams \
--dataset_folder "/path/to/PASTIS" \
--batch_size 2 \
--num_workers 0 \
--device gpu
```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 128 to 133
## 实验结果
在 PASTIS 数据集上,本案例复现了以下性能(PaddlePaddle 实现):
- **SQ (Segmentation Quality)**: 83.8
- **RQ (Recognition Quality)**: 58.9
- **PQ (Panoptic Quality)**: 49.7
![rusult](https://paddle-org.bj.bcebos.com/paddlescience/docs/utae/rusult.png)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同理,实验结果和预训练权重,建议放到文档开头位置,参考:https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/shock_wave/?h=%E6%BF%80%E6%B3%A2#1

Comment on lines 11 to 26
"""
Initialize ConvLSTM cell.

Parameters
----------
input_size: (int, int)
Height and width of input tensor as (height, width).
input_dim: int
Number of channels of input tensor.
hidden_dim: int
Number of channels of hidden state.
kernel_size: (int, int)
Size of the convolutional kernel.
bias: bool
Whether or not to add the bias.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init__相关代码请移动到类名的下方

Comment on lines 131 to 135
在 PASTIS 数据集上,本案例复现了以下性能(PaddlePaddle 实现):

- **SQ (Segmentation Quality)**: 83.8
- **RQ (Recognition Quality)**: 58.9
- **PQ (Panoptic Quality)**: 49.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在PASTIS 数据集上,本案例复现了全景分割预测与语义分割预测的可视化结果如图所示:

Comment on lines 145 to 149
- 源代码实现:[python test_semantic.py \
--weight_file /home/aistudio/1/results/Fold_1/model_epoch_8_miou_0.477.pdparams \
--dataset_folder "/home/aistudio/PASTIS" \
--device gpu
--num_workers 0](https://github.com/VSainteuf/utae-paps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants