-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Visual observations cause significantly degraded performance when running trained ONNX models in Unity, despite working perfectly during Python inference. The agent exhibits noisy behavior and makes incorrect decisions that don't occur during training or Python inference. When the same data is marked as Vector observations instead of Visual, both training and Unity inference work correctly.
To Reproduce
Steps to reproduce the behavior:
Create a custom visual sensor (50x50 pixels, 3 channels) using ObservationSpec.Visual()
Train using mlagents-learn config.yaml --run-id=test_run
Test with Python inference: mlagents-learn config.yaml --run-id=test_run --resume --inference - works perfectly
Load the generated .onnx file in Unity using standard ML-Agents package - agent behavior becomes noisy and makes incorrect decisions
Change the same sensor to use ObservationSpec.Vector() instead and retrain - both Python inference and Unity work correctly
Console logs / stack traces
No error messages or stack traces appear in Unity console. The model loads successfully but produces degraded results.
Code snippets
Custom visual sensor implementation key parts:
public ObservationSpec GetObservationSpec() => ObservationSpec.Visual(channels, height, width, ObservationType.Default);
public int Write(ObservationWriter writer) {
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
ref var pixel = ref data[y * width + x];
writer[0, y, x] = pixel.tower;
writer[1, y, x] = pixel.monster;
writer[2, y, x] = pixel.coin;
}
}
return width * height * channels;
}
Training configuration:
behaviors:
UniversalBot:
trainer_type: ppo
hyperparameters:
batch_size: 2048
buffer_size: 40960
learning_rate: 1.0e-3
learning_rate_schedule: linear
beta: 0.01
epsilon: 0.2
lambd: 0.95
num_epoch: 2
network_settings:
normalize: true
hidden_units: 128
num_layers: 1
vis_encode_type: simple
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.97
rnd:
strength: 0.005
gamma: 0.99
encoding_size: 32
learning_rate: 1e-4
max_steps: 4000000
time_horizon: 128
summary_freq: 10000
keep_checkpoints: 1000
checkpoint_interval: 500000
threaded: false
Environment (please complete the following information):
Unity Version: 6000.0.43f1
OS + version: Windows 11
ML-Agents version: Both develop branch and release 22 (no difference observed)
Torch version: 2.2.2+cu121
Environment: Custom environment with simple visual sensor (50x50x3)
Additional Information:
Tested with both vis_encode_type: simple and nature_cnn - no difference
Different resolutions tested - issue persists
Training duration doesn't affect the issue
The issue only occurs when observations are marked as Visual; Vector observations work correctly
Python inference with --inference flag works perfectly, suggesting the issue is specific to Unity's ONNX execution