Skip to content

Commit de3e94f

Browse files
yiliu30thuang6
andauthored
Add doc for autotune (#1825)
Signed-off-by: yiliu30 <[email protected]> Co-authored-by: Huang, Tai <[email protected]>
1 parent 5b99bd3 commit de3e94f

File tree

1 file changed

+90
-0
lines changed

1 file changed

+90
-0
lines changed

docs/3x/autotune.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
AutoTune
2+
========================================
3+
4+
1. [Overview](#overview)
5+
2. [How it Works](#how-it-works)
6+
3. [Working with Autotune](#working-with-autotune) \
7+
3.1 [Working with PyTorch Model](#working-with-pytorch-model) \
8+
3.1 [Working with Tensorflow Model](#working-with-tensorflow-model)
9+
10+
11+
## Overview
12+
13+
Intel® Neural Compressor aims to help users quickly deploy low-precision models by leveraging popular compression techniques, such as post-training quantization and weight-only quantization algorithms. Despite having a variety of these algorithms, finding the appropriate configuration for a model can be difficult and time-consuming. To address this, we built the `autotune` module based on the [strategy](./tuning_strategies.md) in 2.x for accuracy-aware tuning, which identifies the best algorithm configuration for models to achieve optimal performance under the certain accuracy criteria. This module allows users to easily use predefined tuning recipes and customize the tuning space as needed.
14+
15+
## How it Works
16+
17+
The autotune module constructs the tuning space according to the pre-defined tuning set or users' tuning set. It iterates the tuning space and applies the configuration on given float model then records and compares its evaluation result with the baseline. The tuning process stops when meeting the exit policy.
18+
19+
20+
## Working with Autotune
21+
22+
The `autotune` API is used across all of frameworks supported by INC. It accepts three primary arguments: `model`, `tune_config`, and `eval_fn`.
23+
24+
The `TuningConfig` class defines the tuning process, including the tuning space, order, and exit policy.
25+
26+
- Define the tuning space
27+
28+
User can define the tuning space by setting `config_set` with an algorithm configuration or a set of configurations.
29+
```python
30+
# Use the default tuning space
31+
config_set = get_woq_tuning_config()
32+
33+
# Customize the tuning space with one algorithm configurations
34+
config_set = RTNConfig(use_sym=False, group_size=[32, 64])
35+
36+
# Customize the tuning space with two algorithm configurations
37+
config_set = ([RTNConfig(use_sym=False, group_size=32), GPTQConfig(group_size=128, use_sym=False)],)
38+
```
39+
40+
- Define the tuning order
41+
42+
The tuning order determines how the process traverses the tuning space and samples configurations. Users can customize it by configuring the `sampler`. Currently, we provide the `default_sampler`, which samples configurations sequentially, always in the same order.
43+
44+
- Define the exit policy
45+
46+
The exit policy includes two components: accuracy goal (`tolerable_loss`) and the allowed number of trials (`max_trials`). The tuning process will stop when either condition is met.
47+
48+
### Working with PyTorch Model
49+
The example below demonstrates how to autotune a PyTorch model on four `RTNConfig` configurations.
50+
51+
```python
52+
from neural_compressor.torch.quantization import RTNConfig, TuningConfig, autotune
53+
54+
55+
def eval_fn(model) -> float:
56+
return ...
57+
58+
59+
tune_config = TuningConfig(
60+
config_set=RTNConfig(use_sym=[False, True], group_size=[32, 128]),
61+
tolerable_loss=0.2,
62+
max_trials=10,
63+
)
64+
q_model = autotune(model, tune_config=tune_config, eval_fn=eval_fn)
65+
```
66+
67+
### Working with Tensorflow Model
68+
69+
The example below demonstrates how to autotune a TensorFlow model on two `StaticQuantConfig` configurations.
70+
71+
```python
72+
from neural_compressor.tensorflow.quantization import StaticQuantConfig, autotune
73+
74+
calib_dataloader = MyDataloader(...)
75+
custom_tune_config = TuningConfig(
76+
config_set=[
77+
StaticQuantConfig(weight_sym=True, act_sym=True),
78+
StaticQuantConfig(weight_sym=False, act_sym=False),
79+
]
80+
)
81+
82+
83+
def eval_fn(model) -> float:
84+
return ...
85+
86+
87+
best_model = autotune(
88+
model="baseline_model", tune_config=custom_tune_config, eval_fn=eval_fn, calib_dataloader=calib_dataloader
89+
)
90+
```

0 commit comments

Comments
 (0)