You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bump version for float8 dynamic quant and weight only quant configs
Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649
Also extended current config serialization to work with multiple config versions
Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed
```
python test/integration/test_loading_deprecated_checkpoint.py
```
Reviewers:
Subscribers:
Tasks:
Tags:
f"Stored version is not the same as current default version of the config: {stored_version=}, {current_version=}, please check the deprecation warning"
239
+
)
248
240
249
241
# Handle the case where obj_data is not a dictionary
Copy file name to clipboardExpand all lines: torchao/dtypes/floatx/float8_layout.py
+4Lines changed: 4 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,7 @@
3
3
#
4
4
# This source code is licensed under the BSD 3-Clause license found in the
5
5
# LICENSE file in the root directory of this source tree.
6
+
importwarnings
6
7
fromdataclassesimportdataclass
7
8
fromtypingimportAny, Dict, List, Optional, Tuple, Union
8
9
@@ -109,6 +110,9 @@ def __init__(
109
110
transposed: bool,
110
111
_layout: Layout,
111
112
):
113
+
warnings.warn(
114
+
"Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in March 2026 (9 months), please upgrade torchao and quantize again, or download a newer torchao checkpoint, see https://github.com/pytorch/ao/issues/2649 for more details"
Copy file name to clipboardExpand all lines: torchao/quantization/quant_api.py
+11-4Lines changed: 11 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -1489,15 +1489,15 @@ class Float8WeightOnlyConfig(AOBaseConfig):
1489
1489
Args:
1490
1490
weight_dtype (torch.dtype): The target data type for weight quantization. Default is torch.float8_e4m3fn.
1491
1491
set_inductor_config (bool): if True, adjusts `torchinductor` settings to recommended values.
1492
-
VERSION (int): the version of the config, version 1 is using AffineQuantizedTensor that we plan to deprecate/split, version 2 is using Float8Tensor
1492
+
VERSION (int): the version of the config, version 1 is using AffineQuantizedTensor that we plan to deprecate/split, version 2 is using Float8Tensor (default)
1493
1493
1494
1494
Note:
1495
1495
The actual matmul will be computed in original precision of the weight tensor.
1496
1496
"""
1497
1497
1498
1498
weight_dtype: torch.dtype=e4m3_dtype
1499
1499
set_inductor_config: bool=True
1500
-
VERSION: int=1
1500
+
VERSION: int=2
1501
1501
1502
1502
1503
1503
# for BC
@@ -1506,6 +1506,9 @@ class Float8WeightOnlyConfig(AOBaseConfig):
"VERSION 1 of Float8WeightOnlyConfig is deprecated and will no longer be supported in March 2026 (9 months), please use VERSION 2, see https://github.com/pytorch/ao/issues/2649 for more details"
@@ -1629,7 +1632,7 @@ class Float8DynamicActivationFloat8WeightConfig(AOBaseConfig):
1629
1632
activation_value_ub (Optional[float]): the upper bound for activation value for calculating scale
1630
1633
kernel_preference (KernelPreference): kernel preference for ops like matmul, grouped matmul etc. by defalut (KernelPreference.AUTO) it will be chosen for user based on hardware or other information, this only needs to be set in weight
1631
1634
set_inductor_config (bool): if True, adjusts `torchinductor` settings to recommended values.
1632
-
VERSION (int): the version of the config, version 1 is using AffineQuantizedTensor that we plan to deprecate/split, version 2 is using Float8Tensor
1635
+
VERSION (int): the version of the config, version 1 is using AffineQuantizedTensor that we plan to deprecate/split, version 2 is using Float8Tensor (default)
1633
1636
1634
1637
"""
1635
1638
@@ -1641,7 +1644,7 @@ class Float8DynamicActivationFloat8WeightConfig(AOBaseConfig):
"VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in March 2026 (9 months), please use VERSION 2, see https://github.com/pytorch/ao/issues/2649 for more details"
0 commit comments