-
Notifications
You must be signed in to change notification settings - Fork 253
Add sparsity to benchmarking #1917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1917
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit ef4cf36 with merge base 09c2760 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on this! just a couple questions but otherwise looks good
@@ -44,11 +45,33 @@ def run(config: BenchmarkConfig) -> BenchmarkResult: | |||
|
|||
# Use quantize_ to apply each quantization function to the model | |||
m_copy = deepcopy(base_model).eval().to(config.device) | |||
quantization_config = string_to_config( | |||
config.quantization, high_precision_dtype=config.high_precision_dtype | |||
aoBaseConfig = string_to_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably camel_case is better here?
@@ -1,9 +1,13 @@ | |||
# Sample configuration for inference benchmarks | |||
benchmark_mode: "inference" | |||
quantization_config_recipe_names: | |||
- "baseline" | |||
# - "baseline" Will always run a baseline instatance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're running baseline case as default for any benchmarking param. the reason I listed it here as a comment is because I wanted to let users know that this will always run. Maybe I can simply add it to readme, and write the comment like
# Will run a baseline inference for model by default, without quantization for comparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: yeah I think that's better, I would just make it clear that it's not some commented out code.
- "int4wo-128" | ||
- "marlin" | ||
sparsity_config_recipe_names: | ||
# - "none" Will always run a without sparsity instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
# Mock string_to_config to return valid configs | ||
from torchao.quantization import Int4WeightOnlyConfig | ||
from torchao.sparsity.sparse_api import ( | ||
BlockSparseWeightConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need BlockSparseWeightConfig here - should be semi-structured sparsity no?
self.assertIsInstance(result, BenchmarkResult) | ||
self.assertTrue(hasattr(result, "model_inference_time_in_ms")) | ||
|
||
# Test with block sparsity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see - can we split this into two tests then, one for int4+2:4 marlin, and one for block sparsity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
6e00835
to
d71baa3
Compare
This reverts commit d71baa3.
@@ -139,6 +191,7 @@ def test_generate_results_csv(self): | |||
BenchmarkResult( | |||
BenchmarkConfig( | |||
quantization="int8wo", | |||
sparsity="None", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: why string None and not just None
here?
Add sparsity support for benchmarking. The following support has been added