Skip to content

Conversation

zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented Sep 30, 2022

Current perf on RTX 3090:

$CUDA_VISIBLE_DEVICES=1 ./build/bin/nvfuser_bench --benchmark_filter=.*Matmul.*Legacy/2048/3456/4096.*
---------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                       Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------------------------------
Nvfuser_Matmul_4warp3stage/no_quant_nvfuser_4warp_TT_Legacy/2048/3456/4096/manual_time        849 us         1032 us          739
Nvfuser_Matmul_4warp3stage/no_quant_nvfuser_4warp_TN_Legacy/2048/3456/4096/manual_time        871 us         1052 us          738
Nvfuser_Matmul_4warp3stage/no_quant_nvfuser_4warp_NT_Legacy/2048/3456/4096/manual_time        867 us         1049 us          737
Nvfuser_Matmul_4warp4stage/no_quant_nvfuser_4warp_TT_Legacy/2048/3456/4096/manual_time        872 us         1053 us          726
Nvfuser_Matmul_4warp4stage/no_quant_nvfuser_4warp_TN_Legacy/2048/3456/4096/manual_time        863 us         1044 us          726
Nvfuser_Matmul_4warp4stage/no_quant_nvfuser_4warp_NT_Legacy/2048/3456/4096/manual_time        874 us         1055 us          728
Nvfuser_Matmul_8warp3stage/no_quant_nvfuser_8warp_TT_Legacy/2048/3456/4096/manual_time        839 us         1020 us          723
Nvfuser_Matmul_8warp3stage/no_quant_nvfuser_8warp_TN_Legacy/2048/3456/4096/manual_time        881 us         1063 us          723
Nvfuser_Matmul_8warp3stage/no_quant_nvfuser_8warp_NT_Legacy/2048/3456/4096/manual_time        842 us         1023 us          723
Nvfuser_Matmul_8warp4stage/no_quant_nvfuser_8warp_TT_Legacy/2048/3456/4096/manual_time        873 us         1054 us          723
Nvfuser_Matmul_8warp4stage/no_quant_nvfuser_8warp_TN_Legacy/2048/3456/4096/manual_time        870 us         1052 us          723
Nvfuser_Matmul_8warp4stage/no_quant_nvfuser_8warp_NT_Legacy/2048/3456/4096/manual_time        841 us         1022 us          722
EagerModeMatmul/no_quant_eagermode_TT_Legacy/2048/3456/4096/manual_time                       893 us          958 us          796
EagerModeMatmul/no_quant_eagermode_TN_Legacy/2048/3456/4096/manual_time                       916 us          985 us          738
EagerModeMatmul/no_quant_eagermode_NT_Legacy/2048/3456/4096/manual_time                       846 us          916 us          838

shmsong and others added 30 commits July 11, 2022 22:15
@zasdfgbnm zasdfgbnm changed the base branch from devel to rebase-tracking-matmul March 20, 2023 17:14
@zasdfgbnm zasdfgbnm merged commit 86b103b into rebase-tracking-matmul Mar 20, 2023
@zasdfgbnm zasdfgbnm deleted the tracking-matmul branch March 20, 2023 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants