forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Transpose scheduler, step 1 #1854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
bb3940f
Move MaxProducerPosUpdater into InlinePropagator::tearDown
zasdfgbnm 644b81d
cleanup
zasdfgbnm d552315
add null scheduler and matching for "statically null"
shmsong 0ea589f
Merge branch 'devel' of github.com:csarofeen/pytorch into ip-td
zasdfgbnm 07fe5b7
Merge branch 'ip-td' into transpose-schedule
zasdfgbnm 09a9843
save
zasdfgbnm 887d9fa
draft that compiles
zasdfgbnm 225a63e
save
zasdfgbnm 9f533e9
save
zasdfgbnm 6c52fe7
More test
zasdfgbnm e5b513a
cleanup tests
zasdfgbnm f993007
save
zasdfgbnm 96bdcb5
fix
zasdfgbnm 0dc891c
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm 858d6ce
more test
zasdfgbnm 7df6e2f
new
zasdfgbnm 9b0f02b
fix
zasdfgbnm 419d4be
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm 0c9cdd4
fix
zasdfgbnm 7d28da1
save
zasdfgbnm 498ab05
lint
zasdfgbnm b1ed2a7
cleanup
zasdfgbnm a6e2215
writings
zasdfgbnm 3d94ecc
save
zasdfgbnm 536817a
save
zasdfgbnm eb97cbd
save
zasdfgbnm 000840c
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm 275fa94
fix conflicts
zasdfgbnm 51346fe
cleanups
zasdfgbnm 48ecc4b
save
zasdfgbnm 3ce7360
// TODO: support symbolic tile size
zasdfgbnm 9bbf0cf
save
zasdfgbnm 52e4d63
fix
zasdfgbnm 595b5bf
fix
zasdfgbnm 4fc2223
inline-propagator most inlined
zasdfgbnm 2e8ce81
cleanup
zasdfgbnm ccfe5db
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm 276f9d5
save
zasdfgbnm ff54dc5
save
zasdfgbnm 7d91989
add cache
zasdfgbnm 84f25f4
reject trivial reduction and view in canScheduleCompileTime
zasdfgbnm 47d516a
reorder all_heuristics
zasdfgbnm aa2752d
pushing some failing tests
zasdfgbnm d0a026b
fix reference tensor finding
zasdfgbnm 9205565
make broadcasting test work
zasdfgbnm cacd1f7
cleanup
zasdfgbnm fc65d23
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm cd01c0f
revert
zasdfgbnm 9e5d394
clean
zasdfgbnm 821d027
enable view without testing
zasdfgbnm 1b425d2
merge all dims
zasdfgbnm eab982d
disable FusionScheduleTransposeBroadcast_CUDA
zasdfgbnm 9dd5aac
cleanup & simplify things
zasdfgbnm d7f0ea4
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm c19bbf1
Merge branch 'transpose-schedule' of github.com:csarofeen/pytorch int…
zasdfgbnm a5861a0
skip FusionScheduleTransposeBroadcast_CUDA
zasdfgbnm 65598b2
war for transpose split support
shmsong 9bea3bf
fix
zasdfgbnm 55c8298
FusionScheduleTransposeComplexDAG1_CUDA
zasdfgbnm 4dfd569
Merge branch 'devel' of github.com:csarofeen/pytorch into transpose-s…
zasdfgbnm 6881cc6
manual test
zasdfgbnm 4539019
save
zasdfgbnm 9545036
save
zasdfgbnm 6c369cf
save
zasdfgbnm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ | |
#include <torch/csrc/jit/codegen/cuda/scheduler/debug_utils.h> | ||
#include <torch/csrc/jit/codegen/cuda/scheduler/pointwise.h> | ||
#include <torch/csrc/jit/codegen/cuda/scheduler/registry.h> | ||
#include <torch/csrc/jit/codegen/cuda/scheduler/transpose.h> | ||
#include <torch/csrc/jit/codegen/cuda/scheduler/utils.h> | ||
|
||
#include <limits> | ||
|
@@ -1244,10 +1245,75 @@ class PersistentKernelScheduler : public SchedulerEntry { | |
} | ||
}; | ||
|
||
class TransposeScheduler : public SchedulerEntry { | ||
public: | ||
explicit TransposeScheduler( | ||
Fusion* fusion, | ||
SchedulerRuntimeInfo& runtime_info, | ||
HeuristicSummary* data_cache = nullptr) | ||
: SchedulerEntry(ScheduleHeuristic::Transpose) { | ||
computeHeuristics(fusion, runtime_info, data_cache); | ||
} | ||
|
||
static bool canScheduleCompileTime(Fusion* fusion) { | ||
// Not enabling this yet. Needs more validation. | ||
return false; | ||
#if 0 | ||
if (!hasAtLeastTwoValidGroups(fusion)) { | ||
scheduler_debug_utils::canScheduleRejectReason( | ||
ScheduleHeuristic::Transpose, | ||
"cannot find two mismatching inner most dimensions"); | ||
return false; | ||
} | ||
|
||
// TODO: add support for trivial reduction | ||
auto reduction_ops = | ||
ir_utils::getReductionOps(fusion, false /* ignore_trivial */); | ||
|
||
if (!reduction_ops.empty()) { | ||
scheduler_debug_utils::canScheduleRejectReason( | ||
ScheduleHeuristic::Transpose, "no support for reduction ops"); | ||
return false; | ||
} | ||
|
||
if (hasNonUniqueBcast(fusion)) { | ||
scheduler_debug_utils::canScheduleRejectReason( | ||
ScheduleHeuristic::Transpose, | ||
"Broadcasting dimension might be broadcasting to multiple sizes."); | ||
return false; | ||
} | ||
|
||
zasdfgbnm marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return true; | ||
#endif | ||
} | ||
|
||
static bool canScheduleRunTime( | ||
Fusion* fusion, | ||
SchedulerRuntimeInfo& runtime_info, | ||
HeuristicSummary* data_cache = nullptr) { | ||
return true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might want to evaluate a bit in some size-dependent scenarios. Maybe when the inner dimensions are small. But this function is hot path so the simpler the better. |
||
} | ||
|
||
void schedule(Fusion* fusion) override { | ||
FUSER_PERF_SCOPE("Schedule Transpose Fusion"); | ||
scheduleTranspose(fusion, transposeParams()); | ||
} | ||
|
||
private: | ||
void computeHeuristics( | ||
Fusion* fusion, | ||
SchedulerRuntimeInfo& runtime_info, | ||
HeuristicSummary* data_cache = nullptr) { | ||
params_ = getTransposeHeuristics(fusion, runtime_info, data_cache); | ||
TORCH_INTERNAL_ASSERT(params_ != nullptr); | ||
} | ||
}; | ||
|
||
// Schedule Table | ||
const std::vector<ScheduleHeuristic>& all_heuristics() { | ||
static const std::vector<ScheduleHeuristic> hlist = { | ||
ScheduleHeuristic::Reduction, | ||
ScheduleHeuristic::Transpose, | ||
ScheduleHeuristic::PointWise, | ||
ScheduleHeuristic::Persistent}; | ||
return hlist; | ||
|
@@ -1294,6 +1360,9 @@ bool SchedulerEntry::canSchedule( | |
case ScheduleHeuristic::Persistent: | ||
return checkCanSchedule<PersistentKernelScheduler>( | ||
fusion, runtime_info, data_cache); | ||
case ScheduleHeuristic::Transpose: | ||
return checkCanSchedule<TransposeScheduler>( | ||
fusion, runtime_info, data_cache); | ||
default: | ||
TORCH_INTERNAL_ASSERT(false, "unreachable"); | ||
return false; | ||
|
@@ -1320,6 +1389,10 @@ std::unique_ptr<SchedulerEntry> SchedulerEntry::makeEntry( | |
scheduler_entry = std::make_unique<PersistentKernelScheduler>( | ||
fusion, runtime_info, data_cache); | ||
break; | ||
case ScheduleHeuristic::Transpose: | ||
scheduler_entry = std::make_unique<TransposeScheduler>( | ||
fusion, runtime_info, data_cache); | ||
break; | ||
default: | ||
TORCH_INTERNAL_ASSERT(false, "unreachable"); | ||
} | ||
|
@@ -1353,6 +1426,8 @@ std::string toString(ScheduleHeuristic sh) { | |
return "reduction"; | ||
case ScheduleHeuristic::Persistent: | ||
return "persistent"; | ||
case ScheduleHeuristic::Transpose: | ||
return "transpose"; | ||
default: | ||
TORCH_INTERNAL_ASSERT(false, "undefined schedule"); | ||
} | ||
|
@@ -1405,6 +1480,10 @@ HeuristicSummary::HeuristicSummary( | |
getPersistentHeuristics(fusion, runtime_info, this); | ||
PersistentKernelScheduler::canScheduleRunTime(fusion, runtime_info, this); | ||
break; | ||
case ScheduleHeuristic::Transpose: | ||
getTransposeHeuristics(fusion, runtime_info, this); | ||
TransposeScheduler::canScheduleRunTime(fusion, runtime_info, this); | ||
break; | ||
default: | ||
TORCH_INTERNAL_ASSERT(false, "unknown heuristic"); | ||
} | ||
|
@@ -1451,6 +1530,11 @@ void HeuristicSummary::validate() const { | |
entry_type_map_.count(EntryType::SCOPE_PERSISTENT_FACTOR_INFO)); | ||
break; | ||
} | ||
case ScheduleHeuristic::Transpose: { | ||
TORCH_INTERNAL_ASSERT(entry_type_map_.count( | ||
EntryType::INPUTS_AND_OUTPUTS_INNER_DIM_GROUPS)); | ||
break; | ||
} | ||
default: | ||
TORCH_INTERNAL_ASSERT(false, "unknown heuristic"); | ||
} | ||
|
@@ -1490,6 +1574,8 @@ template class HeuristicSummaryEntry<HeuristicCompileTime::DomainMap>; | |
template class HeuristicSummaryEntry<HeuristicCompileTime::ReferenceTensors>; | ||
template class HeuristicSummaryEntry< | ||
HeuristicCompileTime::VectorizableInputsAndOutputs>; | ||
template class HeuristicSummaryEntry< | ||
HeuristicCompileTime::InputsOutputsInnerDimGroups>; | ||
template class HeuristicSummaryEntry< | ||
HeuristicCompileTime::UnrollableInputsAndOutputs>; | ||
template class HeuristicSummaryEntry<HeuristicCompileTime::ReductionTVs>; | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.