-
Notifications
You must be signed in to change notification settings - Fork 699
[Partitioner] Partitioner refactoring - final #3409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, couple of comments
@@ -161,6 +164,11 @@ class Partitioner final : public PartitionerBase { | |||
/// for function optimization. \returns the partition result or an error. | |||
llvm::Expected<DAGListTy> heterogeneousPartition(CompilationContext &cctx); | |||
|
|||
/// Performs a load balancing optimization pass to optimize for load /// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add more details about "load balance", what exactly does that mean in the context of partitioner.
lib/Partitioner/Partitioner.cpp
Outdated
llvm::Expected<DAGListTy> | ||
Partitioner::loadBalancedPartition(CompilationContext &cctx, | ||
size_t numDevices) { | ||
DCHECK(module_->getFunctions().size() == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as talked offline, similarly to the rest of the cases when we return error, we should return error here as well instead of debug level checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM the refactor looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beicy has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
/// workloads of each accelerator/device in addition to respecting memory | ||
/// constraints. | ||
llvm::Expected<DAGListTy> loadBalancedPartition(CompilationContext &cctx, | ||
size_t numDevices = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document numDevices. It seems that it gets overwritten under certain conditions inside the LBPartition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
lib/Partitioner/Partitioner.cpp
Outdated
RETURN_ERR_IF_NOT( | ||
module_->getFunctions().size() == 1, | ||
"Invalid number of functions in a module. For load-balanced partition " | ||
"flow, the module can only contain 1 function"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add actual number of function in a module to the error message as well.
"applied. Call heterogeneous partition instead."; | ||
return heterogeneousPartition(cctx); | ||
} | ||
F_ = selectRepFunc(module_, memSize_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since it's guaranteed to be a single function now, should we just get function from the module directly and remove selectRepFunc call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"since it's guaranteed to be a single function now"
-- In fact, we still allow multi-functions in a module as long as none of the function needs partition. That is, for each function, we just create DAG for each function.
"should we just get function from the module directly and remove selectRepFunc call?"
-- For heterogeneous partition, I think selectRepFunc call should be kept due to the above reason. Also, this function updates function's actual memory usage. So I'd like to keep this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beicy is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary:
This is the last main parts of Partitioner refactoring:
Documentation:
[Optional Fixes #issue]
#2298
Test Plan:
added unnitest, ninja test, ./tests/images/run.sh
Please see a detailed explanation of how to fill out the fields in the relevant sections in PULL_REQUEST.md.