Skip to content

[Partitioner] Partitioner refactoring - final #3409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

beicy
Copy link
Contributor

@beicy beicy commented Aug 12, 2019

Summary:
This is the last main parts of Partitioner refactoring:

  1. Rewrote the load-balanced partition as a separate partition flow instead of an optimization pass since this one only uses the number of partitions generated by auto partition and overwrites the partition results.
  2. Added unittest for load-balanced partition.
  3. Abstract some common code as functions.

Documentation:

[Optional Fixes #issue]
#2298
Test Plan:
added unnitest, ninja test, ./tests/images/run.sh

Please see a detailed explanation of how to fill out the fields in the relevant sections in PULL_REQUEST.md.

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, couple of comments

@@ -161,6 +164,11 @@ class Partitioner final : public PartitionerBase {
/// for function optimization. \returns the partition result or an error.
llvm::Expected<DAGListTy> heterogeneousPartition(CompilationContext &cctx);

/// Performs a load balancing optimization pass to optimize for load ///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add more details about "load balance", what exactly does that mean in the context of partitioner.

llvm::Expected<DAGListTy>
Partitioner::loadBalancedPartition(CompilationContext &cctx,
size_t numDevices) {
DCHECK(module_->getFunctions().size() == 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as talked offline, similarly to the rest of the cases when we return error, we should return error here as well instead of debug level checking.

Copy link
Contributor

@gcatron gcatron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM the refactor looks great!

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beicy has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

/// workloads of each accelerator/device in addition to respecting memory
/// constraints.
llvm::Expected<DAGListTy> loadBalancedPartition(CompilationContext &cctx,
size_t numDevices = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you document numDevices. It seems that it gets overwritten under certain conditions inside the LBPartition.

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

RETURN_ERR_IF_NOT(
module_->getFunctions().size() == 1,
"Invalid number of functions in a module. For load-balanced partition "
"flow, the module can only contain 1 function");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add actual number of function in a module to the error message as well.

"applied. Call heterogeneous partition instead.";
return heterogeneousPartition(cctx);
}
F_ = selectRepFunc(module_, memSize_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since it's guaranteed to be a single function now, should we just get function from the module directly and remove selectRepFunc call?

Copy link
Contributor Author

@beicy beicy Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"since it's guaranteed to be a single function now"
-- In fact, we still allow multi-functions in a module as long as none of the function needs partition. That is, for each function, we just create DAG for each function.

"should we just get function from the module directly and remove selectRepFunc call?"
-- For heterogeneous partition, I think selectRepFunc call should be kept due to the above reason. Also, this function updates function's actual memory usage. So I'd like to keep this function.

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beicy is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@beicy merged this pull request in ae4114f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants