-
Notifications
You must be signed in to change notification settings - Fork 699
[RFC] Added unpaddedSize_ to Tensor #2970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1038f80
to
249422d
Compare
I had a pretty evil thought: do we ever check that the I'm not sure that's a good idea, I'm just thinking about ways to avoid cluttering the main Tensor class. |
Here's a different thought that feels safer than binding wrong-sized tensors... we already create an ExecutionContext that gets passed all the way down to the execute method where we need the size; maybe we should keep a parallel map of physical sizes in that object. |
If we go with the design in this PR (passing the unpadded size in the Tensor) I'd like to make it a bit safer than simply adding a getter/setter. Some suggestions:
|
|
||
uint64_t Tensor::getUnpaddedSize() { | ||
if (unpaddedSize_) { | ||
return unpaddedSize_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm right that virtually-padded tensors are only valid if unowned, assert
that the tensor is unowned here.
249422d
to
c0634f6
Compare
Hmm, maybe we add it to PlaceholderBindings, I'm not sure I like that better than metaData directly on the Tensor but that would work too... I guess the question there is, Do we ever create a padded Tensor so far upstream that it'd be a pain to plumb the padding factor down to context creation? |
c0634f6
to
6560ec5
Compare
OK, after sleeping on it I'm on board with the add-it-to-Tensor approach :-) |
include/glow/Base/Tensor.h
Outdated
size_t getUnpaddedSizeInBytes(); | ||
|
||
/// Set the size of unpaddedSize_ to \p size | ||
void setUnpaddedSize(size_t size) { unpaddedSize_ = size; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about killing the setter in favor of a new flavor of unowned constructor? I always like to avoid exposing data members that can be arbitrarily twiddled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm comfortable approving this, esp to unblock work on top of it.
Would it make sense to add this to Type instead of Tensor? |
CC @opti-mix who is looking into generalizing the padding support in the type system. |
I've implemented some support for providing custom strides/alignments for dimensions of tensors on my private branch ( related to #2686 ). E.g. you can say that a stride for a given dimension is a multiple of 64. If you only have 3 elements as the actual size of the dimension, the remaining 61 elements would be just "padding" elements filled with some garbage data. I've been planning to upstream it, though not necessarily in the coming weeks. On this branch, I've added a new If you think that the ability to specify strides on a per-dimension basis would help with your issue, we could upstream it earlier than I initially planned. But I'm under the impression that you are trying to do something different here. |
To provide context here, the problem we're trying to solve is that PCIe traffic sending inputs to devices is a bottleneck when we're using SparseLengthsSum operators. The problem is that we're forced to provide a maximally-sized "indices" input, when it fact we often use fewer than the maximum number of indices. (E.g., we might compile to a sequence length of 1000, but frequently use only 100). So we end up transferring a bunch of zeros (or garbage) over PCIe, despite knowing (at runtime) the correct amount of data to transfer. I think this feature is probably orthogonal to strided tensors; it's more like cheapo support for dynamic shapes ;-). |
That could be reasonable... although I'm not sure whether it's really a type-system property. I've been thinking of it more as a storage property of the Tensor, like the type is still Int64[2000] but it's "virtually padded" as needed rather than being physically padded to the right size. |
I'd suggest not merging this PR yet. I've got some ideas I'd like to discuss next week. |
Initially I preferred the Tensor approach, but I'm leaning towards adding it to Type now - what does it mean to e.g. MatMul two tensors with the same Type but different paddings? |
You could actually use this for matmul to express padding in the batch dimension. There are maybe two concepts here, (1) can this dimension be padded, which could go in |
cc08fa6
to
7ed15f2
Compare
…y the inputs and not padding in the case of padded inputs
7ed15f2
to
72bc00d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gcatron has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary:
This PR adds a new field to Tensor: unpaddedSize_.
Since we have a static batch size we copy the entire tensor to the device regardless of number of inputs. This would allow a DeviceManager to only copy the inputs if it supports it.
This means adding additional metaData to Tensor, this seemed cleaner than passing side channel data.
Documentation: NA
Test Plan: ninja test