Skip to content

[RFC] Added unpaddedSize_ to Tensor #2970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

gcatron
Copy link
Contributor

@gcatron gcatron commented May 24, 2019

Summary:
This PR adds a new field to Tensor: unpaddedSize_.
Since we have a static batch size we copy the entire tensor to the device regardless of number of inputs. This would allow a DeviceManager to only copy the inputs if it supports it.
This means adding additional metaData to Tensor, this seemed cleaner than passing side channel data.

Documentation: NA

Test Plan: ninja test

@bertmaher
Copy link
Contributor

I had a pretty evil thought: do we ever check that the Tensor and bound Placeholder in PlaceholderBindings actually have the same type? If we relax that restriction, we could simply bind smaller tensors.

I'm not sure that's a good idea, I'm just thinking about ways to avoid cluttering the main Tensor class.

@bertmaher
Copy link
Contributor

Here's a different thought that feels safer than binding wrong-sized tensors... we already create an ExecutionContext that gets passed all the way down to the execute method where we need the size; maybe we should keep a parallel map of physical sizes in that object.

@bertmaher
Copy link
Contributor

bertmaher commented May 24, 2019

If we go with the design in this PR (passing the unpadded size in the Tensor) I'd like to make it a bit safer than simply adding a getter/setter. Some suggestions:

  1. This optimization only makes sense for unowned tensors; add a constructor for this case:
Tensor(void *data, TypeRef ty, size_t unpaddedSize);
  1. Only provide a getUnpaddedSize(), and have it return size() if the tensor is not padded. This is what you have.
  2. I guess I'm nervous about setting unpaddedSize=0 to mean "the tensor is not padded". I'd prefer to always set unpaddedSize, but I guess this is a bummer because you have to do some multiplications every time you allocate a Tensor.
  3. Maybe call the convenience method getUnpaddedSizeInBytes() to follow convention elsewhere.


uint64_t Tensor::getUnpaddedSize() {
if (unpaddedSize_) {
return unpaddedSize_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm right that virtually-padded tensors are only valid if unowned, assert that the tensor is unowned here.

@gcatron gcatron force-pushed the add_tensor_unpadded_size branch from 249422d to c0634f6 Compare May 24, 2019 15:23
@gcatron
Copy link
Contributor Author

gcatron commented May 24, 2019

Here's a different thought that feels safer than binding wrong-sized tensors... we already create an ExecutionContext that gets passed all the way down to the execute method where we need the size; maybe we should keep a parallel map of physical sizes in that object.

Hmm, maybe we add it to PlaceholderBindings, I'm not sure I like that better than metaData directly on the Tensor but that would work too... I guess the question there is, Do we ever create a padded Tensor so far upstream that it'd be a pain to plumb the padding factor down to context creation?

@gcatron gcatron force-pushed the add_tensor_unpadded_size branch from c0634f6 to 6560ec5 Compare May 24, 2019 16:55
@bertmaher
Copy link
Contributor

OK, after sleeping on it I'm on board with the add-it-to-Tensor approach :-)

size_t getUnpaddedSizeInBytes();

/// Set the size of unpaddedSize_ to \p size
void setUnpaddedSize(size_t size) { unpaddedSize_ = size; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about killing the setter in favor of a new flavor of unowned constructor? I always like to avoid exposing data members that can be arbitrarily twiddled.

Copy link
Contributor

@bertmaher bertmaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm comfortable approving this, esp to unblock work on top of it.

@jfix71
Copy link
Contributor

jfix71 commented May 24, 2019

Would it make sense to add this to Type instead of Tensor?

@nadavrot
Copy link
Contributor

CC @opti-mix who is looking into generalizing the padding support in the type system.

@opti-mix
Copy link
Contributor

Would it make sense to add this to Type instead of Tensor?

I've implemented some support for providing custom strides/alignments for dimensions of tensors on my private branch ( related to #2686 ). E.g. you can say that a stride for a given dimension is a multiple of 64. If you only have 3 elements as the actual size of the dimension, the remaining 61 elements would be just "padding" elements filled with some garbage data.

I've been planning to upstream it, though not necessarily in the coming weeks. On this branch, I've added a new strides_ member to Type and updated related places affected by this change.

If you think that the ability to specify strides on a per-dimension basis would help with your issue, we could upstream it earlier than I initially planned.

But I'm under the impression that you are trying to do something different here.

@bertmaher
Copy link
Contributor

To provide context here, the problem we're trying to solve is that PCIe traffic sending inputs to devices is a bottleneck when we're using SparseLengthsSum operators. The problem is that we're forced to provide a maximally-sized "indices" input, when it fact we often use fewer than the maximum number of indices. (E.g., we might compile to a sequence length of 1000, but frequently use only 100). So we end up transferring a bunch of zeros (or garbage) over PCIe, despite knowing (at runtime) the correct amount of data to transfer.

I think this feature is probably orthogonal to strided tensors; it's more like cheapo support for dynamic shapes ;-).

@bertmaher
Copy link
Contributor

Would it make sense to add this to Type instead of Tensor?

That could be reasonable... although I'm not sure whether it's really a type-system property. I've been thinking of it more as a storage property of the Tensor, like the type is still Int64[2000] but it's "virtually padded" as needed rather than being physically padded to the right size.

@opti-mix
Copy link
Contributor

I'd suggest not merging this PR yet. I've got some ideas I'd like to discuss next week.

@nickgg
Copy link
Contributor

nickgg commented May 29, 2019

Initially I preferred the Tensor approach, but I'm leaning towards adding it to Type now - what does it mean to e.g. MatMul two tensors with the same Type but different paddings?

@bertmaher
Copy link
Contributor

You could actually use this for matmul to express padding in the batch dimension. There are maybe two concepts here, (1) can this dimension be padded, which could go in Type, and (2) how much padding is there actually, which probably has to go on Tensor (or some other runtime structure).

@gcatron gcatron force-pushed the add_tensor_unpadded_size branch 4 times, most recently from cc08fa6 to 7ed15f2 Compare May 30, 2019 23:59
…y the inputs and not padding in the case of padded inputs
@gcatron gcatron force-pushed the add_tensor_unpadded_size branch from 7ed15f2 to 72bc00d Compare May 31, 2019 00:18
Copy link
Contributor

@bertmaher bertmaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gcatron has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@gcatron merged this pull request in ee5e2fd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants