-
Notifications
You must be signed in to change notification settings - Fork 53
Should tensordot broadcast the contracted dimensions? #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @lezcano do you have any thoughts on this? |
In today's call, we decided to align with NumPy's behavior, as advocated for by @leofang, @oleksandr-pavlyk, and @rgommers. Given PyTorch's relatively recent addition of @IvanYashchuk did mention elsewhere that PyTorch's behavior can match einsum behavior as follows. If we express the tensordot operation with
|
A summary of additional considerations for why the NumPy behavior is preferred:
|
Should tensordot broadcast the contracted dimensions. For example, say we contract the first dimensions here
The dimension 3 and 1 do not match, but if we broadcast the arrays together first they both become shape (3, 3), after which they do match.
The spec is a little unclear about this https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#tensordot-x1-x2-axes-2. It says x2 must be compatible with x1 by broadcasting, which seems to imply unconditional broadcasting. But it also says "Each axis (dimension) x1_axes[i] for x1 must have the same size as the respective axis (dimension) x2_axes[i] for x2."
NumPy disallows broadcasting in contracted dimensions (it does broadcast non-contracted dimensions):
Pytorch broadcasts all dimensions, including contracted ones (note that pytorch still calls its axes argument
dims
)Note that in either case, the resulting array shape is based on the non-broadcasted input shapes, so it's not as simple as wrapping the call with
broadcast_arrays
.The text was updated successfully, but these errors were encountered: