[ready] Introduce chain_matmul #12380

vishwakftw · 2018-10-05T18:29:42Z

This was one of the few functions left out from the list of functions in
NumPy's linalg module
multi_mm is particularly useful for DL research, for quick analysis of
deep linear networks
Added tests and doc string

vishwakftw · 2018-10-05T19:07:22Z

One benchmark on the CPU (taken out of an exercise in CLRS):

In [14]: a1 = torch.randn(30, 35)

In [15]: a2 = torch.randn(35, 15)

In [16]: a3 = torch.randn(15, 5)

In [17]: a4 = torch.randn(5, 10)

In [18]: a5 = torch.randn(10, 20)

In [19]: a6 = torch.randn(20, 25)

In [20]: %%timeit
    ...: torch.einsum('pq,qr,rs,st,tu,uv->pv',[a1,a2,a3,a4,a5,a6])
    ...: 
262 µs ± 4.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [21]: %%timeit
    ...: torch.multi_mm(a1, a2, a3, a4, a5, a6)
    ...: 
23.6 µs ± 225 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

zou3519 · 2018-10-05T21:38:00Z

I'm not sure einsum is the best thing to compare this to. Could you do a direct comparision with m1 @ m2 @ m3 @... @ m6 ?

vishwakftw · 2018-10-06T03:50:50Z

In [13]: a1 = torch.randn(300, 350).double()

In [14]: a2 = torch.randn(350, 150).double()

In [15]: a3 = torch.randn(150, 50).double()

In [16]: a4 = torch.randn(50, 10).double()

In [17]: a5 = torch.randn(10, 200).double()

In [18]: a6 = torch.randn(200, 25).double()

In [19]: %%timeit
    ...: torch.multi_mm(a1, a2, a3, a4, a5, a6)
    ...: 
178 µs ± 6.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [20]: %%timeit
    ...: a1 @ a2 @ a3 @ a4 @ a5 @ a6
    ...: 
767 µs ± 4.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

vishwakftw · 2018-10-06T13:12:54Z

@zou3519 This is ready for review.

apaszke · 2018-10-07T20:01:37Z

Eh is there any chance to revert the reshuffling? It makes the diff unnecessarily large. Those changes are completely meaningless unless they are enforced by the CI, because I'm sure that the order will be messed up in a week or two.

torch/functional.py

+
+
+    Args:
+        matrices (list of Tensors): list of 2-D tensors whose product is to be determined.


torch/functional.py

+    r"""Returns the matrix product of the :math:`N` 2-D tensors. This product is efficiently computed
+    using the matrix chain order algorithm which selects the order in which incurs the lowest cost in terms
+    of arithmetic operations (`[CLRS]`_). Note that :math:`N` needs to be greater than or equal to 2; if equal to 2
+    then a trivial matrix-matrix product is returned.


fmassa · 2018-10-07T22:46:43Z

I think this is nice! Question: can't those cost function optimizations be used in einsum as well?

vishwakftw · 2018-10-08T01:27:45Z

@fmassa I went through the einsum implementation, and it seems to use a lot of permutations for preprocessing the operands. With the current implementation, I am not sure if these optimizations can be leveraged.

apaszke · 2018-10-08T09:08:29Z

BTW is multi_mm really what NumPy calls this? The term I've always heard is "matrix chain multiplication", so mm_chain seems like a better one.

vishwakftw · 2018-10-08T10:10:54Z

NumPy calls it multi_dot.

apaszke · 2018-10-08T16:05:01Z

Well since we're not calling it that anyway, why not clean up the multi_ prefix that doesn't fit it all too well? Tbh when I first read the title of this PR I was expecting sth like batched mm, but possibly for matrices of mismatched sizes (e.g. passed in as lists). chain_mm or mm_chain seem nice. Finally, it might be the best to make it chain_matmul instead of limiting it to 2D (although I guess that might complicate the implementation a bit).

vishwakftw · 2018-10-08T17:39:43Z

I am sorry to disappoint with the name.

Regarding the name of the function, I'll name it chain_mm (mm_chain looks like a intrinsic function sans the prefix _).

An extension to a matmul should be feasible, which I'll look at soon.

apaszke · 2018-10-08T18:33:52Z

Can we just make it chain_matmul, and assert that all elements are matrices? We can relax the constraint in the future.

Also, don't stress out about the name! That's what NumPy calls it, so it was a very reasonable choice too.

aten/src/ATen/native/LinearAlgebra.cpp

+
+Tensor chain_matmul(TensorList matrices) {
+  AT_CHECK(matrices.size() >= 2, "Expecting at least 2 matrices");
+  checkAllSameDim(matrices, 2);


torch/functional.py

+def chain_matmul(*matrices):
+    r"""Returns the matrix product of the :math:`N` 2-D tensors. This product is efficiently computed
+    using the matrix chain order algorithm which selects the order in which incurs the lowest cost in terms
+    of arithmetic operations (`[CLRS]`_). Note that since is a function to compute the product, :math:`N`


torch/functional.py

+    r"""Returns the matrix product of the :math:`N` 2-D tensors. This product is efficiently computed
+    using the matrix chain order algorithm which selects the order in which incurs the lowest cost in terms
+    of arithmetic operations (`[CLRS]`_). Note that since is a function to compute the product, :math:`N`
+    needs to be greater than or equal to 2; if equal to 2 then a trivial matrix-matrix product is returned.


torch/functional.py

+
+    .. _`[CLRS]`: https://mitpress.mit.edu/books/introduction-algorithms-third-edition
+    """
+    if len(matrices) == 1 and isinstance(matrices[0], (list, tuple)):


ssnl

Algorithm looks correct to me. :)

aten/src/ATen/native/LinearAlgebra.cpp

+  } else {
+
+    // Following the algorithm in Chapter 15.2 : Introduction to Algorithms, Cormen et al.
+    // Minor modifications have be made to accommodate zero-indexing


aten/src/ATen/native/LinearAlgebra.cpp

+    }
+
+    // Cost matrix
+    std::vector<std::vector<double>> m(n, std::vector<double>(n, 0));


test/test_torch.py

test/test_autograd.py

aten/src/ATen/native/LinearAlgebra.cpp

@@ -586,5 +586,81 @@ Tensor &nuclear_norm_out(Tensor& result, const Tensor& self, bool keepdim) {
  return at::sum_out(result, std::get<1>(at::svd(self)), 0, keepdim);
 }

+Tensor _chain_matmul_general(TensorList matrices, std::vector<std::vector<int64_t>>& order, int64_t i, int64_t j) {


ssnl · 2018-10-10T07:59:55Z

@vishwakftw Our CircleCI build has a problem that can be fixed by git pull --rebase upstream master. Could you do that and then force push? Thanks!

vishwakftw · 2018-10-10T18:23:02Z

@ssnl is this good to go?

ssnl

LGTM! Thanks!

aten/src/ATen/native/LinearAlgebra.cpp

+    // parenthesizing matrices A_{i} to A_{j}. By this definition m[i, i] = 0 for all i
+    // m[i, j] is filled using the substructure property of the algorithm, meaning:
+    // m[i, j] = min_{i <= k < j} m[i, k] + m[k, j] + p_{i-1}p_{k}p_{j}
+    std::vector<std::vector<int64_t>> m(n, std::vector<int64_t>(n, 0));


vishwakftw · 2018-10-10T20:02:57Z

@pytorchbot retest this please

ssnl · 2018-10-10T21:28:59Z

Sorry, pytorch bot doesn't work with circlr ci unfortunately. Could you rebase and push again? Thanks!

- This was the only function left out from the list of functions in NumPy's linalg module - `multi_mm` is particularly useful for DL research, for quick analysis of deep linear networks To do: - Add tests

N.B.: I took the opportunity of shuffling some of the functions based on alphabetical order

- This seems to have gotten missed out from the rebase

…i_dot

zou3519 · 2018-10-11T17:19:40Z

I think our windows builds are broken so you can ignore those. The asan failure seems to be real though

vishwakftw · 2018-10-11T19:03:26Z

@zou3519 I have fixed it

facebook-github-bot

SsnL is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: - This was one of the few functions left out from the list of functions in NumPy's `linalg` module - `multi_mm` is particularly useful for DL research, for quick analysis of deep linear networks - Added tests and doc string Pull Request resolved: pytorch/pytorch#12380 Differential Revision: D10357136 Pulled By: SsnL fbshipit-source-id: 52b44fa18d6409bdeb76cbbb164fe4e88224458e

vishwakftw requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners October 5, 2018 18:29

vishwakftw requested review from ssnl and zou3519 as code owners October 5, 2018 19:03

vishwakftw changed the title ~~[WIP] Introduce multi_mm~~ [ready] Introduce multi_mm Oct 6, 2018

apaszke reviewed Oct 7, 2018

View reviewed changes

vishwakftw changed the title ~~[ready] Introduce multi_mm~~ [ready] Introduce chain_mm Oct 8, 2018

vishwakftw changed the title ~~[ready] Introduce chain_mm~~ [ready] Introduce chain_matmul Oct 8, 2018

vishwakftw commented Oct 8, 2018

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp

Tensor chain_matmul(TensorList matrices) {

AT_CHECK(matrices.size() >= 2, "Expecting at least 2 matrices");

checkAllSameDim(matrices, 2);

This comment was marked as off-topic.

Sign in to view

ssnl reviewed Oct 9, 2018

View reviewed changes

torch/functional.py Outdated

.. _`[CLRS]`: https://mitpress.mit.edu/books/introduction-algorithms-third-edition

"""

if len(matrices) == 1 and isinstance(matrices[0], (list, tuple)):

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ssnl reviewed Oct 9, 2018

View reviewed changes

ssnl approved these changes Oct 10, 2018

View reviewed changes

ssnl reviewed Oct 10, 2018

View reviewed changes

vishwakftw added 11 commits October 11, 2018 13:48

[WIP] Introduce multi_mm

7ba1b45

- This was the only function left out from the list of functions in NumPy's linalg module - `multi_mm` is particularly useful for DL research, for quick analysis of deep linear networks To do: - Add tests

Add doc string, make function behave as intended with varargs

14e5b8f

N.B.: I took the opportunity of shuffling some of the functions based on alphabetical order

Fix build

f61b355

Add tests

a3c8fc5

Fix build, and modify docs

fd2b143

CR

0cb21d4

add function in __all__

697fa71

multi_mm --> chain_mm

e5e88a3

chain_mm --> chain_matmul

8487439

Update functional.py

6d4f27d

CR

85a0a45

vishwakftw force-pushed the multi_dot branch from 05bc30e to 85a0a45 Compare October 11, 2018 08:19

vishwakftw added 3 commits October 11, 2018 20:00

modify initialization for m

51dee38

- This seems to have gotten missed out from the rebase

Merge branch 'master' of https://github.com/pytorch/pytorch into mult…

c9b02db

…i_dot

fix nit

00e8f4a

Revert init change

c601d43

facebook-github-bot reviewed Oct 12, 2018

View reviewed changes

facebook-github-bot closed this in 48bc57f Oct 12, 2018

vishwakftw deleted the multi_dot branch October 12, 2018 11:03

ezyang added open source merged labels Jun 24, 2019



		Args:
		matrices (list of Tensors): list of 2-D tensors whose product is to be determined.

[ready] Introduce chain_matmul #12380

[ready] Introduce chain_matmul #12380

Uh oh!

Conversation

vishwakftw commented Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishwakftw commented Oct 5, 2018

Uh oh!

zou3519 commented Oct 5, 2018

Uh oh!

vishwakftw commented Oct 6, 2018

Uh oh!

vishwakftw commented Oct 6, 2018

Uh oh!

apaszke commented Oct 7, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

fmassa commented Oct 7, 2018

Uh oh!

vishwakftw commented Oct 8, 2018

Uh oh!

apaszke commented Oct 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishwakftw commented Oct 8, 2018

Uh oh!

apaszke commented Oct 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishwakftw commented Oct 8, 2018

Uh oh!

apaszke commented Oct 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

ssnl commented Oct 10, 2018

Uh oh!

vishwakftw commented Oct 10, 2018

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

vishwakftw commented Oct 5, 2018 •

edited

Loading

apaszke commented Oct 8, 2018 •

edited

Loading

apaszke commented Oct 8, 2018 •

edited

Loading

apaszke commented Oct 8, 2018 •

edited

Loading