m.t() * m

Is there an optimization available for a matrix multiplication with its transpose? I'm trying to optimize a program where the slowest part is `m.t() * m` (a somewhat big matrix in the most inner loop). I read more about this and it always give a symmetrical matrix, which would make the operation `n(n+1)/2` instead of `n^2`. The lapack function `dsyrk` is supposed to handle that. I don't know if it would actually help but I'm curious to test.

Also, is there a way to know if I'm using lapack? I didn't give any special feature to ndarray in my cargo.toml file. A `perf` told me `29.26% _ZN14matrixmultiply4gemm13masked_kernel` so I think I'm using it because `gemm` is a lapack name. But is there a simpler way?

**EDIT**: Oh, sorry, I meant `BLAS` everywhere in my text. I wasn't aware of the difference :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

m.t() * m #445

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

m.t() * m #445

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions