Same implementation for Sparse Multiplication for aligned and unaligned arrays #1274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

Anipik merged 5 commits into dotnet:master from Anipik:sparse2

Oct 24, 2018

Contributor

Anipik commented Oct 16, 2018

Working towards #1018

Anipik requested review from eerhardt and tannergooding

October 16, 2018 19:40

Contributor Author

Anipik commented Oct 17, 2018

@tannergooding @eerhardt can you please take a look at this ?

eerhardt reviewed

View reviewed changes

test/Microsoft.ML.CpuMath.PerformanceTests/SsePerformanceTests.cs Outdated Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/CpuMathUtils.netcoreapp.cs Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/Thunk.cs Outdated Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Outdated Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Outdated Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Outdated Show resolved Hide resolved

tannergooding reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/SseIntrinsics.cs Outdated

+                                          while (ppos < pposEnd)
+                                          {
+                                              int col = *ppos;
+                                              Vector128<float> x1 = Sse.SetVector128(pm3[col], pm2[col], pm1[col], pm0[col]);

Member

tannergooding Oct 17, 2018

note to self: I want to check the codegen of this and ensure that it is being emitted "optimally" (two loads and three unpack with two folded loads; rather than as four loads and three unpack).

eerhardt reviewed

View reviewed changes

src/Native/CpuMathNative/CMakeLists.txt Outdated Show resolved Hide resolved

eerhardt reviewed

View reviewed changes

test/Microsoft.ML.CpuMath.UnitTests.netcoreapp/UnitTests.cs Outdated Show resolved Hide resolved

Member

eerhardt commented Oct 18, 2018

#endif

It looks like this whole block can be removed as well. It is no longer used.

Refers to: src/Native/CpuMathNative/Sse.cpp:79 in db669af. [](commit_id = db669af, deletion_comment = False)

eerhardt previously approved these changes

View reviewed changes

Member

eerhardt left a comment

eerhardt dismissed their stale review

October 18, 2018 17:24

Need to fix the CMake file

eerhardt approved these changes

View reviewed changes

Member

eerhardt left a comment

Contributor Author

Anipik commented Oct 18, 2018

@tannergooding did u get the chance to look at the codegen for this ?

Member

tannergooding commented Oct 18, 2018

No, not yet.

tannergooding reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/Avx.cs Show resolved Hide resolved

tannergooding reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/AvxIntrinsics.cs Outdated Show resolved Hide resolved

tannergooding reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/AvxIntrinsics.cs

+                                      {
+                                          int col1 = *ppos;
+                                          int col2 = col1 + 4 * ccol;
+                                          Vector256<float> x1 = Avx.SetVector256(pm3[col2], pm2[col2], pm1[col2], pm0[col2],

Member

tannergooding Oct 22, 2018

Don't we have a helper method for this?

Contributor Author

Anipik Oct 22, 2018

no its different , the one we have the indexs are continous
return Avx.SetVector256(src[idx[7]], src[idx[6]], src[idx[5]], src[idx[4]], src[idx[3]], src[idx[2]], src[idx[1]], src[idx[0]]);

Anipik added 5 commits

October 22, 2018 13:04


          sparse vector corrected

6d6897f


          Removind Dead Code, correcting names, adding assert checks to correct…

6fff9a6

… place, span overloads and function for common code


          fixing build on unix

e4478ad


          cmake file corrected, if def removed from sse.cpp and unitest name mo…

924ac0e

…dified


          Performance test corrected, resolved merge conflicts, fma supported a…

96ca8a6

…dded

Contributor Author

Anipik commented Oct 22, 2018

Before

Method	Avx	Sse	Native
MatMulPX(S = 10%)	95.28 us	114.00 us	91.8 us
MatMulPX (S = 20%)	146.2 us	181.2 us	131.3 us

After

Method	Avx	Sse	Native
MatMulPX(S = 10%)	96.46 us	106.3 us	92.8 us
MatMulPX (S = 20%)	140 us	171.7 us	131.1 us

As the matrix becomes more dense , the new algorithm becomes faster

cc @danmosemsft @eerhardt @tannergooding

Contributor Author

Anipik commented Oct 23, 2018

@tannergooding i have resolved the conflicts and addressed the feedback

Anipik mentioned this pull request

Remove AlignedArray and Aligned Matrix from src and tests #1028

Closed

Contributor Author

Anipik commented Oct 23, 2018

I have restarted the queue and new build passed successfully

tannergooding reviewed

View reviewed changes

src/Microsoft.ML.CpuMath/AvxIntrinsics.cs Show resolved Hide resolved

tannergooding approved these changes

View reviewed changes

Member

tannergooding left a comment

Overall LGTM.

Anipik merged commit 263a67b into dotnet:master

Anipik deleted the sparse2 branch

October 24, 2018 21:14

artidoro mentioned this pull request

substitute AlignedArray with a regular array #1018

Closed

Anipik mentioned this pull request

Reverting dead unallignedCode paths #1845

Merged

ghost locked as resolved and limited conversation to collaborators

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet