Skip to content

Conversation

SwapnilGaikwad
Copy link
Contributor

Contribute towards #84510

// ST3 (multiple structures)
public static unsafe void StoreVector128x3(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2, Vector128<byte>   Value3) value);
public static unsafe void StoreVector128x3(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2, Vector128<sbyte>  Value3) value);
public static unsafe void StoreVector128x3(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2, Vector128<short>  Value3) value);
public static unsafe void StoreVector128x3(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2, Vector128<ushort> Value3) value);
public static unsafe void StoreVector128x3(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2, Vector128<int>    Value3) value);
public static unsafe void StoreVector128x3(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2, Vector128<uint>   Value3) value);
public static unsafe void StoreVector128x3(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2, Vector128<long>   Value3) value);
public static unsafe void StoreVector128x3(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2, Vector128<ulong>  Value3) value);
public static unsafe void StoreVector128x3(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2, Vector128<float>  Value3) value);
public static unsafe void StoreVector128x3(double* address, (Vector128<double> Value1, Vector128<double> Value2, Vector128<double> Value3) value);

public static unsafe void StoreVector64x3(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3) value);
public static unsafe void StoreVector64x3(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3) value);
public static unsafe void StoreVector64x3(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3) value);
public static unsafe void StoreVector64x3(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3) value);
public static unsafe void StoreVector64x3(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3) value);
public static unsafe void StoreVector64x3(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3) value);
public static unsafe void StoreVector64x3(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3) value);

// ST4 (multiple structures)
public static unsafe void StoreVector128x4(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2, Vector128<byte>   Value3, Vector128<byte>   Value4) value);
public static unsafe void StoreVector128x4(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2, Vector128<sbyte>  Value3, Vector128<sbyte>  Value4) value);
public static unsafe void StoreVector128x4(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2, Vector128<short>  Value3, Vector128<short>  Value4) value);
public static unsafe void StoreVector128x4(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2, Vector128<ushort> Value3, Vector128<ushort> Value4) value);
public static unsafe void StoreVector128x4(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2, Vector128<int>    Value3, Vector128<int>    Value4) value);
public static unsafe void StoreVector128x4(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2, Vector128<uint>   Value3, Vector128<uint>   Value4) value);
public static unsafe void StoreVector128x4(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2, Vector128<long>   Value3, Vector128<long>   Value4) value);
public static unsafe void StoreVector128x4(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2, Vector128<ulong>  Value3, Vector128<ulong>  Value4) value);
public static unsafe void StoreVector128x4(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2, Vector128<float>  Value3, Vector128<float>  Value4) value);
public static unsafe void StoreVector128x4(double* address, (Vector128<double> Value1, Vector128<double> Value2, Vector128<double> Value3, Vector128<double> Value4) value);

public static unsafe void StoreVector64x4(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3, Vector64<byte>   Value4) value);
public static unsafe void StoreVector64x4(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3, Vector64<sbyte>  Value4) value);
public static unsafe void StoreVector64x4(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3, Vector64<short>  Value4) value);
public static unsafe void StoreVector64x4(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3, Vector64<ushort> Value4) value);
public static unsafe void StoreVector64x4(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3, Vector64<int>    Value4) value);
public static unsafe void StoreVector64x4(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3, Vector64<uint>   Value4) value);
public static unsafe void StoreVector64x4(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3, Vector64<float>  Value4) value);

@ghost ghost added area-System.Runtime.Intrinsics new-api-needs-documentation community-contribution Indicates that the PR has been added by a community member labels Oct 28, 2023
@ghost
Copy link

ghost commented Oct 28, 2023

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Oct 28, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Issue Details

Contribute towards #84510

// ST3 (multiple structures)
public static unsafe void StoreVector128x3(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2, Vector128<byte>   Value3) value);
public static unsafe void StoreVector128x3(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2, Vector128<sbyte>  Value3) value);
public static unsafe void StoreVector128x3(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2, Vector128<short>  Value3) value);
public static unsafe void StoreVector128x3(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2, Vector128<ushort> Value3) value);
public static unsafe void StoreVector128x3(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2, Vector128<int>    Value3) value);
public static unsafe void StoreVector128x3(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2, Vector128<uint>   Value3) value);
public static unsafe void StoreVector128x3(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2, Vector128<long>   Value3) value);
public static unsafe void StoreVector128x3(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2, Vector128<ulong>  Value3) value);
public static unsafe void StoreVector128x3(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2, Vector128<float>  Value3) value);
public static unsafe void StoreVector128x3(double* address, (Vector128<double> Value1, Vector128<double> Value2, Vector128<double> Value3) value);

public static unsafe void StoreVector64x3(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3) value);
public static unsafe void StoreVector64x3(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3) value);
public static unsafe void StoreVector64x3(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3) value);
public static unsafe void StoreVector64x3(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3) value);
public static unsafe void StoreVector64x3(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3) value);
public static unsafe void StoreVector64x3(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3) value);
public static unsafe void StoreVector64x3(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3) value);

// ST4 (multiple structures)
public static unsafe void StoreVector128x4(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2, Vector128<byte>   Value3, Vector128<byte>   Value4) value);
public static unsafe void StoreVector128x4(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2, Vector128<sbyte>  Value3, Vector128<sbyte>  Value4) value);
public static unsafe void StoreVector128x4(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2, Vector128<short>  Value3, Vector128<short>  Value4) value);
public static unsafe void StoreVector128x4(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2, Vector128<ushort> Value3, Vector128<ushort> Value4) value);
public static unsafe void StoreVector128x4(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2, Vector128<int>    Value3, Vector128<int>    Value4) value);
public static unsafe void StoreVector128x4(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2, Vector128<uint>   Value3, Vector128<uint>   Value4) value);
public static unsafe void StoreVector128x4(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2, Vector128<long>   Value3, Vector128<long>   Value4) value);
public static unsafe void StoreVector128x4(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2, Vector128<ulong>  Value3, Vector128<ulong>  Value4) value);
public static unsafe void StoreVector128x4(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2, Vector128<float>  Value3, Vector128<float>  Value4) value);
public static unsafe void StoreVector128x4(double* address, (Vector128<double> Value1, Vector128<double> Value2, Vector128<double> Value3, Vector128<double> Value4) value);

public static unsafe void StoreVector64x4(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3, Vector64<byte>   Value4) value);
public static unsafe void StoreVector64x4(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3, Vector64<sbyte>  Value4) value);
public static unsafe void StoreVector64x4(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3, Vector64<short>  Value4) value);
public static unsafe void StoreVector64x4(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3, Vector64<ushort> Value4) value);
public static unsafe void StoreVector64x4(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3, Vector64<int>    Value4) value);
public static unsafe void StoreVector64x4(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3, Vector64<uint>   Value4) value);
public static unsafe void StoreVector64x4(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3, Vector64<float>  Value4) value);
Author: SwapnilGaikwad
Assignees: -
Labels:

area-System.Runtime.Intrinsics, new-api-needs-documentation, community-contribution

Milestone: -

@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak , would you prefer an overloaded name for the interleaved multi-structure stores like we did for StoreSelectedScalar? If yes then what should we call such methods? Store or StoreVector may not convey the interleaving nature of underlying store instructions. 🤔

@kunalspathak
Copy link
Contributor

for the interleaved multi-structure stores

sorry, but could you confirm which APIs are you asking about?

@SwapnilGaikwad
Copy link
Contributor Author

for the interleaved multi-structure stores

sorry, but could you confirm which APIs are you asking about?

The StoreVectorNX2, StoreVectorNx3, StoreVectorNx3 that result in ST2, ST3 and ST4 instructions respectively.

@SwapnilGaikwad
Copy link
Contributor Author

SwapnilGaikwad commented Oct 30, 2023

Hi @kunalspathak,

Could you please confirm if my understanding is correct?

  • LoadVector(N)x(M) is mapped to LD(M). E.g., LoadVector128x3 to LD3

  • StoreVector(N)x(M) should map to ST(M), E.g., StoreVector128x3 to ST3

  • StoreVector(N)x(M)AndZip should map to ST1 with M registers.

If this is correct, then aren't these names a little confusing?
LD2/3/4 would unzip/de-interleave while loading the input registers while LD1 will load given input vectors 2/3/4 consecutively.
Thus, wouldn't it make more sense for LoadVector(N)x(M) to emit LD1 with M input values and vice-versa for LoadVector(N)x(M)AndZip?

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good, some minor fixes needed.

@ghost ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 31, 2023
@kunalspathak
Copy link
Contributor

Thus, wouldn't it make more sense for LoadVector(N)x(M) to emit LD1 with M input values and vice-versa for LoadVector(N)x(M)AndZip?

Looking carefully, yes, you are right. I swapped the two. I will send a PR to fix it. Thanks for spotting it.

@SwapnilGaikwad
Copy link
Contributor Author

Thus, wouldn't it make more sense for LoadVector(N)x(M) to emit LD1 with M input values and vice-versa for LoadVector(N)x(M)AndZip?

Looking carefully, yes, you are right. I swapped the two. I will send a PR to fix it. Thanks for spotting it.

Cool, I'll update this PR accordingly 👍

@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Nov 1, 2023
@SwapnilGaikwad
Copy link
Contributor Author

After merging #93223, I'll rebase/merge this PR and then mark it ready for review.

@SwapnilGaikwad
Copy link
Contributor Author

After merging #93223, I'll rebase/merge this PR and then mark it ready for review.

As the #93223 taking longer, I'll mark this PR ready for review. If this can progress further quickly then we can merge this. I'll rebase/merge these changes wherever necessary.

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a request to update the documentation for LoadVector equivalent.

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak kunalspathak merged commit e138ff1 into dotnet:main Nov 9, 2023
@SwapnilGaikwad SwapnilGaikwad deleted the github-st3-st4 branch November 9, 2023 20:07
@github-actions github-actions bot locked and limited conversation to collaborators Dec 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime.Intrinsics community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants