Skip to content

encoding_rs aarch64 priority wishlist #314

Closed
@hsivonen

Description

@hsivonen

In the SIMD meeting yesterday, @gnzlbg encouraged me to file an issue suggesting what aarch64 intrinsics to do first in the interest of letting encoding_rs's SIMD functionality migrate to stable Rust.

Assuming that LLVM portable shuffles become available on stable together with the rest of portable 128-bit SIMD, my aarch64 wishlist has just two items:

extern "platform-intrinsic" {
            fn aarch64_vmaxvq_u8(x: u8x16) -> u8;
            fn aarch64_vmaxvq_u16(x: u16x8) -> u16;
}

If we don't get portable shuffles on stable for a while, then my wishlist has two additional items:

  • The intrinsic(s) that enables the operation of expanding a single u8x16 into two u16x8s such that each u8 lane is zero-extended to a u16 lane. I believe this would be zipping the u8x16 with an all-zero vector.
  • The intrinsic(s) that takes two u16x8s and in the situation where the high half of each lane is zeroed, produces a u8x16 consisting of the low half of each lane. (If the high halves aren't zero, I don't care what garbage ends up in the result u8x16 as long as it doesn't trap. For example, on SSE2, I use x86_mm_packus_epi16, which does signed saturation instead of just discarding the high halves.) I believe on aarch64 this would be some kind of unzip operation with the other output vector (the one with the high halves) ignored afterwards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions