Do The SIMD Shuffle #11

Lokathor · 2020-09-28T19:33:04Z

A "shuffle", in SIMD terms, takes a SIMD vector (or possibly two vectors) and a pattern of source lane indexes (usually as an immediate), and then produces a new SIMD vector where the output is the source lane values in the pattern given.

Example (pseudo-code):

let a = simd(5, 6, 7, 8);
assert_eq!(shuffle(a, [2, 2, 1, 0]), simd(7, 7, 6, 5));

Shuffles are very important for particular SIMD tasks, but the requirement that the input be a compile time constant complicates the API:

Rust doesn't have a stable way for users to express this, so they never expect it.
Rather than just having a const-required arg (which totally works) a lot of people have said "you should use const generics for that!" (which actually doesn't work as well on current rust).

Still, min_const_generics is aimed to be stable by the end of the year and most likely that'll be enough to do shuffle basics on stable.

The text was updated successfully, but these errors were encountered:

workingjubilee · 2020-09-28T22:47:30Z

As I mentioned before, I'm OK with this particularly API being unstable a little longer if that's what it takes, rather than having two ways to do such a basic thing. I recognize the importance of the shuffle API, and also that's exactly why I would want it to be transparently recognizable even if that means it comes a little later. But also, even a delayed schedule will probably not hold this up forever after our initial stabilizations.

Lokathor · 2020-09-28T23:46:29Z

Yeah, a person can do plenty of useful work without shuffles.

It's "necessary eventually", but not "required immediately".

programmerjake · 2020-09-30T02:25:09Z

Note that some ISAs also support non-const shuffles. Some of those ISAs require the vector to be in memory (x86 -- VPGATHERDD), others also support non-const shuffles just in registers (RISC-V V extension; Libre-SOC's SimpleV; probably more).

workingjubilee · 2020-10-06T07:12:51Z

Dynamic shuffles will probably have to be part of the final API, but we probably will want to provide an additional function rather than attempting to do implicit const folding on a single shuffle function, so that our API obeys the principle of least surprise: if you use a const shuffle, you get a const shuffle. If you use a dynamic shuffle, you usually get a dynamic shuffle (and sometimes you get a treat when it is possible to const-fold it despite what you thought and your code runs faster than you expected).

aldanor · 2020-12-25T00:21:54Z

For some problems, dynamic shuffles are absolutely crucial and hard to replace, so it would be definitely nice to have them rather sooner than later. One common use case to mention - is when in the shuffle the argument is compile-time const but the index is runtime, so it acts like a lookup table of sorts (but that will always be a dynamic shuffle from simd perspective).

Having two separate shuffle apis (const and dynamic) is definitely a good idea; if your algorithm relies on the shuffle being const, why not spell it out explicitly. It would also be nice if the two shuffle APIs looked closer to each other than they do in packed_simd so it would be more discoverable and easier to grok.

calebzulawski · 2020-12-25T02:28:01Z

One possibility we discussed is using const generics for the const shuffles (hinted at in some of the comments above). I'm not sure how close that would look to the dynamic API but I imagine it would be better than packed_simd's macros.

workingjubilee · 2021-02-03T21:13:23Z

@aldanor I am curious what you mean by "closer"? Do you mean using a more similar name? I was under the impression that choosing shuffle_dyn as a name was fine. If you mean the fact that one's a macro, yeah, as Caleb said, that's definitely something we're going to try to avoid in the public API. Ideally both should be functions that are implemented on the same struct or trait, and it will be very obvious which you are using because one will have its argument in spikes (<>) and the other will have it as a regular function parameter.

For this case, it's looking like const-generic shuffles will be unpretty (especially at first) but viable.

workingjubilee · 2021-02-16T18:15:31Z

An initial implementation was landed in #62 but we are considering other possible APIs in addition to that one, or future modifications of that one, including ones that are both more and less flexible.

calebzulawski · 2021-06-28T02:37:44Z

I looked into dynamic shuffles a little bit more and found WebAssembly/simd#68 which sums up architecture support pretty well, the bottom line is that many (most?) architectures support some sort of dynamic byte shuffle so there isn't any good reason for us not to support it. The only limiting factor is LLVM not exposing a dynamic shuffle instruction.

Wasm SIMD does support dynamic shuffles, so LLVM is generating them somehow, it's just a matter of exposing that as an instruction.

andy-thomason · 2021-06-28T10:43:50Z

This is great. I'm used to dynamic shuffles from the PowerPC/SPU architecture (vperm). They are very powerful tools, especially
for doing parsing with SIMD used for character classification. They are as close as we get to a table lookup in SIMD.

programmerjake · 2021-06-28T18:01:00Z

They are as close as we get to a table lookup in SIMD.

See Gather -- an actual SIMD table lookup.

andy-thomason · 2021-06-28T19:39:21Z

Sadly, gather is too slow to use for this purpose.

andy-thomason · 2021-06-28T19:41:18Z

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=gather&expand=2983

Performance
Architecture Latency Throughput (CPI)
Broadwell ~19 ~6
Haswell ~20 ~10

calebzulawski · 2021-06-28T23:09:55Z

Reading more into wasm, I think dynamic shuffles were drafted but didn't have much momentum. So I don't think it ever made it into a spec and there isn't any compiler support for generic dynamic shuffles.

workingjubilee · 2021-06-29T20:02:14Z

Haswell Gather was terribly naive, essentially a stub instruction that allowed saving code size but not speed. It has improved in speed significantly on x86 since even Broadwell, so it's now reasonable to use on Lake and Zen architectures. But I of course wouldn't recommend it when a load and shuffle would do, however.

hkratz · 2021-09-22T08:11:20Z

Reading more into wasm, I think dynamic shuffles were drafted but didn't have much momentum. So I don't think it ever made it into a spec and there isn't any compiler support for generic dynamic shuffles.

@calebzulawski Byte-wise "swizzle" is supported, see https://doc.rust-lang.org/core/arch/wasm32/fn.i8x16_swizzle.html

Ygg01 · 2024-07-05T22:56:31Z

Chiming in, that dynamic swizzle is definitely still missing from portable simd.

calebzulawski mentioned this issue Oct 1, 2020

Implement SIMD-specific functions #16

Open

4 tasks

hkratz mentioned this issue May 3, 2021

Experimental stdsimd implementation rusticstuff/simdutf8#35

Closed

programmerjake mentioned this issue Jan 17, 2022

Support variable-index swizzles #226

Closed

saik0 mentioned this issue Jan 21, 2022

Will simd be introduced？ RoaringBitmap/roaring-rs#60

Closed

saik0 mentioned this issue Feb 11, 2022

Remove static jump table from swizzle_to_front RoaringBitmap/roaring-rs#207

Open

Pr0methean mentioned this issue Jan 10, 2024

Feature request: statically-linked version dzamlo/rust-testu01#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do The SIMD Shuffle #11

Do The SIMD Shuffle #11

Lokathor commented Sep 28, 2020

workingjubilee commented Sep 28, 2020 •

edited

Loading

Lokathor commented Sep 28, 2020 •

edited

Loading

programmerjake commented Sep 30, 2020

workingjubilee commented Oct 6, 2020

aldanor commented Dec 25, 2020

calebzulawski commented Dec 25, 2020 •

edited

Loading

workingjubilee commented Feb 3, 2021 •

edited

Loading

workingjubilee commented Feb 16, 2021

calebzulawski commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

programmerjake commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

calebzulawski commented Jun 28, 2021

workingjubilee commented Jun 29, 2021

hkratz commented Sep 22, 2021

Ygg01 commented Jul 5, 2024

Do The SIMD Shuffle #11

Do The SIMD Shuffle #11

Comments

Lokathor commented Sep 28, 2020

workingjubilee commented Sep 28, 2020 • edited Loading

Lokathor commented Sep 28, 2020 • edited Loading

programmerjake commented Sep 30, 2020

workingjubilee commented Oct 6, 2020

aldanor commented Dec 25, 2020

calebzulawski commented Dec 25, 2020 • edited Loading

workingjubilee commented Feb 3, 2021 • edited Loading

workingjubilee commented Feb 16, 2021

calebzulawski commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

programmerjake commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

andy-thomason commented Jun 28, 2021

calebzulawski commented Jun 28, 2021

workingjubilee commented Jun 29, 2021

hkratz commented Sep 22, 2021

Ygg01 commented Jul 5, 2024

workingjubilee commented Sep 28, 2020 •

edited

Loading

Lokathor commented Sep 28, 2020 •

edited

Loading

calebzulawski commented Dec 25, 2020 •

edited

Loading

workingjubilee commented Feb 3, 2021 •

edited

Loading