You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The shuffle API in #62 can handle nearly any hardware shuffle, but it's a bit clumsy to use (and the API requires full const generics).
A few common cases of shuffles should be provided as a simpler API:
If the shift amount is a constant, then it should be possible to implement these as const functions that output the corresponding array to passe into shuffle's const parameter.
As an example, it's possible to implement alignr (x86 doesn't have an alignl) with
constfnalignr<constLANES:usize>(shift:usize) -> [u32;LANES]{letmut indices = [0;LANES];letmut block = 0;while block < LANES{letmut idx = 0;while idx < 16 && idx < LANES{// x86 chunks its vectors into 16 byte chunks for alignrlet offset = if shift + idx >= 16{LANES + (shift + idx) % 16}else{
shift + idx
};
indices[idx + block] = (offset + block)asu32;
idx += 1;}
block += 16;}
indices
}
After some testing, LLVM does appear to recognize the array that results as an alignr and simplifies it down.
I wonder if there's any point adding shift and align? They can both be implemented manually, or they can be implemented via rotate and select, which I would think the compiler could optimize. They're not the most obvious functions, so it might actually be clearer manually implementing them than having to reference the docs.
The shuffle API in #62 can handle nearly any hardware shuffle, but it's a bit clumsy to use (and the API requires full const generics).
A few common cases of shuffles should be provided as a simpler API:
reverse
rotate
shift
(likerotate
but insert 0s)align
(or mayberotate2
?) see Are the alignr/alignl simd functions planned? #78interleave
/deinterleave
The text was updated successfully, but these errors were encountered: