You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, the BLSR x86_64 instruction is equivalent to x & (x - 1), and can easily be written inline or put in a function at the discretion of the programmer. We would never need a @blsr builtin because the fundamental operation of resetting the lowest set bit is already in the language.
Some of the other operations are newer fundamental operations and are builtin to Zig already, through @popCount, @ctz and @clz. I would also consider @bitReverse to be a fundamental operation, even though for some reason x86_64 lacks an instruction for it.
pdep and pext on the other hand, are not in the language, (relatively) not trivial to implement manually, and to my knowledge will only appear in the output binary as PDEP/PEXT by using inline assembly. In my view pdep and pext are now fundamental bitwise operations, just like the aforementioned operations.
Since they were first proposed by Yedidya Hilewitz and Ruby B. Lee [1, 2] they have seen a myriad of uses. Perhaps one of the biggest uses that I am aware of is in implementing the select function of rank/select data structures, which are the fundamental operations of succinct data structures and certain other bitwise structures like the Counting Quotient Filter (used for Approximate Membership Queries). Here is an article on how to implement select using pdep.
I have personally used pdep recently to check that delimiting characters are in the proper positions with SIMD. Specifically, I had a file that was of the format word\t123<digits>123\nword2\t456<digits2>456\n, etc. I moved 64 byte chunks into a vector, got a bitmap for where the '\t' and '\n' characters are, or'ed them together, and used pdep to extract every other set bit, which I can then check against the newline or tab mask to make sure the file alternates between tabs and newlines as delimiters. To grab every other set bit, one can do:
constvec: @Vector(VEC_SIZE, u8) =buf[0..VEC_SIZE].*;
// Note: we cannot just match digit characters because those characters can also appear in the string section.consttab_mask=@bitCast(std.meta.Int(.unsigned, VEC_SIZE), vec==@splat(VEC_SIZE, @as(u8, '\t')));
constnewline_mask=@bitCast(std.meta.Int(.unsigned, VEC_SIZE), vec==@splat(VEC_SIZE, @as(u8, '\n')));
constnewline_tab_mask=tab_mask|newline_mask;
// Validate that the file alternates between tab and newlines.if (maskEvenBits(newline_tab_mask) !=if (state==0) tab_maskelsenewline_mask)
returnerror.InvalidFile;
// Note: state is flipped depending on whether the 64-byte chunk is supposed to be matching strings or digits at the start of the next loop
Conveniently, I have heard that LLVM supports pdep and pext as fundamental operations. If Zig becomes the new C for the next few decades, supporting operations like pdep and pext is important for bit twiddlers to choose love Zig. It is also important from the language's perspective to define the set of fundamental bit manipulations, which should be supported on new ISA's.
I'm interested in having a go at implementing this.
The case for this as a language builtin is strong, with the clear utility of the instructions well set-out, but also with the fact that RISC-V may gain similar analogous instructions (bext/bdep) in the Bitmanip extensions.
However, I think though that perhaps a more descriptive name which fits with the style guide would be preferable, instead of @pext and @pdep. One possible idea would be @extractBits?
andrewrk
changed the title
Add @pext and @pdep builtins
Add @pext and @pdep builtins
Apr 6, 2023
andrewrk
added
proposal
This issue suggests modifications. If it also has the "accepted" label then it is planned.
accepted
This proposal is planned.
labels
Apr 6, 2023
Of the all the bit manipulation instructions available on a platform like x86_64, most of them are an optimization over fundamental operations.
https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set
For example, the BLSR x86_64 instruction is equivalent to
x & (x - 1)
, and can easily be written inline or put in a function at the discretion of the programmer. We would never need a@blsr
builtin because the fundamental operation of resetting the lowest set bit is already in the language.Some of the other operations are newer fundamental operations and are builtin to Zig already, through
@popCount
,@ctz
and@clz
. I would also consider@bitReverse
to be a fundamental operation, even though for some reason x86_64 lacks an instruction for it.pdep
andpext
on the other hand, are not in the language, (relatively) not trivial to implement manually, and to my knowledge will only appear in the output binary as PDEP/PEXT by using inline assembly. In my viewpdep
andpext
are now fundamental bitwise operations, just like the aforementioned operations.Since they were first proposed by Yedidya Hilewitz and Ruby B. Lee [1, 2] they have seen a myriad of uses. Perhaps one of the biggest uses that I am aware of is in implementing the
select
function ofrank
/select
data structures, which are the fundamental operations of succinct data structures and certain other bitwise structures like the Counting Quotient Filter (used for Approximate Membership Queries). Here is an article on how to implementselect
usingpdep
.They are also used to implement bitboards, which are important for chess programming or sudoku solvers: https://news.ycombinator.com/item?id=19137798
More praise of
pdep
andpext
: https://news.ycombinator.com/item?id=20205743I have personally used
pdep
recently to check that delimiting characters are in the proper positions with SIMD. Specifically, I had a file that was of the formatword\t123<digits>123\nword2\t456<digits2>456\n
, etc. I moved 64 byte chunks into a vector, got a bitmap for where the '\t' and '\n' characters are, or'ed them together, and usedpdep
to extract every other set bit, which I can then check against the newline or tab mask to make sure the file alternates between tabs and newlines as delimiters. To grab every other set bit, one can do:Here is an excerpt of my code:
Conveniently, I have heard that LLVM supports
pdep
andpext
as fundamental operations. If Zig becomes the new C for the next few decades, supporting operations likepdep
andpext
is important for bit twiddlers tochooselove Zig. It is also important from the language's perspective to define the set of fundamental bit manipulations, which should be supported on new ISA's.Also important: #9631
The text was updated successfully, but these errors were encountered: