Skip to content

Add @pext and @pdep builtins #14995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Validark opened this issue Mar 18, 2023 · 1 comment · May be fixed by #23474
Open

Add @pext and @pdep builtins #14995

Validark opened this issue Mar 18, 2023 · 1 comment · May be fixed by #23474
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@Validark
Copy link
Contributor

Validark commented Mar 18, 2023

Of the all the bit manipulation instructions available on a platform like x86_64, most of them are an optimization over fundamental operations.

https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set

For example, the BLSR x86_64 instruction is equivalent to x & (x - 1), and can easily be written inline or put in a function at the discretion of the programmer. We would never need a @blsr builtin because the fundamental operation of resetting the lowest set bit is already in the language.

Some of the other operations are newer fundamental operations and are builtin to Zig already, through @popCount, @ctz and @clz. I would also consider @bitReverse to be a fundamental operation, even though for some reason x86_64 lacks an instruction for it.

pdep and pext on the other hand, are not in the language, (relatively) not trivial to implement manually, and to my knowledge will only appear in the output binary as PDEP/PEXT by using inline assembly. In my view pdep and pext are now fundamental bitwise operations, just like the aforementioned operations.

Since they were first proposed by Yedidya Hilewitz and Ruby B. Lee [1, 2] they have seen a myriad of uses. Perhaps one of the biggest uses that I am aware of is in implementing the select function of rank/select data structures, which are the fundamental operations of succinct data structures and certain other bitwise structures like the Counting Quotient Filter (used for Approximate Membership Queries). Here is an article on how to implement select using pdep.

They are also used to implement bitboards, which are important for chess programming or sudoku solvers: https://news.ycombinator.com/item?id=19137798
More praise of pdep and pext: https://news.ycombinator.com/item?id=20205743

I have personally used pdep recently to check that delimiting characters are in the proper positions with SIMD. Specifically, I had a file that was of the format word\t123<digits>123\nword2\t456<digits2>456\n, etc. I moved 64 byte chunks into a vector, got a bitmap for where the '\t' and '\n' characters are, or'ed them together, and used pdep to extract every other set bit, which I can then check against the newline or tab mask to make sure the file alternates between tabs and newlines as delimiters. To grab every other set bit, one can do:

inline fn maskEvenBits(bitstring: u64) u64 {
    return pdep(0xaaaaaaaaaaaaaaaa, bitstring);
}

Here is an excerpt of my code:

const vec: @Vector(VEC_SIZE, u8) = buf[0..VEC_SIZE].*;

// Note: we cannot just match digit characters because those characters can also appear in the string section.
const tab_mask = @bitCast(std.meta.Int(.unsigned, VEC_SIZE), vec == @splat(VEC_SIZE, @as(u8, '\t')));
const newline_mask = @bitCast(std.meta.Int(.unsigned, VEC_SIZE), vec == @splat(VEC_SIZE, @as(u8, '\n')));
const newline_tab_mask = tab_mask | newline_mask;

// Validate that the file alternates between tab and newlines.
if (maskEvenBits(newline_tab_mask) != if (state == 0) tab_mask else newline_mask)
    return error.InvalidFile;

// Note: state is flipped depending on whether the 64-byte chunk is supposed to be matching strings or digits at the start of the next loop

Conveniently, I have heard that LLVM supports pdep and pext as fundamental operations. If Zig becomes the new C for the next few decades, supporting operations like pdep and pext is important for bit twiddlers to choose love Zig. It is also important from the language's perspective to define the set of fundamental bit manipulations, which should be supported on new ISA's.

Also important: #9631

@ominitay
Copy link
Contributor

ominitay commented Apr 6, 2023

I'm interested in having a go at implementing this.

The case for this as a language builtin is strong, with the clear utility of the instructions well set-out, but also with the fact that RISC-V may gain similar analogous instructions (bext/bdep) in the Bitmanip extensions.

However, I think though that perhaps a more descriptive name which fits with the style guide would be preferable, instead of @pext and @pdep. One possible idea would be @extractBits?

@andrewrk andrewrk added this to the 0.12.0 milestone Apr 6, 2023
@andrewrk andrewrk changed the title Add @pext and @pdep builtins Add @pext and @pdep builtins Apr 6, 2023
@andrewrk andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. accepted This proposal is planned. labels Apr 6, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jun 29, 2023
@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 9, 2025
@bnuuydev bnuuydev linked a pull request Apr 5, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
3 participants