Skip to content

Implement the WavePrefixSum HLSL Function #99172

@farzonl

Description

@farzonl
  • Implement WavePrefixSum clang builtin,
    Link WavePrefixSum clang builtin with hlsl_intrinsics.h
    Add sema checks for WavePrefixSum to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
    Add codegen for WavePrefixSum to EmitHLSLBuiltinExpr in CGBuiltin.cpp
    Add codegen tests to clang/test/CodeGenHLSL/builtins/WavePrefixSum.hlsl
    Add sema tests to clang/test/SemaHLSL/BuiltIns/WavePrefixSum-errors.hlsl
    Create the int_dx_WavePrefixSum intrinsic in IntrinsicsDirectX.td
    Create the DXILOpMapping of int_dx_WavePrefixSum to 121 in DXIL.td
    Create the WavePrefixSum.ll and WavePrefixSum_errors.ll tests in llvm/test/CodeGen/DirectX/
    Create the int_spv_WavePrefixSum intrinsic in IntrinsicsSPIRV.td
    In SPIRVInstructionSelector.cpp create the WavePrefixSum lowering and map it to int_spv_WavePrefixSum in SPIRVInstructionSelector::selectIntrinsic.
    Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WavePrefixSum.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
121 WavePrefixOp 6.0 ('library', 'compute', 'amplification', 'mesh', 'pixel', 'vertex', 'hull', 'domain', 'geometry', 'raygeneration', 'intersection', 'anyhit', 'closesthit', 'miss', 'callable', 'node')

SPIR-V

OpGroupNonUniformFAdd:

Description:

A floating point add group operation of all Value
operands contributed by active invocations in the
group.

Result Type must be a scalar or vector of floating-point
type
.

Execution is a Scope that identifies the group of
invocations affected by this command. It must be Subgroup.

The identity I for Operation is 0. If Operation is
ClusteredReduce, ClusterSize must be present.

The type of Value must be the same as Result Type. The method used
to perform the group operation on the contributed Value(s) from active
invocations is implementation defined.

ClusterSize is the size of cluster to use. ClusterSize must be a
scalar of integer type, whose Signedness operand is 0.
ClusterSize must come from a constant
instruction
. Behavior is undefined unless
ClusterSize is at least 1 and a power of 2. If ClusterSize is
greater than the size of the group, executing this instruction
results in undefined behavior.

Capability:
GroupNonUniformArithmetic, GroupNonUniformClustered,
GroupNonUniformPartitionedNV

Missing before version 1.3.

Word Count Opcode Results Operands

6 + variable

350

<id>
Result Type

Result <id>

Scope <id>
Execution

Group Operation
Operation

<id>
Value

Optional
<id>
ClusterSize

Test Case(s)

Example 1

//dxc WavePrefixSum_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export float4 fn(float4 p1) {
    return WavePrefixSum(p1);
}

Example 2

//dxc WavePrefixSum_1_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint4 fn(uint4 p1) {
    return WavePrefixSum(p1);
}

Example 3

//dxc WavePrefixSum_2_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export int4 fn(int4 p1) {
    return WavePrefixSum(p1);
}

HLSL:

Returns the sum of all of the values in the active lanes with smaller indices than this one.

Syntax

<type> WavePrefixSum(
   <type> value
);

Parameters

value

The value to sum up.

Return value

The sum of the values.

Remarks

The order of operations on this routine cannot be guaranteed. So, effectively, the [precise] flag is ignored within it.

A postfix sum can be computed by adding the prefix sum to the current lane's value.

Note that the active lane with the lowest index will always receive a 0 for its prefix sum.

This function is supported from shader model 6.0 in all shader stages.

Examples

uint numToSum = 2;
uint prefixSum = WavePrefixSum( numToSum );

On a machine with a wave size of 8, and all lanes active except lanes 0 and 4, the following values would be returned from WavePrefixSum.

lane index status prefixSum
0 inactive n/a
1 active = 0
2 active = 0+2
3 active = 0+2+2
4 inactive n/a
5 active = 0+2+2+2
6 active = 0+2+2+2+2
7 active = 0+2+2+2+2+2

See also

Overview of Shader Model 6

Shader Model 6

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Ready

    Milestone

    No milestone

    Development

    No branches or pull requests

      Participants

      @farzonl

      Issue actions

        Implement the `WavePrefixSum` HLSL Function · Issue #99172 · llvm/llvm-project