AMDGPU Builtins¶
This document describes the AMDGPU target-specific builtins available in Clang. Most of these builtins provide direct access to AMDGPU hardware instructions and intrinsics.
All AMDGPU builtins use the __builtin_amdgcn_ prefix (or __builtin_r600_
for R600 targets). Some arguments must be compile-time constant expressions;
this is noted in the descriptions where applicable.
Warning
These builtins, including their names, arguments, and target requirements, are all subject to change without warning across LLVM releases.
Note
This document is a work in progress. Not all builtins are fully documented yet. The initial descriptions were generated with AI assistance, cross-referencing the following sources:
clang/include/clang/Basic/BuiltinsAMDGPU.td(builtin definitions)llvm/include/llvm/IR/IntrinsicsAMDGPU.td(intrinsic definitions)clang/lib/Sema/SemaAMDGPU.cpp(argument validation and constraints)clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp(lowering logic)GPUOpen Machine-Readable ISA (ISA documents)
ABI / Special Register Builtins¶
These builtins provide access to kernel dispatch metadata, work-item and workgroup identification, and other ABI-level information. They are available on all AMDGCN targets unless otherwise noted.
__builtin_amdgcn_cluster_id_x¶
Prototype:
unsigned int __builtin_amdgcn_cluster_id_x()
Target Features: gfx1250-insts
Returns the cluster ID in the X dimension.
__builtin_amdgcn_cluster_id_y¶
Prototype:
unsigned int __builtin_amdgcn_cluster_id_y()
Target Features: gfx1250-insts
Returns the cluster ID in the Y dimension.
__builtin_amdgcn_cluster_id_z¶
Prototype:
unsigned int __builtin_amdgcn_cluster_id_z()
Target Features: gfx1250-insts
Returns the cluster ID in the Z dimension.
__builtin_amdgcn_cluster_workgroup_flat_id¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_flat_id()
Target Features: gfx1250-insts
Returns the flat (linearized) workgroup ID within the cluster.
__builtin_amdgcn_cluster_workgroup_id_x¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_id_x()
Target Features: gfx1250-insts
Returns the workgroup ID within the cluster in the X dimension.
__builtin_amdgcn_cluster_workgroup_id_y¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_id_y()
Target Features: gfx1250-insts
Returns the workgroup ID within the cluster in the Y dimension.
__builtin_amdgcn_cluster_workgroup_id_z¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_id_z()
Target Features: gfx1250-insts
Returns the workgroup ID within the cluster in the Z dimension.
__builtin_amdgcn_cluster_workgroup_max_flat_id¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_max_flat_id()
Target Features: gfx1250-insts
Returns the maximum flat (linearized) workgroup ID within the cluster.
__builtin_amdgcn_cluster_workgroup_max_id_x¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_max_id_x()
Target Features: gfx1250-insts
Returns the maximum workgroup ID within the cluster in the X dimension.
__builtin_amdgcn_cluster_workgroup_max_id_y¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_max_id_y()
Target Features: gfx1250-insts
Returns the maximum workgroup ID within the cluster in the Y dimension.
__builtin_amdgcn_cluster_workgroup_max_id_z¶
Prototype:
unsigned int __builtin_amdgcn_cluster_workgroup_max_id_z()
Target Features: gfx1250-insts
Returns the maximum workgroup ID within the cluster in the Z dimension.
__builtin_amdgcn_dispatch_ptr¶
Prototype:
void address_space<4> * __builtin_amdgcn_dispatch_ptr()
Returns a read-only pointer to the dispatch packet, which contains workgroup size, grid size, and other dispatch parameters.
__builtin_amdgcn_grid_size_x¶
Prototype:
unsigned int __builtin_amdgcn_grid_size_x()
Returns the total grid size in the X dimension.
__builtin_amdgcn_grid_size_y¶
Prototype:
unsigned int __builtin_amdgcn_grid_size_y()
Returns the total grid size in the Y dimension.
__builtin_amdgcn_grid_size_z¶
Prototype:
unsigned int __builtin_amdgcn_grid_size_z()
Returns the total grid size in the Z dimension.
__builtin_amdgcn_implicitarg_ptr¶
Prototype:
void address_space<4> * __builtin_amdgcn_implicitarg_ptr()
Returns a pointer to the implicit arguments appended after explicit kernel arguments. Layout depends on the code object version.
__builtin_amdgcn_kernarg_segment_ptr¶
Prototype:
void address_space<4> * __builtin_amdgcn_kernarg_segment_ptr()
Returns a pointer to the beginning of the kernel argument segment.
__builtin_amdgcn_queue_ptr¶
Prototype:
void address_space<4> * __builtin_amdgcn_queue_ptr()
Returns a pointer to the queue_t object for the queue executing the
current kernel.
__builtin_amdgcn_workgroup_id_x¶
Prototype:
unsigned int __builtin_amdgcn_workgroup_id_x()
Returns the workgroup ID in the X dimension.
__builtin_amdgcn_workgroup_id_y¶
Prototype:
unsigned int __builtin_amdgcn_workgroup_id_y()
Returns the workgroup ID in the Y dimension.
__builtin_amdgcn_workgroup_id_z¶
Prototype:
unsigned int __builtin_amdgcn_workgroup_id_z()
Returns the workgroup ID in the Z dimension.
__builtin_amdgcn_workgroup_size_x¶
Prototype:
unsigned short __builtin_amdgcn_workgroup_size_x()
Returns the workgroup size in the X dimension.
__builtin_amdgcn_workgroup_size_y¶
Prototype:
unsigned short __builtin_amdgcn_workgroup_size_y()
Returns the workgroup size in the Y dimension.
__builtin_amdgcn_workgroup_size_z¶
Prototype:
unsigned short __builtin_amdgcn_workgroup_size_z()
Returns the workgroup size in the Z dimension.
__builtin_amdgcn_workitem_id_x¶
Prototype:
unsigned int __builtin_amdgcn_workitem_id_x()
Returns the work-item (thread) ID within the workgroup in the X dimension.
__builtin_amdgcn_workitem_id_y¶
Prototype:
unsigned int __builtin_amdgcn_workitem_id_y()
Returns the work-item (thread) ID within the workgroup in the Y dimension.
__builtin_amdgcn_workitem_id_z¶
Prototype:
unsigned int __builtin_amdgcn_workitem_id_z()
Returns the work-item (thread) ID within the workgroup in the Z dimension.
WMMA (Wave Matrix Multiply-Accumulate) Builtins¶
WMMA builtins perform wave-cooperative matrix multiply-accumulate of the form
D = A * B + C. The work is distributed across all lanes in the wavefront.
The _w32 suffix indicates a wave32 variant, _w64 for wave64.
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w32¶
Prototype:
_ExtVector<16, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w32(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<16, short> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize32
Same as the non-tied f16/bf16 WMMA variant, but the destination register is tied to the input accumulator: the unselected 16-bit half is preserved from the input rather than being undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w64¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w64(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<8, short> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize64
Same as the non-tied f16/bf16 WMMA variant, but the destination register is tied to the input accumulator: the unselected 16-bit half is preserved from the input rather than being undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32¶
Prototype:
_ExtVector<16, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<16, short> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
The opsel argument selects which 16-bit half of the accumulator
registers to read from and write to. The content of the other 16-bit
half is undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12(_ExtVector<8, short> a, _ExtVector<8, short> b, _ExtVector<8, short> c)
Target Features: wmma-128b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
Returns the result matrix.
On GFX12, the output always uses the low 16-bit half of the accumulator
registers (the opsel bit is implicitly false). In these builtins,
no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<8, short> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
The opsel argument selects which 16-bit half of the accumulator
registers to read from and write to. The content of the other 16-bit
half is undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12¶
Prototype:
_ExtVector<4, short> __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12(_ExtVector<4, short> a, _ExtVector<4, short> b, _ExtVector<4, short> c)
Target Features: wmma-128b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
Returns the result matrix.
On GFX12, the output always uses the low 16-bit half of the accumulator
registers (the opsel bit is implicitly false). In these builtins,
no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_bf16_16x16x32_bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_wmma_bf16_16x16x32_bf16(_Constant bool a_neg, _ExtVector<16, __bf16> a, _Constant bool b_neg, _ExtVector<16, __bf16> b, _Constant short c_mod, _ExtVector<8, __bf16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with f16 or bf16 inputs and matching
f16 or bf16 output, using a 32-deep K dimension. Multiplies a 16x32
matrix A by a 32x16 matrix B and adds the 16x16 accumulator C.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_bf16f32_16x16x32_bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_wmma_bf16f32_16x16x32_bf16(_Constant bool a_neg, _ExtVector<16, __bf16> a, _Constant bool b_neg, _ExtVector<16, __bf16> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with bf16 inputs, f32 accumulator, and bf16
output. Uses a 32-deep K dimension. Multiplies a 16x32 matrix A by a
32x16 matrix B, adds the 16x16 f32 accumulator C, and converts the
result to bf16.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x128_bf8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x128_bf8_bf8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x128_bf8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x128_bf8_fp8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x128_fp8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x128_fp8_bf8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x128_fp8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x128_fp8_fp8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w32¶
Prototype:
_ExtVector<16, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w32(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<16, _Float16> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize32
Same as the non-tied f16/bf16 WMMA variant, but the destination register is tied to the input accumulator: the unselected 16-bit half is preserved from the input rather than being undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w64¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w64(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<8, _Float16> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize64
Same as the non-tied f16/bf16 WMMA variant, but the destination register is tied to the input accumulator: the unselected 16-bit half is preserved from the input rather than being undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_f16_16x16x16_f16_w32¶
Prototype:
_ExtVector<16, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_w32(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<16, _Float16> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
The opsel argument selects which 16-bit half of the accumulator
registers to read from and write to. The content of the other 16-bit
half is undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12(_ExtVector<8, _Float16> a, _ExtVector<8, _Float16> b, _ExtVector<8, _Float16> c)
Target Features: wmma-128b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
Returns the result matrix.
On GFX12, the output always uses the low 16-bit half of the accumulator
registers (the opsel bit is implicitly false). In these builtins,
no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_f16_16x16x16_f16_w64¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_w64(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<8, _Float16> c, _Constant bool opsel)
Target Features: wmma-256b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
The opsel argument selects which 16-bit half of the accumulator
registers to read from and write to. The content of the other 16-bit
half is undefined.
On GFX11, the A and B input matrices require replication (full 256-bit operands).
__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12(_ExtVector<4, _Float16> a, _ExtVector<4, _Float16> b, _ExtVector<4, _Float16> c)
Target Features: wmma-128b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the
accumulator C, where all operands and the result are f16 or bf16.
Returns the result matrix.
On GFX12, the output always uses the low 16-bit half of the accumulator
registers (the opsel bit is implicitly false). In these builtins,
no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_f16_16x16x32_f16¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x32_f16(_Constant bool a_neg, _ExtVector<16, _Float16> a, _Constant bool b_neg, _ExtVector<16, _Float16> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with f16 or bf16 inputs and matching
f16 or bf16 output, using a 32-deep K dimension. Multiplies a 16x32
matrix A by a 32x16 matrix B and adds the 16x16 accumulator C.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x64_bf8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x64_bf8_bf8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x64_bf8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x64_bf8_fp8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x64_fp8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x64_fp8_bf8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f16_16x16x64_fp8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_wmma_f16_16x16x64_fp8_fp8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, _Float16> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x128_bf8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x128_bf8_bf8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x128_bf8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x128_bf8_fp8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x128_f8f6f4¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x128_f8f6f4(_Constant int matrix_a_fmt, _ExtVector<16, int> a, _Constant int matrix_b_fmt, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c)
Target Features: gfx1250-insts,wavefrontsize32
Mixed-format wave matrix multiply-accumulate with a 128-deep K dimension.
Multiplies a 16x128 matrix A by a 128x16 matrix B and adds the
16x16 f32 accumulator C. The input element types are selected
independently per operand.
matrix_a_fmt/matrix_b_fmt: format selectors for A and B (FP8, BF8, FP6, BF6, or FP4). Must be compile-time constants.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).
__builtin_amdgcn_wmma_f32_16x16x128_fp8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x128_fp8_bf8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x128_fp8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x128_fp8_fp8(_ExtVector<16, int> a, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 128-deep K dimension. Multiplies a 16x128 matrix A
by a 128x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_bf16_w32(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<8, float> c)
Target Features: wmma-256b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
On GFX11, the A and B input matrices require replication (full 256-bit operands): 2 copies for wave32, 4 copies for wave64. The total VGPR footprint is the same for both wave sizes.
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12(_ExtVector<8, short> a, _ExtVector<8, short> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
Unlike the GFX11 variants, these GFX12 builtins do not require A/B matrix replication in their input arguments, so A/B vectors are smaller (128-bit for wave32 and 64-bit for wave64).
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_bf16_w64(_ExtVector<16, short> a, _ExtVector<16, short> b, _ExtVector<4, float> c)
Target Features: wmma-256b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
On GFX11, the A and B input matrices require replication (full 256-bit operands): 2 copies for wave32, 4 copies for wave64. The total VGPR footprint is the same for both wave sizes.
__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12(_ExtVector<4, short> a, _ExtVector<4, short> b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
Unlike the GFX11 variants, these GFX12 builtins do not require A/B matrix replication in their input arguments, so A/B vectors are smaller (128-bit for wave32 and 64-bit for wave64).
__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12(_ExtVector<2, int> a, _ExtVector<2, int> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12(int a, int b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12(_ExtVector<2, int> a, _ExtVector<2, int> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12(int a, int b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_f16_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_f16_w32(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<8, float> c)
Target Features: wmma-256b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
On GFX11, the A and B input matrices require replication (full 256-bit operands): 2 copies for wave32, 4 copies for wave64. The total VGPR footprint is the same for both wave sizes.
__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12(_ExtVector<8, _Float16> a, _ExtVector<8, _Float16> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
Unlike the GFX11 variants, these GFX12 builtins do not require A/B matrix replication in their input arguments, so A/B vectors are smaller (128-bit for wave32 and 64-bit for wave64).
__builtin_amdgcn_wmma_f32_16x16x16_f16_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_f16_w64(_ExtVector<16, _Float16> a, _ExtVector<16, _Float16> b, _ExtVector<4, float> c)
Target Features: wmma-256b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
On GFX11, the A and B input matrices require replication (full 256-bit operands): 2 copies for wave32, 4 copies for wave64. The total VGPR footprint is the same for both wave sizes.
__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12(_ExtVector<4, _Float16> a, _ExtVector<4, _Float16> b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
Multiplies a 16x16 matrix A by a 16x16 matrix B (both f16 or bf16)
and adds the 16x16 f32 accumulator C using fused multiply-add. Returns
the f32 result matrix.
Unlike the GFX11 variants, these GFX12 builtins do not require A/B matrix replication in their input arguments, so A/B vectors are smaller (128-bit for wave32 and 64-bit for wave64).
__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12(_ExtVector<2, int> a, _ExtVector<2, int> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12(int a, int b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12(_ExtVector<2, int> a, _ExtVector<2, int> b, _ExtVector<8, float> c)
Target Features: wmma-128b-insts,wavefrontsize32
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12(int a, int b, _ExtVector<4, float> c)
Target Features: wmma-128b-insts,wavefrontsize64
WMMA with 8-bit floating-point inputs (FP8 or BF8) and f32 result.
Multiplies a 16x16 matrix A by a 16x16 matrix B and adds the f32
accumulator C.
A and B contain packed 8-bit floats (4 values per 32-bit word). Since FP8 and BF8 are not first-class LLVM types, the inputs are passed as integer vectors.
__builtin_amdgcn_wmma_f32_16x16x32_bf16¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x32_bf16(_Constant bool a_neg, _ExtVector<16, __bf16> a, _Constant bool b_neg, _ExtVector<16, __bf16> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with f16 or bf16 inputs and f32 output,
using a 32-deep K dimension. Multiplies a 16x32 matrix A by a 32x16
matrix B and adds the 16x16 f32 accumulator C.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x32_f16¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x32_f16(_Constant bool a_neg, _ExtVector<16, _Float16> a, _Constant bool b_neg, _ExtVector<16, _Float16> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with f16 or bf16 inputs and f32 output,
using a 32-deep K dimension. Multiplies a 16x32 matrix A by a 32x16
matrix B and adds the 16x16 f32 accumulator C.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x4_f32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x4_f32(_Constant bool a_neg, _ExtVector<2, float> a, _Constant bool b_neg, _ExtVector<2, float> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with f32 inputs and f32 output, using a
4-deep K dimension. Multiplies a 16x4 matrix A by a 4x16 matrix B
and adds the 16x16 f32 accumulator C.
a_neg/b_neg: if true, negate the corresponding input matrix.c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x64_bf8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x64_bf8_bf8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x64_bf8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x64_bf8_fp8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x64_fp8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x64_fp8_bf8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_16x16x64_fp8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_f32_16x16x64_fp8_fp8(_ExtVector<8, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Wave matrix multiply-accumulate with FP8 or BF8 inputs and f32 or f16
output, using a 64-deep K dimension. Multiplies a 16x64 matrix A by a
64x16 matrix B and adds the 16x16 accumulator C.
A and B contain packed 8-bit floats passed as integer vectors.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_f32_32x16x128_f4¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_wmma_f32_32x16x128_f4(_ExtVector<16, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<16, float> c)
Target Features: gfx1250-insts,wavefrontsize32
FP4 wave matrix multiply-accumulate with a 128-deep K dimension and a
32x16 output. Multiplies a 32x128 matrix A by a 128x16 matrix B
and adds the 32x16 f32 accumulator C.
c_mod: accumulator modifier (0 = none, 1 = neg, 2 = abs, 3 = neg(abs)).
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x16_iu4_w32(_Constant bool a_sign, _ExtVector<2, int> a, _Constant bool b_sign, _ExtVector<2, int> b, _ExtVector<8, int> c, _Constant bool clamp)
Target Features: wmma-256b-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
On GFX11, the A and B input matrices require replication: 2 copies for wave32 and 4 copies for wave64. In the builtin API, IU8 A/B operands are 128-bit and IU4 A/B operands are 64-bit.
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12(_Constant bool a_sign, int a, _Constant bool b_sign, int b, _ExtVector<8, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
In these builtins, no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_wmma_i32_16x16x16_iu4_w64(_Constant bool a_sign, _ExtVector<2, int> a, _Constant bool b_sign, _ExtVector<2, int> b, _ExtVector<4, int> c, _Constant bool clamp)
Target Features: wmma-256b-insts,wavefrontsize64
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
On GFX11, the A and B input matrices require replication: 2 copies for wave32 and 4 copies for wave64. In the builtin API, IU8 A/B operands are 128-bit and IU4 A/B operands are 64-bit.
__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12(_Constant bool a_sign, int a, _Constant bool b_sign, int b, _ExtVector<4, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize64
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
In these builtins, no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x16_iu8_w32(_Constant bool a_sign, _ExtVector<4, int> a, _Constant bool b_sign, _ExtVector<4, int> b, _ExtVector<8, int> c, _Constant bool clamp)
Target Features: wmma-256b-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
On GFX11, the A and B input matrices require replication: 2 copies for wave32 and 4 copies for wave64. In the builtin API, IU8 A/B operands are 128-bit and IU4 A/B operands are 64-bit.
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12(_Constant bool a_sign, _ExtVector<2, int> a, _Constant bool b_sign, _ExtVector<2, int> b, _ExtVector<8, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
In these builtins, no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_wmma_i32_16x16x16_iu8_w64(_Constant bool a_sign, _ExtVector<4, int> a, _Constant bool b_sign, _ExtVector<4, int> b, _ExtVector<4, int> c, _Constant bool clamp)
Target Features: wmma-256b-insts,wavefrontsize64
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
On GFX11, the A and B input matrices require replication: 2 copies for wave32 and 4 copies for wave64. In the builtin API, IU8 A/B operands are 128-bit and IU4 A/B operands are 64-bit.
__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12(_Constant bool a_sign, int a, _Constant bool b_sign, int b, _ExtVector<4, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize64
Integer wave matrix multiply-accumulate on packed 8-bit or 4-bit integer
inputs. Multiplies a 16x16 matrix A by a 16x16 matrix B and adds
the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
In these builtins, no explicit A/B matrix replication is required in the arguments.
__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12(_Constant bool a_sign, _ExtVector<2, int> a, _Constant bool b_sign, _ExtVector<2, int> b, _ExtVector<8, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 4-bit integer inputs with
a 32-deep K dimension (16x16x32). Multiplies a 16x32 matrix A by a
32x16 matrix B and adds the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12(_Constant bool a_sign, int a, _Constant bool b_sign, int b, _ExtVector<4, int> c, _Constant bool clamp)
Target Features: wmma-128b-insts,wavefrontsize64
Integer wave matrix multiply-accumulate on packed 4-bit integer inputs with
a 32-deep K dimension (16x16x32). Multiplies a 16x32 matrix A by a
32x16 matrix B and adds the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.clamp: if true, the result saturates instead of wrapping on overflow.
__builtin_amdgcn_wmma_i32_16x16x64_iu8¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_wmma_i32_16x16x64_iu8(_Constant bool a_sign, _ExtVector<8, int> a, _Constant bool b_sign, _ExtVector<8, int> b, _ExtVector<8, int> c, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse, ...)
Target Features: gfx1250-insts,wavefrontsize32
Integer wave matrix multiply-accumulate on packed 8-bit integer inputs
with a 64-deep K dimension. Multiplies a 16x64 matrix A by a 64x16
matrix B and adds the i32 accumulator C.
a_sign/b_sign: if true, the corresponding operand is treated as signed; if false, as unsigned.matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.clamp(optional): an optional 8th argument. When present, it must be a compile-time constant integer expression convertible to bool. If true, the result saturates instead of wrapping on overflow. If omitted, defaults to false.
__builtin_amdgcn_wmma_scale16_f32_16x16x128_f8f6f4¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_scale16_f32_16x16x128_f8f6f4(_Constant int matrix_a_fmt, _ExtVector<16, int> a, _Constant int matrix_b_fmt, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant int matrix_a_scale, _Constant int matrix_a_scale_fmt, long int matrix_a_scale_exp, _Constant int matrix_b_scale, _Constant int matrix_b_scale_fmt, long int matrix_b_scale_exp, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Scaled wave matrix multiply-accumulate with 64-bit (16-bit per element) scale exponents, as opposed to the 32-bit exponents used by the non-16 scale variant.
For
..._f8f6f4variants,matrix_a_fmt/matrix_b_fmtare format selectors for A and B. They must be compile-time constants.c_mod: accumulator modifier.matrix_a_scale/matrix_b_scale: scale factor selectors. Must be compile-time constants.matrix_a_scale_fmt/matrix_b_scale_fmt: scale format selectors. Must be compile-time constants.matrix_a_scale_exp/matrix_b_scale_exp: 64-bit scale exponents.matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_scale16_f32_32x16x128_f4¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_wmma_scale16_f32_32x16x128_f4(_ExtVector<16, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<16, float> c, _Constant int matrix_a_scale, _Constant int matrix_a_scale_fmt, long int matrix_a_scale_exp, _Constant int matrix_b_scale, _Constant int matrix_b_scale_fmt, long int matrix_b_scale_exp, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Scaled wave matrix multiply-accumulate with 64-bit (16-bit per element) scale exponents, as opposed to the 32-bit exponents used by the non-16 scale variant.
For
..._f8f6f4variants,matrix_a_fmt/matrix_b_fmtare format selectors for A and B. They must be compile-time constants.c_mod: accumulator modifier.matrix_a_scale/matrix_b_scale: scale factor selectors. Must be compile-time constants.matrix_a_scale_fmt/matrix_b_scale_fmt: scale format selectors. Must be compile-time constants.matrix_a_scale_exp/matrix_b_scale_exp: 64-bit scale exponents.matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_scale_f32_16x16x128_f8f6f4¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_wmma_scale_f32_16x16x128_f8f6f4(_Constant int matrix_a_fmt, _ExtVector<16, int> a, _Constant int matrix_b_fmt, _ExtVector<16, int> b, _Constant short c_mod, _ExtVector<8, float> c, _Constant int matrix_a_scale, _Constant int matrix_a_scale_fmt, int matrix_a_scale_exp, _Constant int matrix_b_scale, _Constant int matrix_b_scale_fmt, int matrix_b_scale_exp, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Scaled wave matrix multiply-accumulate. Extends the f8f6f4 or f4
WMMA with per-operand scale factors applied during the computation.
For
..._f8f6f4variants,matrix_a_fmt/matrix_b_fmtare format selectors for A and B. They must be compile-time constants.c_mod: accumulator modifier.matrix_a_scale/matrix_b_scale: scale factor selectors. Must be compile-time constants.matrix_a_scale_fmt/matrix_b_scale_fmt: scale format selectors. Must be compile-time constants.matrix_a_scale_exp/matrix_b_scale_exp: 32-bit scale exponents.matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
__builtin_amdgcn_wmma_scale_f32_32x16x128_f4¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_wmma_scale_f32_32x16x128_f4(_ExtVector<16, int> a, _ExtVector<8, int> b, _Constant short c_mod, _ExtVector<16, float> c, _Constant int matrix_a_scale, _Constant int matrix_a_scale_fmt, int matrix_a_scale_exp, _Constant int matrix_b_scale, _Constant int matrix_b_scale_fmt, int matrix_b_scale_exp, _Constant bool matrix_a_reuse, _Constant bool matrix_b_reuse)
Target Features: gfx1250-insts,wavefrontsize32
Scaled wave matrix multiply-accumulate. Extends the f8f6f4 or f4
WMMA with per-operand scale factors applied during the computation.
For
..._f8f6f4variants,matrix_a_fmt/matrix_b_fmtare format selectors for A and B. They must be compile-time constants.c_mod: accumulator modifier.matrix_a_scale/matrix_b_scale: scale factor selectors. Must be compile-time constants.matrix_a_scale_fmt/matrix_b_scale_fmt: scale format selectors. Must be compile-time constants.matrix_a_scale_exp/matrix_b_scale_exp: 32-bit scale exponents.matrix_a_reuse/matrix_b_reuse: hints to the hardware that matrix A or B data can be reused from a previous WMMA instruction.
Undocumented¶
This section lists builtins which are recognized by Clang, but which are currently missing documentation.
__builtin_amdgcn_add_max_i32¶
Prototype:
int __builtin_amdgcn_add_max_i32(int, int, int, _Constant bool)
Target Features: add-min-max-insts
No documentation.
__builtin_amdgcn_add_max_u32¶
Prototype:
unsigned int __builtin_amdgcn_add_max_u32(unsigned int, unsigned int, unsigned int, _Constant bool)
Target Features: add-min-max-insts
No documentation.
__builtin_amdgcn_add_min_i32¶
Prototype:
int __builtin_amdgcn_add_min_i32(int, int, int, _Constant bool)
Target Features: add-min-max-insts
No documentation.
__builtin_amdgcn_add_min_u32¶
Prototype:
unsigned int __builtin_amdgcn_add_min_u32(unsigned int, unsigned int, unsigned int, _Constant bool)
Target Features: add-min-max-insts
No documentation.
__builtin_amdgcn_alignbit¶
Prototype:
unsigned int __builtin_amdgcn_alignbit(unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_alignbyte¶
Prototype:
unsigned int __builtin_amdgcn_alignbyte(unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_ashr_pk_i8_i32¶
Prototype:
unsigned short __builtin_amdgcn_ashr_pk_i8_i32(unsigned int, unsigned int, unsigned int)
Target Features: ashr-pk-insts
No documentation.
__builtin_amdgcn_ashr_pk_u8_i32¶
Prototype:
unsigned short __builtin_amdgcn_ashr_pk_u8_i32(unsigned int, unsigned int, unsigned int)
Target Features: ashr-pk-insts
No documentation.
__builtin_amdgcn_asyncmark¶
Prototype:
void __builtin_amdgcn_asyncmark()
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_atomic_dec32¶
Prototype:
uint32_t __builtin_amdgcn_atomic_dec32(uint32_t volatile *, uint32_t, unsigned int, char const *)
No documentation.
__builtin_amdgcn_atomic_dec64¶
Prototype:
uint64_t __builtin_amdgcn_atomic_dec64(uint64_t volatile *, uint64_t, unsigned int, char const *)
No documentation.
__builtin_amdgcn_atomic_inc32¶
Prototype:
uint32_t __builtin_amdgcn_atomic_inc32(uint32_t volatile *, uint32_t, unsigned int, char const *)
No documentation.
__builtin_amdgcn_atomic_inc64¶
Prototype:
uint64_t __builtin_amdgcn_atomic_inc64(uint64_t volatile *, uint64_t, unsigned int, char const *)
No documentation.
__builtin_amdgcn_ballot_w32¶
Prototype:
uint32_t __builtin_amdgcn_ballot_w32(bool)
Target Features: wavefrontsize32
No documentation.
__builtin_amdgcn_ballot_w64¶
Prototype:
uint64_t __builtin_amdgcn_ballot_w64(bool)
No documentation.
__builtin_amdgcn_bitop3_b16¶
Prototype:
short __builtin_amdgcn_bitop3_b16(short, short, short, _Constant unsigned int)
Target Features: bitop3-insts
No documentation.
__builtin_amdgcn_bitop3_b32¶
Prototype:
int __builtin_amdgcn_bitop3_b32(int, int, int, _Constant unsigned int)
Target Features: bitop3-insts
No documentation.
__builtin_amdgcn_buffer_wbinvl1¶
Prototype:
void __builtin_amdgcn_buffer_wbinvl1()
No documentation.
__builtin_amdgcn_buffer_wbinvl1_vol¶
Prototype:
void __builtin_amdgcn_buffer_wbinvl1_vol()
Target Features: ci-insts
No documentation.
__builtin_amdgcn_class¶
Prototype:
bool __builtin_amdgcn_class(double, int)
No documentation.
__builtin_amdgcn_classf¶
Prototype:
bool __builtin_amdgcn_classf(float, int)
No documentation.
__builtin_amdgcn_classh¶
Prototype:
bool __builtin_amdgcn_classh(_Float16, int)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_cluster_load_async_to_lds_b128¶
Prototype:
void __builtin_amdgcn_cluster_load_async_to_lds_b128(_ExtVector<4, int> address_space<1> *, _ExtVector<4, int> address_space<3> *, _Constant int, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_async_to_lds_b32¶
Prototype:
void __builtin_amdgcn_cluster_load_async_to_lds_b32(int address_space<1> *, int address_space<3> *, _Constant int, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_async_to_lds_b64¶
Prototype:
void __builtin_amdgcn_cluster_load_async_to_lds_b64(_ExtVector<2, int> address_space<1> *, _ExtVector<2, int> address_space<3> *, _Constant int, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_async_to_lds_b8¶
Prototype:
void __builtin_amdgcn_cluster_load_async_to_lds_b8(char address_space<1> *, char address_space<3> *, _Constant int, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_b128¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_cluster_load_b128(_ExtVector<4, int> address_space<1> *, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_b32¶
Prototype:
int __builtin_amdgcn_cluster_load_b32(int address_space<1> *, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cluster_load_b64¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_cluster_load_b64(_ExtVector<2, int> address_space<1> *, _Constant int, int)
Target Features: mcast-load-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_load_16x8B¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_cooperative_atomic_load_16x8B(_ExtVector<2, int> *, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_load_32x4B¶
Prototype:
int __builtin_amdgcn_cooperative_atomic_load_32x4B(int *, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_load_8x16B¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_cooperative_atomic_load_8x16B(_ExtVector<4, int> *, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_store_16x8B¶
Prototype:
void __builtin_amdgcn_cooperative_atomic_store_16x8B(_ExtVector<2, int> *, _ExtVector<2, int>, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_store_32x4B¶
Prototype:
void __builtin_amdgcn_cooperative_atomic_store_32x4B(int *, int, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cooperative_atomic_store_8x16B¶
Prototype:
void __builtin_amdgcn_cooperative_atomic_store_8x16B(_ExtVector<4, int> *, _ExtVector<4, int>, _Constant int, char const *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_cos_bf16¶
Prototype:
__bf16 __builtin_amdgcn_cos_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_cosf¶
Prototype:
float __builtin_amdgcn_cosf(float)
No documentation.
__builtin_amdgcn_cosh¶
Prototype:
_Float16 __builtin_amdgcn_cosh(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_cubeid¶
Prototype:
float __builtin_amdgcn_cubeid(float, float, float)
Target Features: cube-insts
No documentation.
__builtin_amdgcn_cubema¶
Prototype:
float __builtin_amdgcn_cubema(float, float, float)
Target Features: cube-insts
No documentation.
__builtin_amdgcn_cubesc¶
Prototype:
float __builtin_amdgcn_cubesc(float, float, float)
Target Features: cube-insts
No documentation.
__builtin_amdgcn_cubetc¶
Prototype:
float __builtin_amdgcn_cubetc(float, float, float)
Target Features: cube-insts
No documentation.
__builtin_amdgcn_cvt_f16_bf8¶
Prototype:
_Float16 __builtin_amdgcn_cvt_f16_bf8(int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_f16_fp8¶
Prototype:
_Float16 __builtin_amdgcn_cvt_f16_fp8(int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_f32_bf8¶
Prototype:
float __builtin_amdgcn_cvt_f32_bf8(int, _Constant int)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_f32_fp8¶
Prototype:
float __builtin_amdgcn_cvt_f32_fp8(int, _Constant int)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_f32_fp8_e5m3¶
Prototype:
float __builtin_amdgcn_cvt_f32_fp8_e5m3(int, _Constant int)
Target Features: fp8e5m3-insts
No documentation.
__builtin_amdgcn_cvt_off_f32_i4¶
Prototype:
float __builtin_amdgcn_cvt_off_f32_i4(int)
No documentation.
__builtin_amdgcn_cvt_pk_bf8_f16¶
Prototype:
short __builtin_amdgcn_cvt_pk_bf8_f16(_ExtVector<2, _Float16>)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_pk_bf8_f32¶
Prototype:
int __builtin_amdgcn_cvt_pk_bf8_f32(float, float, int, _Constant bool)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_pk_f16_bf8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_pk_f16_bf8(short)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_pk_f16_fp8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_pk_f16_fp8(short)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_pk_f32_bf8¶
Prototype:
_ExtVector<2, float> __builtin_amdgcn_cvt_pk_f32_bf8(int, _Constant bool)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_pk_f32_fp8¶
Prototype:
_ExtVector<2, float> __builtin_amdgcn_cvt_pk_f32_fp8(int, _Constant bool)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_pk_fp8_f16¶
Prototype:
short __builtin_amdgcn_cvt_pk_fp8_f16(_ExtVector<2, _Float16>)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_pk_fp8_f32¶
Prototype:
int __builtin_amdgcn_cvt_pk_fp8_f32(float, float, int, _Constant bool)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_pk_fp8_f32_e5m3¶
Prototype:
int __builtin_amdgcn_cvt_pk_fp8_f32_e5m3(float, float, int, _Constant bool)
Target Features: fp8e5m3-insts
No documentation.
__builtin_amdgcn_cvt_pk_i16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_pk_i16(int, int)
No documentation.
__builtin_amdgcn_cvt_pk_u16¶
Prototype:
_ExtVector<2, unsigned short> __builtin_amdgcn_cvt_pk_u16(unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_cvt_pk_u8_f32¶
Prototype:
unsigned int __builtin_amdgcn_cvt_pk_u8_f32(float, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_cvt_pknorm_i16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_pknorm_i16(float, float)
Target Features: cvt-pknorm-vop2-insts
No documentation.
__builtin_amdgcn_cvt_pknorm_u16¶
Prototype:
_ExtVector<2, unsigned short> __builtin_amdgcn_cvt_pknorm_u16(float, float)
Target Features: cvt-pknorm-vop2-insts
No documentation.
__builtin_amdgcn_cvt_pkrtz¶
Prototype:
_ExtVector<2, __fp16> __builtin_amdgcn_cvt_pkrtz(float, float)
No documentation.
__builtin_amdgcn_cvt_scale_pk16_bf16_bf6¶
Prototype:
_ExtVector<16, __bf16> __builtin_amdgcn_cvt_scale_pk16_bf16_bf6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk16_bf16_fp6¶
Prototype:
_ExtVector<16, __bf16> __builtin_amdgcn_cvt_scale_pk16_bf16_fp6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk16_f16_bf6¶
Prototype:
_ExtVector<16, _Float16> __builtin_amdgcn_cvt_scale_pk16_f16_bf6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk16_f16_fp6¶
Prototype:
_ExtVector<16, _Float16> __builtin_amdgcn_cvt_scale_pk16_f16_fp6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk16_f32_bf6¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_cvt_scale_pk16_f32_bf6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk16_f32_fp6¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_cvt_scale_pk16_f32_fp6(_ExtVector<3, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_bf16_bf8¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_cvt_scale_pk8_bf16_bf8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_bf16_fp4¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_cvt_scale_pk8_bf16_fp4(unsigned int, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_bf16_fp8¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_cvt_scale_pk8_bf16_fp8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f16_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_cvt_scale_pk8_f16_bf8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f16_fp4¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_cvt_scale_pk8_f16_fp4(unsigned int, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f16_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_cvt_scale_pk8_f16_fp8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f32_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_cvt_scale_pk8_f32_bf8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f32_fp4¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_cvt_scale_pk8_f32_fp4(unsigned int, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scale_pk8_f32_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_cvt_scale_pk8_f32_fp8(_ExtVector<2, unsigned int>, unsigned int, _Constant unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_2xpk16_bf6_f32¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_2xpk16_bf6_f32(_ExtVector<16, float>, _ExtVector<16, float>, float)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_2xpk16_fp6_f32¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_2xpk16_fp6_f32(_ExtVector<16, float>, _ExtVector<16, float>, float)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_f16_bf8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_scalef32_f16_bf8(_ExtVector<2, _Float16>, int, float, _Constant int, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_f16_fp8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_scalef32_f16_fp8(_ExtVector<2, _Float16>, int, float, _Constant int, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_f32_bf8¶
Prototype:
float __builtin_amdgcn_cvt_scalef32_f32_bf8(int, float, _Constant int)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_f32_fp8¶
Prototype:
float __builtin_amdgcn_cvt_scalef32_f32_fp8(int, float, _Constant int)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_bf6_bf16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_bf6_bf16(_ExtVector<16, __bf16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_bf6_f16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_bf6_f16(_ExtVector<16, _Float16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_bf6_f32¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_bf6_f32(_ExtVector<16, float>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_fp6_bf16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_fp6_bf16(_ExtVector<16, __bf16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_fp6_f16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_fp6_f16(_ExtVector<16, _Float16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk16_fp6_f32¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_pk16_fp6_f32(_ExtVector<16, float>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_bf16_bf6¶
Prototype:
_ExtVector<32, __bf16> __builtin_amdgcn_cvt_scalef32_pk32_bf16_bf6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_bf16_fp6¶
Prototype:
_ExtVector<32, __bf16> __builtin_amdgcn_cvt_scalef32_pk32_bf16_fp6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_bf6_bf16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_pk32_bf6_bf16(_ExtVector<32, __bf16>, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_bf6_f16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_pk32_bf6_f16(_ExtVector<32, _Float16>, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_f16_bf6¶
Prototype:
_ExtVector<32, _Float16> __builtin_amdgcn_cvt_scalef32_pk32_f16_bf6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_f16_fp6¶
Prototype:
_ExtVector<32, _Float16> __builtin_amdgcn_cvt_scalef32_pk32_f16_fp6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_f32_bf6¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_cvt_scalef32_pk32_f32_bf6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_f32_fp6¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_cvt_scalef32_pk32_f32_fp6(_ExtVector<6, unsigned int>, float)
Target Features: fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_fp6_bf16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_pk32_fp6_bf16(_ExtVector<32, __bf16>, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk32_fp6_f16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_pk32_fp6_f16(_ExtVector<32, _Float16>, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_bf8_bf16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_bf8_bf16(_ExtVector<8, __bf16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_bf8_f16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_bf8_f16(_ExtVector<8, _Float16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_bf8_f32¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_bf8_f32(_ExtVector<8, float>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp4_bf16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk8_fp4_bf16(_ExtVector<8, __bf16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp4_f16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk8_fp4_f16(_ExtVector<8, _Float16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp4_f32¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk8_fp4_f32(_ExtVector<8, float>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp8_bf16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_fp8_bf16(_ExtVector<8, __bf16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp8_f16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_fp8_f16(_ExtVector<8, _Float16>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk8_fp8_f32¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_pk8_fp8_f32(_ExtVector<8, float>, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf16_bf8¶
Prototype:
_ExtVector<2, __bf16> __builtin_amdgcn_cvt_scalef32_pk_bf16_bf8(unsigned int, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf16_fp4¶
Prototype:
_ExtVector<2, __bf16> __builtin_amdgcn_cvt_scalef32_pk_bf16_fp4(unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf16_fp8¶
Prototype:
_ExtVector<2, __bf16> __builtin_amdgcn_cvt_scalef32_pk_bf16_fp8(unsigned int, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf8_bf16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_bf8_bf16(_ExtVector<2, short>, _ExtVector<2, __bf16>, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf8_f16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_bf8_f16(_ExtVector<2, short>, _ExtVector<2, _Float16>, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_bf8_f32¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_bf8_f32(_ExtVector<2, short>, float, float, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f16_bf8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_scalef32_pk_f16_bf8(unsigned int, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f16_fp4¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_scalef32_pk_f16_fp4(unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f16_fp8¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_scalef32_pk_f16_fp8(unsigned int, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f32_bf8¶
Prototype:
_ExtVector<2, float> __builtin_amdgcn_cvt_scalef32_pk_f32_bf8(unsigned int, float, _Constant bool)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f32_fp4¶
Prototype:
_ExtVector<2, float> __builtin_amdgcn_cvt_scalef32_pk_f32_fp4(unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_f32_fp8¶
Prototype:
_ExtVector<2, float> __builtin_amdgcn_cvt_scalef32_pk_f32_fp8(unsigned int, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp4_bf16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk_fp4_bf16(unsigned int, _ExtVector<2, __bf16>, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp4_f16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk_fp4_f16(unsigned int, _ExtVector<2, _Float16>, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp4_f32¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_pk_fp4_f32(unsigned int, float, float, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp8_bf16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_fp8_bf16(_ExtVector<2, short>, _ExtVector<2, __bf16>, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp8_f16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_fp8_f16(_ExtVector<2, short>, _ExtVector<2, _Float16>, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_pk_fp8_f32¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_cvt_scalef32_pk_fp8_f32(_ExtVector<2, short>, float, float, float, _Constant bool)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_bf8_bf16¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_bf8_bf16(int, __bf16, unsigned int, float, _Constant int)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_bf8_f16¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_bf8_f16(int, _Float16, unsigned int, float, _Constant int)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_bf8_f32¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_bf8_f32(int, float, unsigned int, float, _Constant int)
Target Features: bf8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_fp8_bf16¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_fp8_bf16(int, __bf16, unsigned int, float, _Constant int)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_fp8_f16¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_fp8_f16(int, _Float16, unsigned int, float, _Constant int)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_fp8_f32¶
Prototype:
int __builtin_amdgcn_cvt_scalef32_sr_fp8_f32(int, float, unsigned int, float, _Constant int)
Target Features: fp8-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_bf16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_bf16(_ExtVector<16, __bf16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_f16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_f16(_ExtVector<16, _Float16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_f32¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_bf6_f32(_ExtVector<16, float>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_bf16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_bf16(_ExtVector<16, __bf16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_f16¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_f16(_ExtVector<16, _Float16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_f32¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk16_fp6_f32(_ExtVector<16, float>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_bf16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_bf16(_ExtVector<32, __bf16>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_f16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_f16(_ExtVector<32, _Float16>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_f32¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_bf6_f32(_ExtVector<32, float>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_bf16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_bf16(_ExtVector<32, __bf16>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_f16¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_f16(_ExtVector<32, _Float16>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_f32¶
Prototype:
_ExtVector<6, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk32_fp6_f32(_ExtVector<32, float>, unsigned int, float)
Target Features: f16bf16-to-fp6bf6-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_bf16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_bf16(_ExtVector<8, __bf16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_f16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_f16(_ExtVector<8, _Float16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_f32¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_bf8_f32(_ExtVector<8, float>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_bf16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_bf16(_ExtVector<8, __bf16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_f16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_f16(_ExtVector<8, _Float16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_f32¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk8_fp4_f32(_ExtVector<8, float>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_bf16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_bf16(_ExtVector<8, __bf16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_f16¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_f16(_ExtVector<8, _Float16>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_f32¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_cvt_scalef32_sr_pk8_fp8_f32(_ExtVector<8, float>, unsigned int, float)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk_fp4_bf16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk_fp4_bf16(unsigned int, _ExtVector<2, __bf16>, unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk_fp4_f16¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk_fp4_f16(unsigned int, _ExtVector<2, _Float16>, unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_scalef32_sr_pk_fp4_f32¶
Prototype:
unsigned int __builtin_amdgcn_cvt_scalef32_sr_pk_fp4_f32(unsigned int, _ExtVector<2, float>, unsigned int, float, _Constant int)
Target Features: fp4-cvt-scale-insts
No documentation.
__builtin_amdgcn_cvt_sr_bf16_f32¶
Prototype:
_ExtVector<2, __bf16> __builtin_amdgcn_cvt_sr_bf16_f32(_ExtVector<2, __bf16>, float, unsigned int, _Constant bool)
Target Features: f32-to-f16bf16-cvt-sr-insts
No documentation.
__builtin_amdgcn_cvt_sr_bf8_f16¶
Prototype:
int __builtin_amdgcn_cvt_sr_bf8_f16(_Float16, int, unsigned int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_sr_bf8_f32¶
Prototype:
int __builtin_amdgcn_cvt_sr_bf8_f32(float, int, int, _Constant int)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_sr_f16_f32¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_sr_f16_f32(_ExtVector<2, _Float16>, float, unsigned int, _Constant bool)
Target Features: f32-to-f16bf16-cvt-sr-insts
No documentation.
__builtin_amdgcn_cvt_sr_fp8_f16¶
Prototype:
int __builtin_amdgcn_cvt_sr_fp8_f16(_Float16, int, unsigned int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_cvt_sr_fp8_f32¶
Prototype:
int __builtin_amdgcn_cvt_sr_fp8_f32(float, int, int, _Constant int)
Target Features: fp8-conversion-insts
No documentation.
__builtin_amdgcn_cvt_sr_fp8_f32_e5m3¶
Prototype:
int __builtin_amdgcn_cvt_sr_fp8_f32_e5m3(float, int, int, _Constant int)
Target Features: fp8e5m3-insts
No documentation.
__builtin_amdgcn_cvt_sr_pk_bf16_f32¶
Prototype:
_ExtVector<2, __bf16> __builtin_amdgcn_cvt_sr_pk_bf16_f32(float, float, int)
Target Features: bf16-cvt-insts
No documentation.
__builtin_amdgcn_cvt_sr_pk_f16_f32¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_cvt_sr_pk_f16_f32(float, float, int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_div_fixup¶
Prototype:
double __builtin_amdgcn_div_fixup(double, double, double)
No documentation.
__builtin_amdgcn_div_fixupf¶
Prototype:
float __builtin_amdgcn_div_fixupf(float, float, float)
No documentation.
__builtin_amdgcn_div_fixuph¶
Prototype:
_Float16 __builtin_amdgcn_div_fixuph(_Float16, _Float16, _Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_div_fmas¶
Prototype:
double __builtin_amdgcn_div_fmas(double, double, double, bool)
No documentation.
__builtin_amdgcn_div_fmasf¶
Prototype:
float __builtin_amdgcn_div_fmasf(float, float, float, bool)
No documentation.
__builtin_amdgcn_div_scale¶
Prototype:
double __builtin_amdgcn_div_scale(double, double, bool, bool *)
No documentation.
__builtin_amdgcn_div_scalef¶
Prototype:
float __builtin_amdgcn_div_scalef(float, float, bool, bool *)
No documentation.
__builtin_amdgcn_dot4_f32_bf8_bf8¶
Prototype:
float __builtin_amdgcn_dot4_f32_bf8_bf8(unsigned int, unsigned int, float)
Target Features: dot11-insts
No documentation.
__builtin_amdgcn_dot4_f32_bf8_fp8¶
Prototype:
float __builtin_amdgcn_dot4_f32_bf8_fp8(unsigned int, unsigned int, float)
Target Features: dot11-insts
No documentation.
__builtin_amdgcn_dot4_f32_fp8_bf8¶
Prototype:
float __builtin_amdgcn_dot4_f32_fp8_bf8(unsigned int, unsigned int, float)
Target Features: dot11-insts
No documentation.
__builtin_amdgcn_dot4_f32_fp8_fp8¶
Prototype:
float __builtin_amdgcn_dot4_f32_fp8_fp8(unsigned int, unsigned int, float)
Target Features: dot11-insts
No documentation.
__builtin_amdgcn_ds_append¶
Prototype:
int __builtin_amdgcn_ds_append(int address_space<3> *)
No documentation.
__builtin_amdgcn_ds_atomic_async_barrier_arrive_b64¶
Prototype:
void __builtin_amdgcn_ds_atomic_async_barrier_arrive_b64(long int address_space<3> *)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_ds_atomic_barrier_arrive_rtn_b64¶
Prototype:
long int __builtin_amdgcn_ds_atomic_barrier_arrive_rtn_b64(long int address_space<3> *, long int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_ds_atomic_fadd_f32¶
Prototype:
float __builtin_amdgcn_ds_atomic_fadd_f32(float address_space<3> *, float)
Target Features: gfx8-insts
No documentation.
__builtin_amdgcn_ds_atomic_fadd_f64¶
Prototype:
double __builtin_amdgcn_ds_atomic_fadd_f64(double address_space<3> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_ds_atomic_fadd_v2bf16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_ds_atomic_fadd_v2bf16(_ExtVector<2, short> address_space<3> *, _ExtVector<2, short>)
Target Features: atomic-ds-pk-add-16-insts
No documentation.
__builtin_amdgcn_ds_atomic_fadd_v2f16¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_ds_atomic_fadd_v2f16(_ExtVector<2, _Float16> address_space<3> *, _ExtVector<2, _Float16>)
Target Features: atomic-ds-pk-add-16-insts
No documentation.
__builtin_amdgcn_ds_bpermute¶
Prototype:
int __builtin_amdgcn_ds_bpermute(int, int)
No documentation.
__builtin_amdgcn_ds_bpermute_fi_b32¶
Prototype:
int __builtin_amdgcn_ds_bpermute_fi_b32(int, int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_ds_bvh_stack_push4_pop1_rtn¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_ds_bvh_stack_push4_pop1_rtn(unsigned int, unsigned int, _ExtVector<4, unsigned int>, _Constant int)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_ds_bvh_stack_push8_pop1_rtn¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_ds_bvh_stack_push8_pop1_rtn(unsigned int, unsigned int, _ExtVector<8, unsigned int>, _Constant int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_ds_bvh_stack_push8_pop2_rtn¶
Prototype:
_ExtVector<2, uint64_t> __builtin_amdgcn_ds_bvh_stack_push8_pop2_rtn(unsigned int, unsigned int, _ExtVector<8, unsigned int>, _Constant int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_ds_bvh_stack_rtn¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_ds_bvh_stack_rtn(unsigned int, unsigned int, _ExtVector<4, unsigned int>, _Constant int)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_ds_consume¶
Prototype:
int __builtin_amdgcn_ds_consume(int address_space<3> *)
No documentation.
__builtin_amdgcn_ds_faddf¶
Prototype:
float __builtin_amdgcn_ds_faddf(float address_space<3> *, float, _Constant int, _Constant int, _Constant bool)
No documentation.
__builtin_amdgcn_ds_fmaxf¶
Prototype:
float __builtin_amdgcn_ds_fmaxf(float address_space<3> *, float, _Constant int, _Constant int, _Constant bool)
No documentation.
__builtin_amdgcn_ds_fminf¶
Prototype:
float __builtin_amdgcn_ds_fminf(float address_space<3> *, float, _Constant int, _Constant int, _Constant bool)
No documentation.
__builtin_amdgcn_ds_gws_barrier¶
Prototype:
void __builtin_amdgcn_ds_gws_barrier(unsigned int, unsigned int)
Target Features: gws
No documentation.
__builtin_amdgcn_ds_gws_init¶
Prototype:
void __builtin_amdgcn_ds_gws_init(unsigned int, unsigned int)
Target Features: gws
No documentation.
__builtin_amdgcn_ds_gws_sema_br¶
Prototype:
void __builtin_amdgcn_ds_gws_sema_br(unsigned int, unsigned int)
Target Features: gws
No documentation.
__builtin_amdgcn_ds_gws_sema_p¶
Prototype:
void __builtin_amdgcn_ds_gws_sema_p(unsigned int)
Target Features: gws
No documentation.
__builtin_amdgcn_ds_gws_sema_release_all¶
Prototype:
void __builtin_amdgcn_ds_gws_sema_release_all(unsigned int)
Target Features: ci-insts
No documentation.
__builtin_amdgcn_ds_gws_sema_v¶
Prototype:
void __builtin_amdgcn_ds_gws_sema_v(unsigned int)
Target Features: gws
No documentation.
__builtin_amdgcn_ds_load_tr16_b128_v8bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_ds_load_tr16_b128_v8bf16(_ExtVector<8, __bf16> address_space<3> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_load_tr16_b128_v8f16¶
Prototype:
_ExtVector<8, __fp16> __builtin_amdgcn_ds_load_tr16_b128_v8f16(_ExtVector<8, __fp16> address_space<3> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_load_tr16_b128_v8i16¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_ds_load_tr16_b128_v8i16(_ExtVector<8, short> address_space<3> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_load_tr4_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_ds_load_tr4_b64_v2i32(_ExtVector<2, int> address_space<3> *)
Target Features: transpose-load-f4f6-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_load_tr6_b96_v3i32¶
Prototype:
_ExtVector<3, int> __builtin_amdgcn_ds_load_tr6_b96_v3i32(_ExtVector<3, int> address_space<3> *)
Target Features: transpose-load-f4f6-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_load_tr8_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_ds_load_tr8_b64_v2i32(_ExtVector<2, int> address_space<3> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_ds_permute¶
Prototype:
int __builtin_amdgcn_ds_permute(int, int)
No documentation.
__builtin_amdgcn_ds_read_tr16_b64_v4bf16¶
Prototype:
_ExtVector<4, __bf16> __builtin_amdgcn_ds_read_tr16_b64_v4bf16(_ExtVector<4, __bf16> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_read_tr16_b64_v4f16¶
Prototype:
_ExtVector<4, __fp16> __builtin_amdgcn_ds_read_tr16_b64_v4f16(_ExtVector<4, __fp16> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_read_tr16_b64_v4i16¶
Prototype:
_ExtVector<4, short> __builtin_amdgcn_ds_read_tr16_b64_v4i16(_ExtVector<4, short> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_read_tr4_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_ds_read_tr4_b64_v2i32(_ExtVector<2, int> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_read_tr6_b96_v3i32¶
Prototype:
_ExtVector<3, int> __builtin_amdgcn_ds_read_tr6_b96_v3i32(_ExtVector<3, int> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_read_tr8_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_ds_read_tr8_b64_v2i32(_ExtVector<2, int> address_space<3> *)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_ds_swizzle¶
Prototype:
int __builtin_amdgcn_ds_swizzle(int, _Constant int)
No documentation.
__builtin_amdgcn_endpgm¶
Prototype:
void __builtin_amdgcn_endpgm()
No documentation.
__builtin_amdgcn_exp2_bf16¶
Prototype:
__bf16 __builtin_amdgcn_exp2_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_exp2f¶
Prototype:
float __builtin_amdgcn_exp2f(float)
No documentation.
__builtin_amdgcn_fcmp¶
Prototype:
uint64_t __builtin_amdgcn_fcmp(double, double, _Constant int)
No documentation.
__builtin_amdgcn_fcmpf¶
Prototype:
uint64_t __builtin_amdgcn_fcmpf(float, float, _Constant int)
No documentation.
__builtin_amdgcn_fdot2¶
Prototype:
float __builtin_amdgcn_fdot2(_ExtVector<2, _Float16>, _ExtVector<2, _Float16>, float, _Constant bool)
Target Features: dot10-insts
No documentation.
__builtin_amdgcn_fdot2_bf16_bf16¶
Prototype:
short __builtin_amdgcn_fdot2_bf16_bf16(_ExtVector<2, short>, _ExtVector<2, short>, short)
Target Features: dot9-insts
No documentation.
__builtin_amdgcn_fdot2_f16_f16¶
Prototype:
_Float16 __builtin_amdgcn_fdot2_f16_f16(_ExtVector<2, _Float16>, _ExtVector<2, _Float16>, _Float16)
Target Features: dot9-insts
No documentation.
__builtin_amdgcn_fdot2_f32_bf16¶
Prototype:
float __builtin_amdgcn_fdot2_f32_bf16(_ExtVector<2, short>, _ExtVector<2, short>, float, _Constant bool)
Target Features: dot12-insts
No documentation.
__builtin_amdgcn_fdot2c_f32_bf16¶
Prototype:
float __builtin_amdgcn_fdot2c_f32_bf16(_ExtVector<2, __bf16>, _ExtVector<2, __bf16>, float, _Constant bool)
Target Features: dot13-insts
No documentation.
__builtin_amdgcn_fence¶
Prototype:
void __builtin_amdgcn_fence(unsigned int, char const *, ...)
No documentation.
__builtin_amdgcn_flat_atomic_fadd_f32¶
Prototype:
float __builtin_amdgcn_flat_atomic_fadd_f32(float address_space<0> *, float)
Target Features: gfx940-insts
No documentation.
__builtin_amdgcn_flat_atomic_fadd_f64¶
Prototype:
double __builtin_amdgcn_flat_atomic_fadd_f64(double address_space<0> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_flat_atomic_fadd_v2bf16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_flat_atomic_fadd_v2bf16(_ExtVector<2, short> address_space<0> *, _ExtVector<2, short>)
Target Features: atomic-flat-pk-add-16-insts
No documentation.
__builtin_amdgcn_flat_atomic_fadd_v2f16¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_flat_atomic_fadd_v2f16(_ExtVector<2, _Float16> address_space<0> *, _ExtVector<2, _Float16>)
Target Features: atomic-flat-pk-add-16-insts
No documentation.
__builtin_amdgcn_flat_atomic_fmax_f64¶
Prototype:
double __builtin_amdgcn_flat_atomic_fmax_f64(double address_space<0> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_flat_atomic_fmin_f64¶
Prototype:
double __builtin_amdgcn_flat_atomic_fmin_f64(double address_space<0> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_flat_load_monitor_b128¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_flat_load_monitor_b128(_ExtVector<4, int> address_space<0> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_flat_load_monitor_b32¶
Prototype:
int __builtin_amdgcn_flat_load_monitor_b32(int address_space<0> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_flat_load_monitor_b64¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_flat_load_monitor_b64(_ExtVector<2, int> address_space<0> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_flat_prefetch¶
Prototype:
void __builtin_amdgcn_flat_prefetch(void const address_space<0> *, _Constant int)
Target Features: vmem-pref-insts
No documentation.
__builtin_amdgcn_fmed3f¶
Prototype:
float __builtin_amdgcn_fmed3f(float, float, float)
No documentation.
__builtin_amdgcn_fmed3h¶
Prototype:
__fp16 __builtin_amdgcn_fmed3h(__fp16, __fp16, __fp16)
Target Features: gfx9-insts
No documentation.
__builtin_amdgcn_fract¶
Prototype:
double __builtin_amdgcn_fract(double)
No documentation.
__builtin_amdgcn_fractf¶
Prototype:
float __builtin_amdgcn_fractf(float)
No documentation.
__builtin_amdgcn_fracth¶
Prototype:
_Float16 __builtin_amdgcn_fracth(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_frexp_exp¶
Prototype:
int __builtin_amdgcn_frexp_exp(double)
No documentation.
__builtin_amdgcn_frexp_expf¶
Prototype:
int __builtin_amdgcn_frexp_expf(float)
No documentation.
__builtin_amdgcn_frexp_exph¶
Prototype:
short __builtin_amdgcn_frexp_exph(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_frexp_mant¶
Prototype:
double __builtin_amdgcn_frexp_mant(double)
No documentation.
__builtin_amdgcn_frexp_mantf¶
Prototype:
float __builtin_amdgcn_frexp_mantf(float)
No documentation.
__builtin_amdgcn_frexp_manth¶
Prototype:
_Float16 __builtin_amdgcn_frexp_manth(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_get_fpenv¶
Prototype:
uint64_t __builtin_amdgcn_get_fpenv()
No documentation.
__builtin_amdgcn_global_atomic_fadd_f32¶
Prototype:
float __builtin_amdgcn_global_atomic_fadd_f32(float address_space<1> *, float)
Target Features: atomic-fadd-rtn-insts
No documentation.
__builtin_amdgcn_global_atomic_fadd_f64¶
Prototype:
double __builtin_amdgcn_global_atomic_fadd_f64(double address_space<1> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_global_atomic_fadd_v2bf16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_global_atomic_fadd_v2bf16(_ExtVector<2, short> address_space<1> *, _ExtVector<2, short>)
Target Features: atomic-global-pk-add-bf16-inst
No documentation.
__builtin_amdgcn_global_atomic_fadd_v2f16¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_global_atomic_fadd_v2f16(_ExtVector<2, _Float16> address_space<1> *, _ExtVector<2, _Float16>)
Target Features: atomic-buffer-global-pk-add-f16-insts
No documentation.
__builtin_amdgcn_global_atomic_fmax_f64¶
Prototype:
double __builtin_amdgcn_global_atomic_fmax_f64(double address_space<1> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_global_atomic_fmin_f64¶
Prototype:
double __builtin_amdgcn_global_atomic_fmin_f64(double address_space<1> *, double)
Target Features: gfx90a-insts
No documentation.
__builtin_amdgcn_global_load_async_lds¶
Prototype:
void __builtin_amdgcn_global_load_async_lds(void address_space<1> *, void address_space<3> *, _Constant unsigned int, _Constant int, _Constant unsigned int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_global_load_async_to_lds_b128¶
Prototype:
void __builtin_amdgcn_global_load_async_to_lds_b128(_ExtVector<4, int> address_space<1> *, _ExtVector<4, int> address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_async_to_lds_b32¶
Prototype:
void __builtin_amdgcn_global_load_async_to_lds_b32(int address_space<1> *, int address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_async_to_lds_b64¶
Prototype:
void __builtin_amdgcn_global_load_async_to_lds_b64(_ExtVector<2, int> address_space<1> *, _ExtVector<2, int> address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_async_to_lds_b8¶
Prototype:
void __builtin_amdgcn_global_load_async_to_lds_b8(char address_space<1> *, char address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_lds¶
Prototype:
void __builtin_amdgcn_global_load_lds(void address_space<1> *, void address_space<3> *, _Constant unsigned int, _Constant int, _Constant unsigned int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_global_load_monitor_b128¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_global_load_monitor_b128(_ExtVector<4, int> address_space<1> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_monitor_b32¶
Prototype:
int __builtin_amdgcn_global_load_monitor_b32(int address_space<1> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_monitor_b64¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_global_load_monitor_b64(_ExtVector<2, int> address_space<1> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_load_tr16_b128_v8bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_global_load_tr16_b128_v8bf16(_ExtVector<8, __bf16> address_space<1> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr16_b128_v8f16¶
Prototype:
_ExtVector<8, __fp16> __builtin_amdgcn_global_load_tr16_b128_v8f16(_ExtVector<8, __fp16> address_space<1> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr16_b128_v8i16¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_global_load_tr16_b128_v8i16(_ExtVector<8, short> address_space<1> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr4_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_global_load_tr4_b64_v2i32(_ExtVector<2, int> address_space<1> *)
Target Features: transpose-load-f4f6-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr6_b96_v3i32¶
Prototype:
_ExtVector<3, int> __builtin_amdgcn_global_load_tr6_b96_v3i32(_ExtVector<3, int> address_space<1> *)
Target Features: transpose-load-f4f6-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr8_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_global_load_tr8_b64_v2i32(_ExtVector<2, int> address_space<1> *)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr_b128_v4bf16¶
Prototype:
_ExtVector<4, __bf16> __builtin_amdgcn_global_load_tr_b128_v4bf16(_ExtVector<4, __bf16> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_global_load_tr_b128_v4f16¶
Prototype:
_ExtVector<4, __fp16> __builtin_amdgcn_global_load_tr_b128_v4f16(_ExtVector<4, __fp16> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_global_load_tr_b128_v4i16¶
Prototype:
_ExtVector<4, short> __builtin_amdgcn_global_load_tr_b128_v4i16(_ExtVector<4, short> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_global_load_tr_b128_v8bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_global_load_tr_b128_v8bf16(_ExtVector<8, __bf16> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr_b128_v8f16¶
Prototype:
_ExtVector<8, __fp16> __builtin_amdgcn_global_load_tr_b128_v8f16(_ExtVector<8, __fp16> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr_b128_v8i16¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_global_load_tr_b128_v8i16(_ExtVector<8, short> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_load_tr_b64_i32¶
Prototype:
int __builtin_amdgcn_global_load_tr_b64_i32(int address_space<1> *)
Target Features: gfx12-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_global_load_tr_b64_v2i32¶
Prototype:
_ExtVector<2, int> __builtin_amdgcn_global_load_tr_b64_v2i32(_ExtVector<2, int> address_space<1> *)
Target Features: gfx12-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_global_prefetch¶
Prototype:
void __builtin_amdgcn_global_prefetch(void const address_space<1> *, _Constant int)
Target Features: vmem-pref-insts
No documentation.
__builtin_amdgcn_global_store_async_from_lds_b128¶
Prototype:
void __builtin_amdgcn_global_store_async_from_lds_b128(_ExtVector<4, int> address_space<1> *, _ExtVector<4, int> address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_store_async_from_lds_b32¶
Prototype:
void __builtin_amdgcn_global_store_async_from_lds_b32(int address_space<1> *, int address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_store_async_from_lds_b64¶
Prototype:
void __builtin_amdgcn_global_store_async_from_lds_b64(_ExtVector<2, int> address_space<1> *, _ExtVector<2, int> address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_global_store_async_from_lds_b8¶
Prototype:
void __builtin_amdgcn_global_store_async_from_lds_b8(char address_space<1> *, char address_space<3> *, _Constant int, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_groupstaticsize¶
Prototype:
unsigned int __builtin_amdgcn_groupstaticsize()
No documentation.
__builtin_amdgcn_iglp_opt¶
Prototype:
void __builtin_amdgcn_iglp_opt(_Constant int)
No documentation.
__builtin_amdgcn_image_bvh8_intersect_ray¶
Prototype:
_ExtVector<10, unsigned int> __builtin_amdgcn_image_bvh8_intersect_ray(uint64_t, float, unsigned char, _ExtVector<3, float>, _ExtVector<3, float>, unsigned int, _ExtVector<4, unsigned int>, _ExtVector<3, float> *, _ExtVector<3, float> *)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_image_bvh_dual_intersect_ray¶
Prototype:
_ExtVector<10, unsigned int> __builtin_amdgcn_image_bvh_dual_intersect_ray(uint64_t, float, unsigned char, _ExtVector<3, float>, _ExtVector<3, float>, _ExtVector<2, unsigned int>, _ExtVector<4, unsigned int>, _ExtVector<3, float> *, _ExtVector<3, float> *)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_image_bvh_intersect_ray¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_image_bvh_intersect_ray(unsigned int, float, _ExtVector<4, float>, _ExtVector<4, float>, _ExtVector<4, float>, _ExtVector<4, unsigned int>)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_image_bvh_intersect_ray_h¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_image_bvh_intersect_ray_h(unsigned int, float, _ExtVector<4, float>, _ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<4, unsigned int>)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_image_bvh_intersect_ray_l¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_image_bvh_intersect_ray_l(uint64_t, float, _ExtVector<4, float>, _ExtVector<4, float>, _ExtVector<4, float>, _ExtVector<4, unsigned int>)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_image_bvh_intersect_ray_lh¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_image_bvh_intersect_ray_lh(uint64_t, float, _ExtVector<4, float>, _ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<4, unsigned int>)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_image_gather4_lz_2d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_gather4_lz_2d_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_load_1d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_1d_v4f16_i32(int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_1d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_1d_v4f32_i32(int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_1darray_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_1darray_v4f16_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_1darray_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_1darray_v4f32_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2d_f32_i32¶
Prototype:
float __builtin_amdgcn_image_load_2d_f32_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_2d_v4f16_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_2d_v4f32_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2darray_f32_i32¶
Prototype:
float __builtin_amdgcn_image_load_2darray_f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2darray_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_2darray_v4f16_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_2darray_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_2darray_v4f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_3d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_3d_v4f16_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_3d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_3d_v4f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_cube_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_cube_v4f16_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_cube_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_cube_v4f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_1d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_1d_v4f16_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_1d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_1d_v4f32_i32(int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_1darray_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_1darray_v4f16_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_1darray_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_1darray_v4f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2d_f32_i32¶
Prototype:
float __builtin_amdgcn_image_load_mip_2d_f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_2d_v4f16_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_2d_v4f32_i32(int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2darray_f32_i32¶
Prototype:
float __builtin_amdgcn_image_load_mip_2darray_f32_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2darray_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_2darray_v4f16_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_2darray_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_2darray_v4f32_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_3d_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_3d_v4f16_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_3d_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_3d_v4f32_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_cube_v4f16_i32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_load_mip_cube_v4f16_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_load_mip_cube_v4f32_i32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_load_mip_cube_v4f32_i32(int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_1d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_1d_v4f16_f32(int, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_1d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_1d_v4f32_f32(int, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_1darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_1darray_v4f16_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_1darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_1darray_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2d_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_2d_f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_2d_v4f16_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_2d_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2darray_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_2darray_f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_2darray_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_2darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_2darray_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_3d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_3d_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_3d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_3d_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_cube_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_cube_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_cube_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_cube_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_sample_d_1d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_d_1d_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_1d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_d_1d_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_1darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_d_1darray_v4f16_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_1darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_d_1darray_v4f32_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2d_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_d_2d_f32_f32(int, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_d_2d_v4f16_f32(int, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_d_2d_v4f32_f32(int, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2darray_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_d_2darray_f32_f32(int, float, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_d_2darray_v4f16_f32(int, float, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_2darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_d_2darray_v4f32_f32(int, float, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_3d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_d_3d_v4f16_f32(int, float, float, float, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_d_3d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_d_3d_v4f32_f32(int, float, float, float, float, float, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_1d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_1d_v4f16_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_1d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_1d_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_1darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_1darray_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_1darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_1darray_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2d_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_l_2d_f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_2d_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_2d_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2darray_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_l_2darray_f32_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_2darray_v4f16_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_2darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_2darray_v4f32_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_3d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_3d_v4f16_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_3d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_3d_v4f32_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_cube_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_l_cube_v4f16_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_l_cube_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_l_cube_v4f32_f32(int, float, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_1d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_1d_v4f16_f32(int, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_1d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_1d_v4f32_f32(int, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_1darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_1darray_v4f16_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_1darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_1darray_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2d_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_lz_2d_f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_2d_v4f16_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_2d_v4f32_f32(int, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2darray_f32_f32¶
Prototype:
float __builtin_amdgcn_image_sample_lz_2darray_f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2darray_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_2darray_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_2darray_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_2darray_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_3d_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_3d_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_3d_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_3d_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_cube_v4f16_f32¶
Prototype:
_ExtVector<4, _Float16> __builtin_amdgcn_image_sample_lz_cube_v4f16_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_sample_lz_cube_v4f32_f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_image_sample_lz_cube_v4f32_f32(int, float, float, float, __amdgpu_texture_t, _ExtVector<4, int>, bool, int, int)
Target Features: extended-image-insts
No documentation.
__builtin_amdgcn_image_store_1d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_1d_v4f16_i32(_ExtVector<4, _Float16>, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_1d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_1d_v4f32_i32(_ExtVector<4, float>, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_1darray_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_1darray_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_1darray_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_1darray_v4f32_i32(_ExtVector<4, float>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2d_f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_2d_f32_i32(float, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_2d_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_2d_v4f32_i32(_ExtVector<4, float>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2darray_f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_2darray_f32_i32(float, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2darray_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_2darray_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_2darray_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_2darray_v4f32_i32(_ExtVector<4, float>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_3d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_3d_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_3d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_3d_v4f32_i32(_ExtVector<4, float>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_cube_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_cube_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_cube_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_cube_v4f32_i32(_ExtVector<4, float>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_1d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_1d_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_1d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_1d_v4f32_i32(_ExtVector<4, float>, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_1darray_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_1darray_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_1darray_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_1darray_v4f32_i32(_ExtVector<4, float>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2d_f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2d_f32_i32(float, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2d_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2d_v4f32_i32(_ExtVector<4, float>, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2darray_f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2darray_f32_i32(float, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2darray_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2darray_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_2darray_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_2darray_v4f32_i32(_ExtVector<4, float>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_3d_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_3d_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_3d_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_3d_v4f32_i32(_ExtVector<4, float>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_cube_v4f16_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_cube_v4f16_i32(_ExtVector<4, _Float16>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_image_store_mip_cube_v4f32_i32¶
Prototype:
void __builtin_amdgcn_image_store_mip_cube_v4f32_i32(_ExtVector<4, float>, int, int, int, int, int, __amdgpu_texture_t, int, int)
Target Features: image-insts
No documentation.
__builtin_amdgcn_interp_mov¶
Prototype:
float __builtin_amdgcn_interp_mov(unsigned int, unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_interp_p1¶
Prototype:
float __builtin_amdgcn_interp_p1(float, unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_interp_p1_f16¶
Prototype:
float __builtin_amdgcn_interp_p1_f16(float, unsigned int, unsigned int, bool, unsigned int)
No documentation.
__builtin_amdgcn_interp_p2¶
Prototype:
float __builtin_amdgcn_interp_p2(float, float, unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_interp_p2_f16¶
Prototype:
__fp16 __builtin_amdgcn_interp_p2_f16(float, float, unsigned int, unsigned int, bool, unsigned int)
No documentation.
__builtin_amdgcn_inverse_ballot_w32¶
Prototype:
bool __builtin_amdgcn_inverse_ballot_w32(uint32_t)
Target Features: wavefrontsize32
No documentation.
__builtin_amdgcn_inverse_ballot_w64¶
Prototype:
bool __builtin_amdgcn_inverse_ballot_w64(uint64_t)
Target Features: wavefrontsize64
No documentation.
__builtin_amdgcn_is_private¶
Prototype:
bool __builtin_amdgcn_is_private(void const address_space<0> *)
No documentation.
__builtin_amdgcn_ldexp¶
Prototype:
double __builtin_amdgcn_ldexp(double, int)
No documentation.
__builtin_amdgcn_ldexpf¶
Prototype:
float __builtin_amdgcn_ldexpf(float, int)
No documentation.
__builtin_amdgcn_ldexph¶
Prototype:
_Float16 __builtin_amdgcn_ldexph(_Float16, int)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_lerp¶
Prototype:
unsigned int __builtin_amdgcn_lerp(unsigned int, unsigned int, unsigned int)
Target Features: lerp-inst
No documentation.
__builtin_amdgcn_load_async_to_lds¶
Prototype:
void __builtin_amdgcn_load_async_to_lds(void *, void address_space<3> *, _Constant unsigned int, _Constant int, _Constant unsigned int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_load_to_lds¶
Prototype:
void __builtin_amdgcn_load_to_lds(void *, void address_space<3> *, _Constant unsigned int, _Constant int, _Constant unsigned int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_log_bf16¶
Prototype:
__bf16 __builtin_amdgcn_log_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_log_clampf¶
Prototype:
float __builtin_amdgcn_log_clampf(float)
No documentation.
__builtin_amdgcn_logf¶
Prototype:
float __builtin_amdgcn_logf(float)
No documentation.
__builtin_amdgcn_make_buffer_rsrc¶
Prototype:
__amdgpu_buffer_rsrc_t __builtin_amdgcn_make_buffer_rsrc(void *, short, int64_t, int)
No documentation.
__builtin_amdgcn_mbcnt_hi¶
Prototype:
unsigned int __builtin_amdgcn_mbcnt_hi(unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_mbcnt_lo¶
Prototype:
unsigned int __builtin_amdgcn_mbcnt_lo(unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_mfma_f32_16x16x16bf16_1k¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x16bf16_1k(_ExtVector<4, short>, _ExtVector<4, short>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x16f16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x16f16(_ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x1f32¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_16x16x1f32(float, float, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x2bf16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_16x16x2bf16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_bf16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_bf16(_ExtVector<8, __bf16>, _ExtVector<8, __bf16>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_bf8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_bf8_bf8(int64_t, int64_t, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_bf8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_bf8_fp8(int64_t, int64_t, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_f16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_f16(_ExtVector<8, _Float16>, _ExtVector<8, _Float16>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_fp8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_fp8_bf8(int64_t, int64_t, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x32_fp8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x32_fp8_fp8(int64_t, int64_t, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x4bf16_1k¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_16x16x4bf16_1k(_ExtVector<4, short>, _ExtVector<4, short>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x4f16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_16x16x4f16(_ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x4f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x4f32(float, float, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x8_xf32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x8_xf32(_ExtVector<2, float>, _ExtVector<2, float>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_16x16x8bf16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_16x16x8bf16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_bf16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_bf16(_ExtVector<8, __bf16>, _ExtVector<8, __bf16>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_bf8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_bf8_bf8(int64_t, int64_t, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_bf8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_bf8_fp8(int64_t, int64_t, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_f16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_f16(_ExtVector<8, _Float16>, _ExtVector<8, _Float16>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_fp8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_fp8_bf8(int64_t, int64_t, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x16_fp8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x16_fp8_fp8(int64_t, int64_t, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x1f32¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_mfma_f32_32x32x1f32(float, float, _ExtVector<32, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x2bf16¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_mfma_f32_32x32x2bf16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<32, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x2f32¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x2f32(float, float, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x4_xf32¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x4_xf32(_ExtVector<2, float>, _ExtVector<2, float>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x4bf16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x4bf16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x4bf16_1k¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_mfma_f32_32x32x4bf16_1k(_ExtVector<4, short>, _ExtVector<4, short>, _ExtVector<32, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x4f16¶
Prototype:
_ExtVector<32, float> __builtin_amdgcn_mfma_f32_32x32x4f16(_ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<32, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x8bf16_1k¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x8bf16_1k(_ExtVector<4, short>, _ExtVector<4, short>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_32x32x8f16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_f32_32x32x8f16(_ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_4x4x1f32¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_4x4x1f32(float, float, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_4x4x2bf16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_4x4x2bf16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_4x4x4bf16_1k¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_4x4x4bf16_1k(_ExtVector<4, short>, _ExtVector<4, short>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f32_4x4x4f16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_f32_4x4x4f16(_ExtVector<4, _Float16>, _ExtVector<4, _Float16>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f64_16x16x4f64¶
Prototype:
_ExtVector<4, double> __builtin_amdgcn_mfma_f64_16x16x4f64(double, double, _ExtVector<4, double>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_f64_4x4x4f64¶
Prototype:
double __builtin_amdgcn_mfma_f64_4x4x4f64(double, double, double, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_16x16x16i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_mfma_i32_16x16x16i8(int, int, _ExtVector<4, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_16x16x32_i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_mfma_i32_16x16x32_i8(int64_t, int64_t, _ExtVector<4, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_16x16x4i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_mfma_i32_16x16x4i8(int, int, _ExtVector<16, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_16x16x64_i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_mfma_i32_16x16x64_i8(_ExtVector<4, int>, _ExtVector<4, int>, _ExtVector<4, int>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_i32_32x32x16_i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_mfma_i32_32x32x16_i8(int64_t, int64_t, _ExtVector<16, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_32x32x32_i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_mfma_i32_32x32x32_i8(_ExtVector<4, int>, _ExtVector<4, int>, _ExtVector<16, int>, _Constant int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_i32_32x32x4i8¶
Prototype:
_ExtVector<32, int> __builtin_amdgcn_mfma_i32_32x32x4i8(int, int, _ExtVector<32, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_32x32x8i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_mfma_i32_32x32x8i8(int, int, _ExtVector<16, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_i32_4x4x4i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_mfma_i32_4x4x4i8(int, int, _ExtVector<4, int>, _Constant int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_mfma_scale_f32_16x16x128_f8f6f4¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_mfma_scale_f32_16x16x128_f8f6f4(_ExtVector<8, int32_t>, _ExtVector<8, int32_t>, _ExtVector<4, float>, _Constant int, _Constant int, _Constant int, int, _Constant int, int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mfma_scale_f32_32x32x64_f8f6f4¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_mfma_scale_f32_32x32x64_f8f6f4(_ExtVector<8, int32_t>, _ExtVector<8, int32_t>, _ExtVector<16, float>, _Constant int, _Constant int, _Constant int, int, _Constant int, int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_mov_dpp¶
Prototype:
int __builtin_amdgcn_mov_dpp(int, _Constant int, _Constant int, _Constant int, _Constant bool)
Target Features: dpp
No documentation.
__builtin_amdgcn_mov_dpp8¶
Prototype:
unsigned int __builtin_amdgcn_mov_dpp8(unsigned int, _Constant unsigned int)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_mqsad_pk_u16_u8¶
Prototype:
uint64_t __builtin_amdgcn_mqsad_pk_u16_u8(uint64_t, unsigned int, uint64_t)
No documentation.
__builtin_amdgcn_mqsad_u32_u8¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_mqsad_u32_u8(uint64_t, unsigned int, _ExtVector<4, unsigned int>)
No documentation.
__builtin_amdgcn_msad_u8¶
Prototype:
unsigned int __builtin_amdgcn_msad_u8(unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_perm¶
Prototype:
unsigned int __builtin_amdgcn_perm(unsigned int, unsigned int, unsigned int)
Target Features: gfx8-insts
No documentation.
__builtin_amdgcn_perm_pk16_b4_u4¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_perm_pk16_b4_u4(unsigned int, unsigned int, _ExtVector<2, unsigned int>)
Target Features: tensor-cvt-lut-insts
No documentation.
__builtin_amdgcn_perm_pk16_b6_u4¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_perm_pk16_b6_u4(unsigned int, unsigned long int, _ExtVector<2, unsigned int>)
Target Features: tensor-cvt-lut-insts
No documentation.
__builtin_amdgcn_perm_pk16_b8_u4¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_perm_pk16_b8_u4(unsigned long int, unsigned long int, _ExtVector<2, unsigned int>)
Target Features: tensor-cvt-lut-insts
No documentation.
__builtin_amdgcn_permlane16¶
Prototype:
unsigned int __builtin_amdgcn_permlane16(unsigned int, unsigned int, unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_permlane16_swap¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_permlane16_swap(unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: permlane16-swap
No documentation.
__builtin_amdgcn_permlane16_var¶
Prototype:
unsigned int __builtin_amdgcn_permlane16_var(unsigned int, unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_permlane32_swap¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_permlane32_swap(unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: permlane32-swap
No documentation.
__builtin_amdgcn_permlane64¶
Prototype:
unsigned int __builtin_amdgcn_permlane64(unsigned int)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_permlane_bcast¶
Prototype:
int __builtin_amdgcn_permlane_bcast(int, int, int)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_permlane_down¶
Prototype:
int __builtin_amdgcn_permlane_down(int, int, int)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_permlane_idx_gen¶
Prototype:
int __builtin_amdgcn_permlane_idx_gen(int, int)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_permlane_up¶
Prototype:
int __builtin_amdgcn_permlane_up(int, int, int)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_permlane_xor¶
Prototype:
int __builtin_amdgcn_permlane_xor(int, int, int)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_permlanex16¶
Prototype:
unsigned int __builtin_amdgcn_permlanex16(unsigned int, unsigned int, unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_permlanex16_var¶
Prototype:
unsigned int __builtin_amdgcn_permlanex16_var(unsigned int, unsigned int, unsigned int, _Constant bool, _Constant bool)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_pk_add_max_i16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_pk_add_max_i16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<2, short>, _Constant bool)
Target Features: pk-add-min-max-insts
No documentation.
__builtin_amdgcn_pk_add_max_u16¶
Prototype:
_ExtVector<2, unsigned short> __builtin_amdgcn_pk_add_max_u16(_ExtVector<2, unsigned short>, _ExtVector<2, unsigned short>, _ExtVector<2, unsigned short>, _Constant bool)
Target Features: pk-add-min-max-insts
No documentation.
__builtin_amdgcn_pk_add_min_i16¶
Prototype:
_ExtVector<2, short> __builtin_amdgcn_pk_add_min_i16(_ExtVector<2, short>, _ExtVector<2, short>, _ExtVector<2, short>, _Constant bool)
Target Features: pk-add-min-max-insts
No documentation.
__builtin_amdgcn_pk_add_min_u16¶
Prototype:
_ExtVector<2, unsigned short> __builtin_amdgcn_pk_add_min_u16(_ExtVector<2, unsigned short>, _ExtVector<2, unsigned short>, _ExtVector<2, unsigned short>, _Constant bool)
Target Features: pk-add-min-max-insts
No documentation.
__builtin_amdgcn_prng_b32¶
Prototype:
unsigned int __builtin_amdgcn_prng_b32(unsigned int)
Target Features: prng-inst
No documentation.
__builtin_amdgcn_qsad_pk_u16_u8¶
Prototype:
uint64_t __builtin_amdgcn_qsad_pk_u16_u8(uint64_t, unsigned int, uint64_t)
Target Features: qsad-insts
No documentation.
__builtin_amdgcn_raw_buffer_load_b128¶
Prototype:
_ExtVector<4, unsigned int> __builtin_amdgcn_raw_buffer_load_b128(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_load_b16¶
Prototype:
unsigned short __builtin_amdgcn_raw_buffer_load_b16(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_load_b32¶
Prototype:
unsigned int __builtin_amdgcn_raw_buffer_load_b32(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_load_b64¶
Prototype:
_ExtVector<2, unsigned int> __builtin_amdgcn_raw_buffer_load_b64(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_load_b8¶
Prototype:
unsigned char __builtin_amdgcn_raw_buffer_load_b8(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_load_b96¶
Prototype:
_ExtVector<3, unsigned int> __builtin_amdgcn_raw_buffer_load_b96(__amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b128¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b128(_ExtVector<4, unsigned int>, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b16¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b16(unsigned short, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b32¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b32(unsigned int, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b64¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b64(_ExtVector<2, unsigned int>, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b8¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b8(unsigned char, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_buffer_store_b96¶
Prototype:
void __builtin_amdgcn_raw_buffer_store_b96(_ExtVector<3, unsigned int>, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_add_i32¶
Prototype:
int __builtin_amdgcn_raw_ptr_buffer_atomic_add_i32(int, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fadd_f32¶
Prototype:
float __builtin_amdgcn_raw_ptr_buffer_atomic_fadd_f32(float, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-fadd-rtn-insts
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fadd_v2f16¶
Prototype:
_ExtVector<2, _Float16> __builtin_amdgcn_raw_ptr_buffer_atomic_fadd_v2f16(_ExtVector<2, _Float16>, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-buffer-global-pk-add-f16-insts
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fmax_f32¶
Prototype:
float __builtin_amdgcn_raw_ptr_buffer_atomic_fmax_f32(float, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-fmin-fmax-global-f32
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fmax_f64¶
Prototype:
double __builtin_amdgcn_raw_ptr_buffer_atomic_fmax_f64(double, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-fmin-fmax-global-f64
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fmin_f32¶
Prototype:
float __builtin_amdgcn_raw_ptr_buffer_atomic_fmin_f32(float, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-fmin-fmax-global-f32
No documentation.
__builtin_amdgcn_raw_ptr_buffer_atomic_fmin_f64¶
Prototype:
double __builtin_amdgcn_raw_ptr_buffer_atomic_fmin_f64(double, __amdgpu_buffer_rsrc_t, int, int, _Constant int)
Target Features: atomic-fmin-fmax-global-f64
No documentation.
__builtin_amdgcn_raw_ptr_buffer_load_async_lds¶
Prototype:
void __builtin_amdgcn_raw_ptr_buffer_load_async_lds(__amdgpu_buffer_rsrc_t, void address_space<3> *, _Constant unsigned int, int, int, _Constant int, _Constant int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_raw_ptr_buffer_load_lds¶
Prototype:
void __builtin_amdgcn_raw_ptr_buffer_load_lds(__amdgpu_buffer_rsrc_t, void address_space<3> *, _Constant unsigned int, int, int, _Constant int, _Constant int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_rcp¶
Prototype:
double __builtin_amdgcn_rcp(double)
No documentation.
__builtin_amdgcn_rcp_bf16¶
Prototype:
__bf16 __builtin_amdgcn_rcp_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_rcpf¶
Prototype:
float __builtin_amdgcn_rcpf(float)
No documentation.
__builtin_amdgcn_rcph¶
Prototype:
_Float16 __builtin_amdgcn_rcph(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_read_exec¶
Prototype:
uint64_t __builtin_amdgcn_read_exec()
No documentation.
__builtin_amdgcn_read_exec_hi¶
Prototype:
unsigned int __builtin_amdgcn_read_exec_hi()
No documentation.
__builtin_amdgcn_read_exec_lo¶
Prototype:
unsigned int __builtin_amdgcn_read_exec_lo()
No documentation.
__builtin_amdgcn_readfirstlane¶
Prototype:
int __builtin_amdgcn_readfirstlane(int)
No documentation.
__builtin_amdgcn_readlane¶
Prototype:
int __builtin_amdgcn_readlane(int, int)
No documentation.
__builtin_amdgcn_rsq¶
Prototype:
double __builtin_amdgcn_rsq(double)
No documentation.
__builtin_amdgcn_rsq_bf16¶
Prototype:
__bf16 __builtin_amdgcn_rsq_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_rsq_clamp¶
Prototype:
double __builtin_amdgcn_rsq_clamp(double)
No documentation.
__builtin_amdgcn_rsq_clampf¶
Prototype:
float __builtin_amdgcn_rsq_clampf(float)
No documentation.
__builtin_amdgcn_rsqf¶
Prototype:
float __builtin_amdgcn_rsqf(float)
No documentation.
__builtin_amdgcn_rsqh¶
Prototype:
_Float16 __builtin_amdgcn_rsqh(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_s_barrier¶
Prototype:
void __builtin_amdgcn_s_barrier()
No documentation.
__builtin_amdgcn_s_barrier_init¶
Prototype:
void __builtin_amdgcn_s_barrier_init(void *, int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_join¶
Prototype:
void __builtin_amdgcn_s_barrier_join(void *)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_leave¶
Prototype:
void __builtin_amdgcn_s_barrier_leave(_Constant short)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_signal¶
Prototype:
void __builtin_amdgcn_s_barrier_signal(_Constant int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_signal_isfirst¶
Prototype:
bool __builtin_amdgcn_s_barrier_signal_isfirst(_Constant int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_signal_var¶
Prototype:
void __builtin_amdgcn_s_barrier_signal_var(void *, int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_barrier_wait¶
Prototype:
void __builtin_amdgcn_s_barrier_wait(_Constant short)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_buffer_prefetch_data¶
Prototype:
void __builtin_amdgcn_s_buffer_prefetch_data(__amdgpu_buffer_rsrc_t, _Constant int, unsigned int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_cluster_barrier¶
Prototype:
void __builtin_amdgcn_s_cluster_barrier()
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_s_dcache_inv¶
Prototype:
void __builtin_amdgcn_s_dcache_inv()
No documentation.
__builtin_amdgcn_s_dcache_inv_vol¶
Prototype:
void __builtin_amdgcn_s_dcache_inv_vol()
Target Features: ci-insts
No documentation.
__builtin_amdgcn_s_dcache_wb¶
Prototype:
void __builtin_amdgcn_s_dcache_wb()
Target Features: gfx8-insts
No documentation.
__builtin_amdgcn_s_decperflevel¶
Prototype:
void __builtin_amdgcn_s_decperflevel(_Constant int)
No documentation.
__builtin_amdgcn_s_get_barrier_state¶
Prototype:
unsigned int __builtin_amdgcn_s_get_barrier_state(int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_get_named_barrier_state¶
Prototype:
unsigned int __builtin_amdgcn_s_get_named_barrier_state(void *)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_getpc¶
Prototype:
uint64_t __builtin_amdgcn_s_getpc()
No documentation.
__builtin_amdgcn_s_getreg¶
Prototype:
unsigned int __builtin_amdgcn_s_getreg(_Constant int)
No documentation.
__builtin_amdgcn_s_incperflevel¶
Prototype:
void __builtin_amdgcn_s_incperflevel(_Constant int)
No documentation.
__builtin_amdgcn_s_memrealtime¶
Prototype:
uint64_t __builtin_amdgcn_s_memrealtime()
Target Features: s-memrealtime
No documentation.
__builtin_amdgcn_s_memtime¶
Prototype:
uint64_t __builtin_amdgcn_s_memtime()
Target Features: s-memtime-inst
No documentation.
__builtin_amdgcn_s_monitor_sleep¶
Prototype:
void __builtin_amdgcn_s_monitor_sleep(_Constant short)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_s_prefetch_data¶
Prototype:
void __builtin_amdgcn_s_prefetch_data(void const *, unsigned int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_sendmsg¶
Prototype:
void __builtin_amdgcn_s_sendmsg(_Constant int, unsigned int)
No documentation.
__builtin_amdgcn_s_sendmsg_rtn¶
Prototype:
unsigned int __builtin_amdgcn_s_sendmsg_rtn(_Constant unsigned int)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_s_sendmsg_rtnl¶
Prototype:
uint64_t __builtin_amdgcn_s_sendmsg_rtnl(_Constant unsigned int)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_s_sendmsghalt¶
Prototype:
void __builtin_amdgcn_s_sendmsghalt(_Constant int, unsigned int)
No documentation.
__builtin_amdgcn_s_setprio¶
Prototype:
void __builtin_amdgcn_s_setprio(_Constant short)
No documentation.
__builtin_amdgcn_s_setprio_inc_wg¶
Prototype:
void __builtin_amdgcn_s_setprio_inc_wg(_Constant short)
Target Features: setprio-inc-wg-inst
No documentation.
__builtin_amdgcn_s_setreg¶
Prototype:
void __builtin_amdgcn_s_setreg(_Constant int, unsigned int)
No documentation.
__builtin_amdgcn_s_sleep¶
Prototype:
void __builtin_amdgcn_s_sleep(_Constant int)
No documentation.
__builtin_amdgcn_s_sleep_var¶
Prototype:
void __builtin_amdgcn_s_sleep_var(unsigned int)
Target Features: gfx12-insts
No documentation.
__builtin_amdgcn_s_ttracedata¶
Prototype:
void __builtin_amdgcn_s_ttracedata(int)
No documentation.
__builtin_amdgcn_s_ttracedata_imm¶
Prototype:
void __builtin_amdgcn_s_ttracedata_imm(_Constant short)
Target Features: gfx10-insts
No documentation.
__builtin_amdgcn_s_wait_asynccnt¶
Prototype:
void __builtin_amdgcn_s_wait_asynccnt(_Constant unsigned short)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_s_wait_event¶
Prototype:
void __builtin_amdgcn_s_wait_event(_Constant short)
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_s_wait_event_export_ready¶
Prototype:
void __builtin_amdgcn_s_wait_event_export_ready()
Target Features: gfx11-insts
No documentation.
__builtin_amdgcn_s_wait_tensorcnt¶
Prototype:
void __builtin_amdgcn_s_wait_tensorcnt(_Constant unsigned short)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_s_waitcnt¶
Prototype:
void __builtin_amdgcn_s_waitcnt(_Constant int)
No documentation.
__builtin_amdgcn_s_wakeup_barrier¶
Prototype:
void __builtin_amdgcn_s_wakeup_barrier(void *)
Target Features: s-wakeup-barrier-inst
No documentation.
__builtin_amdgcn_sad_hi_u8¶
Prototype:
unsigned int __builtin_amdgcn_sad_hi_u8(unsigned int, unsigned int, unsigned int)
Target Features: sad-insts
No documentation.
__builtin_amdgcn_sad_u16¶
Prototype:
unsigned int __builtin_amdgcn_sad_u16(unsigned int, unsigned int, unsigned int)
Target Features: sad-insts
No documentation.
__builtin_amdgcn_sad_u8¶
Prototype:
unsigned int __builtin_amdgcn_sad_u8(unsigned int, unsigned int, unsigned int)
Target Features: sad-insts
No documentation.
__builtin_amdgcn_sat_pk4_i4_i8¶
Prototype:
unsigned short __builtin_amdgcn_sat_pk4_i4_i8(unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_sat_pk4_u4_u8¶
Prototype:
unsigned short __builtin_amdgcn_sat_pk4_u4_u8(unsigned int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_sbfe¶
Prototype:
unsigned int __builtin_amdgcn_sbfe(unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_sched_barrier¶
Prototype:
void __builtin_amdgcn_sched_barrier(_Constant int)
No documentation.
__builtin_amdgcn_sched_group_barrier¶
Prototype:
void __builtin_amdgcn_sched_group_barrier(_Constant int, _Constant int, _Constant int)
No documentation.
__builtin_amdgcn_sdot2¶
Prototype:
int __builtin_amdgcn_sdot2(_ExtVector<2, short>, _ExtVector<2, short>, int, _Constant bool)
Target Features: dot2-insts
No documentation.
__builtin_amdgcn_sdot4¶
Prototype:
int __builtin_amdgcn_sdot4(int, int, int, _Constant bool)
Target Features: dot1-insts
No documentation.
__builtin_amdgcn_sdot8¶
Prototype:
int __builtin_amdgcn_sdot8(int, int, int, _Constant bool)
Target Features: dot1-insts
No documentation.
__builtin_amdgcn_set_fpenv¶
Prototype:
void __builtin_amdgcn_set_fpenv(uint64_t)
No documentation.
__builtin_amdgcn_sicmp¶
Prototype:
uint64_t __builtin_amdgcn_sicmp(int, int, _Constant int)
No documentation.
__builtin_amdgcn_sicmpl¶
Prototype:
uint64_t __builtin_amdgcn_sicmpl(int64_t, int64_t, _Constant int)
No documentation.
__builtin_amdgcn_sin_bf16¶
Prototype:
__bf16 __builtin_amdgcn_sin_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_sinf¶
Prototype:
float __builtin_amdgcn_sinf(float)
No documentation.
__builtin_amdgcn_sinh¶
Prototype:
_Float16 __builtin_amdgcn_sinh(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x128_bf8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x128_bf8_bf8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x128_bf8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x128_bf8_fp8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x128_fp8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x128_fp8_bf8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x128_fp8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x128_fp8_fp8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x32_bf16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x32_bf16(_ExtVector<4, short>, _ExtVector<8, short>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x32_f16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x32_f16(_ExtVector<4, _Float16>, _ExtVector<8, _Float16>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_bf16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_bf16(_ExtVector<8, __bf16>, _ExtVector<16, __bf16>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_bf8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_bf8_bf8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_bf8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_bf8_fp8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_f16¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_f16(_ExtVector<8, _Float16>, _ExtVector<16, _Float16>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_fp8_bf8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_fp8_bf8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_16x16x64_fp8_fp8¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_smfmac_f32_16x16x64_fp8_fp8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<4, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x16_bf16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x16_bf16(_ExtVector<4, short>, _ExtVector<8, short>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x16_f16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x16_f16(_ExtVector<4, _Float16>, _ExtVector<8, _Float16>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_bf16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_bf16(_ExtVector<8, __bf16>, _ExtVector<16, __bf16>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_bf8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_bf8_bf8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_bf8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_bf8_fp8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_f16¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_f16(_ExtVector<8, _Float16>, _ExtVector<16, _Float16>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_fp8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_fp8_bf8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x32_fp8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x32_fp8_fp8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: fp8-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x64_bf8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x64_bf8_bf8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x64_bf8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x64_bf8_fp8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x64_fp8_bf8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x64_fp8_bf8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_f32_32x32x64_fp8_fp8¶
Prototype:
_ExtVector<16, float> __builtin_amdgcn_smfmac_f32_32x32x64_fp8_fp8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<16, float>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_i32_16x16x128_i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_smfmac_i32_16x16x128_i8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, int>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_smfmac_i32_16x16x64_i8¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_smfmac_i32_16x16x64_i8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<4, int>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_i32_32x32x32_i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_smfmac_i32_32x32x32_i8(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<16, int>, int, _Constant int, _Constant int)
Target Features: mai-insts
No documentation.
__builtin_amdgcn_smfmac_i32_32x32x64_i8¶
Prototype:
_ExtVector<16, int> __builtin_amdgcn_smfmac_i32_32x32x64_i8(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<16, int>, int, _Constant int, _Constant int)
Target Features: gfx950-insts
No documentation.
__builtin_amdgcn_sqrt¶
Prototype:
double __builtin_amdgcn_sqrt(double)
No documentation.
__builtin_amdgcn_sqrt_bf16¶
Prototype:
__bf16 __builtin_amdgcn_sqrt_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_sqrtf¶
Prototype:
float __builtin_amdgcn_sqrtf(float)
No documentation.
__builtin_amdgcn_sqrth¶
Prototype:
_Float16 __builtin_amdgcn_sqrth(_Float16)
Target Features: 16-bit-insts
No documentation.
__builtin_amdgcn_struct_ptr_buffer_load_async_lds¶
Prototype:
void __builtin_amdgcn_struct_ptr_buffer_load_async_lds(__amdgpu_buffer_rsrc_t, void address_space<3> *, _Constant unsigned int, int, int, int, _Constant int, _Constant int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_struct_ptr_buffer_load_lds¶
Prototype:
void __builtin_amdgcn_struct_ptr_buffer_load_lds(__amdgpu_buffer_rsrc_t, void address_space<3> *, _Constant unsigned int, int, int, int, _Constant int, _Constant int)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_sudot4¶
Prototype:
int __builtin_amdgcn_sudot4(_Constant bool, int, _Constant bool, int, int, _Constant bool)
Target Features: dot8-insts
No documentation.
__builtin_amdgcn_sudot8¶
Prototype:
int __builtin_amdgcn_sudot8(_Constant bool, int, _Constant bool, int, int, _Constant bool)
Target Features: dot8-insts
No documentation.
__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32¶
Prototype:
_ExtVector<8, short> __builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32(_ExtVector<8, short>, _ExtVector<16, short>, _ExtVector<8, short>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64¶
Prototype:
_ExtVector<4, short> __builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64(_ExtVector<4, short>, _ExtVector<8, short>, _ExtVector<4, short>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_bf16_16x16x64_bf16¶
Prototype:
_ExtVector<8, __bf16> __builtin_amdgcn_swmmac_bf16_16x16x64_bf16(_Constant bool, _ExtVector<16, __bf16>, _Constant bool, _ExtVector<32, __bf16>, _ExtVector<8, __bf16>, int, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_bf16f32_16x16x64_bf16¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_bf16f32_16x16x64_bf16(_Constant bool, _ExtVector<16, __bf16>, _Constant bool, _ExtVector<32, __bf16>, _ExtVector<8, float>, int, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x128_bf8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_swmmac_f16_16x16x128_bf8_bf8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, _Float16>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x128_bf8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_swmmac_f16_16x16x128_bf8_fp8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, _Float16>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x128_fp8_bf8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_swmmac_f16_16x16x128_fp8_bf8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, _Float16>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x128_fp8_fp8¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_swmmac_f16_16x16x128_fp8_fp8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, _Float16>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32¶
Prototype:
_ExtVector<8, __fp16> __builtin_amdgcn_swmmac_f16_16x16x32_f16_w32(_ExtVector<8, __fp16>, _ExtVector<16, __fp16>, _ExtVector<8, __fp16>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64¶
Prototype:
_ExtVector<4, __fp16> __builtin_amdgcn_swmmac_f16_16x16x32_f16_w64(_ExtVector<4, __fp16>, _ExtVector<8, __fp16>, _ExtVector<4, __fp16>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f16_16x16x64_f16¶
Prototype:
_ExtVector<8, _Float16> __builtin_amdgcn_swmmac_f16_16x16x64_f16(_Constant bool, _ExtVector<16, _Float16>, _Constant bool, _ExtVector<32, _Float16>, _ExtVector<8, _Float16>, int, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x128_bf8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x128_bf8_bf8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, float>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x128_bf8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x128_bf8_fp8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, float>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x128_fp8_bf8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x128_fp8_bf8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, float>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x128_fp8_fp8¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x128_fp8_fp8(_ExtVector<8, int>, _ExtVector<16, int>, _ExtVector<8, float>, _ExtVector<2, int>, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32(_ExtVector<8, short>, _ExtVector<16, short>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64(_ExtVector<4, short>, _ExtVector<8, short>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64(int, _ExtVector<2, int>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64(int, _ExtVector<2, int>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_f16_w32(_ExtVector<8, __fp16>, _ExtVector<16, __fp16>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_f16_w64(_ExtVector<4, __fp16>, _ExtVector<8, __fp16>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64(int, _ExtVector<2, int>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32(_ExtVector<2, int>, _ExtVector<4, int>, _ExtVector<8, float>, int)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64¶
Prototype:
_ExtVector<4, float> __builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64(int, _ExtVector<2, int>, _ExtVector<4, float>, int)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x64_bf16¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x64_bf16(_Constant bool, _ExtVector<16, __bf16>, _Constant bool, _ExtVector<32, __bf16>, _ExtVector<8, float>, int, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_f32_16x16x64_f16¶
Prototype:
_ExtVector<8, float> __builtin_amdgcn_swmmac_f32_16x16x64_f16(_Constant bool, _ExtVector<16, _Float16>, _Constant bool, _ExtVector<32, _Float16>, _ExtVector<8, float>, int, _Constant bool, _Constant bool)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x128_iu8¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_swmmac_i32_16x16x128_iu8(_Constant bool, _ExtVector<8, int>, _Constant bool, _ExtVector<16, int>, _ExtVector<8, int>, _ExtVector<2, int>, _Constant bool, _Constant bool, ...)
Target Features: gfx1250-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32(_Constant bool, int, _Constant bool, _ExtVector<2, int>, _ExtVector<8, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64(_Constant bool, int, _Constant bool, int, _ExtVector<4, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32(_Constant bool, _ExtVector<2, int>, _Constant bool, _ExtVector<4, int>, _ExtVector<8, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64(_Constant bool, int, _Constant bool, _ExtVector<2, int>, _ExtVector<4, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32¶
Prototype:
_ExtVector<8, int> __builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32(_Constant bool, _ExtVector<2, int>, _Constant bool, _ExtVector<4, int>, _ExtVector<8, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize32
No documentation.
__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64¶
Prototype:
_ExtVector<4, int> __builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64(_Constant bool, int, _Constant bool, _ExtVector<2, int>, _ExtVector<4, int>, int, _Constant bool)
Target Features: wmma-128b-insts,wavefrontsize64
No documentation.
__builtin_amdgcn_tanh_bf16¶
Prototype:
__bf16 __builtin_amdgcn_tanh_bf16(__bf16)
Target Features: bf16-trans-insts
No documentation.
__builtin_amdgcn_tanhf¶
Prototype:
float __builtin_amdgcn_tanhf(float)
Target Features: tanh-insts
No documentation.
__builtin_amdgcn_tanhh¶
Prototype:
__fp16 __builtin_amdgcn_tanhh(__fp16)
Target Features: tanh-insts
No documentation.
__builtin_amdgcn_tensor_load_to_lds¶
Prototype:
void __builtin_amdgcn_tensor_load_to_lds(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, int>, _ExtVector<4, int>, _ExtVector<8, int>, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_tensor_store_from_lds¶
Prototype:
void __builtin_amdgcn_tensor_store_from_lds(_ExtVector<4, int>, _ExtVector<8, int>, _ExtVector<4, int>, _ExtVector<4, int>, _ExtVector<8, int>, _Constant int)
Target Features: gfx1250-insts
No documentation.
__builtin_amdgcn_trig_preop¶
Prototype:
double __builtin_amdgcn_trig_preop(double, int)
No documentation.
__builtin_amdgcn_trig_preopf¶
Prototype:
float __builtin_amdgcn_trig_preopf(float, int)
No documentation.
__builtin_amdgcn_ubfe¶
Prototype:
unsigned int __builtin_amdgcn_ubfe(unsigned int, unsigned int, unsigned int)
No documentation.
__builtin_amdgcn_udot2¶
Prototype:
unsigned int __builtin_amdgcn_udot2(_ExtVector<2, unsigned short>, _ExtVector<2, unsigned short>, unsigned int, _Constant bool)
Target Features: dot2-insts
No documentation.
__builtin_amdgcn_udot4¶
Prototype:
unsigned int __builtin_amdgcn_udot4(unsigned int, unsigned int, unsigned int, _Constant bool)
Target Features: dot7-insts
No documentation.
__builtin_amdgcn_udot8¶
Prototype:
unsigned int __builtin_amdgcn_udot8(unsigned int, unsigned int, unsigned int, _Constant bool)
Target Features: dot7-insts
No documentation.
__builtin_amdgcn_uicmp¶
Prototype:
uint64_t __builtin_amdgcn_uicmp(unsigned int, unsigned int, _Constant int)
No documentation.
__builtin_amdgcn_uicmpl¶
Prototype:
uint64_t __builtin_amdgcn_uicmpl(uint64_t, uint64_t, _Constant int)
No documentation.
__builtin_amdgcn_update_dpp¶
Prototype:
int __builtin_amdgcn_update_dpp(int, int, _Constant int, _Constant int, _Constant int, _Constant bool)
Target Features: dpp
No documentation.
__builtin_amdgcn_wait_asyncmark¶
Prototype:
void __builtin_amdgcn_wait_asyncmark(_Constant unsigned short)
Target Features: vmem-to-lds-load-insts
No documentation.
__builtin_amdgcn_wave_barrier¶
Prototype:
void __builtin_amdgcn_wave_barrier()
No documentation.
__builtin_amdgcn_wave_reduce_add_u32¶
Prototype:
uint32_t __builtin_amdgcn_wave_reduce_add_u32(uint32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_add_u64¶
Prototype:
uint64_t __builtin_amdgcn_wave_reduce_add_u64(uint64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_and_b32¶
Prototype:
int32_t __builtin_amdgcn_wave_reduce_and_b32(int32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_and_b64¶
Prototype:
int64_t __builtin_amdgcn_wave_reduce_and_b64(int64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fadd_f32¶
Prototype:
float __builtin_amdgcn_wave_reduce_fadd_f32(float, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fadd_f64¶
Prototype:
double __builtin_amdgcn_wave_reduce_fadd_f64(double, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fmax_f32¶
Prototype:
float __builtin_amdgcn_wave_reduce_fmax_f32(float, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fmax_f64¶
Prototype:
double __builtin_amdgcn_wave_reduce_fmax_f64(double, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fmin_f32¶
Prototype:
float __builtin_amdgcn_wave_reduce_fmin_f32(float, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fmin_f64¶
Prototype:
double __builtin_amdgcn_wave_reduce_fmin_f64(double, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fsub_f32¶
Prototype:
float __builtin_amdgcn_wave_reduce_fsub_f32(float, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_fsub_f64¶
Prototype:
double __builtin_amdgcn_wave_reduce_fsub_f64(double, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_max_i32¶
Prototype:
int32_t __builtin_amdgcn_wave_reduce_max_i32(int32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_max_i64¶
Prototype:
int64_t __builtin_amdgcn_wave_reduce_max_i64(int64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_max_u32¶
Prototype:
uint32_t __builtin_amdgcn_wave_reduce_max_u32(uint32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_max_u64¶
Prototype:
uint64_t __builtin_amdgcn_wave_reduce_max_u64(uint64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_min_i32¶
Prototype:
int32_t __builtin_amdgcn_wave_reduce_min_i32(int32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_min_i64¶
Prototype:
int64_t __builtin_amdgcn_wave_reduce_min_i64(int64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_min_u32¶
Prototype:
uint32_t __builtin_amdgcn_wave_reduce_min_u32(uint32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_min_u64¶
Prototype:
uint64_t __builtin_amdgcn_wave_reduce_min_u64(uint64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_or_b32¶
Prototype:
int32_t __builtin_amdgcn_wave_reduce_or_b32(int32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_or_b64¶
Prototype:
int64_t __builtin_amdgcn_wave_reduce_or_b64(int64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_sub_u32¶
Prototype:
uint32_t __builtin_amdgcn_wave_reduce_sub_u32(uint32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_sub_u64¶
Prototype:
uint64_t __builtin_amdgcn_wave_reduce_sub_u64(uint64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_xor_b32¶
Prototype:
int32_t __builtin_amdgcn_wave_reduce_xor_b32(int32_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wave_reduce_xor_b64¶
Prototype:
int64_t __builtin_amdgcn_wave_reduce_xor_b64(int64_t, _Constant int32_t)
No documentation.
__builtin_amdgcn_wavefrontsize¶
Prototype:
unsigned int __builtin_amdgcn_wavefrontsize()
No documentation.
__builtin_r600_implicitarg_ptr¶
Prototype:
unsigned char address_space<7> * __builtin_r600_implicitarg_ptr()
No documentation.
__builtin_r600_read_tgid_x¶
Prototype:
unsigned int __builtin_r600_read_tgid_x()
No documentation.
__builtin_r600_read_tgid_y¶
Prototype:
unsigned int __builtin_r600_read_tgid_y()
No documentation.
__builtin_r600_read_tgid_z¶
Prototype:
unsigned int __builtin_r600_read_tgid_z()
No documentation.
__builtin_r600_read_tidig_x¶
Prototype:
unsigned int __builtin_r600_read_tidig_x()
No documentation.
__builtin_r600_read_tidig_y¶
Prototype:
unsigned int __builtin_r600_read_tidig_y()
No documentation.
__builtin_r600_read_tidig_z¶
Prototype:
unsigned int __builtin_r600_read_tidig_z()
No documentation.
__builtin_r600_recipsqrt_ieee¶
Prototype:
double __builtin_r600_recipsqrt_ieee(double)
No documentation.
__builtin_r600_recipsqrt_ieeef¶
Prototype:
float __builtin_r600_recipsqrt_ieeef(float)
No documentation.