Releases: intel/llvm
Releases · intel/llvm
DPC++ daily 2022-06-22
[ESIMD] Implement stateless memory accesses enforcement (#6287) The driver option -f[no-]sycl-esimd-force-stateless-mem is added. -fsycl-esimd-force-stateless-mem enables the automatic conversion of stateful memory accesses via SYCL accessors or surface-index to stateless within ESIMD kernels. It also disables those ESIMD intrinsics that use stateful accesses that cannot be converted to stateless. -fsycl-esimd-force-stateless-mem defines the macro __ESIMD_FORCE_STATELESS_MEM to map the calls of ESIMD API using accessors to calls of API using pointers. It also passes a switch to sycl-post-link to signal it that it should ignore the buffer_t attribute and use svmptr_t. -fno-sycl-esimd-force-stateless-mem is used to tell the compiler not to convert stateful memory accesses to stateless. Default behavior. Draft of the design document/proposal for this change-set: #6187
DPC++ daily 2022-06-21
[SYCL][PI][CUDA] Fix too many streams getting synchronized (#6333) Fixed off-by-one error introduced in https://github.com/intel/llvm/pull/6201 that would cause queue synchronization to synchronize all streams when no stream has been used. The code worked correctly, but this can in some cases impact performance.
DPC++ daily 2022-06-20
sycl-nightly/20220620 [SYCL] Add missing iterator header for std::back_inserter (#6329)
DPC++ daily 2022-06-18
sycl-nightly/20220618 [SYCL] Add group::get_linear_id(int dim) overload (#6320)
DPC++ daily 2022-06-17
[GHA] Uplift GPU RT version for Linux CI (#6300) Uplift GPU RT version for Linux to 22.23.23405 Co-authored-by: GitHub Actions <[email protected]>
DPC++ daily 2022-06-16
[GHA] Uplift GPU RT version for Nightly Builds (#6299) Uplift GPU RT version for Linux to 22.23.23405 Co-authored-by: GitHub Actions <[email protected]>
DPC++ daily 2022-06-15
sycl-nightly/20220615 [SYCL] Refactor SYCL kernel object handling in hierarchical paralleli…
DPC++ daily 2022-06-14
[SYCL] Add aspect for bfloat16 (#5720) This PR adds a new aspect ext_oneapi_bfloat16 to allow a runtime check for if the device supports the bfloat16 floating point type. Only the CUDA implementation for checking if the device supports this aspect is added. Updated test: intel/llvm-test-suite#888
DPC++ daily 2022-06-13
[SYCL][Doc] Initial draft of root-group proposal (#6163) Introduces a new group type (root_group) representing all work-items executing a kernel, along with an associated kernel property that enables all work-items within a root_group to synchronize. Signed-off-by: John Pennycook <[email protected]>
DPC++ daily 2022-06-12
[SYCL][Doc] Initial draft of root-group proposal (#6163) Introduces a new group type (root_group) representing all work-items executing a kernel, along with an associated kernel property that enables all work-items within a root_group to synchronize. Signed-off-by: John Pennycook <[email protected]>