Thus performing device-to-host copy during boundscheck. ```julia julia> x = AMDGPU.rand(Float32, 16); julia> x[[1, 2, 3, 4]]; [D to H] ROCArray{Bool, 1, AMDGPU.Runtime.Mem.HIPBuffer}: (1,) -> Vector{Bool}: (1,) julia> @inbounds x[[1, 2, 3, 4]]; [D to H] ROCArray{Bool, 1, AMDGPU.Runtime.Mem.HIPBuffer}: (1,) -> Vector{Bool}: (1,) ```