As diagnosed in https://github.com/JuliaGPU/CUDA.jl/issues/2615, automatically prefetching unified memory arrays is counterproductive when the array is used on multiple devices. For the time being, we should probably disable this prefetching when multiple devices are available on the system.