Open
Description
I've noticed that a big part of the allocations in my Julia code come from Base.iterate
, when it doesn't inline. That might be a possible target of optimisation - at least I see no reason it couldn't stack-allocate it.
Here is a small example:
julia> struct ViewIterator{T}
x::T
end
julia> @noinline function Base.iterate(x::ViewIterator, i::Int=1)
i > length(x.x) ? nothing : (@inbounds view(x.x, i:i), i+1)
end
julia> function foo(x::Vector{Int})
n = 0
for i in ViewIterator(x)
n += @inbounds i[1]
end
n
end
foo (generic function with 1 method)
julia> v = rand(Int, 1000000);
julia> @time foo(v)
0.124638 seconds (1000.00 k allocations: 61.035 MiB, 89.47% gc time)
-754608176331557632
These allocations appear to come from the Union{Tuple{SubArray, Int}, Nothing}
return value of iterate
. Interestingly, it's not that having a non-inlined function return SubArray
itself heap allocates - if we change the code like so, it allocates the result on the stack:
julia> @noinline bar(x, i) = @inbounds view(x, i:i);
julia> function foo(x::Vector{Int})
n = 0
for i in 1:length(x)
n += @inbounds bar(x, i)[1]
end
n
end;
julia> @time foo(v)
0.002680 seconds
7211620882627564191
So, something causes the small union to heap-allocate, whereas the subarray can stack allocate.