Skip to content

Performance of ReinterpretArray, continued #28980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RalphAS opened this issue Aug 31, 2018 · 6 comments · Fixed by #37559
Closed

Performance of ReinterpretArray, continued #28980

RalphAS opened this issue Aug 31, 2018 · 6 comments · Fixed by #37559
Labels
performance Must go faster

Comments

@RalphAS
Copy link

RalphAS commented Aug 31, 2018

Although the problem described in #25014 was handled by #28707, there is more to be done.
To start, can the pattern of the latter PR be extended to setindex?

Example (based on a regression in QuartzImageIO)

using BenchmarkTools, ImageCore, Colors, FixedPointNumbers

const old = VERSION < v"0.7"

function fillcolor1!(buffer::AbstractArray{T, 3}, imgsrc, nc) where T
    imwidth, imheight = size(buffer, 2), size(buffer, 3)
    pixelptr = pointer(imgsrc)
    if old
        imbuffer = unsafe_wrap(Array, pixelptr, (nc, imwidth, imheight), false)
    else
        imbuffer = unsafe_wrap(Array, pixelptr, (nc, imwidth, imheight), own=false)
    end
    buffer[:, :, :] = imbuffer
end

function fillcolor2!(buffer::AbstractArray{T, 3}, imgsrc, nc) where T
    imwidth, imheight = size(buffer, 2), size(buffer, 3)
    pixelptr = pointer(imgsrc)
    if old
        imbuffer = unsafe_wrap(Array, pixelptr, (nc, imwidth, imheight), false)
    else
        imbuffer = unsafe_wrap(Array, pixelptr, (nc, imwidth, imheight), own=false)
    end
    buffer .= imbuffer
end

function runme()
    T = N0f8
    sz = (512,512)
    sz3 = (3,sz...)
    imgsrc = rand(T,sz3)
    if old
        buf1 = Array{RGB{T}}(sz)
    else
        buf1 = Array{RGB{T}}(undef,sz)
    end
    buf2 = similar(buf1)
    buf1r = reshape(reinterpret(T, buf1), (3, sz...))
    buf2r = reshape(reinterpret(T, buf1), (3, sz...))
    println("buf2r is a ",typeof(buf2r))
    @btime fillcolor1!($buf1r, $imgsrc, 3)
    @btime fillcolor2!($buf2r, $imgsrc, 3)
    buf2r == buf1r
end

v0.6.4:

buf2r is a Array{FixedPointNumbers.Normed{UInt8,8},3}
  908.177 μs (8 allocations: 224 bytes)
  47.144 μs (2 allocations: 112 bytes)
true

v0.7.0:

buf2r is a Base.ReshapedArray{Normed{UInt8,8},3,Base.ReinterpretArray{Normed{UInt8,8},2,RGB{Normed{UInt8,8}},Array{RGB{Normed{UInt8,8}},2}},Tuple{}}
  505.656 ms (7863300 allocations: 275.98 MiB)
  3.866 ms (2 allocations: 112 bytes)
true

v1.1.0-DEV.140:

buf2r is a Base.ReshapedArray{FixedPointNumbers.Normed{UInt8,8},3,Base.ReinterpretArray{FixedPointNumbers.Normed{UInt8,8},2,ColorTypes.RGB{FixedPointNumbers.Normed{UInt8,8}},Array{ColorTypes.RGB{FixedPointNumbers.Normed{UInt8,8}},2}},Tuple{}}
  1.031 s (7863300 allocations: 275.98 MiB)
  2.497 ms (2 allocations: 112 bytes)
true

Are the two versions of fillarray guaranteed to have equivalent results? If so, could the first now be made to throw?

@Keno
Copy link
Member

Keno commented Aug 31, 2018

Please don't use unsafe_wrap in this way. It's very hard to get right. In particular the example will randomly segfault. If you write it out as a proper assignment and the performance is still bad, we can work on that.

@RalphAS
Copy link
Author

RalphAS commented Sep 1, 2018

Fair enough; that code was extracted (with minor changes to make it stand-alone) from a current package where one hopes the unsafe_wrap makes more sense, for the sake of a spectacular example.

A cleaner case, from ImageCore, boils down to this:

const old = VERSION < v"0.7"
ssz = (1000,1000)
c = rand(RGB{Float64}, ssz...)
if old
    a = copy(reinterpret(Float64, c))
else
    a = copy(reinterpretc(Float64, c))
end
vchan = channelview(c)
vcol = colorview(RGB, a)

function myfill1!(A, val)
    for I in eachindex(A)
        A[I] = val
    end
    A
end

const zcol = zero(eltype(vcol))
const zchan = zero(eltype(vchan))

println("VERSION: ",VERSION)
println("arg type: ",typeof(vcol))
@btime myfill1!($vcol,zcol)
println("arg type: ",typeof(vchan))
@btime myfill1!($vchan,zchan)
VERSION: 0.6.4
arg type: Array{ColorTypes.RGB{Float64},2}
  1.891 ms (0 allocations: 0 bytes)
arg type: Array{Float64,3}
  1.835 ms (0 allocations: 0 bytes)

VERSION: 0.7.0
arg type: Base.ReshapedArray{ColorTypes.RGB{Float64},2,Base.ReinterpretArray{ColorTypes.RGB{Float64},3,Float64,Array{Float64,3}},Tuple{}}
  10.042 ms (0 allocations: 0 bytes)
arg type: Base.ReinterpretArray{Float64,3,ColorTypes.RGB{Float64},Array{ColorTypes.RGB{Float64},3}}
  33.735 ms (0 allocations: 0 bytes)

VERSION: 1.1.0-DEV.140
arg type: Base.ReshapedArray{ColorTypes.RGB{Float64},2,Base.ReinterpretArray{ColorTypes.RGB{Float64},3,Float64,Array{Float64,3}},Tuple{}}
  10.670 ms (0 allocations: 0 bytes)
arg type: Base.ReinterpretArray{Float64,3,ColorTypes.RGB{Float64},Array{ColorTypes.RGB{Float64},3}}
  29.510 ms (0 allocations: 0 bytes)

@RalphAS
Copy link
Author

RalphAS commented Sep 1, 2018

And to follow up on the original example, removing the pointless copy with unsafe_wrap, so I'm timing the simple functions

function fillcolor1!(buffer::AbstractArray{T, 3}, imgsrc, nc) where T
    buffer[:, :, :] = imgsrc
end

B-times:
v0.6.4:
908.451 μs (6 allocations: 112 bytes)
v0.7.0:
489.031 ms (7863298 allocations: 275.98 MiB)
v1.1.0-DEV.140:
1.048 s (7863298 allocations: 275.98 MiB)

function fillcolor2!(buffer::AbstractArray{T, 3}, imgsrc, nc) where T
    buffer .= imgsrc
end

v0.6.4
47.005 μs (0 allocations: 0 bytes)
v0.7.0
18.541 ms (0 allocations: 0 bytes)
v1.1.0-DEV.140:
18.528 ms (0 allocations: 0 bytes)

@Keno Keno added the performance Must go faster label Sep 2, 2018
@chethega
Copy link
Contributor

chethega commented Sep 3, 2018

Could we revive #27213, now that the 1.0 release is done? That seemed significantly less complex, more robust and more elegant than the alternatives.

@ifquant
Copy link

ifquant commented Sep 9, 2018

using Images
img1 = "./test1.png" #a png file 8MB size.
@time load(img1)
@time load(img1)
@time open(f->read(f, String), img1)

7.160414 seconds (21.85 M allocations: 1.549 GiB, 9.59% gc time)
2.421921 seconds (9.28 k allocations: 521.201 MiB, 5.13% gc time)
0.011717 seconds (11.96 k allocations: 9.105 MiB)

@timholy
Copy link
Member

timholy commented Sep 4, 2019

Bump.

using ColorTypes, BenchmarkTools

# Julia 0.6
julia> @btime rand(RGB{Float64}, 20, 20);
  1.855 μs (3 allocations: 9.59 KiB)

# Julia 1.2
julia> @btime rand(RGB{Float64}, 20, 20);
  12.046 μs (3 allocations: 9.58 KiB)

JuliaGraphics/ColorTypes.jl#122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants