Skip to content

Conversation

pazner
Copy link
Contributor

@pazner pazner commented Sep 16, 2020

Fixes issue #197.

The issue is caused by threading, and if JULIA_NUM_THREADS is greater than 1, it can be reproduced with the following MWE, which compares the gradients computed with ∇conv_filter_im2col to those computed with ∇conv_filter_direct.

using NNlib, Random

function compare_grad()
    Random.seed!(0)

    x = rand(50,1,10)
    w = rand(3,1,1)
    dy = rand(50,1,10)

    cdims = DenseConvDims(size(x), size(w); stride=1, padding=1)
    dw1 = NNlib.∇conv_filter_im2col(x, dy, cdims)
    dw2 = NNlib.∇conv_filter_direct(x, dy, cdims)
    maximum(abs, dw1-dw2)
end

In a correct implementation, compare_grad() should always return zero. Since the random seed is set to zero, the result should also be deterministic. However, when using multithreading it returns seemingly random large numbers:

julia> Base.Threads.nthreads()
4

julia> compare_grad()
28.640897825466183

julia> compare_grad()
15.199002081309956

julia> compare_grad()
26.932418843285404

This is caused because ∇conv_filter_im2col! uses a threaded for-loop that attempts to accumulate the result to a single array dw, causing write conflicts.

I fixed this issue simply by disabling threading for this loop. Another option would be to introduce another thread-specific workspace for dw, and accumulate over threads, but I suspect this will not be worth it.

This PR also fixes some bugs involving GC.@preserve by removing commas from the argument list. Commas in this context cause the arguments to be treated as a tuple, as can be seen from the following example:

julia> @macroexpand @GC.preserve x y begin end
:($(Expr(:gc_preserve, quote
    #= REPL[11]:1 =#
end, :x, :y)))

julia> @macroexpand @GC.preserve x, y, begin end
:($(Expr(:gc_preserve, :((x, y, begin
          #= REPL[10]:1 =#
      end)))))

@CarloLucibello
Copy link
Member

yeah, better disable multi-threading and fix the bug. We can try to reimplement it later in a correct way and with proper benchmarking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants