Skip to content

feature_transform: add multithreading #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 25, 2021
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Sep 24, 2021

If Julia is being run with multiple threads, this now will default
to using all threads in feature_transform. The multithreaded
implementation is relatively straightforward, although the recursive algorithm
makes it a little hard to wrap your head around:

  • computeft! for dimension d is dependent only on slices of dimensions 1:d,
    and indeed for all except the final call to voronoift! just on dimensions 1:d-1.
    Hence you can divide the last dimension N into chunks and give each to a thread.

  • For the final voronoift! call along dimension N, it's a
    one-dimensional operation along this dimension. Hence you can just
    split the next-to-last dimension into chunks instead.

This change was responsible for performance improvements in distance_transform
noted in JuliaImages/image_benchmarks#1.

I am unsure about the generalization to Gray{Bool}; Julia doesn't treat anything beside Bool as a bool in if statements, but OTOH when we load a "binary image" it comes in as Gray{Bool}. I'd be interested in the thoughts of others.

@timholy timholy force-pushed the teh/parallelize_feat branch 2 times, most recently from 18ed6fb to 7bae26f Compare September 24, 2021 17:53
Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very nice work!

I am unsure about the generalization to Gray{Bool}; Julia doesn't treat anything beside Bool as a bool in if statements, but OTOH when we load a "binary image" it comes in as Gray{Bool}. I'd be interested in the thoughts of others.

I wouldn't be surprised if people try to directly pass the output from ImageBinarization.binarize (Gray{Bool}) into feature_transform.

function feature_transform(I::AbstractArray{Bool,N}, w::Union{Nothing,NTuple{N}}=nothing) where N
function feature_transform(img::AbstractArray{<:Union{Bool,AbstractGray{Bool}},N};
weights::Union{Nothing,NTuple{N}}=nothing,
nthreads::Int = length(img) < 1000 ? 1 : Threads.nthreads()) where N
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to ask, does this mean that ComputationalResources.jl is considered deprecated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you ask. I'm not sure how to think about that. It was added before keyword arguments to give meaning to options that otherwise might be too simple to be unambiguous (as you've noted, imfilter has a complicated dispatch hierarchy and could be cleaned up with keyword arguments). That said, it does provide facilities for passing options, and that's not a bad thing. I could incorporate it here. Just not quite sure what to think...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a more general perspective, I could imagine that algorithm f is expected to accept img::CuArray and calls the CUDA implementation, in which there won't be nthreads keyword.

The difficulty in ComputationalResources is that if we don't separate the interface dispatch from the implementation dispatch, we need to handle the complicated dispatch hierarchy for every function. I personally think it is affordable to maintain one or two complicated dispatch hierarchies using ComputationalResources. So for example, in ImageBinarization we can only configure CR for binarize/binarize!, and leave all the implementation dispatch to concrete algorithms, e.g., https://github.com/zygmuntszpak/ImageBinarization.jl/blob/0c0c5eaefa997f07746c24887c9ed5fad6199d9b/src/algorithms/niblack.jl#L84-L85

Said this, the current nthreads keyword version looks good to me right now. We can reraise this discussion when we have more concrete issues.

@timholy timholy force-pushed the teh/parallelize_feat branch 2 times, most recently from d3b8e06 to 5efaffd Compare September 25, 2021 09:38
timholy and others added 2 commits September 25, 2021 04:58
If Julia is being run with multiple threads, this now will default
to using all threads in `feature_transform`. The multithreaded
implementation is relatively straightforward, although the recursive algorithm
makes it a little hard to wrap your head around:

- `computeft!` for dimension `d` is dependent only on slices of dimensions `1:d`,
  and indeed for all except the final call to `voronoift!` just on dimensions `1:d-1`.
  Hence you can divide the last dimension `N` into chunks, skip the final
  `voronoift!` on dimension `N`, and give each to a thread.

- For the final `voronoift!` call along dimension `N`, it's a
  one-dimensional operation along this dimension. Hence you can just
  split the next-to-last dimension into chunks instead.

This change was responsible for performance improvements in `distance_transform`
noted in JuliaImages/image_benchmarks#1.

Julia 1.3 is required for `@spawn`

Co-authored-by: Johnny Chen <[email protected]>
@timholy timholy force-pushed the teh/parallelize_feat branch from 5efaffd to 41f9fa8 Compare September 25, 2021 09:59
@timholy timholy merged commit 524f02c into master Sep 25, 2021
@timholy timholy deleted the teh/parallelize_feat branch September 25, 2021 10:04
johnnychen94 pushed a commit that referenced this pull request May 21, 2022
`feature_transform`: add multithreading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants