Skip to content

add sync.Barrier? #1382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rsc opened this issue Jan 4, 2011 · 48 comments
Closed

add sync.Barrier? #1382

rsc opened this issue Jan 4, 2011 · 48 comments

Comments

@rsc
Copy link
Contributor

rsc commented Jan 4, 2011

something like this?
seems to come up a lot

// A Barrier waits for a collection of goroutines to finish.
// The main goroutine calls Add to set the number of
// goroutines to wait for.  Then each of the goroutines
// runs and calls Done when finished.  At the same time,
// the main goroutine can call Wait to block until the
// given number of goroutines have finished.
@gopherbot
Copy link
Contributor

Comment 1 by leimy2k:

Naive library based API implementation?
Seems to work with a very basic test.
package barrier
import (
    "sync"
)
type Group struct {
    count int // number of checkins
    total int
    signalin chan bool
    waiter * sync.RWMutex
}
func controller (g * Group) {
    for g.count != g.total {
        <- g.signalin
        g.count++
    }
    g.waiter.Unlock()
}
func NewGroup (size int) (* Group) {
    g := &Group{0,size, make(chan bool, size), &sync.RWMutex{}}
    g.waiter.Lock()
    go controller(g)
    return g
}
func (g * Group) Wait () {
    g.signalin <- true;
    g.waiter.RLock()
}

@gopherbot
Copy link
Contributor

Comment 2 by leimy2k:

Here's the test
package barrier
import (
    "testing"
)
func worker (g * Group, t *testing.T) {
    t.Log("<---")
    g.Wait()
    t.Log("--->")
}
func Test1(t *testing.T) {
    g := NewGroup(10)
    for i := 0; i < 9; i++ {  // we are the last worker
        go worker(g,t)
    }
    
    worker(g,t)
}

@gopherbot
Copy link
Contributor

Comment 3 by leimy2k:

Perhaps a better test:
package barrier
import (
    "testing"
)
func worker (g * Group, t *testing.T, bc chan <- bool) {
    t.Log("<---\n")
    g.Wait()
    t.Log("--->\n")
    bc <- true
}
func Test1(t *testing.T) {
    bc := make(chan bool)
    g := NewGroup(10)
    for i := 0; i < 10; i++ {  // we are the last worker
        go worker(g,t, bc)
    }
    
    for i := 0; i < 10; i++ {
        <- bc
    }
}

@gopherbot
Copy link
Contributor

Comment 4 by leimy2k:

And no I didn't follow the original directions to make an API of Add, Done and Wait,
because I don't think it makes a lot of sense :-).
Then again, I'm coming from an MPI programming background where you have
MPI_Barrier over a communicator, which is really a group of ranks (processes), that all
must call MPI_Barrier for any of them to escape the barrier.

@gopherbot
Copy link
Contributor

Comment 5 by leimy2k:

I implemented a Barrier with the following API for Go.
NewGroup(size)
AddN
Wait.
https://github.com/Leimy/Go-Barrier
There's gotest based unit tests that show how this works.  It's not precisely what's
stated above, but I'm not sure if what's stated above matches what I've read people have
asked for on the golang-nuts group posts which is "the last guy turns off the light".
I implemented that.

@gopherbot
Copy link
Contributor

Comment 6 by leimy2k:

https://github.com/Leimy/Go-Barrier implements 
NewGroup(Size)
AddN(Size)
Wait()
And some examples of how Wait is barrier synchronization are in the barrier_test.go
source file.
Sorry for so many edits.  I've been jumping around tasks and doing a bad job of keeping
this stuff clean :-(

@gopherbot
Copy link
Contributor

Comment 7 by leimy2k:

Not sure I did the codereview correctly but here we go:
http://golang.org/cl/3769044/

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 5, 2011

Comment 8:

AFAICS there are two quite different APIs here:
- A way for one goroutine to wait for others to complete.
- A way for several goroutines to wait for each other to change computation "phase".
The former is what's being asked for in the initial post in this thread
http://groups.google.com/group/golang-nuts/browse_thread/thread/b26b7e519ef9308
and what Russ proposes above.
The latter is what's being provided by leimy2k above, and is a more convention
semantic for something called a "Barrier".
Both are useful, but for quite different things. I'd suggest that the
former is a more common requirement. The semantics for the latter
need some thinking about (perhaps taking some notice of X10's Clocks?
 - see http://dist.codehaus.org/x10/documentation/languagespec/x10-latest.pdf).
I've attached an implementation, based on some code I wrote some time ago,
that implements the former, with slightly more relaxed semantics than Russ suggests
(neither Add nor Wait need be called in the main goroutine; it allows several goroutines
to wait for the same collection to complete).

Attachments:

  1. waiter.go (1260 bytes)

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 9:

Submitted a proposal closer to the original idea to:
http://golang.org/cl/3770045

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 10:

By the way, I believe this proposal enables both use cases described with the same
simple API suggested by Russ.

@gopherbot
Copy link
Contributor

Comment 11 by leimy2k:

Might want to put defers around the unlocks but the gist looks pretty good.  
I still think of what I implemented when I hear or read "barrier" but I understand I may
have a different programming background than others, and it may not be what's intended.  
I will still use mine in my own namespaces if my patch is not accepted so just get the
right one for the community.

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 12:

I'll be happy to, but I can't see a reason to do so given these operations.  Will be
happy to be convinced otherwise, though.
Regarding the distinction of what you implemented, is there any use case you were
looking for which isn't handled by this interface?

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 5, 2011

Comment 13:

> By the way, I believe this proposal enables both use cases described with the same
simple API suggested by Russ.
i don't think it does.
here's an example that i came across a while ago: implementing
a cellular automaton with concurrent calculation of cells,
a classic barrier algorithm.
the central goroutine looks like this:
func worker(c0, c1 *cells, x0, x1 int, b *Barrier) {
    for {
        calc(c0, c1, x0, x1)
        b.Wait()
        calc(c1, c0, x0, x1)
        b.Wait()
    }
}
it calculates a single stripe of cells within the rectangular cell
array. there's one goroutine for each stripe, and source and
destination cell arrays alternate with each step.
it's crucial that all stripes have finished processing before
the source and destination are swapped.
i'm not sure you could easily use something like russ's proposed
API to do this. e.g. this is wrong:
func worker(c0, c1 *cells, x0, x1 int, b *Barrier) {
    for {
        b.Add(1)
        calc(c0, c1, x0, x1)
        b.Done()
        b.Wait()
        b.Add(1)
        calc(c1, c0, x0, x1)
        b.Done()
        b.Wait()
    }
}

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 5, 2011

Comment 14:

> Might want to put defers around the unlocks
no. defers are only necessary when you're running arbitrary code that
may panic, and in this case there's no such code (if the code
does panic, then it's a bug and needs fixing a.s.a.p.).
a defer does at least one allocation (currently) so it's worth
avoiding.

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 15:

It's trivial to enable it to support this case comfortably.  I've added a preset value
which enables you to do this now:
func worker(c0, c1 *cells, x0, x1 int, b *Barrier) {
    for {
        calc(c0, c1, x0, x1)
        b.DoneWait()
        calc(c1, c0, x0, x1)
        b.DoneWait()
    }
}

@gopherbot
Copy link
Contributor

Comment 16 by leimy2k:

Rog, it sounds like an optimization to avoid the defer before one knows if it's really a
problem for anyone's code.  I may not have written enough Go yet so I'll just bend to
your experience with this.
Also the example in your version can be implemented pretty easily in my proposed API
below is your example:
// e.g. to start a collection of goroutines and wait for them to complete:
//  var w Waiter
//  w.Add(10)
//  for i := 0; i < 10; i++ {
//      go func() {
//              doSomething()
//          w.Done()
//      }()
//  }
//  w.Wait()
I'd do that like this:
    b := NewBarrier(10+1)
    for i:=0; i< 10; i++ {
        go func () {
            doSomething()
            b.Wait()
        }
    }
    b.Wait()
Yes, the Done call means you don't need N+1 items in the barrier group, and that may be
a lot clearer to some people, but I think Done and Wait are the same function
fundamentally, which is why I ignored the advice to implement them to begin with :-).  
In the end I don't really care which one gets adopted, (if any are)  because I
understand how both work and can make use of either.  Also I suspect that people may
understand N calls to Done, and 1 call to Wait a little bit better than N+1 Wait calls,
but that's not how I learned to use barriers.  I suspect I've a different background
than others though.
Dave

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 17:

It's not *just* a matter of interface.  This model will block all the goroutines
from terminating while they're not all finished, for no good reason.  The
Done() + Wait() model enables doing both the MPI-style barrier you'd like to have,
and the "let me know when everyone is done" one, with a straightforward interface.

@gopherbot
Copy link
Contributor

Comment 18 by leimy2k:

MPI style barriers have one API call
MPI_Barrier(Communicator)
The difference, that you very nearly pointed out, is that the ranks in a communicator in
an MPI job actually live for the entire program's runtime.   In this case Go can have a
barrier over processes that will want to complete.
I think I'm coding to a different set of problems at this point than other people.  In
fact this makes me think I should want a "reset" function in my API so I can set the
barrier "Check in" count back to 0 for multi-stage algorithms that might need a few
barrier checkpoints over a set of goroutines.
In that case I don't want the goroutines to exit early.  I just want them to pause
before continuing their work.
I think our solutions and APIs start to diverge here :-)

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 19:

> In that case I don't want the goroutines to exit early.  I just want them
> to pause before continuing their work.
> I think our solutions and APIs start to diverge here :-)
No, they don't. Look at example above. It supports both use cases in a simple way.

@gopherbot
Copy link
Contributor

Comment 20 by leimy2k:

Ah, ok, not sure how I missed it.  Saw the implementation you've provided too.  I guess
they're mostly equivalent now in terms of flexibility.

@gopherbot
Copy link
Contributor

Comment 21 by leimy2k:

ok

@gopherbot
Copy link
Contributor

Comment 22 by leimy2k:

I'm actually really trying to come up with a useful case where saying Done without Wait
at the same time is truly all that useful.
Again, I've been programming in an environment for a long time that doesn't have Done
without Wait married to it, so don't beat me up too badly.
I've also implemented a goroutine free version of my barrier, with reset capabilities,
and tests at 
https://github.com/Leimy/Go-Barrier/tree/master/barrier2
in case anyone is interested.  I have not sent it in via codereview as an integrated
part of the Go tree.
I suspect either Rog or n13m3y3r's implementation will get in at this point, but I do
think it's neat to be able to re-use the RWMutex for the behavior I'm looking for.

@gopherbot
Copy link
Contributor

Comment 23 by leimy2k:

pthread_barrier sort of works as DoneWait as well.
http://man.freetechsecrets.com/pthread_barrier_destroy.3.html
It only has init, wait and destroy functions.

@rsc
Copy link
Contributor Author

rsc commented Jan 5, 2011

Comment 24:

leimy2k's barrier2 does not address the common case
where you do not know the number of goroutines
when you start kicking them off.
m13m3y3r's API addresses rog's complaint but only
at the cost of making the API more complex.  i am not
sure it makes sense to mix the two.  i believe it is important
to have simple APIs that do one thing well, so that they
are not easy to misuse.  in this case, if you use Preset
and Done + Wait instead of DoneWait, you have races.
that's a sign of a dangerous API.
i would like this issue to be about the case where there
is just the one waiter; the helper goroutines exit or otherwise
idle after calling Done.
the other kind of barrier can happen some other day.

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 25:

The potential to misuse the API already exists with the original proposal, with Add() +
Done() + Wait(). Then, the DoneWait() and Preset() are trivial additions which enable
pthread/MPI-style barriers at pretty much no cost.
Do you really think dropping is the right choice?

@rsc
Copy link
Contributor Author

rsc commented Jan 5, 2011

Comment 26:

Yes, I do.
It is possible to misuse Add + Done + Wait but you have
to work at it.
It is impossible to use Preset + Done + Wait correctly.
Russ

@niemeyer
Copy link
Contributor

niemeyer commented Jan 5, 2011

Comment 27:

foo.Add(n)    + foo.Wait() + foo.Done()
foo.Preset(n) + foo.Wait() + foo.Done()
Where's the extra work?
Either way, I've updated the proposal to reduce the API to the part that is not
being debated, and included some proper testing:
http://golang.org/cl/3770045
I will post these further enhancements in a separate CL once the agreed bits go in.

@gopherbot
Copy link
Contributor

Comment 28 by leimy2k:

I did change the version I have at https://github.com/Leimy/Go-Barrier to allow one to
expand the size of the Barrier group after it's created.
I know you said it was common, but I don't think it is.   If it is then MPI missed the
mark as did pthread_barrier "big time".
Dave

@gopherbot
Copy link
Contributor

Comment 29 by leimy2k:

Is what's being defined by this thread properly called a barrier then?  I've got two
examples that look more like what I implemented than the proposed API.  (pthreads and
MPI).
If this is properly a barrier, what's the other thing called? :-)  I guess that's for
another thread.  
I think both APIs are useful, but I don't think it makes sense to combine them either as
it kind of muddies either up a bit and makes it a little less clear how to use them.
Honestly I'm just glad this is such an open process.

@rsc
Copy link
Contributor Author

rsc commented Jan 6, 2011

Comment 30:

What's common in MPI need not be common in Go or vice versa.

@gopherbot
Copy link
Contributor

Comment 31 by leimy2k:

Sure, Go's barrier can be very different from pthread_barrier, MPI_Barrier, Java's
CyclicBarrier
(http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/CyclicBarrier.html)
etc.  All of these will work a little differently from Go's per this proposal.  My
argument is that this may confuse people.
I did just find that boost has a simple thread barrier like the API I proposed.
http://www.boost.org/doc/libs/1_33_1/boost/thread/barrier.hpp
Here is a link to what looks like a rather clever lazy Haskell barrier as well, also not
quite what is proposed here,
http://hpaste.org/11022
Of course goroutines are not quite threads, or MPI ranks, or Java threads, or boost
threads, or Haskell sparks,so perhaps they do deserve different treatment.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 32:

> i would like this issue to be about the case where there
> is just the one waiter; the helper goroutines exit or otherwise
> idle after calling Done.
in that case, i think calling the type "Barrier" is asking for trouble,
because it doesn't correspond very well at all to the usual semantics
attached to such a name.
"Waiter" might not be good either - suggestions?

@rsc
Copy link
Contributor Author

rsc commented Jan 6, 2011

Comment 33:

calling it a WaitGroup would be fine with me.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 34:

how about this?
http://golang.org/cl/3878041
it's substantially the same as niemeyer's, except that it uses a single channel rather
than an array of Mutexes, and doesn't allow a negative argument to Add.

@gopherbot
Copy link
Contributor

Comment 35 by leimy2k:

Why not something like "sync.WaitAll"?  Could make a WaitAll object, add stuff to it,
then call "Wait", while the workers call "Done".
One could implement a WaitAny policy too as a different mechanism.

@gopherbot
Copy link
Contributor

Comment 36 by leimy2k:

Some very fast thoughts...
sync.All  -> this thread's implementation
sync.Any -> like this thread, but Wait completes if any have called Done
sync.Barrier  -> a true barrier
Each has at least "Wait", and can be used with an interface of "Waiter" for that Wait
API. 
All and Any will additionally have "Add" functions to add to the group dynamically. 
Maybe those could also be WaiterAdd interface implementing or something like that. 
Anything that's a WaiterAdd is automatically a Waiter.
That would let people change synchronization types pretty fast.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 37:

Any is trivially implemented with a buffered channel (which generalises easily to
waiting for any N completions)
WaitGroup seems ok to me. type names are generally nouns, which WaitAll is not.

@gopherbot
Copy link
Contributor

Comment 38 by leimy2k:

I'm glad Any is easy, and I can immediately see what you're saying, but in terms of
abstracting these to a more generic "Waiter" interface, I was thinking of including it
conceptually.   It's not that hard to lift that trivial implementation into an interface
conforming type.
As for the verb/noun convention thing, I totally agree with you, which is why I changed
my mind to make it more like 3 types
sync.All
sync.Any
sync.Barrier
All, Any and Barrier implement Waiter
Any and All implement WaiterPlus

@niemeyer
Copy link
Contributor

niemeyer commented Jan 6, 2011

Comment 39:

http://golang.org/cl/3770045/ updated:
- Now based on semaphores (faster, reduced memory use).
- Renamed to WaitGroup as requested.
- Improved testing.

@gopherbot
Copy link
Contributor

Comment 40 by leimy2k:

I suppose they could all implement WaiterPlus really (Wait and Add) and it could just be
called Waiter.  As Russ said, Go doesn't have to be like all the other APIs exactly. 
But I'd like to be able to initialize a WaitGroup, Any, All, or Barrier style
synchronizer with an initial count should I know it up front.
This is still too fluid in my mind perhaps :-)

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 41:

@niemeyer
that seems better - i was essentially using a channel as a semaphore anyway, and using
the semaphore primitive will definitely be faster.
the two implementations are even more similar now :-)
the principal difference is that your implementation allows a negative argument to Add.
i can't really see that it's useful to do that, and it means that the error reporting
can't be so explicit.
also the documentation is a bit lacking. if you want to take the comments and Add
semantics from my version, then i'd be happy. or i could do it the other way around if
you like.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 42:

> I suppose they could all implement WaiterPlus really (Wait and Add) and it could just
be called Waiter.
is there really a significant advantage to having these three quite different
synchronisation primitives implement the same interface? i can't see that that an
algorithms written using one would allow changing for another without breaking.

@niemeyer
Copy link
Contributor

niemeyer commented Jan 6, 2011

Comment 43:

Improved documentation.  Thanks Roger.

@gopherbot
Copy link
Contributor

Comment 44 by leimy2k:

>> I suppose they could all implement WaiterPlus really (Wait and Add) and it could just
be called Waiter.
> is there really a significant advantage to having these three quite different
synchronisation primitives implement 
> the same interface? i can't see that that an algorithms written using one would allow
changing for another 
> without breaking.
I'm hard pressed to come up with a situation, other than some kind of heavy refactoring
of an algorithm where one would want to yank out a WaitAll style policy and replace with
a WaitAny style policy.  I suppose the only advantage you get is the potential to avoid
extra code maintenance by changing all the types used throughout a big stack of API
calls.  Maybe that's not enough, and makes the idea of putting these in an interface
reckless.
It's not really the same as say implementing Reader, where you don't care if it comes
from a File or the network, as it's really a Wait will behave differently based on the
policy implementing object you choose to feed a function requiring a Waiter.
I suppose you're correct that it's not an advantage, and that it's best to be explicit
about such policies.  Seems like too much inversion of control being injected
potentially deeply within a higher order algorithm :-)

@rogpeppe
Copy link
Contributor

rogpeppe commented Jan 6, 2011

Comment 45:

@neimeyer I'm still not entirely sure about that documentation. I liked Russ's original
form of words. One more iteration - I've updated mine to include some of your
improvements. We'll get there :-)
http://golang.org/cl/3878041

@niemeyer
Copy link
Contributor

niemeyer commented Jan 6, 2011

Comment 46:

I'd appreciate if you could stop picking changes out of the merge proposal. If you
want to do this yourself, I'll be happy to stop and review your code, but it's
pointless for me to be pushing this while you copy changes out.

@gopherbot
Copy link
Contributor

Comment 47 by jeff.allen:

I think this issue was fixed in:
  http://code.google.com/p/go/source/detail?r=85f202d81c64

@rsc
Copy link
Contributor Author

rsc commented Feb 16, 2011

Comment 48:

Status changed to Fixed.

@rsc rsc added fixed labels Feb 16, 2011
@rsc rsc self-assigned this Feb 16, 2011
@golang golang locked and limited conversation to collaborators Jun 24, 2016
@rsc rsc removed their assignment Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants