-
Notifications
You must be signed in to change notification settings - Fork 18k
add sync.Barrier? #1382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Naive library based API implementation? Seems to work with a very basic test. package barrier import ( "sync" ) type Group struct { count int // number of checkins total int signalin chan bool waiter * sync.RWMutex } func controller (g * Group) { for g.count != g.total { <- g.signalin g.count++ } g.waiter.Unlock() } func NewGroup (size int) (* Group) { g := &Group{0,size, make(chan bool, size), &sync.RWMutex{}} g.waiter.Lock() go controller(g) return g } func (g * Group) Wait () { g.signalin <- true; g.waiter.RLock() } |
Perhaps a better test: package barrier import ( "testing" ) func worker (g * Group, t *testing.T, bc chan <- bool) { t.Log("<---\n") g.Wait() t.Log("--->\n") bc <- true } func Test1(t *testing.T) { bc := make(chan bool) g := NewGroup(10) for i := 0; i < 10; i++ { // we are the last worker go worker(g,t, bc) } for i := 0; i < 10; i++ { <- bc } } |
And no I didn't follow the original directions to make an API of Add, Done and Wait, because I don't think it makes a lot of sense :-). Then again, I'm coming from an MPI programming background where you have MPI_Barrier over a communicator, which is really a group of ranks (processes), that all must call MPI_Barrier for any of them to escape the barrier. |
I implemented a Barrier with the following API for Go. NewGroup(size) AddN Wait. https://github.com/Leimy/Go-Barrier There's gotest based unit tests that show how this works. It's not precisely what's stated above, but I'm not sure if what's stated above matches what I've read people have asked for on the golang-nuts group posts which is "the last guy turns off the light". I implemented that. |
https://github.com/Leimy/Go-Barrier implements NewGroup(Size) AddN(Size) Wait() And some examples of how Wait is barrier synchronization are in the barrier_test.go source file. Sorry for so many edits. I've been jumping around tasks and doing a bad job of keeping this stuff clean :-( |
Not sure I did the codereview correctly but here we go: http://golang.org/cl/3769044/ |
AFAICS there are two quite different APIs here: - A way for one goroutine to wait for others to complete. - A way for several goroutines to wait for each other to change computation "phase". The former is what's being asked for in the initial post in this thread http://groups.google.com/group/golang-nuts/browse_thread/thread/b26b7e519ef9308 and what Russ proposes above. The latter is what's being provided by leimy2k above, and is a more convention semantic for something called a "Barrier". Both are useful, but for quite different things. I'd suggest that the former is a more common requirement. The semantics for the latter need some thinking about (perhaps taking some notice of X10's Clocks? - see http://dist.codehaus.org/x10/documentation/languagespec/x10-latest.pdf). I've attached an implementation, based on some code I wrote some time ago, that implements the former, with slightly more relaxed semantics than Russ suggests (neither Add nor Wait need be called in the main goroutine; it allows several goroutines to wait for the same collection to complete). Attachments:
|
Submitted a proposal closer to the original idea to: http://golang.org/cl/3770045 |
Might want to put defers around the unlocks but the gist looks pretty good. I still think of what I implemented when I hear or read "barrier" but I understand I may have a different programming background than others, and it may not be what's intended. I will still use mine in my own namespaces if my patch is not accepted so just get the right one for the community. |
> By the way, I believe this proposal enables both use cases described with the same simple API suggested by Russ. i don't think it does. here's an example that i came across a while ago: implementing a cellular automaton with concurrent calculation of cells, a classic barrier algorithm. the central goroutine looks like this: func worker(c0, c1 *cells, x0, x1 int, b *Barrier) { for { calc(c0, c1, x0, x1) b.Wait() calc(c1, c0, x0, x1) b.Wait() } } it calculates a single stripe of cells within the rectangular cell array. there's one goroutine for each stripe, and source and destination cell arrays alternate with each step. it's crucial that all stripes have finished processing before the source and destination are swapped. i'm not sure you could easily use something like russ's proposed API to do this. e.g. this is wrong: func worker(c0, c1 *cells, x0, x1 int, b *Barrier) { for { b.Add(1) calc(c0, c1, x0, x1) b.Done() b.Wait() b.Add(1) calc(c1, c0, x0, x1) b.Done() b.Wait() } } |
> Might want to put defers around the unlocks no. defers are only necessary when you're running arbitrary code that may panic, and in this case there's no such code (if the code does panic, then it's a bug and needs fixing a.s.a.p.). a defer does at least one allocation (currently) so it's worth avoiding. |
Rog, it sounds like an optimization to avoid the defer before one knows if it's really a problem for anyone's code. I may not have written enough Go yet so I'll just bend to your experience with this. Also the example in your version can be implemented pretty easily in my proposed API below is your example: // e.g. to start a collection of goroutines and wait for them to complete: // var w Waiter // w.Add(10) // for i := 0; i < 10; i++ { // go func() { // doSomething() // w.Done() // }() // } // w.Wait() I'd do that like this: b := NewBarrier(10+1) for i:=0; i< 10; i++ { go func () { doSomething() b.Wait() } } b.Wait() Yes, the Done call means you don't need N+1 items in the barrier group, and that may be a lot clearer to some people, but I think Done and Wait are the same function fundamentally, which is why I ignored the advice to implement them to begin with :-). In the end I don't really care which one gets adopted, (if any are) because I understand how both work and can make use of either. Also I suspect that people may understand N calls to Done, and 1 call to Wait a little bit better than N+1 Wait calls, but that's not how I learned to use barriers. I suspect I've a different background than others though. Dave |
It's not *just* a matter of interface. This model will block all the goroutines from terminating while they're not all finished, for no good reason. The Done() + Wait() model enables doing both the MPI-style barrier you'd like to have, and the "let me know when everyone is done" one, with a straightforward interface. |
MPI style barriers have one API call MPI_Barrier(Communicator) The difference, that you very nearly pointed out, is that the ranks in a communicator in an MPI job actually live for the entire program's runtime. In this case Go can have a barrier over processes that will want to complete. I think I'm coding to a different set of problems at this point than other people. In fact this makes me think I should want a "reset" function in my API so I can set the barrier "Check in" count back to 0 for multi-stage algorithms that might need a few barrier checkpoints over a set of goroutines. In that case I don't want the goroutines to exit early. I just want them to pause before continuing their work. I think our solutions and APIs start to diverge here :-) |
I'm actually really trying to come up with a useful case where saying Done without Wait at the same time is truly all that useful. Again, I've been programming in an environment for a long time that doesn't have Done without Wait married to it, so don't beat me up too badly. I've also implemented a goroutine free version of my barrier, with reset capabilities, and tests at https://github.com/Leimy/Go-Barrier/tree/master/barrier2 in case anyone is interested. I have not sent it in via codereview as an integrated part of the Go tree. I suspect either Rog or n13m3y3r's implementation will get in at this point, but I do think it's neat to be able to re-use the RWMutex for the behavior I'm looking for. |
pthread_barrier sort of works as DoneWait as well. http://man.freetechsecrets.com/pthread_barrier_destroy.3.html It only has init, wait and destroy functions. |
leimy2k's barrier2 does not address the common case where you do not know the number of goroutines when you start kicking them off. m13m3y3r's API addresses rog's complaint but only at the cost of making the API more complex. i am not sure it makes sense to mix the two. i believe it is important to have simple APIs that do one thing well, so that they are not easy to misuse. in this case, if you use Preset and Done + Wait instead of DoneWait, you have races. that's a sign of a dangerous API. i would like this issue to be about the case where there is just the one waiter; the helper goroutines exit or otherwise idle after calling Done. the other kind of barrier can happen some other day. |
foo.Add(n) + foo.Wait() + foo.Done() foo.Preset(n) + foo.Wait() + foo.Done() Where's the extra work? Either way, I've updated the proposal to reduce the API to the part that is not being debated, and included some proper testing: http://golang.org/cl/3770045 I will post these further enhancements in a separate CL once the agreed bits go in. |
I did change the version I have at https://github.com/Leimy/Go-Barrier to allow one to expand the size of the Barrier group after it's created. I know you said it was common, but I don't think it is. If it is then MPI missed the mark as did pthread_barrier "big time". Dave |
Is what's being defined by this thread properly called a barrier then? I've got two examples that look more like what I implemented than the proposed API. (pthreads and MPI). If this is properly a barrier, what's the other thing called? :-) I guess that's for another thread. I think both APIs are useful, but I don't think it makes sense to combine them either as it kind of muddies either up a bit and makes it a little less clear how to use them. Honestly I'm just glad this is such an open process. |
Sure, Go's barrier can be very different from pthread_barrier, MPI_Barrier, Java's CyclicBarrier (http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/CyclicBarrier.html) etc. All of these will work a little differently from Go's per this proposal. My argument is that this may confuse people. I did just find that boost has a simple thread barrier like the API I proposed. http://www.boost.org/doc/libs/1_33_1/boost/thread/barrier.hpp Here is a link to what looks like a rather clever lazy Haskell barrier as well, also not quite what is proposed here, http://hpaste.org/11022 Of course goroutines are not quite threads, or MPI ranks, or Java threads, or boost threads, or Haskell sparks,so perhaps they do deserve different treatment. |
> i would like this issue to be about the case where there > is just the one waiter; the helper goroutines exit or otherwise > idle after calling Done. in that case, i think calling the type "Barrier" is asking for trouble, because it doesn't correspond very well at all to the usual semantics attached to such a name. "Waiter" might not be good either - suggestions? |
how about this? http://golang.org/cl/3878041 it's substantially the same as niemeyer's, except that it uses a single channel rather than an array of Mutexes, and doesn't allow a negative argument to Add. |
Some very fast thoughts... sync.All -> this thread's implementation sync.Any -> like this thread, but Wait completes if any have called Done sync.Barrier -> a true barrier Each has at least "Wait", and can be used with an interface of "Waiter" for that Wait API. All and Any will additionally have "Add" functions to add to the group dynamically. Maybe those could also be WaiterAdd interface implementing or something like that. Anything that's a WaiterAdd is automatically a Waiter. That would let people change synchronization types pretty fast. |
I'm glad Any is easy, and I can immediately see what you're saying, but in terms of abstracting these to a more generic "Waiter" interface, I was thinking of including it conceptually. It's not that hard to lift that trivial implementation into an interface conforming type. As for the verb/noun convention thing, I totally agree with you, which is why I changed my mind to make it more like 3 types sync.All sync.Any sync.Barrier All, Any and Barrier implement Waiter Any and All implement WaiterPlus |
http://golang.org/cl/3770045/ updated: - Now based on semaphores (faster, reduced memory use). - Renamed to WaitGroup as requested. - Improved testing. |
I suppose they could all implement WaiterPlus really (Wait and Add) and it could just be called Waiter. As Russ said, Go doesn't have to be like all the other APIs exactly. But I'd like to be able to initialize a WaitGroup, Any, All, or Barrier style synchronizer with an initial count should I know it up front. This is still too fluid in my mind perhaps :-) |
@niemeyer that seems better - i was essentially using a channel as a semaphore anyway, and using the semaphore primitive will definitely be faster. the two implementations are even more similar now :-) the principal difference is that your implementation allows a negative argument to Add. i can't really see that it's useful to do that, and it means that the error reporting can't be so explicit. also the documentation is a bit lacking. if you want to take the comments and Add semantics from my version, then i'd be happy. or i could do it the other way around if you like. |
> I suppose they could all implement WaiterPlus really (Wait and Add) and it could just be called Waiter. is there really a significant advantage to having these three quite different synchronisation primitives implement the same interface? i can't see that that an algorithms written using one would allow changing for another without breaking. |
>> I suppose they could all implement WaiterPlus really (Wait and Add) and it could just be called Waiter. > is there really a significant advantage to having these three quite different synchronisation primitives implement > the same interface? i can't see that that an algorithms written using one would allow changing for another > without breaking. I'm hard pressed to come up with a situation, other than some kind of heavy refactoring of an algorithm where one would want to yank out a WaitAll style policy and replace with a WaitAny style policy. I suppose the only advantage you get is the potential to avoid extra code maintenance by changing all the types used throughout a big stack of API calls. Maybe that's not enough, and makes the idea of putting these in an interface reckless. It's not really the same as say implementing Reader, where you don't care if it comes from a File or the network, as it's really a Wait will behave differently based on the policy implementing object you choose to feed a function requiring a Waiter. I suppose you're correct that it's not an advantage, and that it's best to be explicit about such policies. Seems like too much inversion of control being injected potentially deeply within a higher order algorithm :-) |
@neimeyer I'm still not entirely sure about that documentation. I liked Russ's original form of words. One more iteration - I've updated mine to include some of your improvements. We'll get there :-) http://golang.org/cl/3878041 |
I think this issue was fixed in: http://code.google.com/p/go/source/detail?r=85f202d81c64 |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: