Skip to content

Optimise ring.DoBatch() #1624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 5, 2019
Merged

Optimise ring.DoBatch() #1624

merged 11 commits into from
Sep 5, 2019

Conversation

bboreham
Copy link
Contributor

@bboreham bboreham commented Aug 28, 2019

ring.DoBatch contributes 20% of the memory allocation in my production distributors.

I started looking at this and realised the existing benchmark was not exercising the functionality - the cas() didn't update the ring so it was actually empty, and some test fixture code was also quite far up the profile.

Having fixed all that and made the benchmark call the higher-level function DoBatch(), the numbers before are:

BenchmarkBatch10x100-2     	   10000	    175156 ns/op	   56648 B/op	     334 allocs/op
BenchmarkBatch100x100-2    	    5000	    256146 ns/op	   87675 B/op	     720 allocs/op
BenchmarkBatch100x1000-2   	    1000	   1838020 ns/op	  578726 B/op	    3278 allocs/op

after:

BenchmarkBatch10x100-2     	   10000	    149385 ns/op	   12061 B/op	      29 allocs/op
BenchmarkBatch100x100-2    	   10000	    208285 ns/op	   30909 B/op	     238 allocs/op
BenchmarkBatch100x1000-2   	    1000	   1558331 ns/op	  112601 B/op	     217 allocs/op

The ReadRing function signatures have changed a bit, which may impact downstream projects using this code.

@tomwilkie
Copy link
Contributor

Nice set of optimisations! Sorry I've been so nit picky.

Once small concern: we're now taking a (read) lock in a loop. Not super concerned about consistency (the ring might update mid processing a batch) but do you think this will impact CPU usage?

@bboreham
Copy link
Contributor Author

bboreham commented Sep 5, 2019

do you think this will impact CPU usage?

The benchmarks show that CPU overall comes down. Profiling suggests that sort.Search() looking for each token is far more CPU-intensive than the lock.

I have addressed review comments; PTAL.

Copy link
Contributor

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I've given a shot at optimising this a little while ago but couldn't get any gains, this is a nice approach!

Signed-off-by: Bryan Boreham <[email protected]>
@bboreham bboreham merged commit efbcc63 into master Sep 5, 2019
@bboreham bboreham deleted the optimise-dobatch branch September 5, 2019 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants