Skip to content

Memory leak when using Task#with_timeout #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jpaulgs opened this issue Sep 1, 2022 · 13 comments
Closed

Memory leak when using Task#with_timeout #176

jpaulgs opened this issue Sep 1, 2022 · 13 comments

Comments

@jpaulgs
Copy link

jpaulgs commented Sep 1, 2022

#!/usr/bin/env ruby

require 'async'
require 'async/queue'

@input_channel = Async::LimitedQueue.new(100)

def consumer
  Async do |task|
    loop do
      task.with_timeout(0.001) do
        @input_channel.dequeue
      rescue Async::TimeoutError
        # no-op
      end
    end
  end
end

task = consumer

task.wait

Hey, I'm running ruby 3.1.2 and version 2.0.3 of the Async rubygem. The code above is similar to what we are using. It exhibits a memory leak. The following photo was from running this for about 15 minutes, it certainly never releases the memory.

image

I'll keep digging to see what I can find out.

@jpaulgs
Copy link
Author

jpaulgs commented Sep 1, 2022

Actually the LimitedQueue is irrelevant here, the following code also shows the issue

#!/usr/bin/env ruby

require 'async'

def consumer
  Async do |task|
    loop do
      task.with_timeout(0.001) do
        # no-op
      rescue Async::TimeoutError
        # no-op
      end
    end
  end
end

task = consumer

task.wait

image

@ioquatix
Copy link
Member

ioquatix commented Sep 1, 2022

Hmm, thanks for reporting this, I'll take a look.

@ioquatix
Copy link
Member

ioquatix commented Sep 2, 2022

#!/usr/bin/env ruby

require 'async'
require 'memory'

def consumer
	Async do |task|
		report = Memory.report do
			1000000.times do
				task.with_timeout(0.001) do
					# no-op
				rescue Async::TimeoutError
					# no-op
				end
			end
		end
		report.print
		binding.irb
	end
end

task = consumer

task.wait

Memory Profile

  • Total Allocated: (28.00 MiB in 4000000 allocations)
  • Total Retained: (7.00 MiB in 1000000 allocations)

By Gem (288.00 MiB in 4000000 allocations)

  • (288.00 MiB in 4000000 allocations) timers-4.3.3

By File (288.00 MiB in 4000000 allocations)

  • (160.00 MiB in 2000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/timer.rb
  • (88.00 MiB in 1000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/group.rb
  • (40.00 MiB in 1000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/events.rb

By Location (288.00 MiB in 4000000 allocations)

  • (160.00 MiB in 2000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/timer.rb:38
  • (88.00 MiB in 1000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/group.rb:60
  • (40.00 MiB in 1000000 allocations) /home/samuel/.gem/ruby/3.1.2/gems/timers-4.3.3/lib/timers/events.rb:72

By Class (288.00 MiB in 4000000 allocations)

  • (160.00 MiB in 2000000 allocations) Proc
  • (88.00 MiB in 1000000 allocations) Timers::Timer
  • (40.00 MiB in 1000000 allocations) Timers::Events::Handle

@ioquatix
Copy link
Member

ioquatix commented Sep 2, 2022

I see what is going on.

https://github.com/socketry/timers/blob/64a745a8b2f28849786c8c683ed2a4ef2eb0ae47/lib/timers/events.rb#L71

In order to avoid scheduling timeouts which is expensive, we put them into a queue. Most timeouts are short lived and don't fire because the operation happens immediately.

e.g.

task.with_timeout(0.1) do
  io.read
end

most of the time, io.read returns immediately, so we want the cost of the timeout to be minimal. So we defer as much work as possible, because it also makes cleaning up expensive.

Basically, with_timeout is a no-op unless some operation actually blocks.

There is a @queue which stores these as yet processed timers, and they get processed only on the next call to wait.

https://github.com/socketry/timers/blob/64a745a8b2f28849786c8c683ed2a4ef2eb0ae47/lib/timers/events.rb#L107

However, if you have a tight loop, this queue can grow indefinitely.

I think the solution is just to flush the queue after it gets to a certain size, probably 100 or 1000. The goal is to avoid adding a huge overhead, while also minimising the cost of inserting timers.

@jpaulgs
Copy link
Author

jpaulgs commented Sep 2, 2022

I was starting to focus on @timers = Set.new
in timers/group.rb

#!/usr/bin/env ruby

require 'async'

def consumer
  Async do |task|
    loop do
      task.with_timeout(0.001) do |timer|
        timer.cancel
        print '.'
      rescue Async::TimeoutError

      end
    end
  end
end

task = consumer

task.wait

Which does seem to stabilise the memory usage but that may have more todo with the IO (which also seems to be required)

@ioquatix
Copy link
Member

ioquatix commented Sep 2, 2022

Any time you context switch e.g. on IO, the timers queue is flushed. We just need to flush it more often.

@ZimbiX
Copy link

ZimbiX commented Sep 2, 2022

Thanks for the speedy assistance, @ioquatix! ❤️

Should we be adding the timer.cancel line to our code, or are you working on / thinking about doing that internally? I see you've just put up a branch of timers here, which looks to be addressing the flushing aspect (cheers!), but not the cancel (yet)?

(I've been debugging our memory leak with @jpaulgs and @oeoeaio)

@ioquatix
Copy link
Member

ioquatix commented Sep 2, 2022

I'm going to release an update to fix the issue in the timers gem.

@ioquatix
Copy link
Member

ioquatix commented Sep 2, 2022

socketry/timers@1ec78fb is the initial attempt at a fix, without introducing too much overhead.

To be honest, using a linked list here would make way more sense. My plan has always been to replace the timers gem with C implementation in io-event gem, I just haven't got around to it yet.

This should avoid the worst case, but it won't avoid the case where you create lots of timers and interleave cancelled and non-cancelled timers. We could do more work in flush!... doing a lot of scans would mitigate a lot of the benefit.

@oeoeaio
Copy link

oeoeaio commented Oct 27, 2022

We have tested the changes to the timers gem and while they do appear to remove the accumulation of Timers::Timer, there is still a memory leak in our code (which uses a LimitedQueue as per the OP). Investigating further, we identified this line as the culprit. It appears that when used in conjunction with a Task.with_timeout, instances of Queue (and LimitedQueue) will accumulate fibers in their @waiting list in the absence of any other fiber signalling the Queue, leading to a memory leak.

We've made an attempt to fix this issue in #186

@ioquatix
Copy link
Member

Thanks for this update.

@ioquatix ioquatix mentioned this issue Oct 29, 2022
3 tasks
@ioquatix
Copy link
Member

ioquatix commented Jan 6, 2023

The latest release should address the memory leaks.

@ioquatix
Copy link
Member

ioquatix commented Jun 7, 2023

I am going to close this issue. Based on the reproduction scripts, the issue no longer exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants