Open
Description
Java client has wakeup() functions that can be used to break out of a poll()
and can be called from another thread. It's needed for some unique cases like the process pool processing example, where we have 1 consumer and a pool of workers.
Activity
dpkp commentedon Oct 10, 2017
kafka-python supports
wakeup
via a socketpair in KafkaClient. This allows threads sharing a KafkaClient instance that is blocked waiting for IO to signal that wakeup is desired for other processing.There is also a related, but slightly different, "wakeup" feature in the java KafkaConsumer class. This allows threads that share a KafkaConsumer instance to trigger a WakeupException to be raised from the next consumer.poll() call. As far as I can tell this is used primarily to signal a looping consumer that it should shutdown. But I think the same could be done with simple external concurrency primitives, like a shared threading.Event() perhaps.
You filed this issue wrt the second (KafkaConsumer.wakeup()), correct?
tvoinarovskyi commentedon Oct 11, 2017
Some time ago I wrote this https://gist.github.com/tvoinarovskyi/05a5d083a0f96cae3e9b4c2af580be74 gist. It lets the consumer delegate consumed requests to queues and consume from those queues in separate threads. The upside here is that we pause any partitions that have data in queues still pending for processing, so we can basically have a background heartbeat by calling poll(0) with all partitions paused.
The problem here is that when a thread finishes processing data from a queue, it needs to notify the consumer, so it will unpause this partition. In Java, the
wakeup
method can be used to do that, but in kafka-python I can't do it with the public API.