-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
we're facing an issue with kafka poll functionality and in particular we suspect that the culprit is ensure_coordinator_ready function called by the _coordinator.poll()
we're using robot framework so unfortunately we're not able to have a good amount of logs, but got these messages printed in an infinite loop:
10:36:19.658 INFO <BrokerConnection node_id=**** host=**** [IPv4 ('', )]>: connecting to **** [('', ) IPv4]
10:36:19.765 INFO <BrokerConnection node_id= host= [IPv4 ('', )]>: Connection complete.
10:36:19.886 ERROR <BrokerConnection node_id= host= [IPv4 ('', )]>: socket disconnected
10:36:19.900 INFO <BrokerConnection node_id= host= [IPv4 ('****', ****)]>: Closing connection. KafkaConnectionError: socket disconnected
10:36:19.905 ERROR Error sending GroupCoordinatorRequest_v0 to node **** [KafkaConnectionError: socket disconnected]
After checking the kafka python code we noticed that the functions here
https://github.com/dpkp/kafka-python/blob/master/kafka/coordinator/base.py#L241C9-L241C33
doesn't have an exit point from the while loop and neither have an option to pass a timeout parameter.
Can this be improved/fixed?