4.1.x: a priority classic queue runs into an exception in rabbit_variable_queue:d/1 #14228
Replies: 8 comments 6 replies
-
@klemenStanic we cannot suggest much without a way to reproduce. Even with a simulation of a workload using "Use a quorum queue, they use completely different storage from CQs" is the only recommendation I have. |
Beta Was this translation helpful? Give feedback.
-
According to the log, this is a CQ with priorities, the stack trace originates in
|
Beta Was this translation helpful? Give feedback.
-
#13856 and #12848 could potentially be relevant but besides the |
Beta Was this translation helpful? Give feedback.
-
@klemenStanic have you use an older version of RabbitMQ before and it worked without issues? If so, what version was it? |
Beta Was this translation helpful? Give feedback.
-
just throwing in some more chips of information
segments of the queue index (of prio4)
(segmets sorted in a friendlier format
The queue storage has the last message with seq_id 1806336 in its cache
These are just random observations but it looks like it is the |
Beta Was this translation helpful? Give feedback.
-
Since my last update, we've tried to completely wipe all rabbitmq (/var/lib/rabbitmq/mnesia) data and started ingesting from stratch, with the same configuration. Some queues were still crashing, even non-priority ones. |
Beta Was this translation helpful? Give feedback.
-
The very first crash log would be good to have (if different than what you provided). Otherwise I wonder if some message got acked twice. |
Beta Was this translation helpful? Give feedback.
-
Sorry for the late response. We've identified a problem in our celery application, causing the workers to unexpectedly shut down (with sigkill signal). Since we addressed this issue, there have been no further queue crashes. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
We are experiencing an issue where a specific queue (classic) in our RabbitMQ server, used with Celery, frequently enters a crashed state. Other queues continue to function normally. Restarting the queue or the server temporarily resolves the issue, but it recurs after some time.
Celery throws this error:
Rabbitmq log shows this:
RabbitMQ version: 4.1.2
Erlang version: 27.3.4.1
Celery version: 5.5.2
Kombu version: 5.5.3
OS: linux/amd64
Clustered: no
Any plugins enabled: rabbitmq_delayed_message_exchange,rabbitmq_management,rabbitmq_prometheus,rabbitmq_shovel,rabbitmq_shovel_management
During the crash, there is no abnormal system resources usage.
Attaching the Server state at the time of the crash (from /var/log/[email protected]) and the corresponding erl_crash.dump.
rabbitmq_crash_log_subset.txt
erl_crash.dump.txt
Reproduction steps
Can't reliably reproduce.
Expected behavior
The queue not crashing.
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions