Node fails to start with error access to vhost '/' refused for user 'XXXX': vhost '/' is down #10052
-
I have deployed rabbitmq as a statefulset on kubernetes cluster, there I am frequently facing error access to vhost '/' refused for user 'XXXX': vhost '/' is down. Rabbitmq version: 3.9.7 on Erlang 24.1.1 [jit] Error stack: On restarting the pod/container the issue is resolved. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
RabbitMQ 3.9 has reached EOL. You will not get any further support from the core team unless you upgrade to the latest supported version.
is the log line you are looking for. Something prevents a message store from starting in the first boot. That something is an environment-specific problem, maybe the storage device is not ready initially on pod startup, or something like that. |
Beta Was this translation helpful? Give feedback.
-
I recall discussing this with another core team member and our conclusion was the following: if the mounted volume is not yet ready for writes by the time the node boots, it will fail to seed the data and the virtual host then will fail to stop. This is pretty clearly hinted at by one of the function names: We have never seen this behavior outside of Kubernetes, and RabbitMQ nodes do not do anything creative when it comes to initializing the schema data store or the CQ message store. So there is nothing to "fix once and for all" in RabbitMQ. A while ago we have considered adding optional delays before and after the node boots, for very different reasons. The former might help here. Or you can inject a startup pause using a Kubernetes-specific method, e.g. an init container that would verify that the volume is ready (writeable). To our knowledge, this behavior was never reported by those who use our Kubernetes Cluster Operator. Most likely because it introduces a startup delay to work around a widely known unfortunate CoreDNS caching behavior/default. So, a similar delay will likely help with volumes not being ready early enough. |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for your reply. I didnt check discussion topic.
I have also change the mountdir from /var/lib/rabbitmq/mnesia to /var/lib/rabbitmq and bind to my NFS on the docker host.
I will try this week these fix and I will give you a feedback. |
Beta Was this translation helpful? Give feedback.
I recall discussing this with another core team member and our conclusion was the following: if the mounted volume is not yet ready for writes by the time the node boots, it will fail to seed the data and the virtual host then will fail to stop.
This is pretty clearly hinted at by one of the function names:
rabbit_variable_queue:do_start_msg_store/4
.We have never seen this behavior outside of Kubernetes, and RabbitMQ nodes do not do anything creative when it comes to initializing the schema data store or the CQ message store. So there is nothing to "fix once and for all" in RabbitMQ.
A while ago we have considered adding optional delays before and after the node boots, for very different…