
Let's say the main RabbitMQ log file contains the net_tick_timeout.
=ERROR REPORT==== 1-Oct-2020::00:39:53 ===
** Node rabbit@server1 not responding **
** Removing (timedout) connection **
=INFO REPORT==== 1-Oct-2020::00:39:53 ===
rabbit on node rabbit@server1 down
=INFO REPORT==== 1-Oct-2020::00:39:53 ===
node rabbit@server1 down: net_tick_timeout
The main rabbitmq conf file may have the net_ticktime defined. If not defined, the net_ticktime defaults to 60 seconds.
net_ticktime = 60
net_tick_timeout occurs when:
- One RabbitMQ node in a cluster issues a TCP connection to another RabbitMQ node in the cluster
- The net_ticktime duration is reached (60 seconds by default) before the TCP request is acknowledged
When net_tick_timeout occurs, the node is removed from the cluster. The rabbitmqctl cluster_status command can be used to view the current nodes in the cluster. In this example, "nodes" shows that there are 3 nodes in the cluster, and "running_nodes" shows that only one of the nodes is "running".
~]# rabbitmqctl cluster_status
Cluster status of node rabbit@server1
[{nodes,[{disc,[rabbit@server1,rabbit@server2,rabbit@server3]}]},
{running_nodes,[rabbit@server1]},
{cluster_name,<<"rabbit@server1.example.com">>},
{partitions,[]},
{alarms,[{rabbit@server2,[]},{rabbit@server3,[]}]}]
The rabbitmqctl node_health_check command can be used.
~]# rabbitmqctl node_health_check
Timeout: 70.0 seconds
Checking health of node rabbit@server1
Health check passed
In this scenario, you can try to restart the node, and then reissue the rabbitmqctl cluster_status command to see if the node is detected as a "running node".
Did you find this article helpful?
If so, consider buying me a coffee over at