node offline

Attempt to update a cluster from 3.0.5 to 3.5.3-x version failed with 'Internal error' in UI and shortly after that cluster nodes started to be displayed as 'Offline' and cluster state is 'Unavailable' on main WebCP dashboard.

File /usr/libexec/vstorage-ui-backend/.rnd is owned by root user:

[root@node01 ~]# ll /usr/libexec/vstorage-ui-backend/.rnd
-rw------- 1 root root 1024 May 21 00:47 /usr/libexec/vstorage-ui-backend/.rnd

 

The following behavior is observed in Acronis Cyber Infrastructure (ACI) cluster:

1. Alerts "Node is offline" suddenly appear for all nodes in a cluster for a short period of time. During this time, status of the Storage is displayed as 'Unavailable' per WebCP dashboard.

2. Cluster is displayed as 'Healthy' when checking status of the cluster via CLI.

3. No any real interruptions of storage services are noticed while these alerts appear.

4. Cluster version is below 3.5.4-24.