I have lately been involved in two vSAN installation that had this alert in vSAN Health pane.
Another side effect is that the hosts on the warning list is unable to enter maintenance mode.
Both environments were running on Lenovo servers and with Lenovo RAID controllers and was recently updated to ESXi 6.7 update 1. The problems started a few days after “update 1” was installed.
After a couple of conversations with VMware support, they finally discovered that the error lies in the way that ESXi uses the “StorCLI” utility, that we installed earlier to make sure that Health Check is able to check the storage controller firmware version.
Since we know that the storage controller is supported, we decided to remove the StorCLI utility and that solved the problem!
Steps involved (on all hosts, one at a time):
- Put the host in maintenance mode – if it refuses then run this command on the host: “/etc/init.d/vsanmgmtd restart” and try again
- When the host has entered maintenance mode, uninstall StorCLI with the following command: “esxcli software vib remove -n vmware-storcli-007.0405.0000.0000”
- The version number can be different – you can use “esxcli software vib list” to find the exact version number.
If you want to know why we initially install StorCli you may look at my colleagues blog over here: https://www.virtual-allan.com/vsan-health-and-controller-firmware-n-a/
Please note that this bug is not related exclusively to Lenovo servers, since other vendors uses the same controller chips and StorCLI utility.
VMware has confirmed that they are working on fix.