We have a Hyper-V 2012 R2 Cluster with several host nodes, which are all Dell PowerEdge Servers, connected to two PowerConnect 5524 switches, in a Master/Master-Backup setup, which are in turn connected to a MD3220i SAN.
The idea of this setup was that as long as only one piece of hardware failed then the cluster services would still successfully run.
This was working fine until the other day when the power went out for one of the switches and one of the power supplies for the SAN. This then led to a loss in connectivity between some of the cluster nodes and the volumes within the SAN.
The SAN did lose some connectivity but it remained powered on throughout so we don't believe this is the point of failure. Looking at the logs on the switches it appears as though there was a complete loss in connectivity for the Master. However this begs to question, is this truly a fault tolerant system.
Upon some investigation it appears as though some others, who have a similar setup to us, have found there is a loss of connectivity as the master-backup takes the place of the master switch. There has been mention of using an option called FastLink/PortFast which might speed up the process and thus make the likelihood of failure of connectivity lower.
Anyhow, does anyone know how the switches should be configured in order to ensure that our services are fault tolerant?
Thanks in advance to any replies.