E.4. Common Behaviors: Two Member Cluster with Disk-based Tie-breaker

Loss of network connectivity to other member, shared media still accessible

Common Causes: Network connectivity lost.

Test Case: Disconnect all network cables from a member.

Expected Behavior: No fail-over unless disk updates are also lost. Services will not be able to be relocated in most cases, which is due to the fact that the lock server requires network connectivity.

Verification: Run clustat to verify that services are still marked as running on the member, even though it is inactive according to membership. Messages are logged stating that the member is now in the PANIC state.

Loss of access to shared media

Common Causes: Shared media loses power, cable connecting a member to the shared media is disconnected.

Test Case: Unplug SCSI or Fibre Channel cable from a member.

Expected Behavior: No failover occurs unless network is also lost. Configured action is taken to address loss of access to shared storage (reboot/halt/stop/ignore). Default is reboot. The action may subsequently cause a failover.)

System hang or crash (panic) on member X

Test Case: Kill the cluquorumd and clumembd daemons.

killall -STOP cluquorumd clumembd

Expected Behavior: Hung cluster member is fenced by other cluster member. Services fail over. Configured watchdog timers may be triggered.

Start of Cluster Services without network connectivity

Common Causes: Bad switch; one or both members are without network connectivity

Test Case: Stop cluster services on all members. Disconnect all network cables from one member. Start cluster services on both members.

Verification: Not all services may start, as locks require network connectivity. Because the Cluster Manager requires a fully connected subnet, this case is handled on a best-effort basis, but is technically an inoperable cluster.