Free high availability: Create a XenServer virtualization cluster

With the free Citrix XenServer virtualization platform, it's easy to create a highly available virtual server cluster; here's how

1 2 3 4 5 Page 5
Page 5 of 5

We get in return:

uuid ( RO) : 5ff9245d-726d-41ee-872b-1480ab4e2a56

name-label ( RW): xennode01

 

host-metrics-live ( RO): true

uuid ( RO) : a1716dba-7a75-4e99-94f6-27c00b8b122d

name-label ( RW): xennode03

host-metrics-live ( RO): false

uuid ( RO) : 79285776-847d-4ce0-acd3-86934a026634

name-label ( RW): xennode02

host-metrics-live ( RO): true

So now we have the uuid for the host that is down (note the live=false): It's uuid a1716dba-7a75-4e99-94f6-27c00b8b122d. Now we enter:

xe vm-list is-control-domain=false resident-on=a1716dba-7a75-4e99-94f6-27c00b8b122d

This command lists the VMs the cluster thinks are running on the downed node. (The is-control-domain=false parameter removes dom0 from the list.)

We get:

uuid ( RO) : 5d21d7e2-5cb3-5e20-4307-b69d7eea8d94

name-label ( RW): Windows XP Test

power-state ( RO): running

 

uuid ( RO) : 2fbb543e-aac0-4488-8e57-099d2f71f01e

name-label ( RW): Fedora13_32

power-state ( RO): running

Both VMs need to be recognized as turned off, so we enter:

xe vm-reset-powerstate resident-on=a1716dba-7a75-4e99-94f6-27c00b8b122d --force --multiple

This command forces the nodes in the pool to recognize the VMs associated with this uuid to be off. (The multiple is only necessary if you need to turn off multiple VMs. Be careful, because incorrect usage can force all VMs in the cluster to be powered off.)

Returning to XenCenter, the console now shows that both the Windows VM and the Linux VM are off. Starting them moves the VMs to a different server (xennode01), and we are back in business.

Final notes
It is not difficult to create a highly available virtual server cluster using XenServer. Given the right approach, the availability can be carried even further. Within the XenServer functionality are methods to bond network cards, so the failure of any given NIC does not bring down a system. Using a different switch for each network card removes the switches from the possible failure list. Failure of the external storage can be overcome with the right RAID environment. And all this is possible with free software!

XenServer is built on a Linux kernel. A few people have suggested that we could script the recovery and even add a cron job to check for downed nodes and execution of the recovery script. This seems very plausible. One would just have to be sure that multiple copies of the script weren't getting executed from the different nodes, and that a node that went down and stayed down didn't prompt recurring script executions. If this were done properly, you would get automatic recovery of downed VMs!

As a starting point we offer the following. (Please note: We have not extensively tested this script.)

#!/bin/bash

 

downedhost=` xe host-list params=uuid,name-label,host-metrics-live | \

 

sed -e :a -e '$!N;s/\n/|/g;ta;s/|||/\n/g' | \

grep false | \

awk -F"[:,|]" '{ print $2 }'`

 

if [ -z "$downedhost" ]; then

echo "Hosts all good."

else

echo "$downedhost is down! Promoting myself to master."

xe pool-emergency-transition-to-master

xe pool-recover-slaves

xe vm-reset-powerstate resident-on=$downedhost --force --multiple

fi

 

Copyright © 2010 IDG Communications, Inc.

1 2 3 4 5 Page 5
Page 5 of 5
How to choose a low-code development platform