Simon Horman works as a software engineer for VA Linux Systems Japan. In his downtime he also busies himself working on open source projects such as kexec-tools, kexec for Xen IA64, the Linux Virtual Server Project, and the Linux High-Availability Project, which seeks to provide high availability clustering solutions for Linux.
At this year's linux.conf.au in Melbourne, Australia, Horman will leave his Tokyo base to participate in the conference and to help organise the informal Linux HA Birds of a Feather session. He speaks to Howard Dahdah ahead of his arrival.
What's great with using Linux for High-Availability?
A tricky question, as I'm obviously somewhat biased. So rather than making a feature list and comparing it to others, I'll compare Linux HA to itself over time.
I first came to the project in about 1999, which was pretty early on in its history. I had some interest in doing simple fail-over of mail and web servers, which had come out of a need at the ISP I worked for at that time. My solution was fairly simple: only handling IP failover. And when the first release of Linux HA came about it was able to handle things much better - it could detect the failures too!
Even so, it couldn't do many things. For instance it could only manage two nodes and it didn't even do fencing - which is required to ensure that at most one node accesses sensitive resources such as shared disks. Fencing was added not too long after, but still, things were fairly rudimentary.
If we fast-forward to today, Linux HA can handle clusters of 8 nodes or more, it has support managing a large variety of resources, it has a GUI, it's constantly improving and perhaps most importantly of all, it still supports the simple two-node first-generation cluster configurations.
So if you ask me what's great about Linux-HA, I have to say that its continuing growth.
What could distributions do to improve support of HA (or make it easier to deploy)?
At this time the HA solutions that distributions support varies somewhat. For instance, SuSE have used Linux HA for a long time, but I believe that Red Hat use a different code base. And there are of course distributions like Debian that don't ship a single Linux HA solution, rather they ship a variety of packages and to some extent it's up to the end-user to put things together.
As with any software project one of the most useful things for Linux HA is to get as wider usage as possible. Stressing the code, APIs and feature-set. While I'm certainly not advocating Linux HA as one solution to rule them all, it would certainly benefit Linux HA if more distributions were to use it as their first-line HA solution. I'm particularly referring to enterprise distributions, as to be honest in the Linux HA developer space, it's enterprise that provides much of the developer resources these days.
More generally, as with many projects, using the code and reporting problems helps. Getting involved and in the case of companies assigning resources to the development effort helps even more.
We used heartbeat2 at IDG and our head tech guy uses it to scare the junior techs: "You better get that finished soon or I'll make you change the cib file." Can you see configuration getting easier, or is it enough of a complex problem that it's likely to remain that way for a while?
That certainly isn't the first time that I have heard people speak of the cib configuration file in vain. And I'm certainly not going to claim that its easy to work with. But to some extent the complexity that is present there represents the complexity and flexibility of the system. And while there is probably some room to simplify things without loosing functionality, I suspect that the real answer to the problem is better tools to manage the configuration.
What are the advantages of Linux HA rather than the hardware solutions that are out there?
I think that when it comes to HA solutions it's really about finding a solution that best fits the need at hand. How strict are the HA requirements? How fast does the fail-over need to be? What if any performance hit is acceptable? How much hardware-over-commit is reasonable?
Each solution meets these criteria in different ways. Linux HA has its parameters and it works within them. I think that the thing that is different about Linux HA is that the code is open source. This basically means that within Linux HA's definition of what HA is, people are able to customize and extend the code to meet their needs.
I know that is a bit of a catch-cry of the open source world, but in Linux HA we really do see people adding to the code to meet their needs.
What's the best comment you've ever gotten about the name "ultramonkey"?
I think the most amusing comment was in a meeting where some Japanese people likened the project name to that of the TV series "Monkey Magic", which is sometimes referred to as "Super Monkey" in Japan.