| Home | My Accounts | Newsletter | News Flash | Contact Us | Search |
This document outlines the process required to install and configure Linux
for
use in a cluster environment. While no prior knowledge of either Linux
or
clusters is assumed, the reader should be forewarned that this is not a
trivial
task.
GNU/Linux is a freely or inexpensively available operating system based
around
the Linux kernel which was developed by Linus Torvalds. GNU is a
recursive
acronym that stands for Gnu is Not Unix. GNU is basically a
collection of open
source implementations of common Unix utilities and
programs written to
provide and alternative to expensive commercial Unix
software. Both are open
source, meaning that the source code for the kernel
and GNU applications are
freely available to anyone. Since the source code
is available, anyone who can
code can make modifications to them. This has
lead to many implementations of
GNU/Linux of which clustering is but one.
A cluster is a group of computers which work together toward a final goal.
Some
would argue that a cluster must at least consist of a message passing
interface
and a job scheduler. The message passing interface works to
transmit data among
the computers (commonly called nodes or hosts) in the
cluster. The job scheduler
is just what it sounds like. It takes job
requests from user input or other
means and schedules them to be run on the
number of nodes required in the
cluster. It is possible to have a cluster
without either of these components,
however. Consider a cluster built for a
single purpose. There would be no need
for a job scheduler and data could be
shared among the hosts with simple methods
like a CORBA interface.
By definition, however, a cluster must consist of at least two nodes, a
master
and a slave. The master node is the computer that users are most
likely to
interact with since it usually has the job scheduler running on
it. The master
can also participate in computation like the slave nodes do,
but it is not
required or even recommended in large clusters. The slave
nodes are just that.
They respond to the requests of the master node and, in
general, do most of the
computing.
The First Step: Hardware Considerations
To build a cluster, one must have access to computers on which to install the
software. Therefore, it makes sense to cover this early in the process.
As stated earlier, it is necessary to have at least two machines when
building a
cluster. It is not necessary that these machines have the same
levels of
performance. The only requirement is that they both share the same
architecture.
For instance, the cluster should only consist of all Intel
machines or all Apple
machines but not a mixture of the two. It is possible
in theory to mix
architectures when building a cluster by using Java, but
that is outside the
scope of this document.
Strictly speaking, the only hardware requirements when building a cluster is
two
computers and some type of networking hardware to connect them with.
This is far
from ideal, however.
To maximize the benefits of a cluster, the right hardware must be used. It is
generally accepted that for optimal performance, all nodes except the master
node must have identical hardware specifications. This is due to the fact
that
one node which takes longer to do its work can slow the entire cluster
down as
the rest of the nodes must stop what they are doing and wait for the
slow node
to catch up. This is not always the case, but it is a
consideration that must be
made. Having identical hardware specs also
simplifies the setup process a great
deal as it will allow each hard drive
to be imaged from a master instead of
configuring each node individually.
There are four main considerations when building the master node. They
are: processor speed, disk speed, network speed, and RAM.
This is especially critical if the master node participates in
computation. The master node will be handling many more tasks
than the
slave nodes so a faster processor may be required to
keep it from lagging
behind. Keep in mind that since the
master node can be kept quite busy
doling out work to the other
nodes, a slowdown here can have a huge negative
impact on the
entire cluster as the slave nodes waste time waiting for their
next instruction.
Since most work done on the cluster will be saved as files on a
hard
drive at some time or another, disk speed for the master
node is absolutely
critical, made even more so due to the fact
that most nodes make use of NFS
which means that every node in
the cluster will be competing for access of
the master node's
disk. A fast SCSI drive is recommended, a RAID array of
types 5
or 0+1 is ideal, but an IDE drive will work as well.
This is critical as well. Time spent transmitting data is time RAM
wasted.
The faster the network, the better the performance of
the cluster. This can
be mitigated by a good deal if the
programmer expressly tries to minimize
the ratio of time on the
network to time on the processor but it never hurts
to have more
network speed. Fast Ethernet is recommended, Gigabit Ethernet
is
ideal but basically any network speed will work. While not part
of
the master node per se, it is strongly recommended that a
switch be used
instead of a hub when designing the cluster
network.
RAM is crucial in the master node for two reasons. First, the
more RAM,
the more processes can be run without accessing the
disk. Second, the Linux
kernel can and will cache its disk
writes to memory and keep them there
until they must be written
to disk. Both of these increase the speed of the
master node
which is critical to good overall cluster performance.
The slave nodes need to accomplish two tasks: perform the computations
assigned to them and then send that data back out over the network. For
this reason, their disk performance is not critical. In fact, it
is
common to have nodes without hard drives in a cluster. These diskless
nodes
further reduce the cost of building a cluster and eliminate some
of the time
required to set a cluster up. This document, however,
assumes that the slave
nodes will have hard drives.
The three most important hardware considerations for slave nodes are
processor speed, network speed and RAM.
Since a nodes primary function is computation, it makes sense
that the
fastest possible processor is used. The more processing
power the better.
Multiple processors for each node (i.e. SMP)
can be desirable but add
another degree of complexity to
programming applications for the clusters.
Not only must the
programmer take distributed processing into consideration,
but
SMP as well. As of the time of this writing, Intel Pentium III's
offer a good price/performance ration, Pentium IV's offer good
performance if the programmer includes SSE2 (Intel's special set
of
instructions designed to provide enhanced floating point
performance) but
AMD's Athlon processors offer an outstanding
price/performance ratio and
even better performance for most
applications. AMD processors, however,
produce much more heat
and use more power than the other two, so make a
decision based
on all these factors.
This affects the slave nodes in exactly the same way that
it does the
master node. See that section above
for more information.
This affects the slave nodes in exactly the same way that it
does the
master node. See that section above for more
information.
As mentioned earlier, a switch is more desirable than a hub when
designing clusters due the increased speed that they offer. It
is also a
good idea to purchase a KVM (Keyboard Video Mouse switch) to
allow easy
access to each individual node. Also, consider using 1U cases
when building
a cluster to reduce the space requirements and for the
increased
organization they offer. Keep heat in mind as well. 50 or more
nodes can
produce a significant amount of heat and effect both the
stability of the
cluster and comfort of the operator. This is especially
important when using
AMD based machines in a small room.
The Second Step: Installing Linux
After the machines are assembled, the next logical step is to install Linux.
There are many distributions of Linux available and different people prefer
different distributions for different reasons. This document uses RedHat 7.1
with the default kernel as a basis. Other versions of Linux can be used,
however, and some are more or less desirable depending on the experience
level
of the cluster administrator or personal taste.
Before installing Linux, it is a good idea to gather as much information
as possible about the PC Linux will be installed on. The areas defined
below should be considered the minimum a user should know before
attempting to install Linux.
Find out the maximum horizontal and vertical refresh rates as
well as the
maximum resolution and color depths it supports. The
monitor manufacturer's
website should have all the information
required. It is possible that the
installation process will
detect the information it needs automatically or
that the
monitor will be included in a list the user can choose from
during install time. Nevertheless, it is a good idea to have
this
information handy as misconfiguration can actually
physically damage the
monitor.
As with the monitor, it is possible the the installer will
detect and
configure the card automatically but this is not
always the case. Find out
the make and model of the card as well
as the amount of RAM it has. Most
modern cards will not need a
special RAMDAC setting or clockchip setting so
these steps can
be skipped when the time comes.
Be sure the know the size of the drive as this will effect how
the
partitions are set up later in the install process. Note
that some ATA100
drives or controllers may be problematic when
using the 2.4.2 kernel that
comes with RedHat 7.1
Know the make and model of the card. It is very likely that the
installer
will detect this properly but there is always a chance
that configuration
will have to be done manually.
Linux will need to be installed on every node of the cluster. Thanks to the easy to use installers that ship with most Linux distributions, this task is becoming trivial. Nevertheless, you may want to refer to these Red Hat guides:
The Official RedHat Linux x86 Installation Guide
The Official Red Hat Linux Reference Guide
Red Hat's Official Linux Reference Guide: Appendix B--Introduction to Disk Partitions
| Device | Size | Mount Point |
| /dev/sda1 | 9.6G / | |
| /dev/sda2 | 53M | /boot |
| /dev/sda5 | 16G | /home |
| /dev/sdb6 | 6.1G | /usr/local |
For the slave nodes, you just the / partition and the swap partition, since the applications will all be installed on the master node's /usr/local partition, and since all user files will be stored on the master nodes /home partition. The /home and /usr/local of the slave nodes are mounted to those on the master node, so they can be any size on the slave node. If you type "cd /home or cd /usr/local" on the slave node, you are actually going to the directory on the master node thanks to NFS (Network File System).
id:3:initdefault:
to
id:5:initdefault:
The Third Step: Configuring the Nodes
Now that Linux is functioning properly on every node, it is time to configure them to work with the clustering software. The steps below outline the basic configuration steps clustering software will need to operate properly. Keep in mind that these steps are just a guideline. Remember that many of the files that will be created or edited in this process will grant access to the root user only. It is therefore prudent to log in as root to save time.
The clustering software must be installed on the master and nodes, of course. This can be done now, or later. For mimosa, we used the Portland Group's Cluster Development Kit. This step is not covered in detail in this document. Be sure to read the documentation for the clustering software to be used as different software may require some different configuration procedures.
Type "setup" and make sure that rsh, rlogin, and rexec are checked. Also make sure that NFS is running.
You will need to decide what to IP addresses to use for the nodes of your cluster. For what it's worth, Spector recommends using a Net 10 special address class--that is, giving each node an address of 10.X.X.X, such as: 10.0.2.1, 10.0.2.2, 10.0.2.3, etc. However, for mimosa, we elected to use: 192.168.0.1, 192.168.0.2, 192.168.0.3, etc. You might try the same.
Edit this file on every cluster node, adding the names and IP addresses of every node in the cluster. This allows these machines to be accessed by name instead of by IP number. A typical /etc/hosts file will look something like this, where mimosa is the hostname of the master node and the others are the hostnames of the slave nodes except localhost:
127.0.0.1 localhost
192.168.0.1 mimosa
192.168.0.2 node1-1
192.168.0.3 node1-2
192.168.0.4 node1-3
192.168.0.5 node1-4
192.168.0.6 node1-5
192.168.0.7 node1-6
192.168.0.8 node1-7
192.168.0.9 node1-8
Hosts defined in this file are considered to be equivalent to
the
localhost for security purposes. This means that users of
these machines can
access the localhost without supplying a password.
This can be a significant
security risk, but is required so that
RSH will be able to log into each
machine without a password.
All clustering software may not require this, so
check to see if
it does.
A typical /etc/hosts.equiv file looks something like this:
192.168.0.1 mimosa
192.168.0.2 node1-1
192.168.0.3 node1-2
192.168.0.4 node1-3
192.168.0.5 node1-4
192.168.0.6 node1-5
192.168.0.6 node1-6
192.168.0.8 node1-7
192.168.0.9 node1-8
This file should exist in each user's home directory. Notice the
"." at
the beginning of the file name. This means that this file
will be hidden.
This file is also required so users can use RSH
to connect to each node
without supplying a password.
A typical .rhosts file looks something like this:
mimosa
node1-1
node1-2
node1-3
node1-4
node1-5
node1-6
node1-7
node1-8
This file is a list of tty's from which root can log in. This
allows for
easier administration of the nodes and is highly
recommended. Simply add
"rsh", "rexec", and "rlogin" to the end
of the file.
This directory contains configuration files that effect logins
of the
various services defined here.
Modify the rsh, rlogin, and rexec files by rearranging the lines
to have
the line with "rhosts" as the first line and the line
with "securetty" as
the second line. An example of these files
after modification is given
below:
auth required
/lib/security/pam_rhosts_auth.so
auth required
/lib/security/pam_securetty.so
auth required
/lib/security/pam_nologin.so
auth required
/lib/security/pam_env.so
account required
/lib/security/pam_stack.so service=system-auth
session
required /lib/security/pam_stack.so service=system-auth
This file should only be modified on the master node.
This file determines which directories will be exported by NFS.
This will
allow every host to access these directories and will
eliminate the need to
replicate work on every node. For example,
if user joe adds a file named
foo.txt in his home directory on
the master node, foo.txt will be accessible
from every node that
is configured to mount this export. Typically, it is a
good idea
to export each user's home directory. This is accomplished by
adding the following line to /etc/exports.
/home 192.168.0.0/255.255.255.0(rw,no_root_squash)
It is also a good idea to export /usr/local as a great deal of
user
programs will be installed here. Do this by adding
the following line to
/etc/exports.
/usr/local 192.168.0.0/255.255.255.0(rw,no_root_squash)
This file should only be modified on the slave nodes.
/etc/fstab is a list of devices and directories that will be
mounted at
boot time. Since NFS is being used, it is necessary
to modify /etc/fstab to
mount these NFS directories.
The following lines should be added to mount the two exports set
up
above.
[hostname of master node]:/usr/local /usr/local nfs
[hostname of master
node]:/home /home nfs
On each node, type ifconfig and make sure that the machine has its appropriate interior IP address. (Such as 192.168.0.X).
On each node, go to /etc/rc.d/init.d/ and type ../network stop. On the master node, also type ../nfs stop
On the master node, type ../nfs start
On each node, type ../network start.
Now that the required modifications have been made, enable rsh, rlogin
and
rexec on all nodes. There are several ways to do this. On RedHat, as
root enter the command "setup". Choose "System Services" then select
each of the items listed above. When configuring the master node be
sure
to select "nfs" as well. Select items by pressing the space bar.
1. Restart the machines to force the changes to take effect.
2. Ensure that rsh will work without a password by trying, from the
master node, to rsh to a slave node. If rsh works as expected, follow
the directions in the clustering software documentation to install and
configure the clustering software.
The most glaring issue with this process is the security implications it
raises.
Allowing root to log in remotely to a node via rsh is not generally
recommended
in a general purpose Linux installation. Using /etc/hosts.equiv
and the other
methods to allow logins without supplying a password is
usually frowned on as
well. In fact, rsh and rlogin are considered insecure
themselves. For these
reasons, it is absolutely imperative that, if the
master node can be
accessed from the Internet, great care is taken to ensure
the security of the
cluster.
Remember that when a user is added to the system, it will necessary to add
that
user to all nodes. This is most easily accomplished by simply copying
/etc/passwd and /etc/shadow from the master node to all other nodes.
Hostnames can be changed either with RedHat's setup utility under
"Networking".
Alternatively, it can be changed with the hostname command
like this:
hostname host1
