Home My Accounts | Newsletter | News Flash | Contact Us | Search

MCSR Computing
How to Build a Beowulf Linux Cluster

This document outlines the process required to install and configure Linux for
use in a cluster environment. While no prior knowledge of either Linux or
clusters is assumed, the reader should be forewarned that this is not a trivial
task.

What is Linux?
What is a Cluster?
The First Step: Hardware Considerations
Clusters in an Ideal World
The Master Node
Processor Speed
Disk Speed
Network Speed
RAM
The Slave Nodes
Processor Speed
Network Speed
RAM
Other Hardware Components
The Second Step: Installing Linux
Gather Hardware Information
Monitor
Video Card
Hard Drive
Network Card
Install Linux
The Third Step: Configuring the Nodes
General Linux System Files
/etc/hosts
/etc/hosts.equiv
.rhosts
/etc/securetty
/etc/pam.d
/etc/exports
/etc/fstab
Enabling Required Services
Testing the Configuration
Adding a User
General Notes

What is GNU/Linux?

GNU/Linux is a freely or inexpensively available operating system based around
the Linux kernel which was developed by Linus Torvalds. GNU is a recursive
acronym that stands for Gnu is Not Unix. GNU is basically a collection of open
source implementations of common Unix utilities and programs written to
provide and alternative to expensive commercial Unix software. Both are open
source, meaning that the source code for the kernel and GNU applications are
freely available to anyone. Since the source code is available, anyone who can
code can make modifications to them. This has lead to many implementations of
GNU/Linux of which clustering is but one.

Back to Top

What is a Cluster?

A cluster is a group of computers which work together toward a final goal. Some
would argue that a cluster must at least consist of a message passing interface
and a job scheduler. The message passing interface works to transmit data among
the computers (commonly called nodes or hosts) in the cluster. The job scheduler
is just what it sounds like. It takes job requests from user input or other
means and schedules them to be run on the number of nodes required in the
cluster. It is possible to have a cluster without either of these components,
however. Consider a cluster built for a single purpose. There would be no need
for a job scheduler and data could be shared among the hosts with simple methods
like a CORBA interface.

By definition, however, a cluster must consist of at least two nodes, a master
and a slave. The master node is the computer that users are most likely to
interact with since it usually has the job scheduler running on it. The master
can also participate in computation like the slave nodes do, but it is not
required or even recommended in large clusters. The slave nodes are just that.
They respond to the requests of the master node and, in general, do most of the
computing.

Back to Top

The First Step: Hardware Considerations

To build a cluster, one must have access to computers on which to install the
software. Therefore, it makes sense to cover this early in the process.

As stated earlier, it is necessary to have at least two machines when building a
cluster. It is not necessary that these machines have the same levels of
performance. The only requirement is that they both share the same architecture.
For instance, the cluster should only consist of all Intel machines or all Apple
machines but not a mixture of the two. It is possible in theory to mix
architectures when building a cluster by using Java, but that is outside the
scope of this document.

Strictly speaking, the only hardware requirements when building a cluster is two
computers and some type of networking hardware to connect them with. This is far
from ideal, however.

Clusters in an Ideal World

To maximize the benefits of a cluster, the right hardware must be used. It is
generally accepted that for optimal performance, all nodes except the master
node must have identical hardware specifications. This is due to the fact that
one node which takes longer to do its work can slow the entire cluster down as
the rest of the nodes must stop what they are doing and wait for the slow node
to catch up. This is not always the case, but it is a consideration that must be
made. Having identical hardware specs also simplifies the setup process a great
deal as it will allow each hard drive to be imaged from a master instead of
configuring each node individually.

The Master Node

There are four main considerations when building the master node. They
are: processor speed, disk speed, network speed, and RAM.

Processor Speed

This is especially critical if the master node participates in
computation. The master node will be handling many more tasks
than the slave nodes so a faster processor may be required to
keep it from lagging behind. Keep in mind that since the
master node can be kept quite busy doling out work to the other
nodes, a slowdown here can have a huge negative impact on the
entire cluster as the slave nodes waste time waiting for their
next instruction.

Disk Speed

Since most work done on the cluster will be saved as files on a
hard drive at some time or another, disk speed for the master
node is absolutely critical, made even more so due to the fact
that most nodes make use of NFS which means that every node in
the cluster will be competing for access of the master node's
disk. A fast SCSI drive is recommended, a RAID array of types 5
or 0+1 is ideal, but an IDE drive will work as well.

Network Speed

This is critical as well. Time spent transmitting data is time
wasted. The faster the network, the better the performance of
the cluster. This can be mitigated by a good deal if the
programmer expressly tries to minimize the ratio of time on the
network to time on the processor but it never hurts to have more
network speed. Fast Ethernet is recommended, Gigabit Ethernet is
ideal but basically any network speed will work. While not part
of the master node per se, it is strongly recommended that a
switch be used instead of a hub when designing the cluster
network.

RAM

RAM is crucial in the master node for two reasons. First, the
more RAM, the more processes can be run without accessing the
disk. Second, the Linux kernel can and will cache its disk
writes to memory and keep them there until they must be written
to disk. Both of these increase the speed of the master node
which is critical to good overall cluster performance.

Slave Nodes

The slave nodes need to accomplish two tasks: perform the computations
assigned to them and then send that data back out over the network. For
this reason, their disk performance is not critical. In fact, it
is common to have nodes without hard drives in a cluster. These diskless
nodes further reduce the cost of building a cluster and eliminate some
of the time required to set a cluster up. This document, however,
assumes that the slave nodes will have hard drives.

The three most important hardware considerations for slave nodes are
processor speed, network speed and RAM.

Processor Speed

Since a nodes primary function is computation, it makes sense
that the fastest possible processor is used. The more processing
power the better. Multiple processors for each node (i.e. SMP)
can be desirable but add another degree of complexity to
programming applications for the clusters. Not only must the
programmer take distributed processing into consideration, but
SMP as well. As of the time of this writing, Intel Pentium III's
offer a good price/performance ration, Pentium IV's offer good
performance if the programmer includes SSE2 (Intel's special set
of instructions designed to provide enhanced floating point
performance) but AMD's Athlon processors offer an outstanding
price/performance ratio and even better performance for most
applications. AMD processors, however, produce much more heat
and use more power than the other two, so make a decision based
on all these factors.

Network Speed

This affects the slave nodes in exactly the same way that
it does the master node. See that section above
for more information.

RAM

This affects the slave nodes in exactly the same way that it
does the master node. See that section above for more
information.

Other Hardware Components.

As mentioned earlier, a switch is more desirable than a hub when
designing clusters due the increased speed that they offer. It
is also a good idea to purchase a KVM (Keyboard Video Mouse switch) to
allow easy access to each individual node. Also, consider using 1U cases
when building a cluster to reduce the space requirements and for the
increased organization they offer. Keep heat in mind as well. 50 or more
nodes can produce a significant amount of heat and effect both the
stability of the cluster and comfort of the operator. This is especially
important when using AMD based machines in a small room.

Back to Top

The Second Step: Installing Linux

After the machines are assembled, the next logical step is to install Linux.
There are many distributions of Linux available and different people prefer
different distributions for different reasons. This document uses RedHat 7.1
with the default kernel as a basis. Other versions of Linux can be used,
however, and some are more or less desirable depending on the experience level
of the cluster administrator or personal taste.

Gather Hardware Information

Before installing Linux, it is a good idea to gather as much information
as possible about the PC Linux will be installed on. The areas defined
below should be considered the minimum a user should know before
attempting to install Linux.

Monitor

Find out the maximum horizontal and vertical refresh rates as
well as the maximum resolution and color depths it supports. The
monitor manufacturer's website should have all the information
required. It is possible that the installation process will
detect the information it needs automatically or that the
monitor will be included in a list the user can choose from
during install time. Nevertheless, it is a good idea to have
this information handy as misconfiguration can actually
physically damage the monitor.

Video Card

As with the monitor, it is possible the the installer will
detect and configure the card automatically but this is not
always the case. Find out the make and model of the card as well
as the amount of RAM it has. Most modern cards will not need a
special RAMDAC setting or clockchip setting so these steps can
be skipped when the time comes.

Hard Drive

Be sure the know the size of the drive as this will effect how
the partitions are set up later in the install process. Note
that some ATA100 drives or controllers may be problematic when
using the 2.4.2 kernel that comes with RedHat 7.1

Network Card

Know the make and model of the card. It is very likely that the
installer will detect this properly but there is always a chance
that configuration will have to be done manually.

Install Linux

Linux will need to be installed on every node of the cluster. Thanks to the easy to use installers that ship with most Linux distributions, this task is becoming trivial. Nevertheless, you may want to refer to these Red Hat guides:
The Official RedHat Linux x86 Installation Guide
The Official Red Hat Linux Reference Guide
Red Hat's Official Linux Reference Guide: Appendix B--Introduction to Disk Partitions

  1. Boot from the CD-ROM.
  2. Choose an install mode, then press ENTER
  3. If you have a driver diskette for any special devices, like a monitor, sound card, etc., insert that diskette and press ENTER
    Otherwise, say no, and continue with the installation.
  4. Follow the directions on the screen. The process is fairly straightforward until partitioning.
  5. When installing Linux on the master node, it is recommended to use separate partitions. This is not necessary, but it can allow for easier administration in the long run. The Master node of Mimosa, the cluster at the Mississippi Center for Supercomputing Research where this document is being written, is configured basically like this:

    Device Size Mount Point
    /dev/sda1 9.6G /
    /dev/sda2 53M /boot
    /dev/sda5 16G /home
    /dev/sdb6 6.1G /usr/local
    Do not forget to add a swap partition when partitioning the drive. The swap partition should be at least 128 megabytes. Actually, the Red Hat Linux Reference Guide recommends 2x RAM or 32 MB, whichever is larger.

    For the slave nodes, you just the / partition and the swap partition, since the applications will all be installed on the master node's /usr/local partition, and since all user files will be stored on the master nodes /home partition. The /home and /usr/local of the slave nodes are mounted to those on the master node, so they can be any size on the slave node. If you type "cd /home or cd /usr/local" on the slave node, you are actually going to the directory on the master node thanks to NFS (Network File System).

  6. Since it is assumed that Linux will be the only operating system on the master node and the slave nodes, install LILO, the Linux bootloader, to the MBR(Master Boot Record) of the primary hard disk.
  7. Now choose the packages to install. At the minimum, make sure that NFS, Networked File System, and RSH, Remote Shell, are installed. It is also a good idea to install SSH as a backup in case RSH fails but this is not required. One way to ensure that these packages get installed is to simply choose everything. This is not a bad idea if disk space is not a concern, as you can always turn off unneeded services later. Otherwise, you must check the "Select Individual Packages" box on the "Package Group Selection" page.
  8. RedHat will now ask about installing and configuring their firewall. If the cluster has no contact with the outside world or is behind a very good firewall, it is advisable to not install the firewall at all. If the firewall is installed, make sure that it allows connections via SSH and RSH.
  9. We need to add a step here about configuring X.
  10. Once the packages have installed, the user is presented with an option to make a boot disk. Making a boot disk is always a very good idea.
  11. One of the last steps in installing Linux is deciding whether or not the computer should boot up into graphical mode. Setting the default run level to 5, graphical mode, can cause some problems if it has not been fully tested. Generally, it is a better choice to have the system's default run level set for 3, multi-user with networking, until it is certain that the X window system will not cause any problems. Once it has been determined that the system functions properly in run level 5, it is possible to set the computer to boot into graphical mode by editing /etc/inittab and changing the line that looks like:

    id:3:initdefault:

    to

    id:5:initdefault:

  12. Now Linux is installed. Reboot the computer and continue to the next step.

Back to Top

The Third Step: Configuring the Nodes

Now that Linux is functioning properly on every node, it is time to configure them to work with the clustering software. The steps below outline the basic configuration steps clustering software will need to operate properly. Keep in mind that these steps are just a guideline. Remember that many of the files that will be created or edited in this process will grant access to the root user only. It is therefore prudent to log in as root to save time.

The clustering software must be installed on the master and nodes, of course. This can be done now, or later. For mimosa, we used the Portland Group's Cluster Development Kit. This step is not covered in detail in this document. Be sure to read the documentation for the clustering software to be used as different software may require some different configuration procedures.

Type "setup" and make sure that rsh, rlogin, and rexec are checked. Also make sure that NFS is running.

You will need to decide what to IP addresses to use for the nodes of your cluster. For what it's worth, Spector recommends using a Net 10 special address class--that is, giving each node an address of 10.X.X.X, such as: 10.0.2.1, 10.0.2.2, 10.0.2.3, etc. However, for mimosa, we elected to use: 192.168.0.1, 192.168.0.2, 192.168.0.3, etc. You might try the same.

General Linux System Files

/etc/hosts

Edit this file on every cluster node, adding the names and IP addresses of every node in the cluster. This allows these machines to be accessed by name instead of by IP number. A typical /etc/hosts file will look something like this, where mimosa is the hostname of the master node and the others are the hostnames of the slave nodes except localhost:

127.0.0.1 localhost
192.168.0.1 mimosa

192.168.0.2 node1-1
192.168.0.3 node1-2
192.168.0.4 node1-3
192.168.0.5 node1-4
192.168.0.6 node1-5
192.168.0.7 node1-6
192.168.0.8 node1-7
192.168.0.9 node1-8

/etc/hosts.equiv

Hosts defined in this file are considered to be equivalent to
the localhost for security purposes. This means that users of
these machines can access the localhost without supplying a password.
This can be a significant security risk, but is required so that
RSH will be able to log into each machine without a password.
All clustering software may not require this, so check to see if
it does.

A typical /etc/hosts.equiv file looks something like this:

192.168.0.1 mimosa

192.168.0.2 node1-1
192.168.0.3 node1-2
192.168.0.4 node1-3
192.168.0.5 node1-4
192.168.0.6 node1-5
192.168.0.6 node1-6
192.168.0.8 node1-7
192.168.0.9 node1-8

.rhosts

This file should exist in each user's home directory. Notice the
"." at the beginning of the file name. This means that this file
will be hidden. This file is also required so users can use RSH
to connect to each node without supplying a password.

A typical .rhosts file looks something like this:

mimosa
node1-1
node1-2
node1-3
node1-4
node1-5
node1-6
node1-7
node1-8

/etc/securetty

This file is a list of tty's from which root can log in. This
allows for easier administration of the nodes and is highly
recommended. Simply add "rsh", "rexec", and "rlogin" to the end
of the file.

/etc/pam.d

This directory contains configuration files that effect logins
of the various services defined here.

Modify the rsh, rlogin, and rexec files by rearranging the lines
to have the line with "rhosts" as the first line and the line
with "securetty" as the second line. An example of these files
after modification is given below:

        auth       required     /lib/security/pam_rhosts_auth.so
        auth       required     /lib/security/pam_securetty.so
        auth       required     /lib/security/pam_nologin.so
        auth       required     /lib/security/pam_env.so
        account    required     /lib/security/pam_stack.so service=system-auth
        session    required     /lib/security/pam_stack.so service=system-auth
 

/etc/exports

This file should only be modified on the master node.

This file determines which directories will be exported by NFS.
This will allow every host to access these directories and will
eliminate the need to replicate work on every node. For example,
if user joe adds a file named foo.txt in his home directory on
the master node, foo.txt will be accessible from every node that
is configured to mount this export. Typically, it is a good idea
to export each user's home directory. This is accomplished by
adding the following line to /etc/exports.

/home 192.168.0.0/255.255.255.0(rw,no_root_squash)

It is also a good idea to export /usr/local as a great deal of
user programs will be installed here. Do this by adding
the following line to /etc/exports.

/usr/local 192.168.0.0/255.255.255.0(rw,no_root_squash)

/etc/fstab

This file should only be modified on the slave nodes.

/etc/fstab is a list of devices and directories that will be
mounted at boot time. Since NFS is being used, it is necessary
to modify /etc/fstab to mount these NFS directories.

The following lines should be added to mount the two exports set
up above.

[hostname of master node]:/usr/local /usr/local nfs
[hostname of master node]:/home /home nfs

On each node, type ifconfig and make sure that the machine has its appropriate interior IP address. (Such as 192.168.0.X).
On each node, go to /etc/rc.d/init.d/ and type ../network stop. On the master node, also type ../nfs stop
On the master node, type ../nfs start
On each node, type ../network start.

Enabling Required Services

Now that the required modifications have been made, enable rsh, rlogin
and rexec on all nodes. There are several ways to do this. On RedHat, as
 root enter the command "setup". Choose "System Services" then select
 each of the items listed above. When configuring the master node be sure
 to select "nfs" as well. Select items by pressing the space bar.

Testing the Configuration

1. Restart the machines to force the changes to take effect.

2. Ensure that rsh will work without a password by trying, from the
master node, to rsh to a slave node. If rsh works as expected, follow
the directions in the clustering software documentation to install and
configure the clustering software.

Add a User
To create a user log onto the master node. The easiest way is to use X windows. Find linuxconf under programs, system. Go to user accounts and add your user. To clone this user to the slave nodes use these commands

rsh to appropriate node.
cd /etc
rcp root@[masternode]:/etc/passwd .
rcp root@[masternode]:/etc/shadow .

Make sure you notice the (.). That means to copy this file to the current directory. If you have some problems with rcp, you may have to type the full path (/usr/bin/rcp followed by the rest of the command). The passwd and shadow files contain the information about all the users accounts on that particular machine.

Back to Top

General Notes

The most glaring issue with this process is the security implications it raises.
Allowing root to log in remotely to a node via rsh is not generally recommended
in a general purpose Linux installation. Using /etc/hosts.equiv and the other
methods to allow logins without supplying a password is usually frowned on as
well. In fact, rsh and rlogin are considered insecure themselves. For these
reasons, it is absolutely imperative that, if the master node can be
accessed from the Internet, great care is taken to ensure the security of the
cluster.

Remember that when a user is added to the system, it will necessary to add that
user to all nodes. This is most easily accomplished by simply copying
/etc/passwd and /etc/shadow from the master node to all other nodes.

Hostnames can be changed either with RedHat's setup utility under "Networking".
Alternatively, it can be changed with the hostname command like this:

                hostname host1
 

Back to Top

--------------------------------
Last Modified: Tuesday, 18-Dec-2001 15:55:11 CST
Copyright © 1997-2005 The Mississippi Center for Supercomputing Research. All Rights Reserved.
[an error occurred while processing this directive]