Building a Beowulf Cluster/Installation, Configuration, and Administration

In this chapter, you will find basics of installation, configuration, and administration. Most important here: 1. network setup with DHCP, 2. sharing files over the network with the network file system (NFS). After we cover this, we will come to some more general ideas of administration.

Most beowulfs use a network structure of master (also network head) and slaves, so that computing jobs are going to be distributed from the master to the slaves. All machines of the cluster are connected to the switch. The head will additionally interface with an external network. Master and slaves will share user data over the network. It is easiest to only install the master and one slave (called golden slave) in a first step. In a later chapter we will come to cloning, which is the process of creating machines that are identical to the golden slave. Alternatively to cloning, one may choose to do a diskless boot.

Practically this means, that the master (or head node) has two network interfaces (say eth0, eth1), one connected to the outside world and one connected to the cluster intranet over a network switch. All other computers (the slaves or nodes) are connected to the switch. In order to start processes on the cluster a user logs in on the master and spawns the processes from there to the slaves.

Outline

 * /OS and Software/
 * /Diskless Boot/
 * /Networking/
 * /Shared directories and SSH/
 * /Miscellaneous/