Building a Beowulf Cluster/Installation, Configuration, and Administration/Networking

We want the computers to work in a local network.

The easiest way to setup the network is by DHCP. For practical purposes it is easiest to set up dynamic addresses (DHCP) handled by the master on the basis of physical addresses of slaves' network interfaces. DHCP simplifies the installation of new nodes, because the mac address and hostname is the only thing that is different among the nodes and the DHCP server on the master can manage a new node by a new entry into the configuration file.

In this example we will set up the network IPs to 192.168.1.1 until 192.168.1.8, where 8 is the master.

The idea is to give to slaves the names nodei corresponding to their ip address 192.168.1.i.

Note that the DHCP server provides IP addresses for the other machines not for itself. The master you give a static ip address (192.168.1.250 here). In red hat based distributions this is configured in /etc/sysconfig/network-scripts/ifcfg-eth0 or /etc/sysconfig/network-scripts/ifcfg-eth1.

I configured eth0 for the organization network and eth1 for the cluster intranet. Example files: /etc/sysconfig/network-scripts/ifcfg-eth0 corresponds to your organization network settings.

For the slaves, the interface to the cluster intranet is as follows:

If you don't use a DNS service on your head you use the DNS service of the network of your organization.

Note in /etc/hosts that in the loopback line (first line) the hostname is not given in order to avoid problems with message protocols (PVM, MPI).

You need to activate ip forwarding on the head in order to have internet access on all machines. You enable the firewall and include masquerading on the network interface to you cluster. This you do by changing the /etc/sysconfig/iptables file or using some user interface, e.g. system-config-firewall on red hat based systems. Be careful not to make your firewall too restrictive as this can cause problems.

In the /etc/sysconfig/network you need to have:

You have to reinitiate the network services and startup the dhcp server daemon (dhcpd). To have dhcpd startup at boot, in fedora you the ntsysv program allows you to search a list and mark the corresponding entry.

You may want to setup your printers on master and slave (you can copy an existing printer configuration recursively from /etc/cups e.g. from your local office desktop computer).

Useful references

 * linux networking howto
 * linux DHCP howto]
 * linux firewall howto]