Cluster-Handbook/Munin

Installation of the software Munin
Munin is a software system based on Linux. It measures the server load. This requires a 64-bit computer!

First unzip and to install the software package Munin using the command “sudo apt-get install munin munin-node”. This extracts the entire Munin software package on the Linux operating system. Once you’ve performed this, enter sudo nano /etc/munin/munin.conf to open the configuration menu. The file should look like this: You delete the comments by removing the # character from the commands so that the program can read and execute it. After you installed the munin plugins with sudo nano /usr/share/munin/plugins. The program will then restart with sudo /etc/init.d/munin-node restart, so that it accepts all new settings. The command sudo apt-get install apache2

installs the web server and

sudo/etc/apache2/mods-available/status/conf

shows finally the configuration menu. Here the extended status must be set to On to run Munin as desired. The sudo a2enmod status needs to be activated. This must be on enabeld.

After that the plug-ins are going to be enabled. For this one must enter the following commands in the command line:

sudo ln-s/urs/share/munin/plugins/ _apache/etc/munin/plugins ln-s/urs/share/munin/plugins/ apache_proccess/etc/ munin/plugins

ln-s/urs/share/munin/plugins/_volume/etc/munin/ plugins

To change the settings of Munin you restart the system with:

sudo /etc/init.d/apache2 restart

The following command installs the graphic package:

sudo apt-get install libwww-perl

This is required for the design of the graphs.

Working with Munin
The software system Munin must be connected to an Internet server so that its visual interface can be displayed. For this purpose, again, open the configuration file with this command:

sudo nano /etc/munin/munin.conf

The shown IP Name localdomain for locally called internet domain domain will be changed to the name Master. The displayed IP address must be in the master has be changed to 127.0.0.1.Worker gets the number 10.0.2.2. ( Each working group got different IP extensions, here the 2.2).

(Vgl. http://help.ubuntu-se.org/9.10/serverguide/sv/munin.html)

On a Windows computer always the same name must be used. When the web browsers can not open Munin, the name must be changed in the sudo/etc/hosts file. Subsequently, enter the IP from the Master/munin in the internet browser and trie to open the Munin page or the software system. If the installation was successful, Munin can be accessed and measures the server load. However, the measurement takes some time to complete because Munin measures per day/month/year or different workloads on a few servers. It displays the minimum and maximum values (see on the next page the picture of the software system Munin). In addition, the system measures at different times. Updates for Munin appear and also get reported by the program. It is also displayed when the server is can not be reached, for example, during a power failure or computer crash.



[[File:MySQL queries.jpg|frame|center|Example for the display of the server utilization levels with Munin

(Quelle: http://zockertown.de/s9y/index.php?/archives/1426-Munin-ist-schon-toll.html) ]]

The advantage of the program is that you can react to a failure of a server even with a large number of servers and quickly detect which server is down. This must then be optionally repaired or renewed.

Example of computer cluster in Munin, vgl. http://munin.ping.uio.no/

Overview • ping.uio.no

o aquarius.ping.uio.no [ disk exim network processes system ]

o bache.ping.uio.no [ disk network nfs

postfix processes system time ]

o bambi.ping.uio.no [ disk network nfs

processes system time ]

o bimbo.ping.uio.no [ disk exim network nfs

other processes system ]

o bottolf.ping.uio.no [ disk exim network

nfs processes system time ]

o cirrus.ping.uio.no [ disk exim network

processes sensors system ]

o cumulus.ping.uio.no [ disk exim network

processes sensors system ]

o freddy.ping.uio.no [ disk network nfs

postfix processes sensors system time ]

o galactica.ping.uio.no [ disk exim

network nfs postfix printing processes

system ]

o gud.ping.uio.no [ disk network nfs

postfix printing processes sensors

system ]

o kjell.ping.uio.no [ disk network

nfs postfix processes sensors system time ]

o knuth.ping.uio.no [ Apache disk mysql

network nfs postfix processes sensors system time ]

o m.ping.uio.no [ disk exim network nfs

printing processes sensors system ]

o matz.ping.uio.no [ disk network nfs

processes system ]

o meg.ping.uio.no [ disk network nfs other processes system ]

o pike.ping.uio.no [ Apache disk exim

munin network printing processes sensors

system time virtual machines ]

o ponnypetra.ping.uio.no [ disk network other processes system ]

o rosa.ping.uio.no [ disk network nfs

processes system time ]

o rossum.ping.uio.no [ Apache disk exim network nfs other processes system time ]

o tetra.ping.uio.no [ disk network

processes system ]

o urias.ping.uio.no [ disk network nfs other processes system time ]

o utslett.ping.uio.no [ disk munin

network processes system ]

On the picture you can see the individual process servers and systems. In Ubuntu all packets have a start and stop function. These control the services.

Therefore one must enter: sudo /etc/init.d/munin-node start|stop|restart|force-reload|try-restart

“Restart” restarts the system, existing systems on the server will be stopped. “Try -restart restarts the service when he was stopped before.”

Warnings

If the limits of the capacity utilization in the Munin server are exceeded, these values are usually displayed in red. One can send then alerts via e-mail, so that the maximum disk space is not exceeded. For this purpose, open the file munin.conf ( wiki.ubuntuusers.de/Munin). These commands are then added:

everytime

Contacts me

Contacts.me.command mail -s “Munin notification ( var:host)” user@example.com

Contact.me.always_send warning critical

The email address must be adapted to your own system. This should be done even the utilization values are determined from when the server threatens to overflow, to timely send a warning to the user can. Before a postfix should be installed and configured so that the e-mails are sent to all users. For each host, this can be achieved as follows (see example from the configuration file of Munin.):

(localhost.localdomain/Master)

Address 127.0.0.1.

use_node_name yes

&lt;plugin&gt;.&lt;fieldname&gt;. (critical,warning) &lt;value&gt;

The plugin is accessed via the URL of the graph. The field name can be copied from the Munin graph. Under Internal name is the fieldname shown. Critical warning can be freely selected. The value is determined as described above and upon reaching/exceeding a warning e-mail sent to all users.

Example of a Server Warning entry in Munin

“[localhost.localdomain]

address 127.0.0.1

use_node_name yes

fd._dev_evms_hda2.warning 70

df._dev_evms_hda2.critical 95

df._dev_mapper_hda5.warning 70

df._dev_mapper_hda5.critical 70”

Here was 70 determined as a critical value and 95 selected to be a very critical value. The values should be carefully selected and not too low, because the user gets an warning email and can be frightened. The warnings should also be sent in any case with the truly critical values, so you can, if necessary, load the system with a backup.

CPU main processor

Munin can also measure the load on the main processor. This is a central processing unit executing a program. This also works for central host computers, connected to the plurality of terminals. Even earlier computer server performance and data can be compared with one another with Munin. The master collects the performance data, the node stores them and generates a graphic on the Web interface. The storage of the graphic is made via the RRDtool.



Munin errors and cleanup
Various types of errors can occur, for example, the IP address may change from one day to the other. Munin can not achieve the desired browser page therefore. In this case, the address in the configuration file needs adjustment. It is not so easy to change the name of the localdomainserver.

White bars in the graphic:

The cause may be that the user has configured a graphic file or a mistake when while unpacking the package. When installing the permissions mistakes can happen easily, because then e.g. no warning e-mails can be sent when the server overflows.