System Monitoring with Xymon/Other Docs/FAQ

Q. What is hobbit ?
A. Hobbit was the old name used by Xymon System Monitoring tool. http://www.hswn.dk/hobbit/help/about.html

Q. Why should I use Xymon instead of BB ?

 * Speed. Hobbit runs much faster and is less resource intensive than Big Brother.
 * More functionality built in
 * GPL
 * Integrated Trending
 * A write up of why I use hobbit at http://gendalia.public.iastate.edu/Hobbit.txt

When is the next version going to be released ?
A. Hobbit is a FOSS project by hobbit developers who contribute to the project in their spare time. It is under active development but a release date for the next version has not been announced. The current development snapshot is available for download from sourceforge if required. If you are chasing a specific bug fix or feature not currently available within the existing production release, please search the archive and/or post a query to the general discussion mailing list to see if there is a solution available. You can subscribe to the list by sending an e-mail to hobbit-subscribe@hswn.dk. A searchable archive of the list is available at http://www.hswn.dk/hobbiton/ Please subscribe to the hobbit announcement list to be notified of the release of new versions. You can subscribe to the list by sending an e-mail to hobbit-announce-subscribe@hswn.dk.

There is another answer for this question at http://www.hswn.dk/hobbiton/2008/02/msg00227.html

Where can I find more tests ?
A. Deadcat http://www.deadcat.net Deadcat is mostly Big Brother scripts, but most of them will work on Hobbit without too much fuss.

The Shire Project (Xymonton) https://wiki.xymonton.org/doku.php Shire Project is specific to Xymon(Better than Deadcat)

How do I put duplicate hosts in the bb-hosts file ?
A. The first occurrence of the host is as normal. All further occurrences should be 0.0.0.0 hostname # noconn

How does the conn test work if the first ping doesn't respond ?
A. bbtest-net calls either hobbitping or fping (configurable is FPING= in hobbitserver.cfg)

How many pings are sent simultaneously ?
A. Exactly how many parallel connections are being used depends on your operating system - the default is FD_SETSIZE/4, which amounts to 256 on many Unix systems You can choose the number of concurrent connections with the "--concurrency=N" option to bbtest-net.

How many packets are sent each time ?
A. hobbitping normally stops pinging a host after receiving a single response, and uses that to determine the round-trip time.

What is the ping time-out ?
A The default timeout is 5 seconds.

Can we configure it?
A. Yes. The—timeout=N parameter to hobbitping allows you to set the timeout to N seconds.

What do the conn test (ping) colours mean?
A. Green is an OK response, yellow with packet loss (not delays, just lost packets) and red when there is no response.

How do I get the hobbit client to download etc/hobbitclient.cfg from the server (or other files in etc or ext), then restart?
A. Why? If the clients are set up to use server-side config, they don't even need the /etc/*.cfg files locally...

Is there a way within Hobbit of creating a graph with data from multiple RRD files ?
A. If you want to create a graph from multiple rrds, I would suggest using the 3rd party tool called drraw http://web.taranis.org/drraw/

How do I disable the REPEAT alerts ?
A. Leaving off the REPEAT= defaults to repeat alerts every 30 minutes. Setting REPEAT=0 defaults to every minute. To disable repeat alerts, set REPEAT=365d If your problem lasts that long, you have bigger problems than too many alerts.

The data being sent from my clients is being truncated.
A. The maximum message size can be set in hobbitserver.cfg MAXMSG_CLIENT= MAXMSG_DATA= MAXMSG_STATUS= Change the above parameters to suit your needs.

Do I need to restart the Hobbit server after editing the config files ?
A. No. All files are reread every 5 minutes. The only exception to this is hobbitserver.cfg Changes to this file require a restart.

Do I need to restart my Hobbit client after making a change ?
A. No. All client config files are reread periodically

How do I monitor a web page ?
A. Add the following line to the bb-hosts file ip.ad.dr.ress www.example.com # http://www.example.com/ or for https ip.ad.dr.ress www.example.com # https://user:passwd@www.example.com/ Note that there are also content checks and post and browser options available. See the "HTTP TESTS" section of the bb-hosts(5) manual for details.

My test only runs every hour. How do I get it to not go purple after 30 minutes ?
A. Sometimes you want something monitored/tested only once each hour (remote Internet site), once each day (backup job or DNS update), or even less often. Or perhaps more frequently (a critical connection once per minute or less).

There is a  option when you call   on the client to report status. It defines how long a status message is considered valid (i.e. fresh and not yet stale). Search "LIFETIME" on the bb man page for a bit more info. Syntax is like the following. The defaults are 5 minute testing intervals and 30 minute status lifetimes, so 5 misses are accepted by default. The lifetime should be adjusted for the acceptable number of misses, otherwise false positives might occur. For example, if the host goes completely offline for 2 hours, or twice coincidentally overlapping each hourly reporting period, an hourly test probably doesn't need immediate debugging. Alternatively, for something being tested every minute, a lifetime of 30 minutes should probably be adjusted downward.

A couple of sample scenarios:
 * If a script reports backup status only when a backup is finished, and the host or display is not communicative at that exact time, purple might or might not be desirable for that host's backup status in that NOC.
 * If a script checks the size of each subdirectory within a parent directory, and the status is more intended for graphing than for green/yellow/red reporting, many misses may be acceptable.

Warning: The man page suggests "sligtly more than the interval between your tests". But if that was actually followed then the default timeout would be 6 minutes. It isn't 6, and it's not designed to be easily changed from 30: Instead of being an option in a configuration file, it's hardcoded in the  file in the   function:

Does Hobbit encrypt its transmissions ?
A. Natively, no. Some people have reported success using external encryption. Later versions might support data encryption.

How do I monitor files? I have added FILE /path/to/my/file to hobbitclient.cfg but still nothing happens.
A.

You also need to add monitored files to client-local.cfg (This tells the client to send the file metadata to the server) The server then uses the metadata and the config in hobbitclient.cfg to determine test results. Also make sure hobbit has read access to the files.

How do I monitor log files other than the default ones?
Answer 1

"Kauffman, Tom"  > On your hobbit server - > > 1) set up etc/client-local.cfg to reference the logs you want AND any exclusions. For AIX, I have: > [aix] > log:/var/log/syslog:10240 >        ignore 3004-004 >         ignore 3004-035 >         ignore 3004 > log:/var/log/console.log:10240 > log:/var/log/dsmsched.log:10240 > > 2) set up etc/hobbit-clients.cfg to create your alerting criteria. For AIX, I've got these set: > HOST=%.* >        LOG /var/log/syslog %.*crit.su.*to.root red >        LOG /var/log/syslog %.*crit.su   yellow >        LOG %/var/(adm|log)/console.log %.*not.responding.still.trying yellow > > > Change client-local.cfg first. Allow 15 to 20 minutes for this to propagate to the client; > look for a file called logfetch. .cfg in client/tmp. This should match your entries in > client-local.cfg. > > Once the logs started to coming in, play with the client-local.cfg and a test system, to track > what you're interested in. > > Tom Answer 2. I have added LOG /path/to/my/logfile WARNING COLOR=yellow to hobbitclient.cfg but still nothing happens.

You also need to add monitored log files to client-local.cfg. This tells the client to send the log file to the server because Hobbit messages protocol is bi-directional, not just hobbit client sending message to server. hobbit server can actually instruct hobbit client to send in log files that are not default ones.

The server then uses the sent data and the config in hobbitclient.cfg to determine test results. Also make sure hobbit has read access to the log.

An example client-local.cfg file.

[sunos] log:/var/adm/messages:10240
 * 1) following are by OS type to ask hobbit clients send in messages file.

[osf1] log:/var/adm/messages:10240

[aix] log:/var/adm/syslog/syslog.log:10240

[hp-ux] log:/var/adm/syslog/syslog.log:10240

[win32]

[freebsd] log:/var/log/messages:10240

[netbsd] log:/var/log/messages:10240

[openbsd] log:/var/log/messages:10240

[linux] log:/var/log/messages:10240 dir:/tmp ignore MARK

[linux22] log:/var/log/messages:10240 ignore MARK

[redhat] log:/var/log/messages:10240 ignore MARK

[debian] log:/var/log/messages:10240 ignore MARK

[suse] log:/var/log/messages:10240 ignore MARK

[mandrake] log:/var/log/messages:10240 ignore MARK

[redhatAS] log:/var/log/messages:10240 ignore MARK

[redhatES] log:/var/log/messages:10240 ignore MARK

[rhel3] log:/var/log/messages:10240 ignore MARK

[irix] log:/var/adm/SYSLOG:10240

[darwin] log:/var/log/system.log:10240

[sco_sv] log:/var/adm/syslog:10240

[caoffice2435.mainoffice.test.com] log:/var/adm/messages:10240
 * 1) following are by machine names to ask hobbit clients send in messages file.

[caoffice2436.mainoffice.test.com] log:/var/adm/messages:10240

[caoffice2437.mainoffice.test.com] log:/var/adm/messages:10240

[caoffice2444.mainoffice.test.com] log:/var/adm/messages:10240

[caoffice2445.mainoffice.test.com] log:/var/adm/messages:10240

[caoffice2141.comm.test.com] log:/var/adm/messages:10240 ignore MARK log:/var/opt/hobbitserver42/log/acknowledge.log:10240 log:/var/opt/hobbitserver42/log/bb-display.log:10240 log:/var/opt/hobbitserver42/log/bb-network.log:10240 log:/var/opt/hobbitserver42/log/bb-retest.log:10240 log:/var/opt/hobbitserver42/log/bbcombotest.log:10240 log:/var/opt/hobbitserver42/log/cgierror.log:10240 log:/var/opt/hobbitserver42/log/clientdata.log:10240 log:/var/opt/hobbitserver42/log/history.log:10240 log:/var/opt/hobbitserver42/log/hobbitd.log:10240 log:/var/opt/hobbitserver42/log/hobbitlaunch.log:10240 log:/var/opt/hobbitserver42/log/hostdata.log:10240 log:/var/opt/hobbitserver42/log/il02bbhostsallinone.ksh.log:10240 log:/var/opt/hobbitserver42/log/notifications.log:10240 log:/var/opt/hobbitserver42/log/page.log:10240 log:/var/opt/hobbitserver42/log/rrd-data.log:10240 log:/var/opt/hobbitserver42/log/rrd-status.log:10240 log:/var/opt/hobbitserver42/log/runwebalizer.log:10240
 * 1) Solaris 10 OS log
 * 2)   "log:FILENAME:MAXDATA"
 * 1) hobbit server logs

log:/var/opt/httpd222/log/access_log:102400 log:/var/opt/httpd222/log/error_log:102400
 * 1) httpd server logs

log:/var/log/maillog:102400
 * 1) httpd server logs

What's the meaning of the track alert-mail-number in the subject of hobbit alert emails ?
A. The number is the ack-code you can use for acknowledging the alert. They are random numbers generated for each alert.

How do I restrict access to Hobbit pages to specific people or groups ?
A. Apache has its own authentication. Use it. To give one group access to some info, and another group access to other info, use the PAGE statement in bb-hosts. This will create a new directory for each page, which can be controlled by Apache's authentication system.

I am having a problem with devmon ....
A. Please post devmon related questions to the devmon support mailing list.

How do I create custom scripts and graphs ?
A. http://xymonton.org/tutorials:customgraph

Why does my custom graph only show in the trends column but not its own column ?
A. This is typical for tables. One must append the column name to the  setting. More details on  is in the hobbitsvc.cgi documentation. (man hobbitsvc.cgi)
 * 1) Look in the source file web/hobbitsvc.c.
 * 2) *Find the  assignment and note its value.
 * 3) Edit the configuration file server/etc/hobbitcgi.cfg:
 * 4) Find the  assignment.
 * 5) Add a  value to the assigned string, including the columns that tend to have tabular data. For example: CGI_SVC_OPTS="--env=/home/hobbit/server/etc/hobbitserver.cfg --no-svcid --history=top --multigraphs=disk,if_load,if_dsc" (Note that, while the setting in hobbitsvc.c starts and ends with a comma, the setting in hobbitcgi.cfg does not.)
 * 6) Then try out the change (refresh the web page).

I don't want to display column foo in my display. How do I do that ?
A. Add the entry NOCOLUMNS:foo,bar to hide column foo and bar.

How do I check to ensure something is not running ?
A. In bb-hosts 1.2.3.4 my.host.com # !ftp This will cause the test to go red if FTP is running

I want to monitor Windows servers. How do I do that ?
If you have to, use BBWin as your client. http://bbwin.sourceforge.net/

What are the translations between BBWin's XML and Central Mode?
(Note: untested but hopefully accurate) (Note: not completely accurate but good starting point)

I am having a problem with bbwin on my Windows......
A. Please post bbwin issues to the bbwin forum

The BBWIN client works well. I've re-used a script that funcioba before Big Brother. The result is left in C: \ BBWin \ logs.

Everything else is reported to the server but not external.

               <load name="uptime" value="uptime.dll" /> <load name="who" value="who.dll" /> <setting name="loglevel" value="3" /> <setting name="logpath" value="C:\BBWin\logs\BBWin.log" /> <setting name="logreportfailure" value="false" /> <setting name="hostname" value="vs3k-gap" /> <setting name="alwaysgreen" value="false" /> <setting name="default" warnlevel="90" paniclevel="95" delay="3" /> <setting name="alwaysgreen" value="false" /> <setting name="default" warnlevel="85%" paniclevel="95%" /> <setting name="remote" value="false" /> <setting name="cdrom" value="false" /> <setting name="timer" value="1m" /> <setting name="logstimer" value="60s" /> <load value="C:\BBWin\ext\sqlv.cmd" timer="1m" /> <setting name="alwaysgreen" value="false" /> <setting name="physical" warnlevel="78" paniclevel="98" /> <setting name="page" warnlevel="70" paniclevel="90" /> <setting name="virtual" warnlevel="78" paniclevel="90" /> <setting name="alwaysgreen" value="false" /> <setting name="delay" value="1h" /> <match logfile="System" type="error" alarmcolor="red" /> <match logfile="System" type="warning" alarmcolor="yellow" /> <match logfile="Application" type="error" alarmcolor="red" /> <match logfile="Application" type="warning" alarmcolor="yellow" /> <match logfile="Security" type="fail" /> <setting name="alwaysgreen" value="false" /> <setting name="autoreset" value="false" /> <setting name="alarmcolor" value="yellow" /> <setting name="Windows Time" value="started" autoreset="true" alarmcolor="red" /> <setting name="delay" value="30m" /> <setting name="maxdelay" value="365d" />

File script generate out:


 * OK

echo green   ***Error en Chequeo SQLV, Servidor:%computername%	>>  C:\BBWin\logs\sqlv echo ^&green ***Error en Chequeo SQLV, Servidor:%computername%	>>  C:\BBWin\logs\sqlv


 * BAD

echo red    ***OK en Chequeo SQLV, Servidor:%computername%	>>  C:\BBWin\logs\sqlv echo ^&red  ***OK en Chequeo SQLV, Servidor:%computername%	>>  C:\BBWin\logs\sqlv

I set up my client, but nothing is appearing on the status page ?
A. Check the ghost client reports or the hobbitd status page. It could be misconfigured.

We are changing the name of a host, and want to keep monitoring it, and keep the history. Is there a way ?
A. Check the Hobbit Tips & Trick page in help menu ~/server/bin/bb 127.0.0.1 "rename OLDHOSTNAME NEWHOSTNAME"

I don't want to use rrdtool to create data averages and round-robin the data, I want to keep all data forever. Can I do it ?
A. There's a method of doing this in the current snapshot, including a new hobbitd_rrd manpage that describes how to run it,

and what the input to your custom script looks like. The option is—processor=COMMAND It will feed the raw data via stdio into COMMAND COMMAND can then process the data into another storage system.

I don't have a compiler on my AIX system. Where can I get a precompiled Hobbit client ?
A. http://www.docum.org/twiki/bin/view/Hobbit/HobbitClients

How do I unsubscribe from the Hobbit mailing list ?
A. If you must go, send an e-mail to hobbit-unsubscribe@hswn.dk

How do I add a hobbit search engine plugin into Internet Explorer 7 ?
'A. Await contribution.

How do I add a hobbit search engine plugin into FireFox 2 ?
A. Copy the following file and save it as "c:\Program Files\Mozilla Firefox\searchplugins\hobbit.xml" (this is the OpenSearch description format which is compatible both for FireFox 2 and Internet Explorer 7). Adjust the url address for your site accordingly.

Then restart Firefox. You should see a hobbit blue smile icon shown up in search bar. References:
 * http://www.opensearch.org/Specifications/OpenSearch/1.1#OpenSearch_description_document
 * http://developer.mozilla.org/en/docs/Creating_OpenSearch_plugins_for_Firefox

Is it possible to disable and acknowledge alerts via email, how?
Include an <tt>IGNORE</tt> exception with hosts or alert that you wish to ignore:

HOST=* SERVICE=* IGNORE HOST=%[a-z]{3}[0-9]{4} MAIL admin@foo.com

or perhaps more specific:

HOST=* COLOR=red IGNORE HOST=marketing.foo.com SERVICE=cpu TIME=4:1500:1800 MAIL admin@foo.com

and check configuration:

$ cd ~/server $ ./bin/bbcmd hobbitd_alert—test tic0102 comm 00026606 2012-03-01 10:52:51 *** Match with 'IGNORE HOST=%[a-z]{3}[0-9]{4}' *** 00026606 2012-03-01 10:52:51 IGNORE rule found

Can I configure a maximum time limit a alert can be acknowledged for ?
A.

Can I restrict what hosts can be disabled on the enable/disable page ?
A.

Can I enable/disable on only one display server and have it show on both?
Typical installations with dual display servers have them configured to act independently. Each Xymon display server lists only itself in the <tt>XYMSERVERS</tt> setting in <tt>xymonserver.cfg</tt>, so usually, the enable/disable page only applies to the server that served the form.

However, the <tt>XYMSERVERS</tt> value can be overridden for each CGI script, and the enable/disable form will send updates to all Xymon servers defined in <tt>XYMSERVERS</tt>. To override for enable/disable, perform the following.

Create the file <tt>xymonserver-enadis.cfg</tt> containing:

include /etc/xymon/xymonserver.cfg XYMSERVERS="display1 display2" # replace with IP addresses of Xymon servers

Then edit <tt>cgioptions.cfg</tt> and add this line:

XYMONENV_ENADIS=/usr/lib/xymon/server/etc/xymonserver-enadis.cfg

and modify <tt>CGI_ENADIS_OPTS</tt> to reference the new variable:

CGI_ENADIS_OPTS="--env=$XYMONENV_ENADIS"

Can the SMS alert format be reconfigured to display more or less information ?
A. Not directly, but an alert script (smsplus) has been written to provide more information than the default SMS output does. The script can be easily modified using the following environment variables (from the hobbit documentation):

Q. How do I enable SNMP monitoring with Hobbit Server?
A 1. http://cerebro.victoriacollege.edu/hobbit-trap.html

A 2. http://devmon.sourceforge.net

Q. How to configure multiple yellow to red alert ?
On Wed, Apr 18, 2007 at 04:11:13PM -0400, Galen Johnson wrote:

> I'll admit I haven't put a lot of legwork into this but...is it > possible to configure hobbit to go red on a test after a certain > number of cycles at yellow? I have some tests that I don't mind > if they are yellow for small period but if they are there too long > I need to know.

A. Use the "badTEST" setting in bb-hosts (see the man-page). This delays a yellow or red status from appearing until it has stayed yellow (or red) for a number of test cycles. So you could use this to suppress the yellow status until it had been yellow for some time, so when it does turn yellow you know this is something you have to handle.

Or if your custom test reported a "red" status, you could make it go yellow for the first 5 test cycles, and red after that.

Henrik

Note: Currently, it only work for ping test.

Q. How do I use hobbit client with BB server ?
A. On Wed, Sep 06, 2006 at 01:48:02PM -0500, Rich Smrcina wrote: > Is the new Hobbit client compatible with the old BigBrother server? > BigBrother is run by a different part of the organization and I may not > be able to get them to change to Hobbit, but for my Linux guests and my > z/VM systems, I would be interested in converting to the new Hobbit code.

In the default mode, you cannot use the Hobbit client to report to a Big Brother system. No data would ever show up, because a Big Brother server doesn't know how to feed the client data through the hobbitd_client module, which takes care of converting the client data into status columns.

However, you *can* run the Hobbit client in the local-configuration mode. When the configure script asks Server side client configuration, or client side [server] ? answer "client", and the launch the client with the "--local" option.

In this mode, the client sends normal "status" messages to the Hobbit/BB server. I'm not sure if alerts will work, though, since the Hobbit client doesn't generate the "page" messages that the BB server expects to trigger sending out alerts. (Hobbit ignores these messages completely, so it did seem like a waste of time to generate them).

Note that this isn't really described very well anywhere. It means you will have to maintain the client configuration on the client, not on the Hobbit server.

Regards, Henrik

Q. How to enable fping as non-root user in Solaris 10 ?
The problem bash-3.00$ more bb-network.log 2006-08-29 17:31:02 Execution of 'hobbitping -Ae' failed - program not suid root? 2006-08-29 17:31:02 2006-08-29 17:31:02 Cannot get RAW socket: Permission denied bash-3.00$

the fix There are 3 files that need to be updated so that fping can be executed as root (uid=0) on Solaris by a named user, in this case the hobbit user.

The three files in question are, /etc/user_attr /etc/security/exec_attr /etc/security/prof_attr

These have been updated with the following lines

In /etc/user_attr: hobbit::::profiles=Hobbit Commands

In /etc/security/exec_attr: Hobbit Commands:solaris:cmd:::/usr/local/hobbit/server/bin/hobbitping:uid=0

In /etc/security/prof_attr: Hobbit Commands:::Hobbit Commands:

Regards,

Mike Rowell, edited by T.J. Yang References: http://docs.sun.com/app/docs/doc/816-4557/6maosrjfc?a=view

Q. Can you use a hobbit server, but keep your bb clients?
A. Yes. Hobbit is 100% BB client compatible. but beware that after bb 1.9c, license become more restrictive, you need to pay for per-seat license.

Q. What is Big Sister ?

 * http://sourceforge.net/projects/bigsister/
 * Big Sister is a port of an early version of Big Brother, written in Perl.

Q. Is there a quick overview of System Monitoring Tools ?

 * http://www.generalconcepts.com/resources/monitoring

Q. What features should a monitoring system have ?

 * /Generic Monitoring System Features/

Q. How safe is it to migrate to Hobbit from BB ?
-Original Message- From: Henrik Stoerner [mailto:henrik@hswn.dk] Sent: Tuesday, August 01, 2006 5:13 PM To: hobbit@hswn.dk Subject: Re: [hobbit] Hobbit newbie from BB: differences and what may I lose from migrating?

Hi Jordan,

I'll try to answer your questions. Since I also develop Hobbit I am probably slightly biased when it comes to the "is-this-more-difficult- to-do-than-with-BB" type of questions, but I am sure others will voice their opinions on that.

On Tue, Aug 01, 2006 at 12:36:29PM -0700, Jordan Mendler wrote: > > First, after reading through whatever I could find on the website I am > still a little bit confused about configuration and setup. With BB, > you install and configure each client and server on the local machine, > except for the universal bb-hosts. Is this the same on Hobbit, or does > Hobbit use a central configuration file that is modified only on the > server to configure clients? I am trying to figure out the difference > between installing, maintaining and configuring BB and Hobbit setups.

First, let me stress that Hobbit is fully compatible with your existing BB clients. You can keep your current client setup and just switch to Hobbit on the server side, and all of your clients will continue to work as they do with BB as the server. So you can migrate the server side first, and then migrate clients when you find that it is convenient to do so - or you want to take advantage of some of the new stuff that is in Hobbit.

The Hobbit client configuration is maintained on the Hobbit server. Clients in Hobbit are designed to be *really* dumb; they just collect data, and all of the configuration of what to monitor, what thresholds to use for e.g. disk utilization and so on is configured only on the Hobbit server.

This is a major difference between Hobbit and BB. With BB you have delegated the client administration to whoever manages each server. Hobbit centralizes the monitoring configuration, so you will probably have a group of people who take more control of the monitoring setup.

> Hobbit looks a lot more complex to setup, but once I get my feet wet is > it any harder than BB?

I think it is easier, once you get used to the Hobbit way of doing things. But as I said, I am biased.

> Second is performance. I know this list may be biased toward Hobbit, > but is it actually faster? We have about 50-100 clients on BB and I did > not notice any performance issues.

With that number of systems monitored, you probably will not see a huge difference. BB works quite well for a small number of systems, but when you move beyond a couple of hundred boxes the overhead of generating webpages through shell scripts becomes very noticeable. On my setup, the servers were simply choking on the disk I/O caused by BB saving every status in a separate file, and from the huge number of small cut-grep-awk-sed etc. commands that ran to generate webpages.

> Hobbit looks like it is very complex, so does this mean it uses a lot > of resources on the client and server? What speed/ram server is > usually the minimum recommended for a dedicated Hobbit server? Would > something like a dual Pentium II 266mhz have any performance issues > as a server, if it does nothing else? What about for clients? We have > still have some testing, stating and production servers left that are > singe chip Pentium III 700-850 mhz, and even a couple Pentium II's. > Just need to make sure all the resources used for things like graphs > are taken from the server and not each client.

The Hobbit server uses fewer resources than the BB server. The main resource usage is memory; Hobbit keeps everything in memory except the history logs and the RRD files used for graphs. That doesn't mean a whole lot, though: Here's a ps listing of the Hobbit processes running

on my main monitoring system - it handles about 2500 hosts:

$ ps vax|cut -c1-100|egrep "PID|hobbit" PID TTY     STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND 732 ?       Ss     1:24      0   101  1802  696  0.0 hobbitlaunch 735 ?       S    2434:37     1   162 31357 29784  2.8 hobbitd 1470 ?       S     14:50      0    99  2332 1088  0.1 hobbitd_channel --channel=stachg 1471 ?       S     25:18      0   108  2515 1048  0.1 hobbitd_history 1472 ?       S    964:26      0    99  2332 1264  0.1 hobbitd_channel --channel=page 1473 ?       S    1227:34     0   154  5661 3912  0.3 hobbitd_alert 1474 ?       S    4090:05     0    99  2332 1264  0.1 hobbitd_channel --channel=status 1475 ?       D    2962:15     0   178  7381 4392  0.4 hobbitd_rrd 1476 ?       S    259:55      0    99  2332 1208  0.1 hobbitd_channel --channel=data 1477 ?       S    494:13      0   178  5141 2128  0.2 hobbitd_rrd 1478 ?       S    126:20      0    99  2844 1832  0.1 hobbitd_channel --channel=client 1480 ?       S    291:20      0   146  4485 2792  0.2 hobbitd_client 5552 ?       S      0:00      0   669  2002 1352  0.1 sh -c vmstat 300 2 1>/usr/lib/hobbit/client/

As you can see, the biggest chunk of memory goes to the "hobbitd" process which is the one that keeps all state information. It's currently using some 31 MB of memory. (This box has 1 GB RAM).

A rough estimate of how much memory Hobbit needs would be the size of your bbvar/logs/ directory, plus 30 MB.

As for CPU usage, your PII/266 should be adequate for 50-100 servers. The box I'm running on is an old (7-8 years) Solaris server with a 900 MHz UltraSparc II processor. That's roughly comparable to a PII running at 1.2 GHz. And it handles 25 times as many hosts as you are aiming for.

> Third is plugins. Are BB plugins compatible with Hobbit?

Yes.

> Also how hard are plugins to write for Hobbit?

Plugins that run on the monitored client systems are as easy to write as for BB, since it is basically the same thing.

Hobbit also allows you to write plugins for the Hobbit server, which receive events from the Hobbit server daemon. This is used by the core Hobbit tools - e.g. the hobbitd_rrd processes you see in the ps-listing above are a plugin that handle updating of the RRD files from the status- and data-messages that are sent to Hobbit. There aren't any third-party plugins that use this yet (at least, I don't know of any), but writing them is fairly simple since it basically involves reading data from a pipe and processing it in whatever way you want.

> I don't know if these even exist for bb, but I ultimately would > like to integrate plugins that 1) monitor legato tape backup,

Don't know about this.

> 2) run nmap to see what ports are open/can be seen from an external > machine,

The Hobbit client in version 4.2 (about to be released soon) reports details about the network services running on a host. So you can check for which ports are open/listening for connections, and trigger alerts if any unwanted ports show up.

> 3) run 'lshw -html' to show a list of all the hardware on the system,

This would typically be a client-side test.

> 4) monitor uptime,

This is standard.

> 5) monitor OS and kernel versions (uname -a and head -n 1 /etc/issue),

This data is collected by the Hobbit client.

> 6) maybe some more router/network monitoring stuff and

Hobbit comes with built-in network service monitoring. There is also an SNMP add-on which can be used for monitoring devices such as routers.

> Fourth is relay. By this I mean monitoring systems on a private > subnetwork that are only accessible to the Hobbit server by going > through an intermediate server. Is this possible with Hobbit and is it > any more difficult to do than on BB?

Two ways of doing that. First, there is a proxy utility which is used to forward Hobbit messages from one network to another. This is used if your client systems on the private subnet are allowed to make outgoing connections to the proxy, and the proxy can connect to the real Hobbit server.

Second, Hobbit 4.2 includes a set of tools where it's the server that contacts clients to pick up the data they have collected (i.e. the traffic is initiated by the server, where the normal BB setup is for the client to initiate the connection). Useful for DMZ style setups where clients are not allowed to generate outbound connections.

> Fifth is portability. BB is very portable, I can make a 'model' client > for say Red Hat and tar it and distribute it very easily to every > server I have using only a few commands. Is Hobbit the same, or are there > client dependencies or other things that may make this more difficult.

The Hobbit client uses only the system libraries and standard utilities found on your client systems. You will need at least one system where you can compile the client binaries (that's similar to the BB requirements), since a few of the client-side tools are written in C.

Once you have a client compiled for an OS, it is as portable as any binary that is dynamically linked on your platform. I.e. you can just copy it over as long as the same run-time libraries are available.

So far, we haven't managed to find any unix-like system that couldn't run the Hobbit client. Including some rather odd ones. The current list of client-side data collectors are

hobbitclient-aix.sh   hobbitclient-darwin.sh  hobbitclient-freebsd.sh hobbitclient-hp-ux.sh  hobbitclient-irix.sh    hobbitclient-linux.sh hobbitclient-netbsd.sh hobbitclient-openbsd.sh hobbitclient-osf1.sh hobbitclient-sunos.sh

> Sixth is development. How active is the development of Hobbit, how big > is the community, etc? How many people can attest to having fully > functional hobbit setups, how long has it been around and how often > are new releases usually made?

Hobbit started back in late 2002 when it was called the "bbgen toolkit". It was renamed to Hobbit in March 2005 when it had developed into a complete replacement for BB. More details in the hobbit(7) man-page available online at http://www.hswn.dk/hobbit/help/manpages/

It is actively being developed by me, but people on this list have made contributions of code. Some have picked up special projects like the Windows client and run that completely on their own. I'd say Hobbit currently has a very active user community, and the development community is slowly growing beyond just myself.

There are currently 433 subscribers to the Hobbit mailing list. According to the Sourceforge download statistics, it is downloaded about 1000 times per month. http://sourceforge.net/project/stats/?group_id=128058&ugn=hobbitmon&type =&mode=year

There was a thread on the mailing list back in May about who uses Hobbit. The results were summarized here: http://en.wikibooks.org/wiki/System_Monitoring_with_Hobbit/User_Guide#Wh o_use_Hobbit_.3F

New releases have usually happened frequently - 2-4 times a year. The current interval between the 4.1.2 release and version 4.2 is unusually long - a whole year. I don't expect that to happen again.

> Also I saw something this morning about a Windows client -- how > stable is that?

From what I hear it should be usable. But you can stick with the current BBNT client until it reaches version 1.0.

> How stable is the Solaris version?

Rock-solid.

> Is there a client for Mac OSX?

Yes. It will run the Hobbit server also, if you want to.

> Is Hobbit like BB in the sense that you can change paths to system > binaries like grep and sed to allow easy use on other UNIXes like OSX?

Adding a client for a new OS will require implementing both a client-side script to collect whatever data is interesting for this system, and implementing the data parsing on the Hobbit server-side. So it is somewhat more challenging. But since Hobbit already supports all of the common Unix systems, I doubt that you will need to worry about that. If you do have a system which is not on the list, I will help you with adding support for it.

> When will 4.2 be officially released as a production version?

Probably by the end of this week.

> Since we have a working BB setup for now, I need to > decide if I should try to start migrating now or if I should wait some > time for Hobbit to develop more before I migrate from BB.

I don't think you have to wait. But it's for You to decide.

Regards, Henrik

Q. Does NOCOLUMNS have to refer to a client line or can it refer to a page/subpage in bb-hosts ?
A. Waiting for contribution.

32 bit
CC=cc CFLAGS="-mr -Qn -xstrconst -xO2 -xtarget=ultra2 -xarch=v8plusa" CC_LD_RT="-R"

Meaning of each option: -mr: -Qn: -xstrconst: -xO2 -xtarget=ultra2 -xarch=v8plusa

64 bit
CC=cc CFLAGS="-mr -Qn -xstrconst -xO2 -xtarget=general -xarch=v9" CC_LD_RT="-R"

Q. Example of 32-bit hobbit server
bash-3.00# file hobbitd hobbitd:       ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required, UltraSPARC1 Extensions Required, dynamically linked, not stripped bash-3.00#

Warning messages from Sun Compiler
Followings are warning messages from using the Sun Compiler.

Q. "pointer to unsigned char "=" pointer to char"
bash-3.00# gmake cc -mr -Qn -xstrconst -xO2 -xtarget=ultra2 -xarch=v8plusa -D_REENTRANT -DSunOS -o safequery\ safequery.c "safequery.c", line 12: warning: assignment type mismatch: pointer to unsigned char "=" pointer to char bash-3.00#
 * the error message.


 * The source
 * Note: getenv is documented as returning <tt>char *</tt>, not <tt>unsigned char *</tt>.

Others
"loadhosts.c", line 463: warning: statement not reached prototype: pointer to char : "/opt/build/hobbit-4.2.0/include/../lib/strfunc.h", line 16 argument : pointer to unsigned char "loadhosts.c", line 463: warning: statement not reached "loadhosts.c", line 521: warning: return value type mismatch "hobbitd_alert.c", line 543: warning: assignment type mismatch: pointer to char "=" pointer to unsigned char "hobbitd_alert.c", line 665: warning: assignment type mismatch: pointer to unsigned char "=" pointer to char "hobbitd_alert.c", line 692: warning: argument #1 is incompatible with prototype: prototype: pointer to unsigned char : "/opt/build/hobbit-4.2.0/include/../lib/encoding.h", line 17 argument : pointer to char "hobbitd_alert.c", line 701: warning: assignment type mismatch: pointer to unsigned char "=" pointer to char "hobbitd_alert.c", line 719: warning: assignment type mismatch: pointer to unsigned char "=" pointer to char "do_alert.c", line 182: warning: argument #2 is incompatible with prototype: prototype: pointer to char : "/opt/build/hobbit-4.2.0/include/../lib/strfunc.h", line 16 argument : pointer to unsigned char "do_alert.c", line 190: warning: argument #1 is incompatible with prototype: prototype: pointer to char : "/opt/build/hobbit-4.2.0/include/../lib/misc.h", line 25 argument : pointer to unsigned char "do_alert.c", line 253: warning: argument #1 is incompatible with prototype: prototype: pointer to char : "/opt/build/hobbit-4.2.0/include/../lib/misc.h", line 25 argument : pointer to unsigned char "do_alert.c", line 272: warning: argument #1 is incompatible with prototype: prototype: pointer to char : "/opt/build/hobbit-4.2.0/include/../lib/misc.h", line 25 argument : pointer to unsigned char

How do I monitor HP-UX network log ?
On Mon, Jan 28, 2008 at 12:06:22PM -0500, Robert Herron wrote: > HP-UX stores its network log in a binary file (/var/adm/nettl.LOG000) that > you view with the netfmt command. Before I start working on my own, does > anyone have an EXT script to monitor it? If so, could I have a copy?

Alternatively, you could modify the HP-UX client script to generate a normal Hobbit "msgs" section with the text-output from the netfmt command; then Hobbit can process it as if it were an ordinary text-based logfile.

E.g. at the bottom of the hobbitclient-hp-ux.sh script running on your clients, just before the "exit" command add this:

echo "[msgs:/var/adm/nettl.LOG000]" netfmt ...whatever needs to go here to get the text-output ...

Then you can use a normal log-file entry on the Hobbit server to process the log data.

Regards, Henrik

Q. How do I enable RSS on hobbit server ?
A. BBGENOPTS="--recentgifs --subpagecolumns=2"    # Standard options for bbgen. BBGENOPTS="--recentgifs --subpagecolumns=2 --rss --rsslimit=yellow"     # Standard options for bbgen. bash-3.00$ ls -lrt /opt/moto/hobbitserver42/www/*.rss -rw-r--r--  1 hobbits  hobbits      714 Jan 22 07:50 /opt/moto/hobbitserver42/www/bb.rss -rw-r--r--  1 hobbits  hobbits     6184 Jan 22 07:50 /opt/moto/hobbitserver42/www/bb2.rss -rw-r--r--  1 hobbits  hobbits      337 Jan 22 07:50 /opt/moto/hobbitserver42/www/bbnk.rss bash-3.00$
 * In the hobbitserver.cfg file, change the BBGENOPTS variable from the following:
 * To the following:
 * 1) enable RSS by " --rss --rsslimit=yellow"
 * The newly created RSS files in the www directory should resemble the following:

Q. How do I configure SMF for hobbit on Solaris 10 ?
A. From: Everett, Vernon [mailto:Vernon.Everett@woodside.com.au] For those of you familiar with Solaris 10, you should know about services, but for some, adding new ones is a little tricky. To get Hobbit working as a service we need to do the following.

Create a file named /var/svc/manifest/application/hobbit.xml with the following content:

Take note of lines 37, 47 and 57, the lines that start "exec=". You may need to edit the path to your Hobbit start script.

To avoid confusion or possible issues, shut down your hobbit client at this point using the runclient script.

Now, as root, run the command
 * 1) svccfg import /var/svc/manifest/application/hobbit.xml

We should now have a service called hobbit. online 9:23:05 svc:/application/hobbit:default (It will probably have gone online at this point)
 * 1) svcs | grep hobbit

You can now treat it as you would a regular service. If it hasn't gone online, kick it off as normal. It may be necessary to do a disable and then an enable, but that should get it going.
 * 1) svcadm enable hobbit

And because we have set the default as enabled, the service should start automatically when you do a reboot.

Confirm it's all good by doing All the usual scripts should be running.
 * 1) ps -efa | grep hobbit

If you don't want it as a service anymore, as root run This will remove the service, and allow you to continue running it from the runclient script.
 * 1) svccfg delete hobbit.

Q. How do I configure Hobbit Client for Solaris 10 using SMF ?
A. copied from http://xymonton.org/addons:hobbitsmf These are service manifest files for Solaris 10. These will allow you to import the hobbit start and stop scripts for the server/client into the new Solaris Service Management Facility (Solaris 10 replacement of /etc/rcN.d).

Installation

1. mkdir -p /var/svc/manifest/application/monitoring/hobbit 2. copy the client and server xml files to /var/svc/manifest/application/monitoring/hobbit 3. import the service(s)

svccfg import /var/svc/manifest/application/monitoring/hobbit/server.xml svccfg import /var/svc/manifest/application/monitoring/hobbit/client.xml

4. enable the service(s)

svcadm enable svc:/application/monitoring/hobbit/client:default svcadm enable svc:/application/monitoring/hobbit/server:default

Source Hobbit Client

<?xml version='1.0'?> <!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>

<service_bundle type="manifest" name="hobbit:client"> <service name="application/monitoring/hobbit/client" type="service" version="1"> <create_default_instance enabled='false' /> <single_instance /> <dependency name="filesystem" grouping="require_all" restart_on="none" type="service"> <service_fmri value="svc:/system/filesystem/local"/> <dependency name="network" grouping="require_all" restart_on="none" type="service"> <service_fmri value="svc:/network/initial"/> <dependency name="multi-user-server" grouping="require_any" restart_on="error" type="service"> <service_fmri value="svc:/milestone/multi-user-server:default"/> <exec_method type="method" name="start" exec="/usr/local/hobbit/client/runclient.sh start" timeout_seconds="30"> <method_context> <method_credential user="hobbit" group="bb" supp_groups="" /> </method_context> </exec_method> <exec_method type="method" name="stop" exec="/usr/local/hobbit/client/runclient.sh stop" timeout_seconds="30"> <method_context> <method_credential user="hobbit" group="bb" supp_groups="" /> </method_context> </exec_method> <property_group name='startd' type='framework'> <propval name='ignore_error' type='astring' value='core,signal' /> </property_group> <stability value="Unstable"/> <common_name> <loctext xml:lang="C"> Hobbit Client </common_name> <doc_link name='hobbit_monitor_site' uri='http://hobbitmon.sourceforge.net/' /> </service_bundle>

Q. How do I configure Hobbit Server for Solaris 10 using SMF ?
A. copied from http://xymonton.org/addons:hobbitsmf <?xml version='1.0'?> <!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>

<service_bundle type="manifest" name="hobbit:server"> <service name="application/monitoring/hobbit/server" type="service" version="1"> <create_default_instance enabled='false' /> <single_instance /> <dependency name="filesystem" grouping="require_all" restart_on="none" type="service"> <service_fmri value="svc:/system/filesystem/local"/> <dependency name="network" grouping="require_all" restart_on="none" type="service"> <service_fmri value="svc:/network/initial"/> <dependency name="multi-user-server" grouping="require_any" restart_on="error" type="service"> <service_fmri value="svc:/milestone/multi-user-server:default"/> <exec_method type="method" name="start" exec="/usr/local/hobbit/server/hobbit.sh start" timeout_seconds="30"> <method_context> <method_credential user="hobbit" group="bb"/> </method_context> </exec_method> <exec_method type="method" name="stop" exec="/usr/local/hobbit/server/hobbit.sh stop" timeout_seconds="30"> <method_context> <method_credential user="hobbit" group="bb"/> </method_context> </exec_method> <property_group name='startd' type='framework'> <propval name='ignore_error' type='astring' value='core,signal' /> </property_group> <stability value="Unstable"/> <common_name> <loctext xml:lang="C"> Hobbit Server </common_name> <doc_link name='hobbit_monitor_site' uri='http://hobbitmon.sourceforge.net/' /> <manpage title="hobbit" section="1" manpath="/usr/local/man"/> </service_bundle>

Q. How do I enable devmon on hobbit server ?
A. This http://www.techagent.com/devmon_snmp_hobbit_setup.htm has some procedures to enable devmon on hobbit.

Q. How do I remove a test ?
A. On Tue, Sep 05, 2006 at 08:07:34AM +0200, Ulric Eriksson wrote: > I have figured out how to remove a single test from one host, or > all tests from a single host.

The command,  bb 127.0.0.1 "hobbitdboard" is your friend, combined with a bit of scripting. E.g:

> Is it possible to remove a single test from *all* hosts?

bb 127.0.0.1 "hobbitdboard test=MYTEST fields=hostname" | while read H; do bb 127.0.0.1 "drop $H MYTEST"; done

> Or all tests from all hosts?

bb 127.0.0.1 "hobbitdboard test=info fields=hostname" | while read H; do bb 127.0.0.1 "drop $H"; done

> Or all tests that are purple?

bb 127.0.0.1 "hobbitdboard color=purple fields=hostname,testname" | while read L; do      HOST=`echo $L | cut -d'|' -f1` TEST=`echo $L | cut -d'|' -f2` bb 127.0.0.1 "drop $HOST $TEST" done

Q. How can I request the hobbit server to ask hobbit client to run a command locally based on an alert.
A. Run this as a client extension:

#!/bin/sh

# Get the current status of the "msgs" column MSGSSTATUS=`$BB $BBDISP "query $MACHINE.msgs" | awk '{ print $1 }`

# Get the command we must run from the client config CMD=`grep "^msgsrecovercmd:" $BBTMP/logfetch.$MACHINEDOTS.cfg | sed -e 's!^msgsrecovercmd:!!'`

# If "msgs" is red and there is a command, run it if test "$MSGSSTATUS" = "red" -a "$CMD" != "" then $CMD fi

exit 0

Before doing this, consider the security implications of having your servers run commands that they fetch from a remote host without authentication.

Regards, Henrik

Q. How do I configure "GROUP" alerts ?
A. "GROUP" keyword is used in hobbit to classify many process name or disk partition name into different groups. This feature is needed in a big IT environment usually has different teams responsible for different areas of IT infrastructure. Ex. an IT organization usually consists of network,data/storage,backup, databases, application and Unix teams.


 * hobbit-client.cfg : We need to specify which process will alert which GROUP.
 * hobbit-alerts.cfg : In this file we then specify which email address receive the GROUP alert.

Example
This is simple setup for learning purpose. Assuming when / is over 93% usage we want unix team to be paged.when /boot is over 15 percent apps team need to be alerted. Also cron process is dead, Unix team should be alerted. when Xvnc is over 120 processes then apps team need to be alerted also.

hobbit-client.cfg
HOST=t-rh9.mywork.com DISK / 93 98           GROUP=UNIX_TEAM_PARTITION DISK /boot 15 20       GROUP=APPS_TEAM_PARTITION PROC cron 1 -1 yellow  GROUP=UNIX_TEAM_PROCESS PROC Xvnc 1 120 yellow GROUP=APPS_TEAM_PROCESS PROC defunct 0 0  red LOG /var/log/messages WARNING COLOR=yellow LOG /var/log/maillog  WARNING COLOR=yellow LOG /var/opt/hobbitclient42/log/clientlaunch.log WARNING COLOR=yellow LOG /var/opt/hobbitclient42/log/hobbitclient.log WARNING FILE /etc/passwd SIZE>0  OWNERID=root  COLOR=yellow

hobbit-alerts.cfg
The names after GROUP, have to be exact the same that are used in hobbit-client.cfg GROUP=UNIX_TEAM_PROCESS MAIL site02unix-admin-email@site02ad2141.mywork.com FORMAT=TEXT

GROUP=UNIX_TEAM_PARTITION MAIL site02unix-admin-email@site02ad2141.mywork.com FORMAT=TEXT

GROUP=APPS_TEAM_PROCESS MAIL site02unix-admin-email@site02ad2141.mywork.com FORMAT=TEXT

GROUP=APPS_TEAM_PARTITION MAIL site02unix-admin-email@site02ad2141.mywork.com FORMAT=TEXT

Debugging
Hobbit provide a very powerful debugging tool to trace the alert rules. From following example we check a host against all the rules in hobbit-alert.cfg.

Oversize data/client msg from 10.5.64.212 truncated (n=2068326, limit 1961984) First line: linux2.test.com|linux|linux -rw-r--r-- 1 hobbitc hobbitc 2068297 Jan 29 07:54 msg.linux2.test.com.txt bash-3.00$ bin/bbcmd 2007-07-23 14:28:21 Using default environment file /etc/opt/hobbitserver42/hobbitserver.cfg bash-3.00$ $ bin/hobbitd_alert --debug --test Usage: hobbitd_alert --test HOST SERVICE [options] Possible options: [--duration=SECONDS] [--color=COLOR] [--group=GROUPNAME] [--time=TIMESPEC] $ bash bash-3.00$
 * How do I debug "Oversize data/client msg" error message?
 * following is an example error message
 * Checking the size of msg file got sent from hobbit client side.
 * 1) ls -l msg.linux2.test.com.txt
 * Find out why msg.linux2.test.com.text is so big.
 * Setup the debugging environment by running bbcmd. it will set up all the hobbit environment variables.
 * Also have a look at command syntax of hobbitd_alert.

bash-3.00$ bin/hobbitd_alert --debug --test t-rh9.mywork.com disk --group=APPS_TEAM_PARTITION 2007-07-23 14:38:57 Opening file /etc/opt/hobbitserver42/bb-hosts 2007-07-23 14:38:57 Opening file /etc/opt/hobbitserver42/hobbit-alerts.cfg 2007-07-23 14:38:57 Compiling regex (t-rh9).mywork.com 2007-07-23 14:38:57 Compiling regex (t-rh9).mywork.com 2007-07-23 14:38:57 send_alert t-rh9.mywork.com:DISK state 0 00018286 2007-07-23 14:38:57 send_alert t-rh9.mywork.com:DISK state Paging 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 122 00018286 2007-07-23 14:38:57 Failed 'HOST=$site02test SERVICE=cpu,disk,memory,files,telnet' (service not in include list) 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 125 00018286 2007-07-23 14:38:57 Failed 'HOST=$site02test SERVICE=conn' (service not in include list) 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 129 00018286 2007-07-23 14:38:57 Failed 'GROUP=UNIX_TEAM_PROCESS' (group not in include list) 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 132 00018286 2007-07-23 14:38:57 Failed 'GROUP=UNIX_TEAM_PARTITION' (group not in include list) 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 135 00018286 2007-07-23 14:38:57 Failed 'GROUP=APPS_TEAM_PROCESS' (group not in include list) 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 138 00018286 2007-07-23 14:38:57 *** Match with 'GROUP=APPS_TEAM_PARTITION' *** 2007-07-23 14:38:57 Found a first matching rule 00018286 2007-07-23 14:38:57 Matching host:service:page 't-rh9.mywork.com:DISK:' against rule line 138 00018286 2007-07-23 14:38:57 *** Match with 'GROUP=APPS_TEAM_PARTITION' *** 2007-07-23 14:38:57  repeat t-rh9.mywork.com|DISK|mail|site02unix-admin- email@site02ad2141.mywork.com at 0 2007-07-23 14:38:57  Alert for t-rh9.mywork.com:DISK to site02unix-admin- email@site02ad2141.mywork.com 00018286 2007-07-23 14:38:57 Mail alert with command 'mailx -s "Hobbit [12345] t-rh9.mywork.com:DISK CRITICAL (RED)" site02unix-admin-email@site02ad2141.mywork.com' 2007-07-23 14:38:57 No more secondary matching rule bash-3.00$ bash-3.00$ bin/hobbitd_alert --debug --test t-rh9.mywork.com PROC --group=APPS_TEAM_PARTITION 2007-07-23 14:55:38 Opening file /etc/opt/hobbitserver42/bb-hosts 2007-07-23 14:55:38 Opening file /etc/opt/hobbitserver42/hobbit-alerts.cfg 2007-07-23 14:55:38 Compiling regex (t-rh9).mywork.com 2007-07-23 14:55:38 Compiling regex (t-rh9).mywork.com 2007-07-23 14:55:38 send_alert t-rh9.mywork.com:PROC state 0 00018526 2007-07-23 14:55:38 send_alert t-rh9.mywork.com:PROC state Paging 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 122 00018526 2007-07-23 14:55:38 Failed 'HOST=$site02test SERVICE=cpu,disk,memory,files,telnet' (service not in include list) 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 125 00018526 2007-07-23 14:55:38 Failed 'HOST=$site02test SERVICE=conn' (service not in include list) 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 129 00018526 2007-07-23 14:55:38 Failed 'GROUP=UNIX_TEAM_PROCESS' (group not in include list) 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 132 00018526 2007-07-23 14:55:38 Failed 'GROUP=UNIX_TEAM_PARTITION' (group not in include list) 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 135 00018526 2007-07-23 14:55:38 Failed 'GROUP=APPS_TEAM_PROCESS' (group not in include list) 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 138 00018526 2007-07-23 14:55:38 *** Match with 'GROUP=APPS_TEAM_PARTITION' *** 2007-07-23 14:55:38 Found a first matching rule 00018526 2007-07-23 14:55:38 Matching host:service:page 't-rh9.mywork.com:PROC:' against rule line 138 00018526 2007-07-23 14:55:38 *** Match with 'GROUP=APPS_TEAM_PARTITION' *** 2007-07-23 14:55:38  repeat t-rh9.mywork.com|PROC|mail| site02unix-admin-email@site02ad2141.mywork.com at 0 2007-07-23 14:55:38  Alert for t-rh9.mywork.com:PROC to site02unix-admin-email@site02ad2141.mywork.com 00018526 2007-07-23 14:55:38 Mail alert with command 'mailx -s "Hobbit [12345] t-rh9.mywork.com:PROC CRITICAL (RED)" site02unix-admin-email@site02ad2141.mywork.com' 2007-07-23 14:55:38 No more secondary matching rule bash-3.00$
 * Using the hobbitd_alert debugging command: A successful match, /boot disk usage has to be really over 16%.
 * Look from "Match with" keywords to locate the exact rule got matched.
 * Another check, Xvnc process limit has to be really triggered.

Q. How do I get trimhistory to work for a hobbit server on Solaris ?
A. The default example in hobbit manpage is for hobbit server on Linux.
 * run bbcmd first to inherit BBHIST etc. of variables.
 * 31676645 is about 1 year of epoch time number.
 * ./trimhistory—debug—cutoff=`/usr/bin/perl -e 'printf "%d\n", (time-31676645);'`
 * References : http://solarisjedi.blogspot.com/2006/06/solaris-date-command-and-epoch-time.html

Q. How do I exclude "info" and "trends" columns from the NK overview page?
A. As of Hobbitd 4.12 it is currently hard-coded that the "info" and "trends" columns show up on all pages, including the NK and BB2 pages. If you want those columns removed you'll have to edit the Hobbit source-code. The change is pretty simple. In the hobbit-4.1.2/bbdisplay/pagegen.c file, lines 121-123 look like this: /* TRENDS and INFO columns are always included on non-BB pages */ if (strcmp(column->name, xgetenv("INFOCOLUMN")) == 0) return 1; if (strcmp(column->name, xgetenv("TRENDSCOLUMN")) == 0) return 1; Change the "return 1" on both lines to "return 0", save the file, run "make" and either run "make install", or copy the bbdisplay/bbgen program to ~hobbit/server/bin/. Next time the NK page is updated, those columns will be gone.

Q. How do I use the internal HTTP test feature of Hobbit to test a Squid proxy server?
A. If you want to check that the service is actually functional use something like this in your bb-hosts file: 0.0.0.0  squid.domain.com   # http://squid.domain.com:8080/http://www.google.com/

Q. How do I use the internal HTTP test feature of Hobbit to test a proxy server with authentication to a Windows domain?
A. If you want to check that the service is actually functional use something like this in your bb-hosts file: 0.0.0.0  servername.domain.com   # \ http://domain\username:password@servername.domain.com:8080/http://www.google.com/

Q. How do I compare graphs from different hosts on one page?
A. As of Hobbitd 4.12 there currently isn't a front-end to build the URL's needed for the graphs, but you can do it by hand for any graphs with -multi definitions (e.g. load average, swap):

http://hobbit.domain.com/hobbit-cgi/hobbitgraph.sh?host=host1.domain.com&service=la\ &graph_width=576&graph_height=120&disp=host1%2edomain%2ecom&nostale&graph=hourly&action=view Now, you can add more hosts after the "host=..." part of the URL - just list all of your hosts separated by commas. Like: http://hobbit.domain.com/hobbit-cgi/hobbitgraph.sh?host=host1.domain.com,host2.domain.com,\ host3.domain.com&service=la&graph_width=576&graph_height=120&disp=host1%2edomain%2ecom&nostale&\ graph=hourly&action=view
 * 1) Find the base graph you want, e.g. the cpu "load average" graph for one of your hosts.
 * 2) In your browser, right-click the graph and select "view image" or "open image". You now have a view of your load graph only.
 * 3) In the address bar field you'll see the URL for this image. E.g.

Q. I just upgraded from the BigBrother client to the Hobbit client and I don't get any status for CPU or disk, but I get status for other tests
A. Most common cause is that the hostname used by the client is different from the hostname you have in your bb-hosts file. On the client, what's the hostname reported by the "uname -n" command ? If that is different from the hostname you have in the bb-hosts file, start the client with the "--hostname=THE.REAL.HOSTNAME" option.

Q. How do I fix "Oversize status msg from 192.168.1.31 for test.my.com:ports truncated (n=508634, limit=262144)"
A.

Try to increase value of MAXMSG_STATUS in ~server/etc/hobbitserver.cfg : MAXMSG_STATUS The maximum size of a "status" message in kB, default: 256. Status messages are the ones that end up as columns on the web display. The default size should be adequate in most cases, but some extension scripts can generate very large status messages - close to 1024 kB.  You should only change this if you see messages in the hobbitd log file about status messages being truncated. limit=262144 is 256kB. You can divide n value for 1024 (508634/1024 = 496) then you can set MAXMSG_STATUS="500" and restart hobbit server.

B. On Wed, May 03, 2006 at 03:43:19PM +0200, Dominique Frise wrote: > Hi, > > hobbitd.log > 2006-05-03 12:34:27 Oversize data/client msg from 130.223.5.20 truncated > (n=815825, limit 524288) > First line: godzilla|sunos

"godzilla" - a Solaris host - sent a too-large "client" message of 815825 bytes. There's a limit set in Hobbit for the size of client message at 512 KB, so the message was truncated.

> [bb (at) iris hobbit]$ cat clientdata.log > 2006-05-03 12:34:28 Worker process died with exit code 0, terminating

This is interesting. If the truncated message caused hobbitd_client to crash, I would have expected a different exit-code. I'll have to check how it handles truncated messages.

> How can this happend?

Dont know, but apparently some input from your host caused it.

> Has this been fixed in latest snapshot?

Probably not. Which version are you running?

> Which worker process died? (hobbitd_client is still running)

It's restarted automatically by hobbitlaunch.

Henrik
 * We need to investigate why the hb client message is oversize. Like following we found we have msg.*.txt file that is over 512k. This is abonormal for bb message sampling of a system.

bash-3.00# wc msg.k206.test.com.txt 7943  55662  611936 msg.k206.test.com.txt bash-3.00# ls -l msg.k206.test.com.txt -rw-r--r--  1 hobbitc  hobbitc   611936 May  2 18:35 msg.v04k206.test.com.txt bash-3.00#

bash-3.00# grep netstat /opt/hobbitclient42/bin/hobbitclient-sunos.sh netstat -rn echo "[netstat]" netstat -s netstat -na -f inet -P tcp | tail +3 netstat -na -f inet6 -P tcp | tail +5 bash-3.00#
 * Further investigation found that [ports] section of msg.*.txt has too many output from two "netstat -na" commands.

Q. I'ld like to change the default period for graph display: by default hobbit displays trends with 4 graphs (48h,12d,48d,576d periods). It is possible change it to 48h,7d,30d,365d for example ?
A. It isn't configurable, but in the hobbit-4.2.0/web/hobbitgraph.c file near the top of the file you'll find these lines:

#define HOUR_GRAPH "e-48h" #define DAY_GRAPH  "e-12d" #define WEEK_GRAPH "e-48d" #define MONTH_GRAPH "e-576d"

Change them to suit you. Then search that same file for the HOUR_GRAPH etc. further down; you'll find 1 place where each is used like this:

period = HOUR_GRAPH; persecs = 48*60*60;

and you need to change that "persecs" calculation also for all 4 graph types.

Change the legend: //persecs = 12*24*60*60; persecs = 7*24*60*60; //glegend = "Last 12 Days"; glegend = "Last 7 Days";

Then run "make" (from the hobbit-4.2.0 directory) and "make install" (or just copy the "web/hobbitgraph.cgi" file to your ~hobbit/server/bin/ directory).

Q. Why does my Hobbit server have the following http response graph ?
Why does the response time of the http service differ so much. Is this a mis-configured http server?



A.Hundreds of reasons could cause a delay on an http server. We see a graph differ only in a range between 2 and 10 ms. This is quite normal. Due to the automated scaling, the graph differences look more important or drastic than they actually are.

Q. My Hobbit server has the following wrsmd* network traffic trending graph ?


A., Inbound traffic is much more than outbound, this is normal. The Hobbit server receives lots of bb/hb message from clients.

Q. What are those wrsmd* interfaces ?
A., The wrsmd(7D) (WCI Remote Shared Memory (WRSM) DLPI driver) status was reported by the Solaris 10 command "/usr/bin/kstat -p -s '[or]bytes64'" used for [ifstat] in hobbitclient-sunos.sh. I said "was" because on all our patched Solaris 10 servers, we have not seen this for quite a while.

You can avoid this output by using "/usr/bin/kstat -p -s '[or]bytes64' | grep -v wrsmd | sort" for [ifstat]. You need also to remove all ifstat.wrsmd*.rrd files.

Dominique UNIL - University of Lausanne

Q. Graph after correction
The wrsmd interfaces now disappeared from the graph.



Q. How can we use Hobbit/BBWin client to collect Bit/s on the Network interfaces?
A., I'm very interesting in this question, but have not found an answer yet.

semop failed, Invalid argument ?
A., Frequently seen when hobbit dies ungracefully. Stop Hobbit, and run, # ipcs |grep hobbit 0x0100ba76 162758665 hobbit    600        262144     2 0x0200ba76 162791434 hobbit    600        262144     2

And remove any remaining shared memory segments.

# ipcrm -M 0x0200ba76

bash-3.00# tail hobbitd.log 2008-02-18 12:28:00 semop failed, Invalid argument 2008-02-18 12:28:00 How did this happen? clients=-1, s.sem_op=0 2008-02-18 12:28:00 semop failed, Invalid argument 2008-02-18 12:28:00 How did this happen? clients=-1, s.sem_op=0 2008-02-18 12:28:00 semop failed, Invalid argument 2008-02-18 12:28:00 How did this happen? clients=-1, s.sem_op=0 2008-02-18 12:28:00 semop failed, Invalid argument 2008-02-18 12:28:00 How did this happen? clients=-1, s.sem_op=0 2008-02-18 12:28:00 semop failed, Invalid argument 2008-02-18 12:28:00 How did this happen? clients=-1, s.sem_op=0 bash-3.00# bash-3.00#