Tuesday, July 21, 2009

Configuring Nagios - Overview

Since my original posting "How-To Install Nagios on Ubuntu Linux", I have received many requests for more information about configuring Nagios. Figuring this out was no small undertaking for me. I want to start with an overview of the files that Nagios uses for configuration, starting and stopping Nagios, and some other helpful information.

Configuration files:
Nagios installs (depending on the configuration directives given during the install) into either /usr/local/nagios/etc or /etc/nagios, the configuration files:
  • nagios.cfg - main nagios server configuration file
  • commands.cfg - some predefined commands
  • templates.cfg - some predefined templates for monitored hosts
  • timeperiods.cfg - timeperiods for things such as times to monitor or working hours
  • cgi.cfg - the configuration of the web interface
  • resource.cfg - user-defined macros
The above files are used for the "general" configuration of Nagios. Adding and configuring hosts is so flexible, that it may actually be difficult. There are many ways that hosts, hostgroups, services, and service groups can be setup. I have seen a single file with all of the above defined, I have also seen it broken down to one config file per site, or even one config file for each object. There is a balance there, and you just have to find it. If you aren't sure what you want to do, keep them all in one file for your own sanity.

Although the "checks" aren't configuration files, they are integral in configuring the system. All of the binaries for checking hosts/services are located in /usr/local/nagios/libexec. These files may be written in a variety of languages (such as perl), but they don't all work the same. The easiest way to find out the proper syntax is to use the -h switch to bring up the help for the check.

The checks are required to send specific responses back to the server (such as Critical, Unknown, etc), so they will all do that. The difference will come in the type of check you are performing. I find it best to start out running the check from the command line to ensure that it is working. This can be done by changing to the libexec directory and running a check:

./check_disk -H 10.1.2.2 -w 10% -c 5% -x /dev/sda1

The above check will check the /dev/sda1 partition on the host 10.1.2.2 for freee space. If the space is equal to or less than 10% free, you will receive a warning (the -w) and if the free space is less than or equal to 5% a critical alert (-c) will be sent.

If, like me, you install this on a RedHat derivitive, you can use (sudo) service restart nagios to restart nagios. If you use a Debian derivitive, you can use (sudo) /etc/init.d/nagios restart.

More to come...

2 comments:

Will said...

More to come...

I need more now.

Will said...

I need more now dammit.