Community:Splunk for Nagios
From Splunk Wiki
Splunk-2-Nagios
This topic discusses monitoring the availability of Splunk with Nagios. If you're looking for information on monitoring and analyzing your Nagios logs with Splunk, follow these instructions to point Splunk to the location of your Nagios logs the way you would for any plain text log files.
What is Nagios?
Nagios is an open source monitoring system designed to monitor resources using assorted plugins and SNMP. Out of the box, it supports monitoring of hosts and network services directly over tcp/ip, resource monitoring via SNMP scripts, and the framework required to build your own plugins. For information on writing custom plugins, refer to the development site for Nagios plugins: http://nagiosplug.sourceforge.net/
What Splunk-2-Nagios does
The Splunk-2-Nagios integration will alert when splunkd or Splunk Web are down, and will send Splunk alerts to Nagios.
Note: The Splunk-2-Nagios application currently does not support having Nagios alert on a Splunk license violation. This will be resolved in a future version of the Splunk-2-Nagios application....
What Nagios logs
Nagios has a default log file called nagios.log kept in the base directory under var. If your base directory is /usr/pkg/nagios, the location is /usr/pkg/nagios/var/nagios.log.
Service and host events are logged to this main log file for historical purposes. Configure the log file name and location via the log_file directive in nagios.cfg.
To log messages to the syslog facility as well, set the use_syslog option to 1. Otherwise set it to 0.
Download Splunk-2-Nagios
Download Splunk for Nagios from SplunkBase.
Install Splunk-2-Nagios
1. Determine the location of your Nagios install. Commonly, this is /usr/local/nagios.
2. Edit install.sh and modify the DEFAULT_ lines at the top of the file. Their definitions are as follows:
- DEFAULT_NAGIOS_DIR = directory containing the Nagios components
- DEFAULT_NAGIOS_CFG = file to which the Nagios commands should be added. You can put this into the default
etc/nagios.cfgor intoetc/objects/commands.cfgif you have them separated that way. - DEFAULT_NAGIOS_SERVICES_CFG = files to which the Nagios services should be added. You can put this into the default
etc/nagios.cfgor intoetc/objects/localhost.cfgif you have them separated that way. - DEFAULT_SPLUNK_DIR = bin directory of your Splunk installation
- DEFAULT_IP_ADDRESS= IP of your Nagios server
- DEFAULT_NAGIOS_USER= user Nagios runs as
- DEFAULT_NAGIOS_GROUP=group Nagios runs as
For example :
DEFAULT_NAGIOS_DIR=/usr/local/nagios/libexec DEFAULT_NAGIOS_CFG = /usr/local/nagios/etc/objects/commands.cfg DEFAULT_NAGIOS_SERVICES_CFG=/usr/local/nagios/etc/objects/localhost.cfg DEFAULT_SPLUNK_DIR=/opt/splunk/bin DEFAULT_IP_ADDRESS=127.0.0.1 DEFAULT_NAGIOS_USER=nagios DEFAULT_NAGIOS_GROUP=nagcmd
3. Run the ./install.sh script.
This prompts you with several questions and default answers, and adds the commands to the misccommands file within the Nagios install.
You should see the following message:
Installation complete!
Read the included documentation for more information on how to configure Nagios to use the Splunk integration components.
You can check the /usr/local/nagios/etc/objects/commands.cfg (or whatever location you provided during the required step above) to see that the Splunk commands were added.
What gets installed
The following components get installed in /usr/local/nagios/libexec (or whatever directory you specified during the installation).
check_splunk handle_alert splunk_service_notification splunk_host_notification getSplunkLicenseInfo.py
The following services are added to etc/objects/localhost.cfg (or whatever file you specified during the installation).
splunk port status splunk procs status splunk license status Send Nagios host alerts with Splunk URL via email Send Nagios service alerts with Splunk URL via email
These are the default services that get added. You can modify these services to take advantage of all the other optional parameters that can be sent in. To see what these optional parameters are:
$ libexec/check_splunk
No plugin arguments specified.
check_splunk - Nagios plugin for Splunk
Copyright (c) 2005-2008 Splunk, Inc.
Usage: ./check_splunk {license|ports|procs|search} [options]
license = Checks the status of the Splunk Server license.
This argument accepts optional parameters. Full command usage is shown below:
license [-cd {critval}] [-wd {warnval}] [-cu {critval}] [-wu {warnval}] [-s {url} (http(s)://server:port)]
-cd {critval} = Return a CRITICAL state if the license expires in less than {critval} days.
-wd {warnval} = Return a WARNING state if the license expires in less than {warnval} days.
-cu {critval} = Return a CRITICAL state if the number of daily bytes indexed exceeds
{critval} percent of license allowance.
-wu {critval} = Return a WARNING state if the number of daily bytes indexed exceeds
{warnval} percent of license allowance.
-s {url} = Remote Splunk server. Defaults to https://localhost:8089 if not defined
ports = Checks the status of the Splunk Server TCP sockets.
procs = Checks the status of the splunkd and splunkWeb processes.
search = Checks the Splunk Server for matches to a query.
This argument requires additional parameters. Full command usage is shown below:
search [-c {critval}] [-w {warnval}] [-u {user}] [-p password] {queryString}
-c {critval} = Return a CRITICAL state if number of matches surpasses specified value.
-w {warnval} = Return a WARNING state if number of matches surpasses specified value.
Note: {critval} can be either greater or lesser than {warnval}.
-u {user} = Splunk Enterprise user account
-p {password} = Password for Splunk Enterprise user's account
{queryString} = Any valid query string. The results will be piped to count(_raw) to force a return count
splunk license status
Deployment notes
Edit the following block in handle_alert:
# URL to local Splunk server # Change this to match the address of your local server. Do NOT include a trailing slash! SPLUNK_BASEURL="http://192.168.1.1" <-- Your Splunk server's IP # Default Nagios variables DEFAULT_NAGIOS_COMMAND_FILE="/usr/local/nagios/var/rw/nagios.cmd" DEFAULT_NAGIOS_HOSTNAME="somehost" <-- Host that you are alerting from DEFAULT_NAGIOS_SERVICEDESCRIPTION="someservice" <-- Service that you are alerting about DEFAULT_NAGIOS_RETURNCODE=1 <-- SEVERITY of the service DEFAULT_NAGIOS_OUTPUTPREPEND="An Alert was just triggered!" <-- Name this what ever you like.
In Check_splunk, splunk_host_notification, and splunk_service_notification, make sure the the $SPLUNK_HOME variable is set:
#
# Setup the running environment before we do anything.
#
if [ -z "${SPLUNK_HOME}" ] ; then
SPLUNK_HOME=/opt/splunk
fi
Note: In order for the services specified by splunk_host_notification and splunk_service_notification to work, ensure that the user as which nagios is running is able to send emails. You can test this by opening a terminal, su as nagios (the user under which Nagios runs) and type:
/bin/echo "some data" | /usr/bin/mail -s "some subject" somemailid@somedomain.com
A mail should arrive at somemailid@somedomain.com if it worked.
For example, on Mac OS X, you must do:
sudo chmod 775 $TMPDIR
in order for this to work.
Also, since the $CONTACTEMAIL$ macro is not available to event handlers, but only to notifications, we use the $CONTACTGROUPMEMBERS$ macro (available with Nagios-3 onwards). Ensure that there is a contactgroup by the name of 'admins' and the member field of this is set to the email ID to which notification needs to be sent. For example:
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members somemailid@somedomain.com
}
You'll probably have to add a contact also for the above to work.
Restart Nagios for these effects to take effect.
Enable a remote integration
To enable a remote integration, you must configure nrpe or some other remote Nagios plugin.