Actions: | Security

AllGoodBits.org

Navigation: Home | Services | Tools | Articles | Other

More Automated Monitoring with Nagios and Puppet

This is a continuation and improvement upon a previous article, you should read that first.

There are several weaknesses with the previous setup, but the big one at the moment is that I can't edit my nginx puppet module and add code like:

@@nagios_service { "nginx-${fqdn}":
  host_name => $fqdn,
  check_command => 'check_http -I $HOSTADDRESS$ -w 3 -c 5 -u "http://${ipaddress}/server-status"'
}

In other words, I want to be able to have my puppet declarations automatically add nagios configuration to the monitoring server when a node is declared as a webserver or a database server or whatever.

The approach that I will detail draws upon Adam Kosmin's article in Linux Journal and Matt Gurski's.

The first two files, init.pp and params.pp are standard puppet module fare, nothing interesting and only supplied for "completeness".

modules/nagios/manifests/init.pp:

class nagios {
  include nagios::params
}

modules/nagios/manifests/params.pp:

class nagios::params {

$resource_dir = '/etc/nagios/resource.d'
$user = 'nagios'

  case $::operatingsystem {

    centos,redhat,fedora: {
      $service = 'nagios'
    }
    debian: {
      $service = 'nagios3'
    }
    solaris: {
      $service = 'cswnagios'
    }
    default: {
      fail("This module is not supported on $::operatingsystem")
    }
  }
}

The following files, resource.pp and resource/*.pp are the interesting ones. They create an abstraction for puppet's nagios_* resource types.

modules/nagios/manifests/resource.pp:

define nagios::resource(
  $export,
  $type,
  $host_use = 'generic-host',
  $ensure = 'present',
  $owner = 'nagios',
  $address = '',
  $hostgroups = '',
  $servicegroups = '',
  $check_command = ''
) {

  include nagios::params

  # figure out where to write the file
  # replace spaces with an underscore and convert everything to lowercase
  $target = inline_template("${nagios::params::resource_dir}/${type}_<%=name.gsub(/\\s+/, '_').downcase %>.cfg")

  case $export {
    true, false: {}
    default: { fail("The export parameter must be set to true or false.") }
  }

  case $type {
    host: {
      nagios::resource::host { $name:
        ensure        => $ensure,
        use           => $host_use,
        check_command => $check_command,
        address       => $address,
        hostgroups    => $hostgroups,
        target        => $target,
        export        => $export,
      }
    }

    hostgroup: {
      nagios::resource::hostgroup { $name:
        ensure => $ensure,
        target => $target,
        export => $export,
      }
    }

    default: {
      fail("Unknown type passed to this define: $type")
    }
  }

  # create or export the file resource needed to support
  # the nagios type above
  nagios::resource::file { $target:
    ensure       => $ensure,
    export       => $export,
    resource_tag => "nagios_${type}",
    requires     => "Nagios_${type}[${name}]",
  }
}

modules/manifests/resource/file.pp:

define nagios::resource::file(
  $resource_tag,
  $requires,
  $export = true,
  $ensure = 'present',
) {

  include nagios::params

  if $export {

    @@file { $name:
      ensure  => $ensure,
      tag     => $resource_tag,
      owner   => $nagios::params::user,
      mode    => 0644,
      require => $requires,
    }
  } else {

    file { $name:
      ensure  => $ensure,
      tag     => $resource_tag,
      owner   => $nagios::params::user,
      mode    => 0644,
      require => $requires,
    }
  }
}

modules/manifests/resource/host.pp:

define nagios::resource::host(
  $address,
  $hostgroups,
  $export,
  $target,
  $check_command,
  $use,
  $ensure = 'present'
) {

  include nagios::params

  if $export {

    @@nagios_host { $fqdn:
      ensure        => $ensure,
      address       => $address,
      check_command => $check_command,
      use           => $use,
      target        => $target,
      hostgroups    => $hostgroups ? {
        ''      => undef,
        default => $hostgroups,
      },
    }
  } else {

    nagios_host { $name:
      ensure        => $ensure,
      address       => $address,
      check_command => $check_command,
      use           => $use,
      target        => $target,
      require       => File[$nagios::params::resource_dir],
      hostgroups    => $hostgroups ? {
        ''      => undef,
        default => $hostgroups,
      },
    }
  }
}

modules/manifests/resource/hostgroup.pp:

define nagios::resource::hostgroup(
  $target,
  $ensure = 'present',
  $hostgroup_alias = '',
  $export = false
) {

  include nagios::params

  if $export {
    fail("It is not appropriate to export the Nagios_hostgroup type since it will result in duplicate resources.")
  } else {
    nagios_hostgroup { $name:
      ensure => $ensure,
      target => $target,
      require => File[$nagios::params::resource_dir],
    }
  }
}

This means that our modules/nagios/manifests/export.pp changes as follows:

class nagios::export {
  nagios::resource { $::fqdn:
    type => 'host',
    address => inline_template("<%= has_variable?('my_nagios_interface') ? eval('ipaddress_' + my_nagios_interface) : ipaddress %>"),
    #address => $::ipaddress,
    hostgroups => inline_template("<%= has_variable?('my_nagios_hostgroups') ? $my_nagios_hostgroups : 'all-servers' %>"),
    #hostgroups => 'all-servers' ,
    check_command => 'check-host-alive!3000.0,80%!5000.0,100%!10',
    export => true,
  }
}

The 2 commented attributes show a more simplistic approach, but as is, this shows a hint for how you might start to take advantage of hostgroups and might be able to modify to cope with additional network interfaces, but one step at a time, eh?

At this point, we now have automated generation of nagios configuration that allows host checking and nrpe.

Generate Nagios configuration to monitor services

Monitoring hosts is a really minimal baseline. We really care about whether services are available.

Checking the default virtual host

In my puppet world, the classes that are responsible for dealing with nginx servers include code to export a nagios_service resource. This comes from modules/nagios/manifests/init.pp itself:

@@nagios_service { "${fqdn}-check_http_by_ip":
    use => "generic-service",
    service_description => 'check_http_by_ip',
    host_name => $fqdn,
    target => "/etc/nagios/resource.d/${fqdn}-check_http_by_ip.cfg",
    check_command => "check_http_by_ip"
}

Therefore a node, nginx.example.com, that declares class {'nagios':} will export a nagios_service resource that gets collected to PuppetDB and results in the following content in /etc/nagios/resource.d/nginx.example.com-check_http_by_ip.cfg:

define service{
  service_description check_http_by_ip
  check_command       check_http_by_ip
  host_name           nginx.example.com
  use                 generic-service
}

In /etc/nagios/objects/commands.cfg, I define that command:

define command{
  command_name        check_http_by_ip
  command_line        $USER1$/check_http -I $HOSTADDRESS$ -u "http://$HOSTADDRESS$/server-status" -w 5 -c 15
}

which causes nagios to send a http get request for http://<host>/server-status using the IP address of the host, avoiding a dns lookup and any vhost complexity.

All of my nginx servers respond to <IP>/server-status requests because my puppet module for nginx uses a template to populate /etc/nginx/conf.d/default.conf to serve status infomation:

server {
  listen          *:80;
  server_name     <%= ipaddress -%> 127.0.0.1 localhost;
  location        /server-status {
    stub_status   on;
    access_log    off;
    allow         127.0.0.1;
    allow         <%= monitoring_host -%>;
  }
}

Checking other Virtual Hosts

In that same puppet module, I define a custom resource, nginx::site, that is declared for each nginx virtual host. It creates nginx config for that virtual host. It also exports a nagios_service resource that results in a nagios service check. Here is the code from the define:

@@nagios_service { "${name}-check_http":
  use                 => 'generic-service',
  service_description => "vhost-${name}",
  host_name           => $fqdn,
  target              => "/etc/nagios/resource.d/${name}-check_http.cfg",
  check_command       => 'check_http',
}

Therefore a node, nginx.example.com, that declares nagios::site { foo: .... } will export a nagios_service resource that gets collected to PuppetDB and results in the following content in /etc/nagios/resource.d/nginx.example.com-check_http.cfg:

define service{
  service_description vhost-nginx.example.com
  check_command       check_http
  host_name           nginx.example.com
  use                 generic-service
}

In /etc/nagios/objects/commands.cfg, I should (already) have a command for that:

define command{
  command_name        check_http
  command_line        $USER1$/check_http -H $HOSTNAME$ -w 5 -c 15
}

or something analagous.

Now it should be fairly straightforward, if a little tedious, to start adding other service checks as part of the definition of other puppet resources. (HTTP) Application server checks will look very similar; the nagios plugin, check_http, gives some significant flexibility. Other checks with other plugins will look a little different and you might need to start making more data available to the exported resource @@nagios_service so that the commands can be more sophisticated.