class Arborist::Monitor::SNMP::CPU

Machine load/cpu checks.

Sets current 1, 5, and 15 minute loads under the 'load' attribute, and calculates/warns on cpu overutilization.

Constants

LOADKEYS

When walking load OIDS, the iterator count matches these labels.

OIDS

OIDS for discovering system load.

Public Class Methods

node_properties() click to toggle source

Return the properties used by this monitor.

# File lib/arborist/monitor/snmp/cpu.rb, line 39
def self::node_properties
        return USED_PROPERTIES
end
run( nodes ) click to toggle source

Class run creates a new instance and immediately runs it.

# File lib/arborist/monitor/snmp/cpu.rb, line 46
def self::run( nodes )
        return new.run( nodes )
end

Public Instance Methods

run( nodes ) click to toggle source

Perform the monitoring checks.

Calls superclass method Arborist::Monitor::SNMP#run
# File lib/arborist/monitor/snmp/cpu.rb, line 53
def run( nodes )
        super do |host, snmp|
                self.find_load( host, snmp )
        end
end

Protected Instance Methods

find_load( host, snmp ) click to toggle source

Collect the load information for host from an existing (and open) snmp connection.

# File lib/arborist/monitor/snmp/cpu.rb, line 116
def find_load( host, snmp )
        info = self.format_load( snmp )

        config  = self.identifiers[ host ].last['config'] || {}
        warn_at = config[ 'warn_at' ] || self.class.warn_at
        usage   = info.dig( :cpu, :usage ) || 0

        if usage >= warn_at
                info[ :warning ] = "%0.1f utilization exceeds %0.1f percent" % [ usage, warn_at ]
        end

        self.results[ host ] = info
end
format_load( snmp ) click to toggle source

Find load data, add additional niceties for reporting.

# File lib/arborist/monitor/snmp/cpu.rb, line 66
def format_load( snmp )
        info = { cpu: {}, load: {} }
        cpus = snmp.walk( oid: OIDS[:cpu] ).each_with_object( [] ) do |(_, value), acc|
                acc << value
        end

        info[ :cpu ][ :count ] = cpus.size

        # Windows SNMP doesn't have a concept of "load" over time,
        # so we have to just use the current averaged CPU usage.
        #
        # This means that windows machines will very likely want to
        # adjust the default "overutilization" number, considering
        # it's really just how much of the CPU is used at the time of
        # the monitor run, along with liberal use of the Observer "only
        # alert after X events" pragmas.
        #
        if self.system =~ /windows\s+/i
                info[ :cpu ][ :usage ] = cpus.inject( :+ ).to_f / cpus.size
                info[ :message ] = "System is %0.1f%% in use." % [ info[ :cpu ][ :usage ] ]


        # UCDavis stuff is better for alerting only after there has been
        # an extended load event.  Use the 5 minute average to avoid
        # state changes on transient spikes.
        #
        else
                snmp.walk( oid: OIDS[:load] ).each_with_index do |(_, value), idx|
                        next unless LOADKEYS[ idx ]
                        info[ :load ][ LOADKEYS[idx] ] = value.to_f
                end

                percentage = (( ( info[:load][ :load5 ] / cpus.size) - 1 ) * 100 ).round( 1 )

                if percentage < 0
                        info[ :message ] = "System is %0.1f%% idle." % [ percentage.abs ]
                        info[ :cpu ][ :usage ] = percentage + 100
                else
                        info[ :message ] = "System is %0.1f%% overloaded." % [ percentage ]
                        info[ :cpu ][ :usage ] = percentage
                end
        end

        return info
end