High CPU Usage
In this example, we report high CPU usage--unusually low CPU idle times.
The CPUUsage script might send an alert message like the following:
PIKT ALERT Tue Apr 24 15:17:12 2007 montreal URGENT: CPUUsage Report unusually high CPU usage Cpu(s): 48.2% us, 1.3% sy, 0.0% ni, 48.6% id, 1.6% wa, 0.1% hi, 0.2% si top - 15:17:10 up 18 min, 2 users, load average: 1.50, 1.24, 0.81 Tasks: 97 total, 1 running, 95 sleeping, 0 stopped, 1 zombie Cpu(s): 48.2% us, 1.3% sy, 0.0% ni, 48.6% id, 1.6% wa, 0.1% hi, 0.2% si Mem: 1034932k total, 533352k used, 501580k free, 95180k buffers Swap: 4016168k total, 0k used, 4016168k free, 179540k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7452 boyce 16 0 220m 24m 11m S 100 2.4 13:42.90 java_vm 7415 boyce 15 0 258m 132m 28m S 22 13.1 1:53.86 firefox-bin 6005 root 15 0 177m 45m 7844 S 14 4.5 0:33.87 X 1 root 15 0 1464 512 456 S 0 0.0 0:01.03 init ...
The script follows.
CPUUsage // if we ever need to check this on a per-machine (or per- // hostgroup) basis, we really should set up a new objects file, // CPUUsage.obj, with fields like so: // // //host //#cpuidlim // // then read the data in using =readvals() and process in the usual // manner init status =piktstatus level =piktlevel task "Report unusually high CPU usage" #if gentoo | suse | database3 input proc "=cat =hstdir/log/top.CPUUsage 2>/dev/null | =head -n 3 | =tail -n 1" dat $ky 1 // invariant key, "Cpu(s):" dat #cpuid 8 #elsif redhat input proc "=cat =hstdir/log/top.CPUUsage 2>/dev/null | =head -n 5 | =tail -n 1" dat $ky 1 // invariant key, "CPU0" dat #cpuid $-1 #endif keys $ky begin doexec wait "=top -b -n1 -d1 2>/dev/null > =hstdir/log/top.CPUUsage" doexec wait "=psall > =hstdir/log/ps.CPUUsage" set #dotopps = #false() set #cpuidloglim = 100% // i.e., log always if $alert() =~ "EMERGENCY" set #cpuidlim = 10% elsif $alert() =~ "Urgent" set #cpuidlim = 50% else // if $alert() =~ "CPUUsage" set #cpuidlim = 100% endif #ifdef debug rule output $inlin output "\#cpuid is $text(100*#cpuid,1)%" quit #endifdef rule // log high usage if $alert() =~ "Urgent" if #cpuid <= #cpuidloglim =output_alarm_log($inlin) endif endif rule // report unusually high cpu usage if #cpuid <= #cpuidlim #if missioncritical =hourly(output mail $inlin set #dotopps = #true(), ) #else =every_four_hours(output mail $inlin set #dotopps = #true(), ) #endif endif end if #dotopps && $alert() ne "CPUUsage" output mail =newline =outputfile(mail, "=hstdir/log/top.CPUUsage") output mail =newline =outputfile(mail, "=hstdir/log/ps.CPUUsage") fi
For more examples, see Samples.