System Reboots
In this sample example, we monitor and report system reboots.
The SysReboot script might send an alert message like the following:
PIKT ALERT
Thu Oct 11 12:32:02 2007
vienna
CRITICAL:
SysReboot
Scan the 'last' command output for signs of recent system reboots
reboot system boot 2.6.17-gentoo-r4 Thu Oct 11 11:58 (00:33)
The script follows.
SysReboot
init
status =piktstatus
level =piktlevel
task "Scan the 'last' command output for signs of recent system reboots"
input proc "last | =head"
filter "=egrep 'system boot|shutdown'"
dat $dow $-4 // not currently used
dat $mon $-3
dat $date $-2
dat $time $-1
begin
set #first = #true()
// ensure that we set $mdtprev anew in the current script
// run for recall as %mdtprev in the next script run (the
// 'set $mdtprev' below is conditional)
if #defined(%mdtprev)
set $mdtprev = %mdtprev
else
set $mdtprev = ""
endif
rule
set #daynumber_reboot = #daynumber($dow)
rule // since the "last" command output doesn't show the year,
// we need a way to determine if we are wrapping around
// to previous years; keeping track of month turnovers is
// a kludgy way to do this
set #monthnumber_reboot = #monthnumber($mon)
if #monthnumber_reboot != @monthnumber_reboot
=incr(#monthturnovers)
endif
rule
do #split($time, ":")
set #hour_reboot = #val($1)
rule // ignore scheduled reboots
if =reboot_period(#daynumber_reboot, #hour_reboot)
next
fi
rule
// using =set_lineage to determine the age gives better
// results than the following:
// if #age($mon,$date,$time) <= 1
=set_lineage($inline)
if #lineage == #err()
|| #lineage < 0
|| #monthturnovers > 1 // are at least 2 months
// into the past
next // not in current year, or bad input line
endif
// since this alarm is run daily, only report reboots in the
// last 24 hours
if #lineage < =secs_in_day
set $mdt = "$mon $date $time"
if #defined(%mdtprev)
&& %mdtprev =~ $mdt
quit
endif
output mail $inline
if #first
set #first = #false()
set $mdtprev = $mdt
else
set $mdtprev .= " $mdt"
endif
endif
This is just one program example. You could add rules, or write new scripts, for example to: remove /var/crash files after a reboot, log uptime data, etc.
For more examples, see Samples.