PIKT Log Scan Macro
The pikt_log_scan_alarms_macros.cfg is a script macro to scan PIKT log file output for noteworthy entries.
Note how--by means of the generalized script macro, tailored to the needs of each PIKT log file by the parameters we pass to the =piktlog_scan() script macro--note how this technique saves us from having to write dozens of ungeneralized log file scanning scripts specific to each PIKT log file.
piktlog_scan(A, B1, B2) init status =piktstatus level =piktlevel task "Scan the PIKT (A).log for noteworthy entries" input logfile "=logdir/(A).log" begin =checkpoint(=lalim) // assume no crisis (yet) set #crisis = #false() #if dbserver | prodserver // on production servers, only process non-emergency pikt // log files in the early morning, during production downtime if $alert() !~~ "red|emergency" if #hour() > 5 quit fi fi #endif rule // automatic bypasses if $inlin =~~ "==> =logdir/.+ <==" next endif rule // special alarm-specific bypasses, first bypass if $inlin =~~ "(B1)" next endif rule // bypass empty lines if ! #length($inlin) next endif rule =set_hr($inlin) rule // report ERROR entries if $inlin =~ "ERROR]" output mail $inlin next endif rule // report inactive alarms if $inlin =~ "INFO].+suspended status" // && =monday output mail $inlin next endif rule // report stderr output from invoked commands if =notlogmsg if $inlin !~~ "broken pipe" output mail $inlin endif next endif rule // report corrupted lines, then bypass if #hr == #err() output mail $inlin next endif rule // report entries due to a possibly badly formed input // command that results in no input if $inlin =~ "WARNING].+no input data" =outputmail $inlin next endif rule // bypass commonplace stuff if $inlin =~ "(DEBUG]|NOTICE]|INFO]| last line is improperly date/time-stamped| no answer from)" next endif rule // special alarm-specific bypasses, second bypass if $inlin =~~ "(B2)" next endif rule // possibly report WARNING entries if $inlin =~ "WARNING]" =outputmail $inlin next endif rule // report anything not filtered out output mail $inlin end if #crisis // =page() endif quit
You might invoke the =piktlog_scan() macro in your alarms.cfg file thusly:
/////////////////////////////////////////////////////////////////////////////// // // logs_pikt_alarms.cfg // /////////////////////////////////////////////////////////////////////////////// =nbsp #if piktmaster PIKTREDLogScan =piktlog_scan(RED, =nonesuch, =nonesuch) PIKTREDTestLogScan =piktlog_scan(REDTest, =nonesuch, =nonesuch) #endif /////////////////////////////////////////////////////////////////////////////// PIKTEMERGENCYLogScan =piktlog_scan(EMERGENCY, runaway...procs, =nonesuch) PIKTEMERGENCYTestLogScan =piktlog_scan(EMERGENCYTest, =nonesuch, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTSysAdminsUrgentLogScan =piktlog_scan(SysAdminsUrgent, dmesgrf: no such file|stat source.+dmesg, =nonesuch) PIKTSysAdminsUrgentTestLogScan =piktlog_scan(SysAdminsUrgentTest, dmesgrf: no such file|stat source.+dmesg, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTCodersUrgentLogScan =piktlog_scan(CodersUrgent, dmesgrf: no such file|stat source.+dmesg, =nonesuch) PIKTCodersUrgentTestLogScan =piktlog_scan(CodersUrgentTest, dmesgrf: no such file|stat source.+dmesg, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTSysAdminsCriticalLogScan =piktlog_scan(SysAdminsCritical, =nonesuch, =nonesuch) PIKTSysAdminsCriticalTestLogScan =piktlog_scan(SysAdminsCriticalTest, =nonesuch, =nonesuch) /////////////////////////////////////////////////////////////////////////////// [...] /////////////////////////////////////////////////////////////////////////////// PIKTCheckDatabaseLogScan =piktlog_scan(CheckDatabase, =nonesuch, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTDownSystemsLogScan =piktlog_scan(DownSystems, =nonesuch, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTDownRpcLogScan =piktlog_scan(DownRpc, =nonesuch, =nonesuch) /////////////////////////////////////////////////////////////////////////////// PIKTSysRebootsLogScan =piktlog_scan(SysReboots, =nonesuch, =nonesuch) ///////////////////////////////////////////////////////////////////////////////
Output from these scripts might look like, for example:
DEBUG: PIKTSysAdminsUrgentLogScan Scan the PIKT SysAdminsUrgent.log for noteworthy entries ping: unknown host grenada ping: unknown host grenada DEBUG: PIKTSysAdminsCriticalLogScan Scan the PIKT SysAdminsCritical.log for noteworthy entries telnet: grenada: Name or service not known grenada: Unknown host telnet: connect to address 10.10.5.73: Connection refused telnet: connect to address 10.10.5.34: Connection refused telnet: grenada: Name or service not known
For more examples, see Samples.