Full Disks Macro
The disks_full_alarms_macros.cfg is a script macro for reporting when remote systems' disks are full. Since it invokes piktc, it should be run from the piktmaster system.
Why have the piktmaster poll remote systems for full disks? Why not have individual systems run their own full disk script? Well, they should do that, but what if the disk containing their /var fills, thereby stopping out-bound alert e-mail? You would never know about that system's full disk(s) (or any other problem that PIKT detects but is unable to send out report messages for). Hence the need for a remote-systems full-disks script to run on the piktmaster.
The =diskfull() script macro follows.
/////////////////////////////////////////////////////////////////////////////// // // disks_full_alarms_macros.cfg // /////////////////////////////////////////////////////////////////////////////// diskfull(SYS, BYPASSSYS, BYPASSMOUNTS, CAPLIM, CAPLIMDEFAULT) init status =piktstatus level =piktlevel task "Report full filesystems" // the '0 0 0 0 0' following are to satisfy the expected // number of fields in the =dfdata statements below input proc "=bindir/piktc -xU +C 'echo HOST:`\=hostname` 0 0 0 0 0; \=df -k -l | \=sed \'1,1d\' | \=egrep -v /mapper/ | \=egrep -v /export/ | \=egrep -v /repository/' +H (SYS) -H (BYPASSSYS)" =dffilter =dfdata keys $fsname $mount begin =initmisscrit if $alert() =~ "DiskFullSystems|DiskFullServers|DiskFullClients" set #interactive = #true() else set #interactive = #false() fi rule if $inlin =~ "^HOST:([[:graph:]]+) .+$" set $host = $1 =setmisscrit // determine if host is mission-critical #ifdef debug output $host #elsedef if #interactive output $host output =newline fi #endifdef next fi #ifdef debug rule output $inlin #endifdef rule if $mount =~~ "(BYPASSMOUNTS)" next fi rule set $capmsg = "On $host, filesystem $mount on $fsname is $text(100*#cap,0)% full, $text(#kbytes) KB capacity, $text(#avail) Kb left" rule // report full disks for interactive scripts if #interactive if #cap >= (CAPLIMDEFAULT) output $capmsg output =newline fi next fi // non-interactive scripts after this point rule #ifdef pikttest if #cap >= (CAPLIMDEFAULT) if #misscrit =hourly(output mail $capmsg, ) else =every_two_hours(output mail $capmsg, ) fi #elsedef if (#cap >= (CAPLIM)) // || (#avail == 0) if #misscrit =every_six_hours(output mail $capmsg, ) else =daily(output mail $capmsg, ) fi #endifdef fi end quit ///////////////////////////////////////////////////////////////////////////////
You might invoke the =diskfull() macro in your alarms.cfg file thusly:
/////////////////////////////////////////////////////////////////////////////// // // disks_alarms.cfg // /////////////////////////////////////////////////////////////////////////////// #if piktmaster DiskFull =diskfull(all, down sick sidelined, ^/cdrom, 100%, 90%) #endif // piktmaster /////////////////////////////////////////////////////////////////////////////// #if piktmaster SystemsDiskFull =diskfull(all, down sick sidelined, ^/cdrom, 100%, 90%) #endif // piktmaster /////////////////////////////////////////////////////////////////////////////// #if piktmaster ServersDiskFull =diskfull(missioncritical, down sick sidelined, ^/cdrom, 100%, 90%) #endif // piktmaster /////////////////////////////////////////////////////////////////////////////// #if piktmaster ClientsDiskFull =diskfull(nonmissioncritical, down sick sidelined, ^/cdrom, 100%, 90%) #endif // piktmaster ///////////////////////////////////////////////////////////////////////////////
where 'down' is a host group of known down systems (specified in down_systems.cfg), 'sick' is a host group of systems (specified in sick_systems.cfg) with various system or network issues that prevent or hinder their PIKT operation (e.g., they are Windows systems!), and 'sidelined' are up systems that we don't currently care about.
Output from this script might look like, for example:
URGENT: DiskFull Report full filesystems On firenze, filesystem / on /dev/hda2 is 100% full, 28842780 KB capacity, 0 Kb left
For more examples, see Samples.