Developing Alarm Scripts
[posted 1999/09/04]
Another in the new series on PIKT operations, this one revisiting the topic of developing alarm scripts:
At the UofC's Business School, we give users the option of accessing their e-mail via the DMailWeb program. This program is troublesome, to say the least. If users fail to log out properly, or under other mysterious circumstances (the program is buggy), erroneous information is left in their user data files, which blocks their next access. So, several times daily, we get urgent e-mail from students (and others) about their not being able to access their e-mail, and would we please fix the problem pronto. The fix involves editing out (clearing) several lines from the user data file. We have developed a Perl script to handle this at the command line on a case-by-case basis, but there has to be a better, more proactive solution.
And of course there is. One strategy is to automatically clear every user data file overnight at a time when no sane person would be checking e-mail.
For the file clearing, I've written a Perl script that I maintain in the PIKT programs.cfg file:
/////////////////////////////////////////////////////////////////////////////// #ifndef generic #if warsaw | seville dmwclear.pl // clear the dmailweb user.dat file #!=perl # dmwclear.pl -- clear the dmailweb user.dat file die "Usage: dmwclear.pl <udfile>\n" unless $#ARGV == 0 ; $udf = $ARGV[0] ; die "$udf: No such file\n" unless -e $udf ; $BAK = "$udf.bak" ; $TMP = "$udf.tmp" ; open(UDF, $udf) ; open(TMP, "> $TMP") ; while (<UDF>) { next if (/^(diskuse|remote_addr|pop_cur|pophost)\s/) ; if (/^disk_quota\s/) { print TMP "disk_quota 100000\n" ; } else { print TMP $_ ; } } close(UDF) ; close(TMP) ; ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime, $blksize,$blocks) = stat($udf) ; chown($uid, $gid, $TMP) ; chmod($mode, $TMP) ; unlink($BAK) ; rename($udf, $BAK) ; rename($TMP, $udf) ; exit 0 ; #endif // warsaw | seville #endifdef // generic ///////////////////////////////////////////////////////////////////////////////
I install this on either warsaw or seville (the two systems on which we run DMailWeb) with the command
vienna# piktc -iv +P dmwclear.pl +H warsaw seville
In the configs/macros/piktfiles_prg_macros.cfg file, I created a macro for this new program:
#if warsaw | seville dmwclear =prgdir/dmwclear.pl #endif
This =dmwclear macro is not essential. I could just specify "=prgdir/dmwclear.pl" in the alarm script if I wished.
Here is the PIKT wrapper script for dmwclear.pl:
/////////////////////////////////////////////////////////////////////////////// #ifndef generic #if warsaw | seville DMailWebClearNotice init status active level notice task "Clear DMailWeb user.dat files" #if warsaw input proc "=find /opt/web/dmisp_22e/workarea -name user.dat -print" #elsif seville input proc "=find /home/apache/dmisp_20n/workarea -name user.dat -print" #endif rule // if $inlin =~ "eknievel" // for spot testing // on warsaw // if $inlin =~ "\/u_s[[:alpha:]]\/" // staff accts; for // initial trial run exec wait "=dmwclear $inlin" // endif #endif // warsaw | seville #endifdef // generic ///////////////////////////////////////////////////////////////////////////////
Initially, for testing, I added this script to the Test alert in alerts.cfg, then installed the Test alert with
vienna# piktc -iv +A Test +H warsaw
For the spot test, I uncommented the "eknievel" if-endif.
On warsaw, I did the spot test with
warsaw# pikt +A Test
After verifying that eknievel's user.dat was cleared, and that a pre-clear backup file was made, I checked the Test.log:
warsaw 524) cat /var/pikt/log/Test.log Sep 4 10:04:19 INFO: alert run begin Sep 4 10:04:37 INFO: in dostm(), DMailWebClearNotice, exec wait "/pikt/lib/programs/dmwclear.pl /opt/web/dmisp_22e/workarea/u_ek/eknievel/user.dat 2>> /pikt/var/log/Test.log; /usr/bin/rm -f /pikt/etc/Test.exec.lock 2>/dev/null" Sep 4 10:04:38 INFO: ran DMailWebClearNotice Sep 4 10:04:38 INFO: alert run end
Seeing just the one eknievel entry reassured me that the spot test if-then worked and that just the one user's user.dat was cleared.
I then removed all traces of the Test alert on warsaw with
vienna# piktc -tv +A Test +H warsaw
At this point, I know that the dmwclear.pl program and DMailWebClearNotice script work as designed. I could go full bore, but I prefer to do a trial run on just the staff accounts. So, I commented out the "eknievel" line and uncommented the "\/u_s[[:alpha:]]\/" line, since staff accounts have DMailWeb working directories of the form u_s[a-z].
Next, I moved DMailWebClearNotice from the Test stanza in alerts.cfg to the end of the Notice stanza in that file.
I then installed on warsaw as follows:
vienna# piktc -iv +A Notice +P dmwclear.pl +H warsaw processing warsaw... installing file(s)... Notice.alt installed installing file(s)... dmwclear.pl installed
If all goes well, overnight--it is Saturday morning as I write this, and we schedule our Notice alerts to run before 4 AM--the Notice alert will run on warsaw, and only staff data files will be cleared. I will verify this tomorrow by checking the Notice.log file on warsaw. (Just to be on the safe side, I think I'll also back up the entire workarea directory late this evening. This is an innate and also experience-induced pre-PIKT cautiousness that has me do this.)
Sometime next week, if all still goes well, I will comment out the if-endif to clear every single user file. I will then install on both of our DMailWeb servers:
vienna# piktc -iv +A Notice +P dmwclear.pl +H warsaw seville
There is a small chance that someone might be accessing using DMailWeb at the ungodly hour of 4 AM. It remains to be seen if the PIKT-run clearing operation will mess with an ongoing DMailWeb access. If that happens, I am confident that the appropriate modifications to DMailWebClearNotice will take care of any problems.
Unfortunately, this overnight-clearing strategy won't help the case where a user accesses mail in the morning, fails to log out properly, then attempts access again in the afternoon. We still have the other command-line Perl script to handle that. But the automated PIKT clearing should handle most such problems. If need be, and with the proper safeguards built into a modified DMailWebClear script, we might be able to run this safely during daylight hours.
You may be asking yourself why I wrote a separate Perl script, dmwclear.pl, and didn't simply have Pikt clear the user files directly (with a combination of #fopen(), #fread(), #fwrite(), and #fclose() calls). I could do that, but I know that Perl is much faster for this sort of thing. When Pikt matures and benefits from a small army of co-developers working to optimize the script interpreter the way Perl has so benefitted (else pikt is rewritten using Guile), maybe then Pikt will run as fast as Perl. Anyway, given that DMailWebClearNotice has to run through >3,000 user accounts nightly, I'll go with the faster (if not simpler) option for now.
For more examples, see Developer's Notes.