Multiple Home Directories
[posted 2001/11/29]
Here's some more recent cool PIKT stuff that you might find directly useful, otherwise as a source of inspiration for solving your own unique problems.
We recently discovered that one user had duplicate home directories, the one actually in use and another orphaned one. The orphaned home directory was probably left in place by a glitch in our move_homedir.pl utility a couple months back.
You should understand our setup a bit. Aside from personal desktops and central servers, we have three sets of shared user systems, for the fac (faculty), phd, and mba user communities. As an example, for the phd user community, we have six shared systems. Several user data disks are attached to each system. All disks are available on all systems by means of NFS crossmounts and automounting.
In order to check whether a phd user has multiple home directories across all phd systems, we need a list of all phd disks. We already had per-machine disks lists as with
/////////////////////////////////////////////////////////////////////////////// DirsUser // now auto-generated ... #elif phd6 # verbatim <objects/dirs_user/dirs_user_phd6_objects.cfg> [/pikt/lib/programs/dirsuser.sh 10 phd6 2>/dev/null | /usr/bin/egrep -v "crspdat"] ... ///////////////////////////////////////////////////////////////////////////////
but we lacked a DirsUser objects list for all of the phd machines (and similarly for the fac and mba machines).
The solution adopted was to add the following objects at the end of objects/dirs_user_objects.cfg:
/////////////////////////////////////////////////////////////////////////////// #if mba DirsUserMBA # verbatim [cat /pikt/lib/configs/objects/dirs_user/dirs_user_mba*_objects.cfg | sort | uniq] #endif #if fac DirsUserFac # verbatim [cat /pikt/lib/configs/objects/dirs_user/dirs_user_fac*_objects.cfg | sort | uniq] # include <objects/dirs_user/dirs_user_fac0-bak_objects.cfg> #endif #if phd DirsUserPhd # verbatim [cat /pikt/lib/configs/objects/dirs_user/dirs_user_phd*_objects.cfg | sort | uniq] #endif ///////////////////////////////////////////////////////////////////////////////
That is, cat all the individual dirs_user objects files in each group, and include that "as is" (verbatim) into objects/dirs_user_objects.cfg using the virtual #include file technique shown above.
When installed to a phd machine, DirsUserPhd.obj looks like:
/export/home /pub/phd_disk_1 /pub/phd_disk_13 /pub/phd_disk_16 ... /pub/phd_disk_47 /pub/phd_disk_48 /pub/phd_disk_5 /pub/phd_disk_6
I then developed this new alarm script:
/////////////////////////////////////////////////////////////////////////////// MultipleDirsUserWarning init status active level warning task "Report multiple home directories" # if mba input proc "for f in `=cat =objdir/DirsUserMBA.obj`; do =ls -1 $f; done | =sort | =uniq -d" # elsif fac input proc "for f in `=cat =objdir/DirsUserFac.obj`; do =ls -1 $f; done | =sort | =uniq -d" # elsif phd input proc "for f in `=cat =objdir/DirsUserPhd.obj`; do =ls -1 $f; done | =sort | =uniq -d" # endif dat $name 1 # if phd begin =remind(2002, 7, 1, "REVIEW THE PHD MULTIPLEDIRSUSERWARNING EXCEPTIONS") # endif rule // bypass these if $name eq "." || $name eq ".." || $name eq "lost+found" || $name eq "tmp" || $name eq "pikt" # if fac || $name eq "crsp" # endif # if phd || $name =~ "scully|muldur" // these are legitimate // duplicates # endif next fi rule output mail "multiple $name directories:" # if mba =outputproc(mail, "for f in `=cat =objdir/DirsUserMBA.obj`; do =ls -ld \$f/$name 2>/dev/null; done") # elsif fac =outputproc(mail, "for f in `=cat =objdir/DirsUserFac.obj`; do =ls -ld \$f/$name 2>/dev/null; done") # elsif phd =outputproc(mail, "for f in `=cat =objdir/DirsUserPhd.obj`; do =ls -ld \$f/$name 2>/dev/null; done") # endif =outputproc(mail, "=ls -ld /home/$name 2>/dev/null") output mail =newline #endif // mba | fac | phd ///////////////////////////////////////////////////////////////////////////////
The 'input proc' statement(s) generate a list of accounts with multiple data directories. The 'begin' section schedules a reminder to review the phd exceptions registered in the following rule. In the final rule, for each duplicate directory, an 'ls -ld' is output, followed by an 'ls -ld' of the user's /home link. Here is some sample output:
multiple krychek directories: drwx------ 11 krychek phd 1024 Oct 12 14:40 /pub/phd_disk_30/krychek drwx------ 14 krychek phd 1024 Aug 15 15:49 /pub/phd_disk_37/krychek lrwxrwxrwx 1 root other 25 Nov 7 07:26 /home/krychek -> /pub/phd_disk_37/krychek
From this, we can infer that /pub/phd_disk_37/krychek is the most likely official home directory, and that /pub/phd_disk_30/krychek is a probable orphan.
I'm not too happy about the #if's in the script. I now prefer to keep scripts simple on the surface and hide away #if and #ifdef customizations in the *_macros.cfg and *_objects.cfg files. So, for example, we could replace the series of 'input proc' statements above with just one macroized 'input proc' statement:
input proc "for f in `=cat =dirs_user`; do =ls -1 $f; done | =sort | =uniq -d"
And similarly substitute just one macroized =outputproc() statement in the final rule.
But at the time, I decided against it. Macros are tremendously efficient, but they can be a pain to manage sometimes, too. It's a judgment call.
Anyway, the above approach works well enough so far. In the first sweep of our systems, we discovered about a dozen cases of orphaned home directories. Disposing of those orphans reclaimed over 1/2 GB disk space all told.
What's with the extra #include directive in
#if fac DirsUserFac # verbatim [cat /pikt/lib/configs/objects/dirs_user/dirs_user_fac*_objects.cfg | sort | uniq] # include <objects/dirs_user/dirs_user_fac0-bak_objects.cfg> #endif
?
Most of the fac disks are hosted on a dedicated disk server, fac0-bak. fac0-bak is on a separate 192.168.0.0 subnet, out of direct reach of the piktmaster. (There is a way to work around this problem, I wager. I just haven't figured it out yet.) Here is the dirs_user_fac0-bak_objects.cfg file:
/////////////////////////////////////////////////////////////////////////////// // // dirs_user_fac0-bak_objects.cfg // // we now need to explicitly specify the fac disks, which are hosted // on fac0-bak, and which is not running PIKT (because fac0-bak is on // the bak subnet and is therefore unreachable from the piktmaster, // also is therefore unpollable // /////////////////////////////////////////////////////////////////////////////// #indent /pub/fac_disk_23 ... /pub/fac_disk_34 #unindent ///////////////////////////////////////////////////////////////////////////////
Is our setup sort of screwy? You bet! But with a bit of cleverness, we can configure PIKT around any peculiarity.
For more examples, see Developer's Notes.