Report Recent Google Googlebot Visits
In this example, we report recent Google googlebot (search engine spider) visits. Why is this a concern? For one thing, if you have introduced new pages, they won't be indexed until the googlebot comes crawling. For another, suppose you are SEO'ing (doing Search Engine Optimization on) your web pages. You can't judge if your SEO efforts raise or lower your positions in the SERPs (Search Engine Results Pages) unless and until the googlebot has fetched each page. Moreover, you just might want to have a general sense of search engine crawler activity. If the search engine spider is not visiting your pages, perhaps your robots.txt file is in error, or the spider can't find page(s) due to broken links, or maybe you have some other problem.
The GooglebotVisit script might send an alert message like the following:
PIKT ALERT Thu Sep 8 23:40:06 2005 calgary INFO: GooglebotVisit Report recent Googlebot visits devnotes/devnotes_1.13.x_check_file_status.html: 08/Sep/2005:22:55:26 intro/intro_system_security.html: 08/Sep/2005:18:23:50 ref/ref.3.ifdef_endifdef_define_setdef.html: 08/Sep/2005:22:00:06 ref/ref.3.parse_errors.html: 08/Sep/2005:21:46:19 samples/PerUserProcessCounts.obj.html: 08/Sep/2005:21:56:57 samples/aliases_files.cfg.html: 08/Sep/2005:18:57:14 samples/samples.html: 08/Sep/2005:21:57:14
In the GooglebotVisit script following, note how it uses a Perl script, googlebot.pl, for its input.
GooglebotVisit init status =piktstatus level =piktlevel task "Report recent Googlebot visits" input proc "=httpd_cgibin_root/googlebot.pl pikt" seps "|" dat $page [1] dat $date [2] keys $page rule if $date ne %date output mail "$page: $date" fi
A companion script, GoogleMediabotVisit, reports recent visits by the Google mediabot (the ad server crawler). (Note the similarity between the two scripts, the only significant difference being the change in the date field position.) You could also write scripts to report recent visits by the robots, spiders, and crawlers of other search engines, such as Yahoo!, MSN Search, AltaVista, Excite, and so on.
For more examples, see Samples.