Report Recent Google Googlebot Visits

In this example, we report recent Google googlebot (search engine spider) visits.  Why is this a concern?  For one thing, if you have introduced new pages, they won't be indexed until the googlebot comes crawling.  For another, suppose you are SEO'ing (doing Search Engine Optimization on) your web pages.  You can't judge if your SEO efforts raise or lower your positions in the SERPs (Search Engine Results Pages) unless and until the googlebot has fetched each page.  Moreover, you just might want to have a general sense of search engine crawler activity.  If the search engine spider is not visiting your pages, perhaps your robots.txt file is in error, or the spider can't find page(s) due to broken links, or maybe you have some other problem.

The GooglebotVisit script might send an alert message like the following:

                                PIKT ALERT
                         Thu Sep  8 23:40:06 2005
                                 calgary

INFO:
    GooglebotVisit
        Report recent Googlebot visits

        devnotes/devnotes_1.13.x_check_file_status.html: 08/Sep/2005:22:55:26
        intro/intro_system_security.html: 08/Sep/2005:18:23:50
        ref/ref.3.ifdef_endifdef_define_setdef.html: 08/Sep/2005:22:00:06
        ref/ref.3.parse_errors.html: 08/Sep/2005:21:46:19
        samples/PerUserProcessCounts.obj.html: 08/Sep/2005:21:56:57
        samples/aliases_files.cfg.html: 08/Sep/2005:18:57:14
        samples/samples.html: 08/Sep/2005:21:57:14

In the GooglebotVisit script following, note how it uses a Perl script, googlebot.pl, for its input.

GooglebotVisit

        init
                status =piktstatus
                level =piktlevel
                task "Report recent Googlebot visits"
                input proc "=httpd_cgibin_root/googlebot.pl pikt"
                seps "|"
                dat $page [1]
                dat $date [2]
                keys $page

        rule
                if $date ne %date
                        output mail "$page: $date"
                fi

A companion script, GoogleMediabotVisit, reports recent visits by the Google mediabot (the ad server crawler).  (Note the similarity between the two scripts, the only significant difference being the change in the date field position.)  You could also write scripts to report recent visits by the robots, spiders, and crawlers of other search engines, such as Yahoo!, MSN Search, AltaVista, Excite, and so on.

Open Hand For more examples, see Samples.

 
Home | FAQ | News | Intro | Samples | Tutorial | Reference | Software
Developer's Notes | Licensing | Authors | Pikt-Users | Pikt-Workers | Related Projects | Site Index | Privacy Policy | Contact Us
Page best viewed at 1024x768 or greater.   Page last updated 2008-03-27.   This site is PIKT® powered.
PIKT® is a registered trademark of the University of Chicago.   Copyright © 1998-2008 Robert Osterlund. All rights reserved.
Home FAQ News Intro Samples Tutorial Reference Software
PIKT Logo
PIKT Page Title
View sample
HTML head
macros