URL Status
In the url_status.pl Perl script, we determine and report the status of a given URL.
We use several subroutines from the LWP::Simple module to fetch and process web pages. LWP--"Library for World Wide Web in Perl"--is a set of Perl modules providing an application programming interface (API) to the World-Wide Web. For more information on LWP, visit this site.
The getstore() subroutine gets a document and stores it in a file. The return value is the HTTP response code.
Unless we successfully fetched the document--the HTTP status code is 200 (HTTP_OK) (or some other status code signifying success) and we were able to store the document in a local file (the file size is non-zero; note that we zero out the file before attempting the fetch)--we print the status code and the URL. (For successful fetches, we report nothing.) For example, url_status.pl might report:
# /usr/local/bin/url_status.pl https://pikt.org/foo.html /tmp/junk.url_status 404 https://pikt.org/foo.html
where 404 is the HTTP response code for "URL not found" (HTTP_NOT_FOUND).
The script follows.
#!/usr/bin/perl -w use strict; use LWP::Simple; if ( == 2) { my $url = $ARGV[0]; my $fil = $ARGV[1]; system("/bin/cp /dev/null $fil"); my $stat = getstore($url, $fil); unless (is_success($stat) && (-s $fil)) { printf "%s %s\n", $stat, $url; } }
url_status.pl is used by CheckBrokenLinksExternal and other Pikt scripts.
For more examples, see Samples.