Find Spelling Errors

spell_check.pl is a simple Perl spell check script to find spelling errors in a text document, for example an HTML file.  spell_check.pl uses the find_words script to identify words in the specified text file.  For each word, one or more lookups are done:  first in the standard ispell dictionary, and if not found, second in the specified local dictionary, in this case PIKTDictionary.obj.  Any word not found in either dictionary indicates a possible spelling error and is output, one word per line.  If any of these possibly misspelled words are not a genuine spelling mistake, they should be added to the local dictionary file (PIKTDictionary.obj).

spell_check.pl is called by the ReportPIKTSpellingErrors Pikt script.

#!/usr/bin/perl

# spell_check.pl:  spell check a text file; find all words in a text
#                  file, look up each in the specified dictionary
#                  file (with one lower case word per line), and output
#                  those not found (one word per line)
#
#                  Usage:  spell_check.pl -d <dictionary> -f <file>

while (@ARGV) {
        if ($ARGV[0] eq "-d") {
                shift;
                $dictionary = $ARGV[0];
                shift;
                next;
        }
        if ($ARGV[0] eq "-f") {
                shift;
                $file = $ARGV[0];
                shift;
                next;
        }
}

open(DICTIONARY, $dictionary);
while(<DICTIONARY>) {
        chomp;
        next if /^$/;           # bypass empty lines
        $word = $_;
        $word =~ s/\s//g;       # remove spaces
        $word =~ tr/A-Z/a-z/;   # convert to lower case
#       print "#$word#\n";
        $dict{$word}++;         # add word to internal dictionary
}
close(DICTIONARY);

#foreach $key (keys %dict) {
#       print "$key\n";
#}

open(WORDS, "/bin/egrep -v '\.\.\.' $file | /usr/local/bin/find_words | /usr/bin/ispell -l |");
while(<WORDS>) {
        chomp;
        next if /^$/;           # bypass empty lines
        $word = $_;
        $word =~ tr/A-Z/a-z/;   # convert to lower case
        next if $dict{$word};   # skip if found in internal dictionary
        print "$word\n";        # print if not found
}
close(WORDS);

Open Hand For more examples, see Samples.

 
Home | FAQ | News | Intro | Samples | Tutorial | Reference | Software
Developer's Notes | Licensing | Authors | Pikt-Users | Pikt-Workers | Related Projects | Site Index | Privacy Policy | Contact Us
Page best viewed at 1024x768 or greater.   Page last updated 2008-03-27.   This site is PIKT® powered.
PIKT® is a registered trademark of the University of Chicago.   Copyright © 1998-2008 Robert Osterlund. All rights reserved.
Home FAQ News Intro Samples Tutorial Reference Software
PIKT Logo
PIKT Page Title
View sample
Adsense
section targeting
Pikt script