A while ago, I wrote about the ocrtext SpamAssassin plugin. There is a version of this plugin that is still in active development. The FuzzyOCR plugin is very similar to the older text plugin, but instead of using regular expressions to handle the inaccuracies of OCR, it uses String::Approx to do the fuzzy matching.
This wiki page has some pretty good steps on setting this plugin up.
Also see:
http://wiki.apache.org/spamassassin/FuzzyOcrPlugin
Technorati Tags: FuzzyOCR, gocr, OCR-Text, SpamAssassin
No comments:
Post a Comment