.maildelivery - simple rule-based
procmail - complex rule-based
Ifile - full text analysis
This is used with the MH family of mail readers. It is not real sophisticated, but it's useful enough if your needs are limited.
If you use sendmail as the mail transport agent, create .forward containing: "|bin/sh -c 'exec >> /tmp/out 2>&1; /usr/lib/mh/slocal -user cbbrowne -verbose -debug"
If, like me, you use Qmail, a "more modern" mail transport agent, create .qmail containing: | /usr/lib/mh/slocal -user cbbrowne
>Subject "Connect Log" | A "/usr/lib/mh/rcvstore +Logs" # CT = "Canada Trust" Subject "Portfolio Valuation" | A "/usr/lib/mh/rcvstore +CT" # Stuff from family & Anil goes to appropriate mailboxes From "email@example.com" | A "/usr/lib/mh/rcvstore +Brownes/Brad" From "firstname.lastname@example.org" | A "/usr/lib/mh/rcvstore +Brownes/Dad" From "email@example.com" | A "/usr/lib/mh/rcvstore +Friends/Anil" From "firstname.lastname@example.org" | A "/usr/lib/mh/rcvstore +Brownes/Dave" From "email@example.com" | A "/usr/lib/mh/rcvstore +Brownes/Doug" # Use Ifilter on the remainder... * - ^ ? /home/cbbrowne/bin/ifilter # And if it's not been hit just yet, toss it in the inbox # It also means that if ifilter should happen to crash # for some reason, the mail will *not* be lost; the failed status code # means that processing will fall out to this rule default - ^ ? "/usr/lib/mh/rcvstore +inbox"
procmail is included with all distributions of Linux that I know of.
:0 c backup # Preserve all mail in backup files Just In Case... :0 ic | cd backup && rm -f dummy `ls -t msg.* | sed -e 1,32d` :0:.saplock * ^Sender: SAP R/3 Discussion * Subject: *BASIS | /usr/lib/mh/rcvstore +SAP/BASIS :0:.saplock * ^Sender: SAP R/3 Discussion | /usr/lib/mh/rcvstore +SAP :0:.alpha * ^Resent-From: axp-list | /usr/lib/mh/rcvstore +Alpha :0:.ntlug * ^Sender: owner-ntlug | /usr/lib/mh/rcvstore +Linux/NTLug :0:.general * ^From: | /usr/lib/mh/rcvstore +inbox #:0:.portfolio #* ^Subject: Portfolio #| /usr/lib/mh/rcvstore +CT # #:0:.debian #* ^Resent-From: debian-announce #| /usr/lib/mh/rcvstore +Linux/Debian # #:0:.dilbert #* ^Subject: New Dilbert #| /usr/lib/mh/rcvstore +dilbert
This works in conjunction with MH/EXMH, providing a filter that examines your message refiling patterns in order to determine where future mail ought to go.
Searches thru your mail folders, counting the frequencies of words and messages. Infrequently used words are discarded, and the statistical results are dropped into a file, .idata
Use irefile when a message is misclassified. It does two things:
Moves the message to the right folder
Dribbles frequency statistics into .idata.queue
Updates .idata by adding in frequency information from .idata.queue
Reads the message, computes a "relevance value" for each folder based on the statistics in .idata, and files the message in the folder with the best "relevance value."
Relevance is based on the log function so that high occurances of common words (like the or and) are appropriately discounted.
Text search systems commonly use "stop lists," removing these common words because they just make queries messy. Since Ifile uses the entire text of the incoming message as a "query key," even members of a typical "stop list" can usefully contribute to the result.
TDMA takes the approach of only accepting mail from those willing to jump through the hoop of a quasi-authentication process.
All mail coming from non-authenticated sources gets queued, and a message is sent back asking them to authenticate themselves. Spammers certainly won't bother; those that do go onto the "authenticated" list, and their messages will be passed straight on through.
The "email client" equivalent to Google; written in Java , targeted primarily at MacOS-X, with some ability to run on other Java-supported platforms.
Mailfilter allows the gentle user to apply a set of rules to the headers of mail in their POP3 mailbox, deleting messages that match particular rules. It can function in a " rule-based" as well as in a " score-based" manner.
It is not nearly as good a classifier as (say) Ifile , but has the merit that it can expunge mail before you download it. I found this extremely useful in the wake of 2003's Swen-32 email virus that was pummelling my mailboxes with exorbitant quantities of email viruses. My systems may be immune to the viruses, but that doesn't mean I want to consume the bandwidth to download the messages.
Ifile was designed for use in conjunction with MH for email. I have written Perl scripts that go through a news spool and pass news messages on to ifilter. I'm still in the experimental stages on this; at present it doesn't handle outgoing news very well. But it does provide fairly good message classification. Various forms of "spam" get quite accurately classified as such, so that it takes only a quick browse to "nuke the spam." I still have to complete the circle, turning at least portions of this back into news. (Maybe it's time to try out GNUS. )