Wanted: Spam trap extension for Mozilla Thunderbird

I'd like to see someone write a spam-trap extension for Mozilla Thunderbird that would simply delete any messages that match messages from a spam-only account. I'd be willing to pay for such an extension.

Concept

I first saw this idea in use on unstable.nl. At the bottom of the page was this puzzling declaration:

spam-trap@unstable.nl - Please send spam.
Humans may write to andreas@unstable.nl

I presume that Andreas has programmed his mail client or retriever to delete from andreas@ any messages that are identical or similar to messages that appear on spam-trap@. I later contacted him on Jabber, and he validated my suspicions, adding that he only sees one piece of spam per week. I was impressed.

Specification

A Mozilla Thunderbird plugin could easily implement this concept. Have the user specify an address they don't use, but own, such as an outdated Hotmail account. Then delete any similar or identical messages that arrive on other accounts. Defining "similar" is the hard part, of course, but I have some ideas:

  • Compute a quick hash of each embedded attachment (otherwise may have disproportionate effect on filtering)
  • Use the diff function on textual areas
  • Strip query strings from URLs and embedded forms (query strings may have hashed copy of email address embedded)
  • Compare some email headers

Research

I don't know much about email headers or routing, so I don't know how same-session spam messages are similar or different. Research into this would be necessary. Perhaps public data on this already exists.

Problems

This technique of filtering may be circumvented if spammers start sending out messages with more randomization and scrambling. Additionally, if this filtering technique were to become popular, unforeseen loopholes would undoubtedly arise. In both cases, however, I am certain that spammers would be required to use more processing power, and therefore incur more cost to themselves.

Bounty

This is a cool enough idea to warrant a bounty, especially if research is required. I would be willing to pay $50 out of my own pocket for the first successful solution, and I'm sure others would be willing to contribute. Alternatively, if someone can find a fatal flaw in the idea before any serious work is done, I am willing to pay that person $5-10 dollars. (I might pay more if they devise a new specification that is not vulnerable to the same flaw.)

A "successful solution" is defined as open source/free software, cross-platform, reasonably non-buggy, and able to implement at least the core feature of the request (here, deletion of mail on one account upon receipt of a similar message in another.) A "fatal flaw" is defined as a reasonably easy concept or proof-of-concept which, if implemented, would defeat any reasonable solution.

Please, if you plan to implement this idea, leave a note here so that people are not duplicating efforts. If there is a change in status, I will notify every person who leaves a comment, unless they request otherwise. (Yeah, I know, opt-out emailing...)

Are you willing to pledge bounty money for an implementation? Leave a note here to motivate potential developers. (Your pledge isn't binding, even though mine is.)


Responses: 3 so far

  1. Cory Capron says:

    There's a Jaws or Critters joke in there somewhere... I'm just too tired to figure it out.

    Cool idea. Happy hunting!

  2. Christoph Mueller says:

    Nice idea.

    Possible problems are that spammers often attach individual random junk to messages so comparing will be difficult, and that emails will not arrive simultaneously, so with frequent checks for new mail there is a good chance that the real account will receive the spam first, and when the trap account gets hit it is already filtered.

    That said, if you find something like this, please let me know.

  3. Tim McCormack says:

    There is always a pattern: A spam generation program can only work with so many templates and so many variable fields, the mail proxy that the messages are sent through will generate similar headers each time, and even the image generation programs for the image-based spam leave signatures.

    In other words, until we're dealing with true AI spam engines, this strategy should work. And by the time we have AI, we won't need this kind of spam filter anymore.