Bulc Club

Why Any Definition of Spam Never Seems to Satisfy Your Hunger

News September 18, 2015

Why any explanation of spam never seems to satisfy your hunger

Bulc Club is Now 100% Free!

August 12, 2016 — Read the Press Release »

In an article written for ModernCrypto.org last September, Mike Hearn outlines a comprehensive and fascinating history of spam from the perspective of an engineer who has been on the inside of one of the largest webmail providers in the world (Gmail). Mr. Hearn proves in the article that not only does he have a rich, extensive perspective on arguably one of the digital age’s most pervasive pests, but he can also communicate some rather complex ideas in ways that are understandable for non-technical readers.

The article by Mr. Hearn is a highly recommended read for any email user curious about the rise of spam (also: junkmail, unsolicited email, abusemail,and bulkmail) to the status required to be considered one side of a “war” waged against its own purpose. Although the article makes a tremendous effort at answering reader questions before they arise, with a subject as technical as spam, it’s bound to leave many seeking further answers.

The article begins: “War” is a good analogy: there were two opposing sides and many interesting battles, skirmishes tactics and weapons.

For users hosting email privately instead of through a large webmail service (such as Gmail), the war is hardly two-sided.

The sending side is as strong as it is with large webmail, but the receiving side is only as strong as:

a) The software (consumer or enterprise) installed on the mail server or mail client b) The accessibility and proficiency of the team responsible for the mail server, if it’s a managed solution at all c) Whether the volume of abusemail versus time necessary to defend against it is disproportionately greater than just deleting the offending message(s) and moving on

How is Social Filtering different from Automated Reputation Systems?

The article states: Reputations are moving averages that are calculated based on a careful blend of manual feedbacks from the Report Spam/Not Spam buttons and “auto feedbacks” generated by the spam filter itself. Obviously, manual feedbacks have a lot more weight in the system and that allows the filter to self-correct.

The age-old assumption:
the more dependency on user feedback, the greater the opportunity for user error.

But if you ask any email user if they are certain the message they marked as spam is indeed spam, they will assure you they know what they are doing.

The balance between dependency on user feedback and an automated reputation system is an interesting one, because the time required to flag a message is equal to the time required to simply delete the message. Without any incentive to do so, why contribute this evaluative data at all? Furthermore, identifying legitimate mail for reputation systems is arguably just as valuable as identifying spam, for calculating these reputations. But unless it’s a passive process, an email user doesn’t have an apparent incentive to identify legitimate mail and should not be counted on for doing so. In a large web-based email service, the incentive is obvious: Mark a message as “Not Spam,” move that message from a spam folder back to the inbox, and assume future messages from the same sender will be protected from filtering. But on a private system, these instructions are primarily only recorded on one email account, and minimally effective for future messages. No other mailboxes on the same server, or even account/domain, will receive the benefits of this user action.

The obstacle of making a private mail system function like a large web-based email system is many-fold, but concentrated on three principles:

a) Accountability. At its infancy, a user won’t have the confidence that the reputation is strong enough to be an effective filter, and at its maturity, a user won’t have the confidence that his ratings will be significant. In the beginning, even early adopters might feel like the system should be revisited later, when the network is already grown large enough to not be considered just a fad. And in the end, it’s the same inhibition that keeps a voter from the polls: thinking a single rating can’t have any effect over a massive history of prior ratings. That sense of one-in-a-million insignificance should guarantee empty voter booths and lottery offices, but even in spite of those odds, the incentive is enough to prevail.

The article states: I don’t want to think about how you’d build one of these outside a highly controlled environment, it was enough of a headache even in proprietary/centralised setting…

b) Flexibility. Once you define a list of server environments, multiply that times available client-side mail clients (such as Microsoft Outlook) and then again times the devices that access each mailbox, you end up with unmanageable and ever-changing technologies to support. Large webmail is (conceptually) centralized, and their own hurdles notwithstanding, far more controllable as a service than private email. The solution will need to be one that’s not directly integrated, but rather added in serial to any pre-existing and future architectures. But if it’s not integrated, then how do you give an email user immediate access to the controls necessary for a service like this?

c) Privacy. A large web email user may not actually say it out loud, but they at least suspect that some kind of sacrifice of their privacy is likely a trade-off for the use of an email service that is robust, dependable and — most of all — free. Privately leased mail servers, on the other hand, are not free (at least not yet), but one of the many benefits sought by these leaseholders is possibly the peace of mind that comes from large web email independence. The paradox is suddenly visible: once I have become a member of a social network, especially one that’s tasked with filtering my email, is my private mail still private?

The Bulc Club approach is a rather aggressive one. But at no time during the concepting, development, and subsequent deployment of this approach was the sacrifice of privacy ever a considered option. Unlike some modern technologies which are a closely guarded secret, Bulc Club’s model was published and shared with experts during concepting. The evolution of the ideas generated through this collaboration was not only re-published, but further sharing was (and is) fully encouraged.

The model evaluates a sender address and (originating) mail server domain only. It’s been proposed that these two factors could be doubled to include the sending mail server’s IP and nameserver registration as well, but while these properties are being recorded, they aren’t a compromise of user privacy.

The Bulc Club Member Console presents to its members supplemental message details such as the forwarder/alias that the user has created to organize messages, and the subject of the messages they received. But this last property is purely for identification purposes and is not evaluated or even used by the reputation calculations.

Lastly, messages that are held as a result of high ratings (by the user himself or by Bulc Club’s Member Ratings) are deleted after the user is given a conservative review period of 30 days. During the 30 days, messages are held securely for members who wish to have a held message delivered despite its rating. This review period is absolutely necessary to guarantee that no message (even legitimate spam) is ever discarded before a user has an opportunity to have it delivered anyway. After this point, the ratings information is maintained but all trace of the message itself is expunged.

From the Member Console, a user is given an additional 30 days (totaling 60 for each message) to further review which messages have since been erased, before they too are cleaned from the system. This strict policy of (a) evaluating only the message properties relevant to the reputation calculation, (b) documenting messages that have been sent-versus-held, and (c) finally wiping the message from existence, are the core of Bulc Club operations.

In addition to the forwarding of member email, Bulc Club DOES GENERATE native email messages — which sure sounds like a procedure contrary to the club’s mission. These messages, however, are limited to the following:

a) Verification Emails:  guaranteeing that member email will be forwarded to a member’s chosen/authentic address (member accounts and free accounts)

b) Password Reset Emails: for members who need to have their passwords changed, a standard process for online accounts

c) Membership Account Notifications: a verification message that’s sent to the member email when their account is deactivated or canceled

d) Referrals: an invitation from a Bulc Club member to a non-member only sent upon explicit request on behalf of the member

While the first three of the above were decidedly necessary for the healthy function of the system, the last was very largely discussed among the system’s collaborators and engineers. It was ultimately decided that members become members because they believe in the system, and as such, will not only heavily discriminate the use of a referral system to colleagues and friends who will also believe in the system, but believing in an anti-abuse system also means not abusing the system itself.

The article states: Reputation contains an inherent problem. You need lots of users, which implies accounts must be free. If accounts are free then spammers can sign-up for accounts and mark their own email as not spam, effectively doing a sybil attack on the system. This is not a theoretical problem.

So far, the discussion has been about the incentives on the receiving side of the war. The incentives on the enemy side are fairly simple: delivered messages pay, clicked links in those messages pay, and ad-to-purchase conversions (however impossible to imagine) pay. If every mail-blast is a battle, imagine a battle where one spammer sends one million messages. For each recipient that marks the received message as spam, the reputation of the spammer is decimated. One user, one click means: 100% ABUSE. That user will not receive another message from the flagged sender or offending mail server and the potential for a future payout to the spammer diminishes slightly. How about 10 users, 10 clicks? The reputation is now 10-to-0 with odds that even a riskier bettor might not take.

The spammer, no doubt, has the technology to reverse the effects of the flag by creating his own account (or accounts) working within the parameters of the reputation engine, and marking the message as safe. It will cost them a membership fee, but in their mind, it’s a small price to pay for uninterrupted future battles, right?

It’s important to remember that smart reputation systems are designed to protect receivers. Which means the false account created to offset the rating will successfully receive his own abusemail in accordance with his decision. Meanwhile, the reputation which was re-calculated to 10-to-1 will continue to prevent the 10 other members from receiving abusemail, along with every other member of the network who haven’t rated or even received a prior email message from that sender. Bulc Club’s reputation system also factors into the calculation what’s called “implicit flagging.” Active ratings are traditionally marking a sender as “Report Spam” and “Not Spam.” Systems that incorporate Implicit Flagging strengthen the reputation calculations by passive — rather than active — means. In effect, if a user receives a message that was filtered by other member ratings, the message is held as a possible false positive. If that message is reviewed but no instruction was given to deliver it anyway (passively flagging the message as “Not Not Spam”), that decision to not deliver a message is factored just as integrally as its two active counterparts.

The topics discussed in Mr. Hearn’s article were beyond thought-provoking, inspiring this reaction article and doubtlessly more by organizations like Bulc Club who made it their mission to rid the world of spam. Bulc Club has always encouraged the opinions of others through its Twitter feed andMedium articles, and would like to hear your opinion as well. Write a reply, send Bulc Club a tweet, or best yet — join our fight by becoming a member, today.