> The new definition of spam is "whatever our users say spam is", a definition that cannot be argued with and is simultaneously crisp enough to implement, yet vague enough to adapt to whatever spammers come up with.
> Eventually it had to be replaced with an online system that recalculates scores on the fly. This system is a tremendously impressive piece of engineering - it's basically a global, real time peer to peer learning system. There are no masters. The filter is distributed throughout the world and can tolerate the loss of multiple datacenters.
> I don't want to think about how you'd build one of these outside a highly controlled environment, it was enough of a headache even in the proprietary/centralised setting ....
> The reputation system was generalised to calculate reputations over *features* of messages beyond just sending domain. A message feature can be, for example, a list of the domains found in clickable hyperlinks. Links would turn out to be a critical battleground that would be extensively fought over in the years ahead.
> Gmail was hit especially hard by this because early on Paul Buchheit (the creator) decided not to include the client IP address in email headers. This was either a win for user privacy or a blatant violation of the RFCs, depending on who you asked. It also turned Gmail into the worlds biggest anonymous remailer...
> All major webmail and social services force users to perform phone verification if they trip an abuse filter. This sends a random code via SMS or voice call to a phone number and verifies the user can receive it. It works because phone numbers are a resource that have a cost associated with them, yet~all users have one.
> When you have central control everything becomes a million times easier because you can change anything at any time. You can terminate accounts and control signups. If you don't have central control,
you have to rely exclusively on inbound filtering and have to just suck it up when spammers try to find ways around your defences.
> Another approach would be to allow cross-signing - an entity with good reputation can temporarily countersign mail to give it a reputational boost and trigger cross-propagation of reputations. That entity could employ whatever techniques they liked to verify the senders legitimacy.