Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Do you have any data on the percentage of spam email sources (of all email sources) that send 1-10 emails per day?

Do you have any method by which all the different mailservers worldwide can successfully collaborate to determine precisely how many emails a particular mailserver sends per day?

You're proposing that mailservers with a low emails-per-day metric be trusted. How is that metric established? How does everyone agree on it? Do we have to implement a new protocol in addition to SMTP in order for this to be viable?

The most obvious response to this is "well each service provider can just count the stats internally and use that" but it's a non-starter. That means that Google with their, what, 1000 email servers (or more) now has to figure this out. And every other service provider too.

Finally, given all the service providers out there, and the total number of "consumer" internet connected computers, giving all the computers a free 1-10 email pass means that spam is back to the worst of 2000-2010 levels. Here's some math:

There are 115mm households in the US http://quickfacts.census.gov/qfd/states/00000.html

81% internet usage https://en.wikipedia.org/wiki/Internet_in_the_United_States#...

That's 93mm consumer connections

Let's suppose that there are 20 big players in the mail provider category, Google, Yahoo, Microsoft (hotmail, office365, etc), AOL, comcast, at&t, time warner, etc. 20 seems like a good number.

Since these systems can't be made to communicate with one another easily about the sent-count of every IP address in the world, they just do it locally. That means that the virus that infected your computer and turned it into a spam-bot gets:

10 free messages * 20 major providers * 93mm computers = 18 billion spams per day

And that's just from the computers in the US. Once you take this global, you're probably talking at least 100 major providers and 300-500 million computers. Then you're talking something like a trillion spams a day.

Seems like a reasonable idea until you look at how it would actually be implemented. Then it doesn't seem so great.



Modern spam filtering is not purely based on white/black-lists. Statistical analysis of the content is performed.

You don't need to auto-spam-box sources that are sending low volume emails to the same set of addresses, especially when the messages don't get categorized as spam by the content analysis.

You don't need the kind of global coordination you are talking about. Further, whitelists don't solve the problem. Under your scenarios, you can just as easily receive low volume spam from millions of fake accounts at the big providers. You're just relying on the provider to solve your spam problem.


Sorry, but these claims just demonstrate that you have not operated a large volume mail servers.

Yes, most of us do need to automatically block such sources because content analysis does not do a good job, and experience demonstrates that the vast majority of such messages are still spam.

> You're just relying on the provider to solve your spam problem.

Yes, we are. And you should be extremely happy they put in the effort they do, or your e-mail would be completely useless.


> Modern spam filtering is not purely based on white/black-lists. Statistical analysis of the content is performed.

Not purely, no, but content analysis is typically performed after whitelist/blacklist checks. If a host connecting to my mail servers is on a blacklist, there is no content analysis because the mail will be rejected before it gets that far.


> 10 free messages * 20 major providers * 93mm computers = 18 billion spams per day

I don't get this, is this assuming all 93 million consumer connections are infected with the same malware?


It's assuming that they're all infected by some malware. Of course not all of them are, but realistically there are also thousands of providers, not 20.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: