As a Microsoft employee, I try to avoid writing on areas that blatantly promote Microsoft. However, I think this question is generic enough to involve Microsoft in the discussion: Can IP addresses ever be used for statistical analysis of malicious Web sites?
[ RogerGrimes's column is now a blog! Get the latest IT security news from the Security Adviser blog. ]
I’ve been a malware fighter for more than 20 years. I consider myself fairly up-to-date on the subject of malicious mobile code, malware, hackers, and exploitation vectors in general.
So it was with surprise then that I read another of Google’s recent studies purporting that IIS Web servers were twice as likely to contain malware as Apache Web servers (although Apache and IIS Web servers contained malicious Web sites in equal numbers).
This astounded me for several reasons. First, my personal experience tells me it isn’t so. I run multiple IIS and Apache Web servers on my honeynet, and my Apache Web servers get 89 percent more hacking traffic than my IIS servers. Most of the traffic is PHP/CGI/MySQL based. This is not unexpected, as the Internet contains at least twice as many Apache Web servers, and popularity draws malicious hacking.
Second, in general and contrary to traditional wisdom, the average Apache Web administrator has less security knowledge than the average IIS administrator. I find Apache Web administrators much more likely to download and use dubious code from the Internet (which a previous Google study revealed often contained malware).
While both types of Web administrators, in general, really don’t care about security, IIS is helped by the fact that it has had only three published vulnerabilities over the last four years, as compared to Apache’s 33.
Even if we include application coding errors, ASP and ASP.Net compare favorably against PHP and CGI. PHP proponents are desperately trying to put more security into PHP, but there's a ton of insecure PHP applications out there — just read one of the many vulnerability lists.
Maybe hackers are breaking in using SQL injection or back-end database vulnerabilities? MS-SQL hasn’t had a severe vulnerability since 2003, while Oracle, MySQL, and other databases have had dozens to more than 100.
IIS 6 comes secure by default. Unless the administrator goes out of their way to make it vulnerable or unless the application adds a vulnerability, it’s very secure. When Apache is installed, its defaults are more permissive and less secure.
But the mental kicker for me is my knowledge of Web site infectors. Most Web sites are not maliciously modified by individual hackers. Like client-side attacks, most Web site infections are automated. The most popular Web site attack tool, Web Attacker Toolkit, is responsible for 30 to 80 percent of all infected Web sites, depending on whose statistics you believe. It is a PHP/CGI infector. The MPack Web site infection tool, which is in the press these days for its large-scale infections, again, infects PHP-based Web sites. I’ve yet to come across a Web site attacking tool on the same scale for IIS.
I like Google and the many fine folks who work there. I’ve written positively about their related research recently. But aside from my own personal experiences and knowledge, another recent post from the Washington Post's security blog made me question Google’s summary conclusions.
The article discusses how one Web hosting company, IPOWER, is potentially responsible for more than 250,000 malicious Web sites. I doubt the figure is anywhere accurate, but it's based on the fact that nine of IPOWER’s Web servers contained 2,650 malicious Web sites (33 percent of the 8,192 virtual Web servers).
What are those servers running? No surprise: Apache and PHP.
Whether or not you believe the larger number or just the 33 percent figure, it reveals that IPOWER appears to average about 910 virtual Web sites per server.
The study in which Google purports that IIS is twice as likely to contain malware includes the following disclaimer: “Note that these figures may have some margin of error as it is not unusual to find hundreds of domains served by a single IP address.” Essentially, any Web server containing multiple virtual Web sites will cause the Google study to have sampling error.
In the IPOWER case alone, the sampling error appears to be 30,000 percent (9 servers serving up 2,650 malicious Web servers). I think any reader would generally agree that Apache Internet Web servers are significantly more likely to host multiple Internet Web sites on a single server than IIS.
How many Web servers used in the Google study contained multiple malicious Web sites? According to my research, it is not an insignificant amount. And when I find them, the servers often contain dozens to hundreds, if not thousands, of malicious Web sites. This is because the Web hosting firm has not patched the Apache software or uses a management add-on that's long known to be vulnerable. One compromise leads to hundreds.
If this is the case, is it ever possible to accurately relay IIS vs. Apache malicious statistics based upon IP addresses alone? I contend that the potential sampling error is just too large for this to be successful. How large is it? I contacted Google and members of the research team directly to ask for more clarification on their findings. I was just pointed back to the same published report. I then told them that if I could have the IP addresses used in the study, I would do my own analysis, even if it required a nondisclosure agreement. I want to find the truth, because the truth is important than whether the outcome supports IIS or Apache.
So far, they’ve not responded to my requests for the data.