Hard data on why your users should avoid file-sharing sites

A new study provides solid proof that people all over the world are actively Dumpster diving in file-sharing services

Of course your company's firewall blocks access to RapidShare.com, Easyshare.com, and other well-known file-sharing sites. Your users probably hate you for it. After all, when they need to send a large file to somebody outside the corporate firewall, the file-sharing sites make access fast, easy, and free. And no doubt your users have found plenty of devious ways to work around IT file-sharing restrictions: going to proxy servers or lesser-known file-sharing sites that you haven't blocked, perhaps, or uploading files while not connected to the corporate network.

You've probably told your customers a thousand times that file-sharing sites aren't secure, and the admonitions no doubt have gone in one ear and out the other, assuming your warnings even made it all the way to the second ear.

There's a new study (PDF) that you and your recalcitrant clients should read. Writing for the fourth Usenix Workshop on Large-Scale Exploits and Emergent Threats (LEET '11) last month, researchers at the University of Leuven and the graduate school at Institut Eurécom started poking around file sharing sites and what they found will raise -- no, curl your eyebrows.

Although there are many variations on the theme, file-sharing sites generally have you upload a file, then hand you a URL that other people can use to access the file. You send the URL to your co-workers, family, or 10,000 of your closest friends. They simply pop the URL into a Web browser and download the file.

The study found that many of the most popular sites generate sequential URLs: If your file is located at www.fileshareplace.com/id=123456, the next file is located at www.fileshareplace.com/id=123457. The researchers put together a little crawler that poked at sequential URLs and downloaded whatever it could find. Online data Dumpster diving, one URL at a time.

The researchers intentionally created a slow crawler, so they wouldn't get knocked off any of the hosting sites. The result? One month of harvesting netted 310,735 unique files, including 27,700 JPEGs, 13,400 ZIPs, 7,000 PDFs, 4,000 .doc files, 1,200 .xls files, and almost 1,000 .ppt files. They even came up with a handful of SQL files -- presumably databases sitting there on the file-sharing sites, ripe for the taking.

Some of the file-sharing sites are a little more sophisticated. They generate nonsequential URLs, but in many cases the URLs aren't hard to guess. The researchers created three new crawlers: one that generated random six-digit numeric URLs, one that made eight-digit numeric URLs, and one that made six-character alphanumeric URLs. They turned the three crawlers loose for just five days -- all running on one machine, from one IP address -- and came up with roughly 700, 600, and 300 files from each of the crawlers, respectively. Statistically, the crawlers found about 1.1 hits for every 1,000 attempts.

But but but -- I can hear your users sputter -- but nobody really goes data Dumpster diving like that, do they?

Well, yes they do. The researchers put honeypot files on all of the file-sharing servers that use sequential URLs. They used different kinds of files -- HTML and PDF, as well as .exe and .doc files, all of which "phoned home" in various ways, many of them requiring the user to knowingly click to allow the interaction. They didn't post or otherwise identify the URLs involved.

In a one-month trial, 275 files phoned home, from 80 unique IP addresses. Half of the honeypotted IPs were in Russia, 25 percent in Ukraine, but the other 25 percent came from all over, including the United States, the United Kingdom, Europe, and the Middle East. There's no question that people all over the world are actively Dumpster diving in file-sharing services.

Some file-sharing sites use more sophisticated protection measures -- CAPTCHA codes, for example, download delays, or required passwords. As you no doubt know, many of those measures can be bypassed as well, with varying levels of difficulty.

Adding injury to insult, the researchers found that 13 percent of the file-sharing sites use the same, publicly available software to run the site. And they found at least one directory traversal hole in that software.

If you have users who put company data on file-sharing sites -- "please, really, just this once because I have to get the file out, you know?" -- they're playing with fire.

This article, "Hard data on why your users should avoid file-sharing sites," was originally published at InfoWorld.com. Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2011 IDG Communications, Inc.

How to choose a low-code development platform