A researcher has released 10 million usernames and passwords collected from data breaches over the last decade, a step he worries could be a legally murky but one that will help security research.
The data comes from major data breaches at companies including Adobe Systems and Stratfor, all of which have already been publicly released and can be found through Web searches, said Mark Burnett, a Utah-based security consultant who has written several networking and security books.
Most of the passwords are likely invalid, and he has scrubbed other information such as domain names to make it unusable for hackers, Burnett said. Still, usernames or passwords found on the list that are still used should be changed.
The security concern around such a release is "something I didn't take lightly," Burnett said in a phone interview. "I don't want to put users at risk."
Burnett, who has studied password security for 15 years, said the data came from public sources. He's also been collecting leaked data using scripts that scrape forums, IRC, Usenet groups, Pastebin, torrent releases and other sources.
"This data is extremely valuable for academic and research purposes and for furthering authentication security," Burnett wrote in a blog post.
He devotes a large portion of his blog post discussing points of U.S. law that might apply to such a release, and why it is likely not a violation. Burnett said the release is not "technically illegal," but that doesn't mean law enforcement couldn't use it as a pretext for some other line of questioning.
Burnett compiled the data, cleaning it up and removing duplicate credentials. The result is a .txt file with the credentials, which is already being studied by those with an interest in it, he said.
The top 100 passwords used by people haven't really changed over the years, with the same weak ones appearing.
But that's not the only problem. With an overwhelming number of websites requiring registration, many people continue to reuse the same credentials over and over again, putting them at risk if a website has a data breach. Hackers often try to see if the credentials will work on other sites.