Compliance has turned us into pack rats. Even outside the heavily regulated health and financial industries, many of us now reflexively play it safe and save anything that seems important. But surprisingly, this "structured" information -- documents, e-mail, transaction records -- accounts for only about half of all enterprise data.
The other half is so-called dark matter. Generated by servers, routers, desktops, switches, and other systems, dark matter generally takes the form of log files that record errors, system access attempts, and countless other events. Dark matter in IT, like that mysterious stuff floating in deep space, is both widely distributed and hidden despite its enormous mass.
[ Dark matter makes up almost half of the enterprise data explosion, the most pressing problem in IT. ]
Typically, IT pays attention to dark matter only after something goes wrong. When there's a security breach, you go straight to the log files to see when and how the breach began and which systems may have been compromised. When a server goes down, log files usually reveal the cause of the failure. Otherwise, dark matter stays in the dark.
But what if you monitored those log files en masse as a matter of course? Could you drill into dark matter and detect security breaches in progress or sound the alarm based on a pattern of errors before a server falls over?
The answer to that question points to some of the most interesting enterprise technology around -- including SEM (security event management), cloud-based distributed computing, and advanced search technology expressly designed for dark matter.
To take a timely example, ArcSight -- one of the leading SEM vendors -- just announced FraudView, which mines security log data for statistically significant patterns of nefarious activity. According to Reed Henry, senior vice president of marketing for ArcSight, FraudView is already being used to detect wire fraud in wholesale banks and "pump and dump" stock schemes in retail brokerages.
On the raw technology side, there's Apache Hadoop, a Java programming framework designed for data-intensive parallel processing. Hadoop turns out to be perfect for pulling together log files distributed across an organization for analysis. Amazon now provides turnkey Hadoop services, so customers can shovel huge quantities of log data onto Amazon servers and crunch on it mercilessly, teasing out patterns that may yield profound insights on, say, application or datacenter architecture.
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »
Sign up to receive Storage Resource Alerts

Hello Eric - Steve posted some comments on your post here http://www.prismmicrosys.com/Logtalk/?p=363. In short, while mining dark matter is useful in making it readily available, merely indexing log data has its limitations since there is the assumption that this data exists and is accurate. One of the first things that hackers do when they penetrate a system is to alter or delete log data. Without preserving or securing this data, the usefulness of logs is heavily weakened.