Alexandre Rafalovitch recently wrote me regarding our recently survey of system administrators and why product IT systems are so difficult to troubleshoot. Alexandre writes
By now we pretty much established that until the developers themselves try to support/troubleshoot their own products in production (or get loud enough feedback), they will not understand how to make their products easier to manage post-deployment.
The surveys of the how do you deal with it now kind should always include questions on why commercial solutions are not suitable (usually due to installation/license difficulties) and also what the companies creating the products could do to make things easier in a long run.
I think this is sort of the whole point of the survey results. Attendees at Camp Sys Admin overwhelming stated that they have so much data already it's killing them. It's interesting to think about the amount of IT data being generated every day in a typical enterprise shop. Forget about the network, firewall and security data. I'm talking about just your basic web servers, application servers and databases. Hundreds of gigabytes to several terabytes in IT data a day is not atypical for a good size data center.
The notion that IT people need even more data generated by developers kinda misses the point. Troubleshooting production applications is a whole lot different that debugging code in development or staging environments. Production systems involve many technologies and systems that just don't appear in pre-production environments.
When I was at Yahoo everyday we were fire fighting production problems that never manifested themselves in development or staging systems. At Splunk, my current company, we routinely see problems with our software that only occur with multi-terabyte data sets in very large production systems. Perhaps in another post I'll discuss how we deal with building QA environments to deal with this.
Alexandre comments further that commercial solutions are not suitable for solving production troubleshooting problems. True, today's solutions most often require extensive amounts of code instrumentation. IT people generally don't want to and/or can't instrument code in production environments. For starters, generally we don't even own most of the code running our applications and services.
So we're left to deal with all the ever-changing evidence our machines generated. And boy it sure is piling up quickly.
How much IT data do you have in your data center? Write me with your estimate at thebaum@splunk.com.
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »
Sign up to receive Networking Resource Alerts
