Splunk makes log-file searches a slam dunk
Free-format search tool handles disparate event types, eases admin headachesFollow @infoworld
InfoWorld first looked at Splunk in a preview of the Splunk Server, where it was dubbed “like Google for log files.” That simile is especially apt: Google freed us from the need to learn the intricacies of database regular expression searching, and Splunk does the same for log files.
Splunk is a free-format search tool that helps you correlate time- and date-based events across a huge number of logs -- Apache, FTP, security, MTA, DBMS, and so on. Splunk pulls in data from log files, then indexes and organizes it, determines similarities and differences between events, and allows you to search across all events by time, date, and keywords. Splunk Professional beefs up Splunk Server, handling greater log volume and more servers, and includes a rich scripting language as well as features such as automatic data collection.
I examined Splunk during my time working on the InteropNet HotStage NOC (network operating center), so testing was performed on a veritable enterprise toy box with gear from APC, Avaya, Computer Associates, Cyclades, Extreme Networks, Fluke Networks, Gigamon, Juniper Networks, Network General, Network Physics, and 3M. (I also did a fresh install after using Splunk at HotStage.)
With this much variety, I was able to take advantage of the wealth of log information -- and the fact that InteropNet’s address space is well known to hackers. The constant configuration changes during InteropNet HotStage allowed Splunk to provide a contextual look into the syslog, at one point helping us find a piece of equipment configured with our show’s root password instead of the correct SNMP read string. It was a simple search that saved our bacon -- but also one that could be easily missed.
Installation is a 10- to 20-minute affair, as Splunk linked in all necessary libraries in their binary distribution, thereby eliminating the frustration of chasing down missing dependencies found in many other Linux apps.
Splunk organizes log data from disparate sources, so you can perform queries across the entire database or by data source type. As with any Web search engine, you have to ask the right questions to get the answer you need.
For example, I asked Splunk to display Avaya S8300 SIP PBX call detail records and ExtremeWare switch events -- both syslog and SNMP traps tailed into the Splunk database -- that occurred during the time range of a trouble ticket coming in via the CA help desk. With the narrowed-down data sources correlated by time, I could find out whether problems were related to VoIP or infrastructure simply by looking at the offending time slice. You can also add user-defined tags for records, which in turn can be used to add additional fields to complement Splunk’s internal capability of turning static log terms into search nouns.
Regardless of whether the message is terminated Unix- or Windows-style, or whether your gear puts out single- or multiple-line records, Splunk will characterize the data on the fly and quickly tune itself to index even the weirdest log record. In the case of nonsyslog-enabled applications, Splunk includes a simple Python script to push any file-based log into the Splunk system.