Splunk makes log-file searches a slam dunk

Free-format search tool handles disparate event types, eases admin headaches

InfoWorld first looked at Splunk in a preview of the Splunk Server, where it was dubbed “like Google for log files.” That simile is especially apt: Google freed us from the need to learn the intricacies of database regular expression searching, and Splunk does the same for log files.

Splunk is a free-format search tool that helps you correlate time- and date-based events across a huge number of logs -- Apache, FTP, security, MTA, DBMS, and so on. Splunk pulls in data from log files, then indexes and organizes it, determines similarities and differences between events, and allows you to search across all events by time, date, and keywords. Splunk Professional beefs up Splunk Server, handling greater log volume and more servers, and includes a rich scripting language as well as features such as automatic data collection.

I examined Splunk during my time working on the InteropNet HotStage NOC (network operating center), so testing was performed on a veritable enterprise toy box with gear from APC, Avaya, Computer Associates, Cyclades, Extreme Networks, Fluke Networks, Gigamon, Juniper Networks, Network General, Network Physics, and 3M. (I also did a fresh install after using Splunk at HotStage.)

With this much variety, I was able to take advantage of the wealth of log information -- and the fact that InteropNet’s address space is well known to hackers. The constant configuration changes during InteropNet HotStage allowed Splunk to provide a contextual look into the syslog, at one point helping us find a piece of equipment configured with our show’s root password instead of the correct SNMP read string. It was a simple search that saved our bacon -- but also one that could be easily missed.

Installation is a 10- to 20-minute affair, as Splunk linked in all necessary libraries in their binary distribution, thereby eliminating the frustration of chasing down missing dependencies found in many other Linux apps.

Splunk organizes log data from disparate sources, so you can perform queries across the entire database or by data source type. As with any Web search engine, you have to ask the right questions to get the answer you need.

For example, I asked Splunk to display Avaya S8300 SIP PBX call detail records and ExtremeWare switch events -- both syslog and SNMP traps tailed into the Splunk database -- that occurred during the time range of a trouble ticket coming in via the CA help desk. With the narrowed-down data sources correlated by time, I could find out whether problems were related to VoIP or infrastructure simply by looking at the offending time slice. You can also add user-defined tags for records, which in turn can be used to add additional fields to complement Splunk’s internal capability of turning static log terms into search nouns.

Regardless of whether the message is terminated Unix- or Windows-style, or whether your gear puts out single- or multiple-line records, Splunk will characterize the data on the fly and quickly tune itself to index even the weirdest log record. In the case of nonsyslog-enabled applications, Splunk includes a simple Python script to push any file-based log into the Splunk system.

One of my frustrations with narrowly focused log-analyzers is the sometimes massive amount of scripting necessary to make a simple interaction, such as whether a VoIP call event intersects with a switch threshold event. With Splunk, clicking on the Set Timerange control on any of the Splunk interface screens tunes it to the suspect period. From there, the search bar further refines the search.

You can also use Splunk to correlate system and network events with those directly involved in the development cycle, saving time for programmers who need to search multiple environments. And though the Splunk Base wiki -- intended to serve as a community knowledge base -- wasn’t available during my review, it will certainly be useful for researching events and solutions.

The product does have some holes. Splunk can index SNMP information, but it doesn’t directly tie into management consoles such as CA Unicenter or HP OpenView; instead, it can run a command-line script as a work-around. For example, you could set up a Live Splunk -- a search set to run at specific intervals -- to look for high-priority alerts from Snort and shoot someone an e-mail. In this case, I would prefer the bells and whistles to go off in the NOC -- scripting makes the response very flexible, but I’d like to eventually have the ability to send traps to a central console.

Another gripe involves the amount of manual editing required during the initial setup: it’s simple to edit the sample config.xml files, but it’s also pretty easy to make a mistake. Thankfully, a new configuration GUI will be available in the next major release (support for FreeBSD, Mac OS X, and Solaris is also on the road map), but for now, I suggest making a backup copy of the config file before you start editing.

Nevertheless, Splunk seems well prepared to succeed in a market that’s often the realm of homegrown search tools. If you’re interested in finding out what’s really happening in context across all your systems, take a good look at Splunk Professional and save yourself some eyestrain.

InfoWorld Scorecard
Manageability (10.0%)
Scalability (30.0%)
Ease of use (20.0%)
Security (10.0%)
Value (10.0%)
Interoperability (20.0%)
Overall Score (100%)
Splunk Professional, v1.0 8.0 8.0 7.0 8.0 8.0 9.0 8.0

Copyright © 2006 IDG Communications, Inc.

How to choose a low-code development platform