August 29, 2003

Look deep into log files

Omnisight offers powerful, flexible analysis if you have the skills

As every server administrator knows, log files are the pulse of a network infrastructure. They tell us what has occurred in an application or service, and if they stop growing, something is wrong. Log files can tell us who is using our services, how many users are using a particular resource, how often, and for how long. Logs can also be extremely valuable as forensic evidence in computer crime investigation and litigation. The trick is to be able to use log files to analyze trends in resource utilization, identify and remove security threats, and provide a useful audit trail of user action, without being buried by the sheer volume or resorting to the onerous task of manual inspection.

A large infrastructure can generate many gigabytes of logs from several services and applications every day. These logs are usually archived for a period of time, analyzed, and then discarded on a predetermined schedule. The data contained in these logs may be extremely important or completely irrelevant. Either way, the logs need to be perused to determine which is which, and what is worth flagging for further investigation. For instance, the only way to really know how a Web site is performing is to generate reports based on the Web server log-file data, and use those reports to determine if there are problems with the servers or with the site itself.

Addamark Technologies addresses these issues with Omnisight 2.5. Currently implemented at Lehman Brothers, Yahoo, and Agilent Technologies, among others, Omnisight allows systems managers to extract meaningful data from truly massive log files generated by services and applications by providing a means to import, store, and perform deep analysis on that data.

Heavy Parsing

Addamark’s Omnisight is best described as a programming framework for log-file analysis. Relying on open source packages such as Apache, with a heart written in C and a nervous system written in Perl, Omnisight is not a tool for the faint of heart or light of wallet. Omnisight runs on Red Hat Linux 7.3 or Red Hat Enterprise Linux AS 2.1, with support for Red Hat 8 nearing completion. Omnisight is designed to be implemented in a distributed environment and installed on a cluster. Exchanging SSH (Secure Shell) public keys for the root user between the cluster servers permits seamless installation of the cluster, but could be viewed as a minimal security risk. In my testing, the cluster installation was simple, however, and controlled from a single installer script. When complete, three Red Hat Linux servers were ready to handle log files.

Omnisight isn’t designed to handle live log files, but to import large, static log files into a database. To import a log file requires first creating a parsing file that describes the data to be indexed. For instance, to parse an Apache Web server log, you’ll need to create a file containing the specific log format, parsing rules, variable declarations, and potentially embedded Perl code to handle special-case log files and varying reporting formats found in many applications. These files must be written with care and tested thoroughly, as any deviations will result in parsing errors and lost data. Once the parsing file is complete, it is referenced by the indexing engine, which then imports the log file.

Test Center Scorecard
30%20%20%20%10%
Addamark Omnisight 2.598798
8.3
Very Good

Sign up to receive Data Management Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

The one-stop resource center for IT professionals.

©1994-2009 Infoworld, Inc.