Normalization. Collected data is often parsed and separated into its individual data fields as it enters the data stream. Parsed data (also known as structured) is typically easier to index, retrieve, and to report on. Unparsed data (also known as raw or unstructured) can normally be collected, but isn't as easy to index, retrieve, or report on. Often administrators will have to create their own parsing or treat unstructured data as a single data field, as well as conduct keyword searches to retrieve information.
Normalization is the process of resolving different representations of the same types of data into a similar format in a common database. In a log management database, this may involve synchronizing reported event time to a common time format -- say, local time to Coordinated Universal Time. It may mean resolving IP addresses to host names, and anything else that attempts to make disparate information more similar. The more parsed and normalized data you have, the better. When reviewing products, be sure to examine the number of parsers included to make sure they capture the majority of the log information in your environment.
Indexing. In order to optimize data retrieval for search queries, filters, and reporting, data needs to be indexed as it stored. Indexing takes parsed data, although some vendors will index unstructured data for faster retrieval.
Storage. Captured data needs to be stored to medium- or long-term storage. All products save to local hard drives, and some can store to external storage arrays, such as SAN, NAS, and so on. All the products tested allow event messages to be exported for long-term storage and later retrieved if needed. If you're concerned about legal chain of custody requirements, make sure the solution you're evaluating cryptographically signs all stored messages.
Correlation. Correlation is the process of taking different events from the same or different sources and recognizing a singular event. For example, some log management products have the ability to recognize a packet flood or password guessing attack, versus simply reporting multiple dropped packets or failed logons. Correlation reflects product intelligence. Log management products that excel at correlation are known as Security Information and Event Managers (SIEM). A number of products in the review combine log management (log collection, storage, querying, and reporting) and SIEM functionality, but only their log management functionality was evaluated.
Note that, in order for centralized log management to work well, it's very important that all incoming log information have accurate time stamps. Make sure that all monitored clients have the correct time and time zone. This will help in reporting, forensic analysis, and legal purposes.