BI's new power users rely on advanced search

Increasing cross-over between enterprise search and BI is getting real-time business analytics into the hands of more employees

In the evolving world of business intelligence, swift and targeted access to reports and analysis is the name of the game. But the frequent inability of employees to locate the results they need from high-end BI applications is prompting several enterprise search vendors to step in and address the challenge.

Because BI relies on data generated by accounting, sales, CRM systems, and other back-end applications, it represents a lot of data. IT departments that have made substantial investments in BI packages from Cognos, Information Builders, Oracle, and SAP, among others, are looking at ways to better expose that data and make it all actionable at a much faster clip.  Meanwhile, the number of workers who need daily access to BI data to be more effective at their jobs steadily rises.

"A lot of things are changing in the industry to help expose more BI information," says Frank Brooks, chief data architect at Blue Cross Blue Shield of Tennessee. "We had reached the point where we had so much BI information that it was difficult to go and find just one piece of it. So we had to counter that."

Brooks and his team deployed IBM WebSphere Content Discovery for Business Intelligence, which in tandem with other integrated applications, allows more workers to access critical BI data required for negotiating rates with various care providers and for processing claims. Rather than, say, waiting for biweekly reports and sifting through them, employees can now access a portal to search an array of applications where BI information is stored.

Brooks is one of many IT managers taking advantage of the increasing cross-over between enterprise search and BI. Following news in April of Google OneBox, which extended the reach of the Google Search Appliance to BI, IBM and Microsoft announced new products and features for customers who want to marry search functionality with BI to get real-time business analytics into the hands of more employees. In May, Fast Search and Transfer joined its Enterprise Search Platform with Cognos 8 Business Intelligence solution to deliver corporate content directly to workers who are not necessarily sophisticated BI consumers.

According to Vinod Baya, director at PricewaterhouseCoopers' Technology Center in San Jose, corporate users today are having difficulty getting to BI data due to three principal problems: "They aren't aware that a BI report exists for the analysis they need; or if they know it exists, they can't find it; or they can find it, but it doesn't contain all the information they need." Enterprise search, he says, can help with all three pain points.

Getting to the data

On many BI systems, the reports are designed by analysts versed in the software package's report writer. These reports are catalogued as templates and generally run on a recurring basis, such as month-end. Then, the resulting documents are distributed to specific mailing lists of users.

The problem of finding the data in such a situation is two-fold for any user who isn't on the regular mailing list. Firstly, how do you know if the report even exists? And secondly, if the report is known to exist, how do you access it? The latter problem is especially common because reports often sit on file servers where they are assigned cryptic names by the BI software.

Without an enterprise search engine that can locate the report or its underlying data, a user has few ways of getting the information. This situation leads to undesirable results: the employee forgoes the search or expends significant effort culling the data from other sources and re-creating report. In the latter case, this results in duplication of effort and the risk that two reports that purport to present the same data have differing figures. Even when users can find reports, the documents will often lack the desired data. And because the reports are template-driven, users cannot easily modify the reports to provide different data.

Regulatory compliance is also driving the need for BI search. Compliance officers need to be able to search through CRM databases and e-mail stores, for example, looking for dangerous phrases such as "We guarantee" or "I shouldn't be telling you… ."

How BI search is different

One way that search makes it easier to extend access to BI is that users already know how to employ it due to their familiarality with Web-based search engines. With little training, users can be shown how to use additional options, similar to those in the Advanced Search features on the Web engines.

What happens behind the scenes in an enterprise search, however, varies significantly from the operation of Web search engines. Most Web queries today target unstructured data, such as HTML, PowerPoint presentations, and PDF files. Because these resources have a document-orientation, the engine can make intelligent decisions about the meaning and the relevance of the data. (Web pages even have specific tags to facilitate this process).

Structured data, by contrast, does not generally provide this contextual information. Open a database and read a column of figures called "part" and you have very little knowledge of what that number refers to (part number, cost, inventory, location, among other things). As Baya points out, "this problem will eventually be solved by use of metadata; and this is already happening via the support for XML in databases. But with regard to the vast majority of structured data today, there is no easy solution."

BI software solves this problem in part by the use of templates and the definition of data relationships by trained analysts. Because of this, many search enterprise engines today -- such as Google and X1 -- hand off searches of structured data to the BI software and then federate (that is, combine) the results with items from their own search index.

Unstructured data has its own challenges. The first is pure volume. As Mark Andrews, program director of IBM's Information Management Strategy points out, a typical business user will deal with 70 e-mails per work day (including receiving and sending). In a company of 25,000 employees, that's nearly half a billion e-mails per year that must stored (for compliance purposes) and made searchable. Add to this all the other documents (HTML, word processing, spreadsheets, and presentations) and you have a tremendous capacity issue that translates itself into another challenge. With many searches turning up thousands of results, how do you rank the results for relevance?

As Matthew Glotzbach, head of products at Google Enterprise, observes, "Unlike Web searches, you don't typically have spamming sites that are trying to fool your algorithms, but you also don't have a large set of usage data to guide you." Google doesn't reveal its algorithms, but it does try to establish the "authoritativeness" of specific entries.

IBM, which is more forthcoming about its algorithms, uses a blend of weighting factors for relevance in its enterprise search. These include: user click patterns, the format and position of an entry in a document (headings have higher relevance than in-text entries), metadata (so that text in a link will be ranked differently than similar text in the body of a document), and so on.

Most products today (see "Enterprise search vendors bet heavily on BI") provide a way of increasing relevance of certain documents or URLs so that they occupy first place in a given search. (For example, a query on "sexual harassment" can be tweaked so that the company's policy is always the first item returned.) In addition, many products enable customization for company-specific lingo. This permits search engines to know, for example, that a query regarding "Region 1" refers to the Eastern seaboard.

Proceed with security in mind

Access control is a central aspect of BI search. This problem occurs in two directions: how does an employee access all the needed data for a report and how is an employee blocked from seeing confidential data? In a perfect world, single-sign-on would address the first issue, and access to a directory LDAP server would resolve the second. The problem is in the implementation: much of the data is located on systems whose access control is not tightly neatly defined by a corporate-wide access mechanism.

The problem is actually worse than it appears. Says Maxime Tiran, an engineer in IBM's Data Management division, "When company IT departments set up enterprise-wide searching tools, they are frequently horrified by the kinds of confidential data that is widely accessible and completely unprotected on their intranets."

Security schemes vary, and sites contemplating adding search to their BI need to determine how access control is handled by the products they're considering. Many products simply pass the user credentials to the BI package or other back-end software and rely on those applications to limit the returned results according to their built-in access mechanisms. This aspect is a particular strength of Oracle's Secure Enterprise Search product.

On the horizon

Michael Corcoran, who heads up Corporate Strategy for Information Builders predicts that the integration of BI and search will only become tighter. Search engines will gain better access to BI data and the BI companies will facilitate this process. For example, Information Builders today can take data from transactions in process and make it available to Google's Enterprise search engine.

Information Builders has a division that provides some 300 connectors to data sources and it is actively using them to broaden the reach of search capabilities and its own BI products. Says Corcoran: "This greater integration will really help users. Today, BI still requires users to know where their data is. For example, they still must specify 'call center data.' However, the needed data could be anywhere, and the user should not need to know its origins to be able to locate it."

The next step, says IBM's Andrews, involves integrating analytics functions with search and being able to query the data in a variety of ways to probe for market opportunities that equate to increased sales and greater efficiencies. For the time being, however, most enterprises will be content just to have better access to the business intelligence they currently generate.

Copyright © 2006 IDG Communications, Inc.

How to choose a low-code development platform