April 02, 2007

Rethinking business intelligence

BI has a reputation for being a resource sink that delivers reports almost no one reads. It doesn’t have to be that way. And you can no longer afford to let it be

But if you start with that sort of architectural model, you’re likely to fail, says Scott Sognefest, a partner in Deloitte Consulting’s BI practice. “There’s a growing realization that you can’t put BI technology on top of a big pile of data. It’s expensive and inefficient,” he says. “You wouldn’t build a factory and then decide what products you want to produce after it’s built, but that’s what people do in the BI space.”

So understand the business case first. Then you can begin the messy work IT organizations have struggled with for years: building and refining a common data model and ensuring the data you need from multiple systems is consistent. “Data quality and data integrity are not going away. There’s no easy way to solve them,” says Betsy Burton, a Gartner vice president.

Forrester’s Evelson agrees. Before launching a BI initiative, he says, “I would have a data governance effort — and drop everything else.”

BI vendors have tried to address data quality and integration issues with MDM (master data management) solutions, but efforts to govern, cleanse, and reconcile data go beyond BI to affect every corner of the organization. In many instances, BI stakeholders have lacked the clout to drive enterprisewide MDM, yielding frustration when business execs want to scale BI beyond the original requirements that drove adoption.

Until a company cleans up its data act globally — a long-running project if there ever was one — the best strategy is to reduce the data sources to those that serve well-defined business objectives. “You’ve got no business putting in BI unless you’ve whittled down those core systems,” Martens says. That can eliminate conflicting sources and yield manageable data integration and cleansing. Keeping data close to home also keeps it closer to its context and metadata, something that can get lost when data is transformed for storage in a data warehouse. “ETL [extract, transform, load] will cost you hugely,” Martens adds, referring to the common method of pulling huge chunks of static data from legacy systems.

Reducing the number of data sources helps avoid grunt work, but data quality must still be up to par. Some data will always be dirty, perhaps because it comes from outside sources or perhaps because you’re seeking something difficult to extract. One common example is getting birth dates of customers, who see no reason to share their age, notes Anne Milley, director of technology product marketing at SAS Institute, so you get false data, such as the easy-to-enter 11/11/11, or no information at all.

In such cases, thought should be given to whether you really need that information for your analysis and, if so, how your analysis will account for the missing data so results remain meaningful, she says. This kind of thinking should be done before you deploy data collection, transformation, mining, analysis, or reporting systems, she adds.

Fleet management services provider PHH Arval provides a simple example of how such compromises can be reached. The company tracks odometer readings when truckers refuel to aid customer analyses of vehicle efficiency, delivery costs, and conformance to safety standards. But many drivers don’t take the time to transcribe odometer readings and instead enter guesstimates at the fuel terminals where this data is collected. To adjust analyses appropriately, PHH Arval created a statistical processing model that took this data weakness into account, says Greg Corrigan, the company’s vice president of BI.

Close

On Twitter now

Applications

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive Applications Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2009 Infoworld, Inc.