February 06, 2006

Clean house, clean data

Developers must consider data logic before writing the first piece of code

Before you build anything, you have to get your house in order -- ripping out the old, reorganizing and cleaning up what’s left. I know this firsthand: Right now I’m neck-deep in a house remodeling project that will ultimately transform my grungy old basement into a family room, home office, and bathroom.

I hired a contractor to do the construction, but he wasn’t able to start until I had cleaned out the contents of the basement, sorting and categorizing everything, then labeling it and putting it in boxes for later use. What a nightmare! Of course, my basement is now spotless, even if it is piled high with boxes labeled “Phone Stuff,” “Electrical Cords,” “Office: Paper,” and “Mini-Golf Supplies” (don’t ask).

I cite my renovation tribulations here because, oddly enough, software developers and information architects go through a similar process -- although with less sweating and swearing, I assume -- when they begin building as well. Often the biggest challenge they face, according to Contributing Editor Galen Gruman, involves data (see “Whipping Data Into Shape,” page 26).

“In a traditional environment, all your data is locked up in systems not designed to share with anything out of that environment,” Gruman says. “As enterprises integrate applications, however, the data suddenly lives in a heterogeneous environment without clear context to ensure its proper use in the new world.” To make sure all the data they deal with plays well together, developers have to think through the data logic -- including semantics, structure, and context -- before they begin. After all, you have to know what’s in your boxes, and in your walls, to figure out what to do with them after your renovation.

This is especially critical with SOA projects, which cross knowledge and subject boundaries. “If you’re working within a closed environment on services around, say, customer ID, there is a shared understanding, a context, of what a customer is,” Gruman explains. But partners in a supply chain might have a different definition of the term “customer.” So when you start sharing services across domains, “there’s a mismatch between assumptions. Then weird mistakes crop up, and services don’t produce what you think they should,” Gruman explains.

The key, then, for developers “is to treat the data abstractly,” Gruman says, “separating it from its container or passing along context to yield a common, predictable understanding of data across systems.” This requires architects and data modelers to understand what data they’re using and then to create maps or documentation so that the next person doesn’t have reinvent the process from scratch.

It’s not a pretty business. Fortunately, Gruman’s down-and-dirty research points to the most effective strategies for data transformation and provides a terrific starting point for developers embarking on strategic cross-domain projects. Or you can take my approach and just label everything “Mini-Golf Supplies.”

Close

On Twitter now

Security

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Subscribe to the Security Central Newsletter

Stay informed of the latest security threats and fixes.

White paper

Log Management: How to Develop the Right Strategy for Business and Compliance

This white paper provides guidance on how to develop a strategic approach to managing and monitoring logs, a key function required for compliance with many regulatory mandates and a critical defense against security threats.

Download now! »

White paper

The Essential Series: Security Information Management

Learn about the processes and technologies that support security information management (SIM) operations, as well as the business case for SIM. The series examines different options for implementing SIM and gives you evaluation criteria for selecting the best option for your organization.

Download now! »

White paper

Aberdeen: Choosing and Consuming Managed Security Services

Learn the strategies, actions, and capabilities that Best-in-Class organizations employ and technologies they choose to obtain superior performance against various security performance metrics. This report provides guidelines for identifying which security solutions to consume as a MSS and defines best practices for choosing and managing MSSPs.

Download now! »
©1994-2009 Infoworld, Inc.