Free Newsletters
InfoWorld Daily

InfoWorld
Log-in | Register
STRATEGIC DEVELOPER  

Set my data free

Companies should be able to deliver personal data in the form that most benefits the customer

By Jon Udell  
April 12, 2006
 

Last weekend I helped a friend categorize her Schedule C expenses. All of her business income is in QuickBooks, but the expenses aren’t. I would have to reconstruct those from bank and credit card records. Although this friend has online accounts at both institutions, my Spidey sense was tingling: I knew there was going to be trouble.

Free IT resource

Hear how top CIOs turn change into a competitive advantage.

Sponsored by HP

Free IT resource

Attend the SOA Executive Forum: Breaking SOA Bottlenecks SOAExecForum.com/may2007

Sponsored by InfoWorld

As it turned out, the bank was a knock-over. It doesn’t export data in QuickBooks’ IIF (information interchange format) but does offer CSV (comma-separated variable). I had long ago written a CSV-to-IIF translator. So with a minimum of fuss I was able to suck the 2005 expenses into QuickBooks, where my friend could begin tagging them.

The credit card company’s defenses, though, were more formidable. Its site had a CSV dumper, too, but when I asked for 2005 transactions, all I got back was fourth-quarter records. The 12 statements from 2005 are available as PDFs, but that wasn’t what I had in mind.

Agent: I’m sorry, sir, there’s nothing else we can do.

Me: Or rather, nothing else you will do.

Agent: Would you like me to fax you those statements?

Me: (Grumble.)

Here’s a little secret I didn’t tell them. I have a superpower that enables me to do battle with the evil of data lock-in. I can’t leap tall buildings or crush lumps of coal into diamonds, but when I look at the barriers that divide one data format from another, they seem hardly to exist. For me, data transformation is almost an autonomic reflex, like breathing.

But PDF? Please, oh please, don’t make me dig the data out of those PDF files. I cajoled, I begged, I threatened. But sadly, convincing organizations to make exceptions is not a superpower I possess. So I found a PDF-to-Excel translator and went to work.

The results weren’t pretty. Entropy runs only one way, after all. It takes work to convert a less orderly system into a more orderly one. So naturally the output had to be massaged.

As I ran through a series of regular-expression search-and-replace operations in my programmer’s text editor, I was dimly aware of the fact that I was exercising a freak talent. What do normal people do? Transcribe the numbers by hand, I guess. Or, perhaps equally likely in the case of Schedule C, just invent them.

It doesn’t have to be this way. PayPal, for example, will happily disgorge all my transactions as far back as 2000. It even offers IIF, on the assumption that not everyone can easily convert to it from CSV.

I know what you’re thinking: That’s a security risk. And you’re right, it is. But am I any safer with all my data sitting in PDFs? I’m tempted to joke that, if we regard PDF as a mode of encryption, my statement history actually is safer than a raw transaction history would be.

[ Talkback: right or security risk? ]

But seriously, I should be able to encrypt my historical data in any format, so long as it’s not necessary to the operation of the service and I’m willing to be responsible for the key.

If I don’t make that choice, though, let’s get real. If I want to turn my data into HTML, or IIF, or PDF for that matter, I will. If you can do those transformations for me, great. But first things first. Just give me my data when I ask you for it. Not ink on paper, not a bitmapped image of ink on paper, and not even a vector representation of ink on paper. Just the data.





 


 
Jon Udell is lead analyst and blogger in chief at the InfoWorld Test Center.

  More of Jon Udell's column
  Jon Udell's Weblog

Newsletter Check out all of our free newsletters!
Enter e-mail address:




 

TOP NEWS:


»  Four quick tips for choosing an IM security product
71 percent of businesses will invest in real-time messaging this year. If you're one of them, be sure to protect your enterprise

»  Forrester analysts ID hot IT jobs
Research group finds 16 IT roles with a promising future

»  Nvidia claims 10 hours of HD video on Tegra chip
The Tegra 600 and 650 can be used with hard disk drives and are designed partly for mobile Internet devices

»  Database vendors add Google's MapReduce
Greenplum and Aster Data Systems will support Google's programming technique, developed for parallel processing of large data sets across commodity hardware

»  Network management: Tips for managing costs
New technologies, changing requirements, and ongoing equipment maintenance and upgrades cost money, but there are ways to manage expenses

»  EMC targets SMBs, branch offices with new low-end storage
Celerra NX4 highlights include thin provisioning, snapshot technology for data recovery and backups, and Web-based console for management of storage volumes




Migrating to Vista
Join Windows Vista Expert, Richard Whitehead as he presents the benefits and challenges of migrating to Windows Vista. Sponsored by Novell

»  Click here to view this Webcast
  Planning For A Disaster
This new, comprehensive Solutions Guide is your one stop source for Disaster Recovery. In it you'll learn how to reduce the likelihood of a disaster and to create a rock solid business continuity plan should you face a disaster situation. Sponsored by Equallogic

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist