"For us, the issue is: how do you enable a world where you can keep everything cost effectively. We want a way to keep everything and then make it valuable. Having all that data helps us do our jobs better," he said.
LiveOffice currently stores some 4 petabytes of data on disk and adds another 5TB to that pool each day. LiveOffice encrypts all of the data for customer safety.
LiveOffice uses data analytics tools, such as map reduce technology like Hadoop, and distributed databases like Cassandra to mine massive data stores on Isilon arrays. It's a way to search data for legal discovery and regulatory compliance requests as well as insight into customers' habits.
Stephen Martino, director of production operations at Harvard Medical School, said the time is coming when there will be a demand from corporate users for mining services.
What IT managers need is a way to track who is using what, and that is ostensibly still missing from tools vendors provide, he said.
"A researcher has no boundaries on how much they can store, even 1TB to 2TB per day. I think the biggest struggle we have is you need to gather data that spells out who in the research lab is consuming data for chargeback," he said.
Paul English, director of IT at 3TIER, which provides extensive weather data to renewable energy companies, said his IT staff had been spending hours a day in meetings to figure out where data goes and who is responsible for managing it. "We've never not been dealing with big data," he said. "We want to keep 10 or 20 years of climatological data. We have growth potential of many petabytes."
To address the data deluge, his company installed 14 Isilon NAS arrays to create an expandable pool, accessible by anyone in his company.
"Now [capacity is] delivered more as a utility, he said.
One continuing issue, the IT managers said, is data movement - migrating it to the correctly priced storage tier and keeping it as close as possible to the people using it.
"You're talking terabytes per day that you can never keep up with on operations side," Martino said. "You can never get that data from one site to another.
The solution for Harvard Medical School was to use EMC's Isilon clustered NAS array, which provided a single name space to which any group could store and access data.
Lowey said TGen must constantly move data back and forth between gene sequencing computers in Phoenix and a supercomputer in Tempe that's used to process results.
"We had a one gigabit dedicated link. That didn't last. Now we have a 10Gibit [Ethernet] link, and we're actually playing with the idea of using InfiniBand," he said.
One current quandary the IT managers all agreed on was how big data is changing the way they think about storing information. Most said they want to store everything because they don't know what its value may be to the company at some later point in time.
"We're all in the same boat." Lowey said.
Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian, send email to firstname.lastname@example.org or subscribe to Lucas's RSS feed .
Read more about storage in Computerworld's Storage Topic Center.