As corporate data stores continues to grow, in some cases by more than 50 percent a year, the expanding task of managing them and mining for information is forcing a change in how IT workers are trained.
[ Also on InfoWorld: EMC's Big Data efforts gain momentum. | Read InfoWorld's primer "The big promise of Big Data." | Learn about the emerging BI and Big Data trends in depth with InfoWorld's interactive iGuide. ]
The discussion was sponsored by EMC's Isilon division, which manufactures clustered network-attacked storage (NAS) systems used to house massive data warehouses under single domain name spaces.
Chris McNally, a storage architect with IT hosting company Sungard, said he is helping to cross-train employees to learn how various systems fit into a larger IT ecosystem.
For example, McNally said, AIX and backup administrators have volunteered to undergo storage area network (SAN) and cloud storage training at Sungard.
"So instead of me being the storage guy who has to argue with these other guys about how this works and what we do, they're now able make intelligent requests with regard to storage," he said. "It creates a better product in the end."
James Lowey, director of network and computer systems at genome sequencer company Translational Genomics Institute (TGen) said workers in his company's traditional IT shop are required to learn how networks, operating systems and storage interact.
Lowey said mapping human genomes currently creates 2TB worth of new data every week and he expects that to grow to 10TB per week by year's end. The genomic data is used to tailor drug compounds to treat diseases specific to a person's genetic profile.
He noted that coming up with the best way to mine that data for important information continues to be a sticking point.
Looking to come up with a solution to that problem, EMC last year spent more than $3 billion to acquire companies like Isilon and data warehousing and analytics company database Greenplum.
"Do I keep [data] or do I not keep it has been an age-old question," said Paul Rutherford, CTO of Isilon.
For Lowey, keeping all the data produced is an issue his company is struggling with. On one hand, there's nothing more personal than genomic data. So keeping everything means keeping everything secure for as long as you have it. But the data store also continues to be a good source of information that can be mined for its scientific value in creating custom drug treatments.
"The reason we keep everything forever is that we're not sure about what it is we have," he said. "In life sciences there's so much to learn, and so much unknown."
EMC announced here that it has ramped up programs to train and certify "data scientists." A data scientist spends his or her time determining the value of a corporation's data.
Nick Mehta, CEO of cloud storage provider LiveOffice, said data persists whether it's properly stored or not.