How to wrangle distributed data

Delivering fast and reliable access to enterprise storage across a wide area network can be a huge challenge. Here are some common strategies and pitfalls

Managing corporate data is hard enough if your organization has only one site. Throw in a multitude of remote sites spread across the country, and you're talking about a real challenge. As our data grows at unprecedented rates, higher costs for new and upgraded WAN circuits seem inevitable. But take heart; there is hope. All you have to do is avoid what most people do wrong as they manage distributed environments -- and wrap your head around the right-thinking approach.

The status quo

Today, in most multisite scenarios, each site has its own storage pool -- whether it's NAS/SAN or a traditional file server -- and maintains its own data, sharing with other sites as necessary. Though not particularly storage-efficient, with proper management this can work well if each site represents a distinct business unit with its own data. But few of us are lucky to have such hard lines of demarcation. Most times, significant portions of data must be shared across sites.

[ Looking to revise your storage strategy? See InfoWorld's iGuide on the Enterprise Data Explosion. ]

In the conventional model, there are a few ways to deal with this. One is to define a home location for the data in question and have all of the other sites access this storage pool over the WAN as needed. This generally results in lots of WAN bandwidth being consumed by the same files moving from servers at one site to clients at another.

Another option is to replicate copies of data at all remote sites so that those copies can be accessed locally, but this invites the nightmare scenario where users at more than one site modify the same files.

Clearly, there must be a better way.

Bringing the users to the data

Perhaps the easiest way to deal with storage a distributed multisite network is to keep all of the storage resources at a single site and bring the remote users to it through the use of Server-Based Computing (SBC). Between Microsoft Terminal Services, Citrix XenApp, and the various available VDI implementations, there is a remote computing option that will fit nearly any workload.

The primary benefit of this approach is that it replaces short-lived, high-bandwidth file transfers with relatively constant, low-bandwidth user sessions. Moreover, expenditures on data center infrastructure such as storage and backup hardware can be centralized to a single location which can create massive capital and operational efficiencies.

This approach is not without its drawbacks. The most critical issue is that SBC places a significant emphasis on the reliability of the WAN circuits used by remote sites. If that circuit goes down, the remote site doesn't just lose access to corporate data housed at the headquarters site, it is dead in the water until connectivity is reestablished. There are many excellent ways of providing highly reliable redundant connectivity for remote sites, but depending upon the locations of your remote sites, this may not always be a viable option.

1 2 3 Page 1