InfoWorld's top 10 emerging enterprise technologies

2009's up-and-coming technologies for business that will have the greatest impact in years to come

1 2 3 Page 2
Page 2 of 3

Non-x86 processor vendors are also deeply involved in this fray. For example, Tilera currently sells a 16-core chip and expects to ship a 100-core monster in 2010. What will IT do with so many cores? In the case of Tilera, the chips go into videoconferencing equipment enabling multiple simultaneous video streams at HD quality. In the case of Intel, the many cores enable the company to explore new forms of computing on a single processor, such as doing graphics from within the CPU. On servers, the many-core era will enable huge scalability and provide platforms that can easily run hundreds of virtual machines at full speed.

It's clear the many-core era -- which will surely evolve into the kilo- and megacore epoch -- will enable us to perform large-scale operations with ease and at low cost, while enabling true supercomputing on inexpensive PCs.

-- Andrew Binstock

6. Solid-state drives

SSDs (solid-state drives) have been around since the last century, but recently, we've seen an explosion of new products and a dramatic drop in SSD prices. In the past, SSDs have been used primarily for applications that demand the highest possible performance. Today we're seeing wider adoption, with SSDs being used as external caches to improve performance in a range of applications. Gigabyte for gigabyte, SSDs are still a lot more expensive than disk, but they are cheaper than piling on internal server memory.

Compared to hard drives, SSDs are not only faster for both reads and writes, they also support higher transfer rates and consume less power. On the downside, SSDs have limited life spans, because each cell in an SSD supports a limited number of writes.

[ Wondering where SSDs fit into your datacenter architecture? See "Four considerations for SSD deployment." ]

There are two types of SSDs: single-level cell (SLC) and multilevel cell (MLC). SLCs are faster than MLCs and last as much as 10 times longer (and, as you might imagine, cost a lot more). Write endurance has been a big barrier to SSDs, but increasing write specs and the smarter use of built-in DRAM caches are making the value proposition more attractive. Some manufacturers increase the longevity of drives by adding more actual capacity than the stated capacity, and they use wear-leveling algorithms to spread data over the extra cells.

But the most dramatic story is pricing. A 32GB SSD has gone from over $1,000 to under $100 in the last five years, though this is still about 46 times as expensive as a SATA drive in dollars per gigabyte. As new solutions to the wear problem emerge from the lab, we expect SSD adoption to accelerate even more, as the hunger for high performance in cloud computing and other widely shared applications increases.

-- Logan Harbaugh

5. NoSQL databases

Data is flowing everywhere like never before. And the days when "SQL" and "database" were interchangeable are fading fast, in part because old-fashioned relational databases can't handle the flood of data from Web 2.0 apps.

The hottest Web sites are spewing out terabytes of data that bear little resemblance to the rows and columns of numbers from the accounting department. Instead, the details of traffic are stored in flat files and analyzed by cron jobs running late at night. Diving into and browsing this data require a way to search for and collate information, which a relational database might be able to handle if it weren't so overloaded with mechanisms to keep the data consistent in even the worst possible cases.

[ In InfoWorld's "Slacker databases break all the old rules," Peter Wayner reviews four NoSQL databases: Amazon SimpleDB, CouchDB, Google App Engine, and Persevere of NoSQL. ]

Sure, you can make anything fit into a relational database with enough work, but that means you're paying for all of the sophisticated locking and rollback mechanisms developed for the accounting department to keep track of money. Unless the problem requires all of the sophistication and assurance of a top-of-the-line database, there's no need to invest in that overhead, or suffer its performance consequences.

The solution? Relax the strictures and come up with a new approach: NoSQL. Basic NoSQL databases are simple key/value pairs that bind together a key with a pile of attributes. There's no table filled with blank columns and no problem adding new ad hoc tags or values to each item. Transactions are optional.

Today's NoSQL solutions include Project Voldemort, Cassandra, Dynamite, HBase, Hypertable, CouchDB, and MongoDB, and it seems like more are appearing every day. Each offers slightly different ways to access the data. CouchDB, for instance, wants you to write your query as a JavaScript function. MongoDB has included sharding -- where a large database is broken into pieces and distributed across multiple servers -- from the beginning.

Simple key/value pairs are just the start. Neo4J, for instance, offers a graph database that uses queries that are really routines for wandering around a network. If you want the names of the dogs of all of the friends of a friend, the query takes only a few lines to code.

The real game is keeping the features that are necessary while avoiding the ones that aren't. Project Cassandra, for instance, promises to offer consistent answers "eventually," which may be several seconds in a heavily loaded system. Neo4J requires the addition Lucene or some other indexing package if you want to look for particular nodes by name or content because Neo4J will only help you search through the network itself.

All of these new projects are just the latest to rediscover the speed that might be found by relaxing requirements. Look for more adjustments that relax the rules while enhancing backward compatibility and ease-of-use. And expect a new era of data processing like nothing we've experienced before.

-- Peter Wayner

4. I/O virtualization

I/O virtualization addresses an issue that plagues servers running virtualization software such as VMware or Microsoft Hyper-V. When a large number of virtual machines runs on a single server, I/O becomes a critical bottleneck, both for VM communication with the network and for connecting VMs to storage on the back end. I/O virtualization not only makes it easier to allocate bandwidth across multiple VMs on a single server, it paves the way to dynamically managing the connections between pools of physical servers and pools of storage.

But let's start with the individual server. Take, for example, VMware's recommendation to allocate one gigabit Ethernet port per VM. A server that supports 16 VMs would therefore need four four-port gigabit Ethernet NICs, plus additional Ethernet (iSCSI), SCSI, or Fibre Channel adapters for the necessary storage. Many servers don't have enough empty slots to support that many adapters, even if the cooling capacity were adequate. And 16 VMs per host is barely pushing it, considering that today's Intel and AMD servers pack anywhere from 8 to 24 cores and support hundreds of gigabytes of RAM. Consolidation ratios can go much higher.

[ I/O virtualization is key to the highly scalable architecture of Cisco's Unified Computing System. See "Test Center review: Cisco UCS wows." ]

In response, I/O virtualization vendors such as Xsigo and Cisco have come up with a way to give each server one very high-speed connection instead of multiple Ethernet and Fibre Channel connections. One adapter per server can then provide many virtual connections. These adapters are not custom HBAs, but standard 10 gigabit InfiniBand or Ethernet adapters used with drivers in the OS that let the OS treat the single fast connection as multiple network and storage connections. Since everything is running over a single pipe, the system can grant bandwidth to the virtual connections as needed, providing maximum performance where appropriate.

Typically, a single adapter resides in each server, connected by a single cable to the appliance or switch, which then provides both network and storage ports to connect to storage and other networks. This simplifies datacenter cabling, as well as the installation of each server. It also eases the task of transferring adapters to another system if a server fails. In solutions such as Cisco UCS, I/O virtualization makes server provisioning, repurposing, and failover extremely flexible and potentially completely automated, as it's handled entirely in software. Further, because the I/O virtualization systems can emulate either multiple Ethernet or Fibre Channel connections running at varying speeds, available bandwidth can be tailored to the requirements of VM migration or other heavy loads.

Virtualizing I/O does require drivers that support the specific OS in use. The major operating systems and virtualization platforms are supported, including VMware ESX and Windows Server 2008 Hyper-V, but not necessarily all versions of Linux and Xen or other open source virtualization platforms. If you're using supported OSes, I/O virtualization can make running a large datacenter much simpler and far less expensive, particularly as increased processing power and memory support allow servers to handle vaster numbers of virtual machines.

-- Logan Harbaugh

3. Data deduplication

Data is the lifeblood of any business. The problem is what to do with all of it. According to IDC, data in the enterprise doubles every 18 months, straining storage systems to the point of collapse. The blame for this bloat often falls on compliance regulations that mandate the retention of gobs of messages and documents. More significant, though, is that there's no expiration date on business value. Analyzing data dating back years allows users to discover trends, create forecasts, predict customer behavior, and more.

Surely here must be a way to reduce the immense storage footprint of all of this data, without sacrificing useful information. And there is, thanks to a technology known as data deduplication.

Every network contains masses of duplicate data, from multiple backup sets to thousands of copies of the employee handbook to identical file attachments sitting on the same e-mail server. The basic idea of data deduplication is to locate duplicate copies of the same file and eliminate all but one original copy. Each duplicate is replaced by a simple placeholder pointing to the original. When users request a file, the placeholder directs them to the original and they never know the difference.

Deduplication takes several forms, from simple file-to-file detection to more advanced methods of looking inside files at the block or byte level. Basically, dedupe software works by analyzing a chunk of data, be it a block, a series of bits, or the entire file. This chunk is run through an algorithm to create a unique hash. If the hash is already in the index, that means that chunk of data is a duplicate and doesn't need to be stored again. If not, the hash is added to the index, and so on.

Data deduplication isn't just for data stored in a file or mail system. The benefits in backup situations, especially with regard to disaster recovery, are massive. On a daily basis, the percentage of changed data is relatively small. When transferring a backup set to another datacenter over the WAN, there's no need to move the same bytes each and every night. Use deduplication and you vastly reduce the backup size. WAN bandwidth usage goes down and disaster recovery ability goes up.

More and more backup products are incorporating data deduplication, and deduplication appliances have been maturing over the past few years. File system deduplication is on its way too. When it comes to solving real-world IT problems, few technologies have a greater impact than data deduplications.

-- Keith Schultz

2. Desktop virtualization

Desktop virtualization has been with us in one form or another seemingly forever. You could probably even say that it's been emerging since the mid-1990s. But there's more to desktop virtualization today than most of us could have imagined even two or three years ago. Yet another milestone is just around the corner: truly emergent technology in the guise of the desktop hypervisor.

Long the leader in this space, Citrix System's XenApp and XenDesktop are examples of how desktop virtualization just might put a desktop server farm in every datacenter and a thin client on every desktop. XenDesktop weaves together all the prevalent desktop and application virtualization technologies into a single package: traditional application and desktop sessions, application streaming, and VDI (Virtual Desktop Infrastructure). No matter which way you turn, the detriments of each is generally backed up by the benefits of another.

[ Desktop virtualization, three ways: Check out InfoWorld's detailed evaluation of VMware View, Citrix XenDesktop, and Citrix XenApp. ]

The client hypervisor takes desktop virtualization the last mile. Picture each desktop running its own bare-metal virtualization layer that abstracts the baseline hardware to whatever VM you wish to push to the desktop, where it can be centrally managed, synced with a mirror on a server, and easily replaced (or even reset by the user) when things go wrong. Citrix isn't alone with this concept -- VMware is developing a similar solution, and both promise to hit the market in 2010.

Regardless of what solutions are available today and what solutions may be on the horizon, enterprise desktop management remains one of the biggest points of pain in any organization. While the model for datacenter architecture has changed systemically in the past 20 years, the model for deploying desktops hasn't. In most places, it's still one fat box per user, with some mishmash of management tools layered across the top to protect the users from themselves and protect the network from the users.

1 2 3 Page 2
Page 2 of 3