How to integrate with the cloud

All it takes is a credit card to spin up a SaaS application. But consider how you integrate with that cloud app, or you'll be condemned to create another silo

When businesses decide to go to the cloud for an enterprise application and open an account with or some other SaaS (software as a service) provider, they typically don't consider how that SaaS app will integrate with their existing software.

But integration is crucial. By now, every business understands you can't have multiple applications operating on different versions of the same customer record, for example, without those versions being updated and reconciled.

[ Read David Linthicum's Deep Dive report on cloud services. | Subscribe to InfoWorld's Cloud Computing Report newsletter and stay up to date. ]

Without a solid integration strategy, data quality quickly becomes a problem. You don't want a new SaaS system to be hindered by having to enter data twice -- or worse, by not having the correct data available when a core business process requires it.

So how do enterprises that adopt SaaS applications develop an effective approach to integration? As always, the process begins with business requirements. The good news is that new, innovative integration technologies offer cost efficiencies unavailable just a few years ago -- although in some cases, requirements dictate that you opt for an old-school integration solution.

With SaaS, latency will be more of an issue, and impoverished APIs may limit integration benefits. In general, integration of SaaS apps is restricted to data integration and asynchronous process integration, ruling out the closely coupled application clusters some enterprises depend on. Within these constraints, how far you decide to push integration with SaaS apps depends on your business needs.

Making SaaS play nice with data

The beauty -- and the downside -- of SaaS is that the businesspeople don't need IT to establish accounts and to get up and running. IT has less work to do in the short term. But without integration, SaaS silos spring up, resulting in duplicate data, inaccurate reports, and ultimately, damaging data discrepancies.

Integration technology allows clouds and core enterprise systems to share data while dealing with the different ways that the data is structured. This is accomplished through data mediation subsystems that manage the underlying differences in both structure and content in flight. With SaaS in particular, you need a flexible integration solution, because both the source and target system interfaces change more frequently than those presented by traditional enterprise software.

Back in the '90s, integration technology was immature and expensive. These days, you can find lightweight open source integration solutions, such as that provided by Jitterbit, or cloud-delivered integration offered by the likes of Boomi (now a part of Dell) or Pervasive Software. Even integration appliances have emerged, such as that offered by Cast Iron Systems (now a part of IBM).

This is on top of the fifth- or sixth-generation, enterprise-class integration solutions sold by IBM, Informatica Oracle, Software AG, and other established players that have been around for years.

So how do you choose the right solution from the dozens available? It helps to start by understanding typical integration patterns and the features they demand.

The fundamentals of integration

There are several ways to move data from one system to another, some more sophisticated than others. For example, many enterprises still rely on the primitive FTP method to transfer data -- even when integrating newfangled SaaS with local applications.

The typical way to accomplish this is to lay down data from the source systems into a file once a day. Next, transfer that file from the cloud provider to the enterprise server and load the data into the target application or database. While this may seem reasonable, no mechanisms deal with the differences in data structure or content. Also, the transfer can happen once or twice a day at most, so data latency is an issue. Finally, failures may leave the source or target systems with bad or inaccurate data. Although FTP seems like the simplest approach, it's never the right one.

In the same vein, some organizations opt to build integration technology themselves, in effect coding an integration server from scratch. While this keeps developers busy, the results are almost always ineffective and inefficient. Now that such a broad range of integration solutions are available and affordable, there's no excuse to go down the path of ground-up custom coding.

That leaves you with commercially available and open source solutions, which vary widely. Navigating the technology requires some basic understanding, including the concepts of semantic mediation, connectivity, validation, and routing.

Semantic mediation (also known as data transformation) is the process of dealing with the differences in data structures or data semantics as they exist within the source system -- say, from to an SAP target system. The structures and data content are changed in flight while moving from source to target, such as First_Name(char 20) to F_Name (char 10). Data is sent to the target using the native structure, even though the structure consumed from the source is foreign.

Typically the links between source and target structures are set up using maps that chart the structure from the source schema to the target schema. Within most integration engines, this is typically a visual, drag-and-drop process. Structures can be mediated in a matter of minutes, and information can flow between two very different data structures.

Connectivity is the ability for the integration technology to adapt to the interfaces provided by the cloud or enterprise-based systems -- typically, APIs. Adapters account for the differences in the interfaces and the way the integration technology deals with the data. In the case of, for example, you invoke a Web service that produces data bound to a structure, and the adapter is able to consume that data into the integration engine where it is manipulated as required -- and then sent out another adapter to a local application, such an ERP or an inventory control system.

Validation is the ability of an integration server to validate data, such as making sure a ZIP code is correct. Routing is the ability to make sure the right data ends up getting to the right system.

The way this technology works is rather simple: It reacts to events, such as a customer record being updated or a sale being recorded. In reacting to the event, it carries out some preprogrammed function, such as extracting the changed data from the local enterprise system, accounting for the differences in structure and content, and updating the remote cloud-based system with the changed data, typically in less than a second. These events can occur at a rate of hundreds or thousands a minute, or just a few per day.

Choosing the right integration solution

Today, you have a choice of where your integration technology resides. It can live in a cloud, be bolted into a rack in your data center, or install on a server in your data center like conventional software. There are good and bad points to each.

Using a SaaS integration service to integrate a SaaS application can be a highly effective, low-effort option. In this approach -- offered by Boomi, Informatica, Pervasive Software, and others -- the idea is to supply a multitenanted integration engine that will be shared by many, but behaves as if it were local. You get all the advantages of using a cloud-based service, such as no hardware or software footprint and pay-as-you-go pricing. Prices start at about $1,000 per month and goes up to roughly $5,000 per month depending upon the number of connections and data transferred.

But cloud-delivered integration has its downsides. First of all, SaaS integration has the same problem as SaaS in general: Its availability is in the hands of the provider. With integration, an outage may bring down multiple applications, and latency and performance issues may be beyond your control. You also need some way of dealing with interfaces that are not Port 80 compliant and, thus, can't transfer data outside the firewall. (Many on-demand integration providers still require a small piece of software that runs locally to deal with the Port 80 issue.) Additionally, while the pay-as-you-go pricing seems attractive, you may discover that purchasing integration software outright is actually more cost effective over the long haul.

The appliance approach to integration was brought to us by Cast Iron Systems as a way of providing its clients with preconfigured hardware and software solutions. Some configuration and/or programming may be required to meet your exact needs, but these integration-in-a-box solutions do arrive with the ability to connect to popular SaaS providers, such, along with any number of local enterprise applications. Indeed, integration appliance vendors target the SaaS-to-enterprise integration space.

The advantage of an appliance-based approach is the ease of installation and configuration, as well as the price point, which begins at about $50,000 per appliance (not including yearly maintenance). The main drawback is that many appliances provide less robust integration than their software counterparts.

If you want more, you'll need to turn to good old EAI (enterprise application integration) software, which delivers a broad set of general-purpose integration capabilities, including SaaS-to-enterprise integration. IBM, Informatica Oracle, Software AG, and others offer integration software products with adapters that support hundreds of enterprise systems.

The biggest disadvantage of EAI software is the cost. You must maintain hardware and software in your data center, on top of paying as much as a half a million dollars for each license, with a yearly fee for program maintenance.

The great advantage of EAI software is maturity. This is typically fifth- or sixth-generation technology, well-tested and feature-rich. It can provide core integration services for internal systems as well as connectivity to the cloud. If you're already running a large enterprise IT operation, very likely EAI is already in place, so you're talking at most an incremental increase in licensing cost.

Complex data integration, with many sources and many targets, pretty much demands an EAI solution. But as cloud- and appliance-based solutions continue to improve, they will emerge as viable options as well.

Building a bridge to the cloud

The good news is that we've been working on the SaaS-to-enterprise integration problem for almost 10 years now. We know what works and what does not. Moreover, many single-purpose solutions that focus on cloud-to-enterprise integration, such as appliances and on-demand integration technology, have emerged to solve this problem at a relatively low cost.

But in approaching integration, you still need to think hard about your current business needs and what you'll require in the future. In fact, the richness of the API set may well be a key factor in determining which SaaS application you choose in the first place. Smart integration means greater business efficiency. If you ignore integration until data coherency becomes a problem, you'll spend time on workarounds or replacement solutions rather than reaping the benefits of greater business efficiency.

This article, "How to integrate with the cloud," originally appeared at Read more of David Linthicum's Cloud Computing blog and track the latest developments in cloud computing at For the latest business technology news, follow on Twitter.

Copyright © 2011 IDG Communications, Inc.

InfoWorld Technology of the Year Awards 2023. Now open for entries!