Review: 4 Java clouds face off
CloudBees, Google App Engine, Red Hat OpenShift, and VMware Cloud Foundry reveal the pleasures and perils of coding on a public cloud platformFollow @peterwayner
Pricing may be the most difficult and challenging issue for both buyers and sellers for years to come. People are already cheesed off at the way Google stopped subsidizing its App Engine. Some users are complaining that their costs doubled or tripled with, irony of ironies, one click of a button. But who can blame Google? While the company has excellent financial engineers, I'm not sure if they can know the fair price for a round trip to the Big Table data store. It probably fluctuates with the rainfall in the northwest, where the hydropower is the cheapest power source for some of Google's newest data centers.
Perhaps I'm overthinking it. Things can go wrong anywhere. Prices will fluctuate. The cloud can be more flexible and automated, thus saving us money on people who will minister to the racks and make sure the data is flowing smoothly. If that Web 3.0 application turns out to be a big hit but the cloud is too expensive, it will still bring in enough revenue to pay for all of the reprogramming required to move the app to a set of in-house servers. If it's one of those Web things where the revenues never scale with the costs, well, the price of experimentation couldn't be lower. That's what clouds are ultimately about: They simplify experimentation and change.
Just choosing a cloud can involve plenty of experimentation. The simplest option is to turn up a raw machine from the Amazon or Rackspace cloud, but these don't offer much of what the cloud marketeers promise. Sure, I pushed the button and started up a new machine in just a few seconds, but then I spent more than a few hours logged in as root installing the JVM and the rest of the stack. Once I finally got a machine configuration I liked, I was so proud of it that I wanted to put a picture of it on the fridge. I made sure to store it away so I could start it up as many times as I like.
If you've got the time and the inclination to build up a machine image with the software you like, raw cloud machines can offer you most of what you want from the cloud with few problems of lock-in. Both Amazon and Rackspace make it easy to store an image and hit the replication button again and again. You choose the software and you decide how many machines you want. In theory, there are more machines there whenever you need them. I experimented with spinning up new machines for the daily housekeeping work, and it was nice to spend only 1.5 cents per hour for them. After the work is done, they're gone.
Of course, you've got to do all of the thinking yourself. Do you want 100 machines or 102? Yes, you control your costs but you don't have time to react unless you build more intelligence on top of it.
Java clouds: Google App Engine
There's something warm and comfortable about using Google's App Engine. What began as a fairly radical tool has slowly matured into an asset that's easier to understand and use, if only because the world has adopted many of the ideas.
The basic architectural themes have remained the same. You upload a small kernel of code with your business logic, and App Engine deploys enough instances to satisfy the demand. If you want to store data or synchronize your work between sessions, you have to use Google's proprietary data stores and caching, but everything else feels fairly standard. The first versions of App Engine used Python, but now you can push up Java WAR files filled with JSPs, servlets, and server-side logic. The administration is handled through a separate Web interface. Command-line issues are pretty much relegated to the past.
I think the biggest challenges for programmers will be adjusting to Google's nonrelational data stores. When App Engine first appeared, there weren't so many NoSQL projects around, and the idea of storing collections of name-value pairs was more of a novelty. Anyone approaching App Engine with a bit of experience with NoSQL won't be shocked at all by the simple solution that App Engine forces upon everyone who wants to keep data around. But anyone who still thinks of JOINs and normalized data will need to break from the table-oriented, relational past and adjust to a new way of doing things.
App Engine offers two classes of the data store, so the architect must decide whether to pay for additional power. The basic model makes one data center the master and all others a slave. If the data center fails or starts up a scheduled maintenance, your data can't be stored. You must be ready to live with a "planned read-only period." Many modern Web applications (think Facebook) can easily survive these kind of glitches, but applications requiring banklike levels of availability and consistency will need to look elsewhere.