First look: Google's high-flying cloud for Python code

Google App Engine simplifies the problem of deploying and scaling Web applications, but not without a few wrinkles and question marks

One of the joys of being a Web programmer is heading to a dinner party, a haircut, or a reunion and fielding the pitches for everyone's dream for a brilliant Web application. Everyone is always happy to cut you in for 5, 10, maybe even 15 percent of the equity if you just build out the Web site that's sort of like a combination of Twitter, AltaVista, Eliza, TurboTax, and the corner pharmacy, but cooler.

Google App Engine is meant for dreams like these. You write a bit of code in Python, customize some HTML, and bingo, you've got your database-backed dynamic Web site up and running in a few short minutes. The magic comes when the world starts flocking to your Web application, and Google's cloud of computers quickly adapts to the load, handling everything the public demands. There's no need for you to buy servers, load balancers, or special DNS tables. Google's application cloud handles all of the grungy deployment headaches.

[ See also "What cloud computing really means" and "Early experiments in cloud computing." ]

I played around with the App Engine SDK and, sure enough, developed and deployed applications on my desktop with just a few minutes of work. I didn't upload them to the cloud because I didn't make it into the beta program, but I was able to simulate the experience on my office server. The billions of hits haven't shown up yet, but it has only been a few hours now. It works and it is quite simple.

Google me this
A trickier question is deciding whether this is really what a future Web application really needs. There is little doubt that App Engine makes it simple to get incoming data, make some decisions, store it in a database, and then move on. The more complicated questions are often political, technical, and almost aesthetic. There will be a number of programmers who look at App Engine and melt with excitement, and there will be many who tilt their head like a dog that can't understand his master.

Being a Python lover certainly helps, but it isn't necessary because the language isn't that much different from the other scripting languages. A good programmer should be able to shift gears quickly and easily. There are rumors that Google has a number of other languages waiting around the corner, but there are equally good arguments that this may not be happening as soon as some devotees would like.

Java programmers, in particular, are used to being known as providing the most scalable and flexible applications because the language and the API are some of the most sophisticated ensembles around. The J2EE standard nurtured tools that simplified some of these problems, even though it never really turned out to be as simple as the sales literature promised. Today, Java's sophistication is probably hurting the language as much as helping it. A quick survey of Web hosting services shows that shared hosting for JSP applications begins around $10 a month, while Python shared services can cost as little as $2 a month. The JVM may speed things up and provide better service, but it comes with a hefty memory footprint. If the brutally competitive Web hosting business can support five Python sites for every Java site, then perhaps Google is more interested in the long tail, the niche Web sites, than the big iron.

There are other advantages that probably encouraged Google's choice of Python. The most popular implementations are open source. and the language's creator, Guido van Rossum, works there. This must have made it much simpler for the company to create the slightly crippled version of Python that runs on the app server. This sandbox forbids some potentially dangerous operations such as writing to the file system, a feature that could pretty much prevent building Flickr-like upload services unless you feel like storing these big blocks of data in the database. Your code isn't allowed to spawn subthreads, and it better be efficient because it looks like App Engine will kill any thread that takes too long. This is probably necessary given the endless loops that will be created by newbies, but it pretty much means that App Engine is really just for front ends to databases that don't do much independent thinking or computation.

Smells like SQL
It's probably best to think of the system as a thin layer of business logic in front of a simple database, the kind that DBAs like to call a "data store" to emphasize the point that you can't do most of the complicated things that Oracle allows. The database is nicely integrated with Python but it only offers the kind of basic search and store functions that developers will need to squirrel away their user's information. You set up the data objects in Python, hit the save method, and the data disappears into the cloud where all of the instances of the application can find it. The language is pretty close to SQL, but it comes with a slightly different syntax, which means that you won't be able to use any of the millions of tools that sort of speak SQL to generate reports or produce graphs. Furthermore, the data store API doesn't include old-fashioned joins, an omission that will break some of the code written for traditional databases. The simplicity is nice, but there's a reason why everyone ends up using standard databases for the core of their projects.

So there is a certain amount of lock-in hiding in the API. Porting your application to something like MySQL won't be automatic but I doubt it would be hard at all. Going in the other direction, though, could be both healthy and annoying. Because there's no way to join tables, you're effectively forced to denormalize your tables. Most Web developers end up doing this eventually to help things scale, so I guess it makes sense to start out that way even if it seems a bit messy.

There are omissions. The documentation mentions Web services and AJAX (Asynchronous JavaScript and XML), but there's very little support for them out of the box. Perhaps there will be some grand catalog encouraging mashups in the future. It would also be nice to offer some basic templates for data structures so that most of the applications could begin with the same standard formats for dates, locations, and other things. Currently you don't get much beside the simple Python application framework with a bit of MVC (model-view-controller).

Nor are there many of the tools that might be essential. The samples and the tools all run through the command line, probably the preference of the developing team. I can see that developers might want more sophisticated tools for profiling the code and tracking every click. Google suggests profiling by dumping the profile information between <pre> tags in an HTML document. Using <table> tags would probably confuse the command line jockeys.

Sky's the limit
The plan is to charge when applications exceed some limits, a perfectly fair plan but one that makes me a bit nervous after years of basic pricing for servers. The terms and conditions suggest that you only get "200 million megacycles of CPU per day." You can see a snapshot of resource consumption, but this seems like an especially squirrelly metric that could be skewed in odd ways by factors beyond the developer's control. If you send a weird query to the database, it may burn cash in a way that you didn't anticipate. One of the biggest headaches for Java programmers comes when an instance of an application on one server starts asking for data that happens to be sitting on another server. Inter-server communication can slow fast boxes to a crawl, and entity locking could get scary if two users start nibbling at the same bytes at the same time.

I know I should be happy that App Engine will bring up new servers when the demand arrives, but all I can think about is watching the meter spin when an errant query starts chasing down data on other servers. Getting wildly popular may turn out to be more of a nightmare than a dream because Google will dutifully roll out more versions of your applications, burn more megacycles, and put it on your tab. I'm sure Google will come up with ways of limiting the size of the bill, but all I can think of is firing up a slick Web site and repeating Woody Allen's line from "Manhattan": "God, you're so beautiful I can hardly keep my eyes on the meter."

Google also lets you access Google accounts, the creepy feature that links your search history with your Gmail account. The users of your application don't need to set up a separate log-in or a separate account. You can get a user object with all of this information when they show up, if they've recently been logged in to read their Gmail. If you don't want to use this feature, you could always spin up your own user accounts with the database, of course.

Some of the FUD spread by rival camps suggests that Google just wants to use App Engine as a way to nurture Python developers so that the company can hire them away. Others see it as a cynical way to gain control and lock people into their Google accounts. Others think this is just a technique for Google to build a big plantation with you, the Python developer, toiling away to monetize its app cloud and add more value to the Google account.

Tenant's rights
That's all just a bit too cynical. While the terms and conditions include a number of scary phrases giving Google the power to do pretty much anything with your baby, they seem like rational responses to the scary prospect of letting anyone put applications on your cloud. Copyright violations, spammers, and pornographers must keep the lawyers at Google up late at night. The lock-in is a real problem, but it is mitigated a bit by some of the open source licenses. Python and Django are pretty much free if you want to take your application and run with it. The hurdles and caveats are annoying, but the App Engine formula seems like a serious play for the low end of the marketplace where small developers create niche applications.

The service is best for the simple applications that plan on staying simple for the time being. While the cloud's ability to scale the application quickly is a nice feature, the limitations of the service should be constraining for anyone who has big dreams built on complex code. The sandbox offers only limited services, and the legal issues are still new. While the Google lawyers did a pretty good job of anticipating many of the potential potholes for the service, that doesn't mean they can go away. Google reserves the right to "pre-screen, review, flag, filter, modify, refuse or remove any or all Content from the Service." Will Google be a good hosting provider and treat the small fry like a partner, or will it just nuke entire applications when a DMCA notice shows up? Time will tell.

It's worth thinking a bit about the long-term plan when your hairdresser chats away about a brilliant Web application while cutting your hair. Some whispers I've heard suggest that Google might just steal your application, perhaps copying it. I'm not sure why hosting it with Google would make it any easier for them, but maybe forcing you to map it onto their architecture might help a bit.

There are any number of competitors. Amazon has its own cloud, but it takes a very different approach, giving the user an empty Linux shell. That may offer plenty of freedom, but it offers none of the handholding. It will probably take you longer to install a JVM on Amazon's Elastic Compute Cloud than to spin up a three-page Web site with Google's App Engine. But Amazon's SimpleDB also offers a richer API, including real Web services for REST and SOAP queries.

The biggest competitors may be the old-school Web hosting programs that let you share a server for a few bucks a month. They may not scale automatically, but they give you plenty of control and an older, more established type of user agreement. And while they may not be as magic as Amazon, they have a number of tools for migrating customers to bigger boxes. The last time I asked my shared hosting service to move to a new server with a different version of MySQL, it was done in an hour or two. That's not automatic, but it only took an e-mail message.

Work in progress
It is almost unfair to review the Google App Engine when it is just a beta operation, but Google has a habit of leaving some tools in beta form for a long time. There are a number of places where the documentation and the code suggest that Google will add more functionality pretty soon. The basic framework and the database are both quite nice, although limited. I can imagine Google adding better automatic features for generating the CRUD (Create, Update, Delete) routines common in these applications. Integration with Google's Wallet might also be quite useful, although it's bound to be complicated by the banking system. Some people have already experimented with mapping the Google Web Toolkit to the system, even though that's written in Java and translated into JavaScript.

Google might also provide some good tools that allow the different hosted applications to share user information, essentially allowing a user to move their preferences and some of their data to other applications. This kind of inter-application linking could be pretty cool.

Time will tell what Google delivers. In the meantime, this is a good sandbox for playing with simple database applications. There's a very good reason why the beta version has a waiting list.

1 2 Page 1
Page 1 of 2