The problem with letting go, however, is that it depends on the performance characteristics of opening socket connections. The TCP stack is set up to guard against orphaned packets from a previous connection interrupting a new connection; this is part of the reliability guarantee that TCP draws over IP. The way TCP/IP does this is by making you wait to reuse the same socket connection. Thus, the number of TCP sockets connections you can open in a second is limited. One way of escaping this limit is to reuse connections across multiple request cycles -- a fundamentally sound idea that most PHP applications (due to the PHP concurrency model) simply cannot take advantage of.
If you examine the active connections on your Web server or database server when running a PHP application (on Unix/Linux servers, type
netstat -na), you'll see a large number of connections to or from the database in
CLOSE_WAIT state. Were you instead running your application on a runtime environment that allowed pooled connections, you would see a fixed number (the size of the database connection pool) in
ESTABLISHED state. The bottom line: PHP applications are a load on the database due to the constraints of the concurrency model.
Why is PHP this way? Linux did not originally support threads. It only supported subprocesses. Windows NT-derived operating systems always supported threads (though heavier ones than modern Linux native threads) and thus would outscale Linux by a large margin. Unfortunately, no one believed those Microsoft funded studies that proved it.
To scale PHP on a relational database, you need to shard your data. This means splitting the data by some reasonable key. This might mean East Coast customers go on one RDBMS, Midwest customers go on another, and the West Coast on a third. This is a lot of complexity to swallow when you chose PHP because it was "simple" and "free."
The cloud and NoSQL are game changers
In the cloud, if we can trade a conventional RDBMS for a database that autoshards and can balance connections to each node, PHP can scale pretty well. Rather than have a series of unpooled connections to one or two machines, you can balance this among several database servers.
More Web servers limit the impact of the lack of connection pooling on the database clients. More database nodes and sharding reduce the impact on the server nodes. I think it's clear the move to NoSQL and the cloud are big scalability wins even for existing runtimes. The economic choices that have made PHP so successful may even make it more successful in the cloud and prevent the rework to a thread-safe PHP from ever having to take place.
I think it's clear the move to NoSQL and the cloud are big scalability wins even for existing runtimes. The economic choices that have made PHP so successful may even make it more successful in the cloud and prevent the rework to a thread-safe PHP from ever having to take place.
Together, migration to the cloud and NoSQL greatly mitigate these issues or make them simply a deployment detail. It means we may be able to hire an offshore team of PHP coders to knock one out on a NoSQL database so long as we have a good NoSQL schema and a reasonable cloud deployment scheme.
This article, "How to make PHP apps scale," was originally published at InfoWorld.com. Follow the latest developments in business technology news and get a digest of the key stories each day in the InfoWorld Daily newsletter. For the latest business technology news, follow InfoWorld on Twitter.