Following Hurricane Sandy, let's say you've been asked to set up replication to a disaster recovery site. Your company has chosen to back up its core operations located in Boston with space in a collocation center in Chicago -- about a thousand miles away. You've done the math and determined that you'll need a 500Mbps circuit to handle the amount of data necessary to replicate and maintain recovery-point SLAs.
As you get your Chicago site and connectivity lit up, you decide to test out your connection. First, a ping shows that you're getting a roundtrip time of 25ms -- not horrible for such a long link (at least 11ms of which is simple light-lag). Next, you decide to make sure you're getting the bandwidth you're paying for. You fire up your laptop and FTP a large file to a Windows 2003 management server on the other side of the link. As soon as the transfer finishes, you know something's wrong -- your massive 500Mbps link is pushing about 21Mbps.
Do you know what's wrong with this picture? If not, keep reading because this problem has probably affected you before without your realizing it. If you decide to move to the cloud or implement this kind of replication, it's likely to strike again.
First, understand that the answer is related to Transmission Control Protocol (TCP), one of the two main IPs that most applications use to communicate over the Internet. (The other is User Datagram Protocol, or UDP.) What matters here is that TCP has built-in congestion and packet-loss detection capabilities whereas UDP does not.