HTTP/2 was approved in February 2015 as the successor to the original Web communication protocol. While it is in the last stages of finalization, the standard has already been implemented by early adopters such as Jetty and Netty, and will be incorporated into Servlet 4.0. Find out how HTTP/2 renovates HTTP's text-based protocol for better latency, then see techniques like server push, streaming, multiplexing, and header compression implemented in a client-server example using Jetty.
High-speed browser networking
In the early days of the Web, Internet connection bandwidth was the most important limiting factor for a faster browsing experience. That has changed in the years since, and these days many consumers use broadband technologies for Internet access. As of 2014, Akamai's State of the Internet report showed that the average connection speed for customers in the United States exceeded 11 Mbit/s.
As Internet connection speeds have increased, the importance of latency to Web application performance has become more apparent. When the Web was new, the delay in sending a request and waiting for a response was much less than the total time to download all of the response data, but today that is no longer the case. "High bandwidth equals high speed" is no longer a valid maxim, but that doesn't mean we can ignore the importance of bandwidth. For use cases that require bulk data transfer such as video streaming or large downloads, bandwidth is still a roadblock. In contrast to Web pages, these types of content use long-running connections, which stream a constant flow of data. Such use cases are bandwidth bound in general.
Bandwidth versus latency
Bandwidth determines how fast data can be transferred over time. It is the amount of data that can be transferred per second. You can liken bandwidth to the diameter of a water pipe: with a larger diameter more water can be carried. For just this reason bandwidth is very important for media streaming and larger downloads.
Latency is the time it takes data to travel between a source and destination. Given an empty pipe, latency would measure the time taken for water to travel through the pipe from one end to the other.
Downloading a Web page is like moving water through a bidirectional empty pipeline. In fact, you are passing data through a network connection, where the request data travels through the end user's side of the connection to the server's side. Upon receiving the request the server sends response data through the same bidirectional connection. The total latency time it takes for data to travel from one end of the connection to the other and back again is called the round-trip time (RTT).
Latency is constrained by the speed of light. For instance, the distance between Dallas and Paris is approximately 7900 km/4900 miles. The speed of light is almost 300 km/ms. This means you will never get a better RTT than ~50 milliseconds for a connection between Dallas and Paris without changing the laws of physics. In practice you will get round-trip times that are much higher due to the refraction effects of the optical fiber cable and the overhead of other network components. According to Akamai's network performance comparison monitor, the RTT for the public transatlantic link between Dallas and Paris in August 2014 was ~150 ms. (Please note, however, that this doesn't include the last-mile latencies.)
What does latency mean for an application user? From a usability perspective, an application will feel instant if user input is provided within 100 ms. Responses within one second won't interrupt the user's flow of thought in general, but they will notice the delay. A delay longer than 10 seconds will be perceived as a non-responsive or broken service.
This means highly responsive applications should have a latency of less than one second. For instant responsiveness you should aim for a latency within 100 milliseconds. In the early days of the Internet web-based applications were far from being highly responsive.
Latency in the HTTP protocol
HTTP 0.9
The original HTTP version 0.9, defined in 1991, did not consider latency a factor in application responsiveness. In order to perform an HTTP 0.9 request you had to open a new TCP connection, which was closed by the server after the response had been transmitted. To establish a new connection, TCP uses a three-way handshake, which requires an extra network roundtrip before data can be exchange. That additional handshake roundtrip would double the minimum latency of the Dallas-Paris link in my previous example.
HTTP 0.9 is a very simple text-based protocol as you can see below. In Listing 1, I have used telnet
on the client-side to query a Web page addressed by http://www.1and1.com/web-hosting. The telnet
utility is a program that allows you to establish a connection to a remote server and to transfer raw network data.
Listing 1. HTTP 0.9 request-response exchange
$ telnet www.1and1.com 80
Trying 74.208.255.133...
Connected to www.1and1.com.
Escape character is '^]'.
GET /web-hosting
<html>
<head>
<title>The page is temporarily unavailable</title>
<style>
body { font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body bgcolor="white" text="black">
<table width="100%" height="100%">
<tr>
<td align="center" valign="middle">
The page you are looking for is temporarily unavailable.<br/>
Please try again later.
</td>
</tr>
</table>
</body>
</html>
Connection closed by foreign host.
An HTTP 0.9 request consists of the word GET
, a space, and the document address terminated by a CR LF (carriage return, line feed) pair. The response to the request is a message in HTML, terminated by the closing of the connection by the server.
HTTP 1.0
Released in 1996, HTTP 1.0 expanded HTTP 0.9 with extended operations and richer meta-information. The HEAD
and POST
methods were added and the concept of header fields was introduced. The HTTP 1.0 header set also included the Content-Length
header field, which noted the size of the entity body. Instead of indicating the end of a message by terminating the connection, you could use the Content-Length
header for that purpose. This was a beneficial update for at least two reasons: First, the receiver could distinguish a valid response from an invalid one, where the connection would break down while the entity body was streaming. Second, connections did not necessarily need to be closed.
In Listing 2 the response message includes a Content-Length
field. Additionally, the request message includes a User-Agent
header field, which is typically used for statistical purposes and debugging.
Listing 2. HTTP/1.0 request-response exchange
$ telnet www.google.com 80
Trying 173.194.113.20...
Connected to www.google.com.
Escape character is '^]'.
GET /index.html HTTP/1.0
User-Agent: CERN-LineMode/2.15 libwww/2.17b3
HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.de/index.html?gfe_rd=cr&ei=X2knVYebCaaI8QfdhIDAAQ
Content-Length: 268
Date: Fri, 10 Apr 2015 06:10:39 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.5
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.de/index.html?gfe_rd=cr&ei=X2knVYebCaaI8QfdhIDAAQ">here</A>.
</BODY></HTML>
Connection closed by foreign host.
In contrast to HTTP 0.9 the response of a message begins with a status line. The response header fields allow the server to pass additional information about the response. The entity body is separated from the header by an empty line.
Even though the functionality became much more powerful with HTTP 1.0, it didn't do much for better latency. HTTP 1.0 still required a new TCP connection for each request, so each request added the cost of setting up a new TCP connection.
HTTP/1.1
With HTTP/1.1 persistent connections became the default, removing the need to initiate a new TCP connection for each request. The HTTP connection in Listing 3 remains open after receiving a response and can be re-used for the next request. (The last line "Connection closed by foreign host" is missing.)
Listing 3. HTTP/1.1 request-response exchange
$ telnet www.google.com 80
Trying 173.194.112.179...
Connected to www.google.com.
Escape character is '^]'.
GET /index.html HTTP/1.1
User-Agent: CERN-LineMode/2.15 libwww/2.17b3
Host: www.google.com:80
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.de/index.html?gfe_rd=cr&ei=hW4nVYy_D8OH8QeKloG4Bg
Content-Length: 268
Date: Fri, 10 Apr 2015 06:32:37 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.5
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.de/index.html?gfe_rd=cr&ei=hW4nVYy_D8OH8QeKloG4Bg">here</A>.
</BODY></HTML>
Making persistent connections the norm in HTTP/1.1 does much to improve latency. Re-using persistent connections to the same server makes succeeding request-response exchanges much cheaper. Re-using open connections also removes the overhead of the TCP handshake. The HTTP/1.1 protocol enablWeb eb application developers to call the same server multiple times within a singWeb eb session, especially fWeb eb pages featuring linked resources such as images.
Challenges in HTTP/1.1
Upon receivingWeb eb page,Web eb browser starts to load the embedded page elements. Typically, the browser will load linked resources in parallel to reduce the total latency of page loading. The browser has to use multiple connections in parallel because connections cannot be re-used before a response is received. In order to improve the total web-page loading time the browser must use quite a few connections in parallel.
Using parallel persistent connections is not engough to improve latency, however, because connections are not free. A dedicated connection consumes significant resources on both the client and server side. Each open connection can consume up to a dedicated thread or process on the server side, depending on the HTTP server in use. For this reason popular browsers do not allow more than eight connection in the same domain.
HTTP/1.1 attempts to resolve this issue via HTTP pipelining, which specifies that the next request can be sent before the response has been received. This is not a perfect solution, however. Because the server must send responses in the same order that requests are received, large or slow responses can block others responses behind it.
Introducing HTTP/2
HTTP/2 addresses latency issues by providing an optimized transport mechanism for HTTP semantics. A major goal of HTTP/2 was to maintain high-level compatibility with HTTP/1.1. Most of HTTP/1.1's high-level syntax -- such as methods, status codes, and header fields -- is unchanged. HTTP/2 does not obsolete HTTP/1.1's message syntax, and it uses the same URI schemes as HTTP/1.1. Because the two protocols share the same default port numbers you can use HTTP/1.1 or HTTP/2 over the same default port.
The raw network protocol for HTTP/2 is completely different from the protocol for HTTP 1.1. HTTP/2 is not a text-based protocol. Instead, it defines a binary, multiplexed network protocol. Telnet-based debugging will not work for HTTP/2. Instead you could use the popular command-line tool curl
or another HTTP/2-compatible client.
The basic protocol unit of HTTP/2 is a frame. In HTTP/2, frames are exchanged over a TCP connection instead of text-based messages. Before being transmitted an HTTP message is split into individual HTTP/2 frames. HTTP/2 provides different types of frames for different purposes, such as HEADERS
, DATA
, SETTINGS
, or GOAWAY
frames.
When establishing an HTTP connection the server has to know which network protocol to use. There are two ways to inform the server that it should use HTTP/2.
1. Server upgrade to HTTP/2
The first way to initiate an HTTP/2 protocol response is to use the HTTP Upgrade
header. In this case the client would begin by making a clear-text request, which would later be upgraded to the HTTP/2 protocol version.
Listing 4. Upgrade HTTP request
GET /index.html HTTP/1.1
Host: server.example.com
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c
HTTP2-Settings: <base64url encoding of HTTP/2 SETTINGS payload>
An HTTP/2-compatible server would accept the upgrade with a Switching Protocols
response. After the empty line terminating the 101 response, the server would begin sending HTTP/2 frames.
Listing 5. Switching Protocols HTTP response
HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: h2c
[ HTTP/2 connection ...