BackgroundSSL is used by HTTPS (and others) to secure the pipe between the browser and the web server. Once used primarily by sites performing financial transactions (banks, ecommerce), it is more and more used by services which require a login (e.g. Gmail, Twitter). As such, a fast HTTPS connection is more important than ever. HTTPS connection starts out as a TCP 3-way handshake, followed by an SSL/TLS handshake. Let's take a quick look at each one in more detail.
TCP 3-way HandshakeTCP 3-way handshake starts with the client sending a SYN packet to the server. Upon receipt, the server replies with its own SYN packet but also piggybacks the acknowledgement (that it received the client's SYN) in the same packet. Thus, the server replies with a SYN-ACK packet. Finally, when the client receives the server's SYN-ACK packet, it replies with an ACK to signify that it has received the SYN from the server. At this point the TCP connection is established and the client can immediately start sending data. Therefore, the latency cost of connection establishment is 1 round-trip in the case of the first message sent by the client (as in HTTP case) and 1.5 round-trips in the case where the server is the first to speak its mind (e.g. POP3).
There are a number of reasons for SYN-packets and 3-way handshake. First, the SYN (synchronize) packets are used to exchange the initial sequence numbers that will be used by both parties to ensure transport reliability (and flow control). Second, it is used to detect old (duplicate) SYN packets arriving at the host by asking the sender to validate them (see RFC793, Figure 8). Lastly, the handshake is used as a security measure. For example, suppose a TCP server is sitting behind a firewall that only accepts traffic from IP 184.108.40.206. If the connection would become established with the very first SYN packet, an attacker could create TCP/IP packet whose source IP would be 220.127.116.11 instead of his own and put the data in that very packet. The server would receive the SYN+data, create a connection and forward the data to the application. By sending a SYN-ACK and expecting it to echo the sequence number in the final ACK packet, the server can be certain that the source IP was not spoofed.
Interestingly, RFC793 does allow data to be included in the SYN packets. However, for the reasons described above, it requires the stack to queue the data for the delivery to the application only after the successful 3-way handshake.
SSL/TLS HandshakeOnce a TCP connection is established, SSL handshake begins with the client sending ClientHello message containing version, random bits, and a list of cipher-suites it supports (e.g. AES-256). It can also include a number of extension fields. The server responds with ServerHello containing selected version and cipher-suite, random bits and extension fields. The server then follows up with its certificate and ServerHelloDone message. The three messages can end up within one TCP packet. Finally, the client sends the cipher key (actually bits that will be used to compute it) and both the client and server send messages to switch from plain-text to encrypted communication.
The takeaway here is that just like a TCP handshake, SSL also requires several packets to be exchanged before the application level communication can commence. These exchanges add another 2 round-trips worth of latency.
Proposal: Combine TCP and SSL HandshakesWhat if TCP and SSL handshakes could be combined to save a round trip? Conceptually, TCP resides at layer 4 (transport) of the OSI reference model. Since it seems that nobody can quite figure out where SSL fits in, it doesn't seem all that unnatural to create "Secure TCP" protocol (not unlike IPSec). SSL handshake can then begin with the very first SYN packet:
TCP SYN, SSL ClientHello -->
<-- TCP SYN+ACK, SSL ServerHello+Cert+ServerHelloDone
TCP ACK, SSL ClientKeyExch+... -->
<-- TCP ACK, SSL ChangeCipherSpec+...
This scheme would almost have to push SSL processing into the kernel (since that's where TCP is usually handled) but perhaps some hybrid solution could be implemented (e.g. PKI done in userspace).