Why TCP needs 3 handshakes

215 points by thunderbong 4 days ago

veblen 5 hours ago

Imagine two blind people who want to have a conversation. Before they start, each person needs to ensure that the other can both speak and hear. Typically, one person begins by asking, 'Can you hear me?' to check if the other can hear them. The second person responds with 'Yes,' confirming that they can hear. Then, the second person asks, 'Can you hear me?' and the first person replies, 'Yes,' completing the process.

In total, there are four exchanges (two questions and two answers). However, if you look closely, the second person's reply of 'Yes' already confirms that they can both hear and speak. Therefore, the second 'Can you hear me?' is unnecessary. With just three exchanges (one question and two answers), both people know that they can send and receive messages.

blamarvt 4 hours ago

What if the first blind person was also deaf and just trolling the other blind person so wouldn't the second "Can you hear me?" be needed?
- kbmr 2 hours ago
  
  Troll Control Protocol
- colejohnson66 an hour ago
  
  Then you set the “evil” bit
tsimionescu 4 hours ago

Actually, the second answer is also unnecessary. The conversation can go like this:
A: Can you hear me?
B: Yes
A: What time is it?
B: 5 o'clock
A: Thank you, goodbye!
B: Goobye!
Nothing is lost compared to:
A: Can you hear me?
B: Yes
A: Yes
A: What time is it?
[...]
- GoatOfAplomb 2 hours ago
  
  You could also just start with "What time is it?" and see what you get back, right?
  - athenot 2 hours ago
    
    I believe that's UDP. :)
- dullcrisp 3 hours ago
  
  Well there the “what time is it?” serves as the “great, I can hear you too!” but I take it that the point is that B needs some reply from A to know that they can hear them.

Pikamander2 17 hours ago

> Theoretically, even more than three handshakes would not guarantee a "completely reliable" TCP connection. However, through three handshakes, it can at least be confirmed that the connection is "basically usable." Increasing the number of handshakes would merely increase the confidence level in the "connection availability."

This sounds like a variation of the Two Generals' Problem: https://en.wikipedia.org/wiki/Two_Generals%27_Problem

stavros 17 hours ago

Kind of, but not exactly. The article treats the channels as immutable, ie you can tell whether a channel is working or not by sending one packet. In this assumption, you'd have to send three packets for both sides to discover if all four ways work (server send, server receive, client send, client receive), but after that you'd need no more assurance.
In the two generals problem, the channel can fail at any time (which is what happens in real life), so no amount of handshakes can assure you. Because of the above, I don't agree with their conclusion that more handshakes is better. Either you assume immutable channels, so you only need three, or mutable channels that can fail any time, so you need infinite.
- cortesoft 4 hours ago
  
  > Either you assume immutable channels, so you only need three, or mutable channels that can fail any time, so you need infinite.
  This is true if you only care about having 100% confidence. Sending more handshakes allows you to do statistical analysis to give you more confidence that the connection is reliable.
- im3w1l 6 hours ago
  
  Well it's quite common to assume a channel has failed if its inactive too long, with periodic keep-alive messages to ensure it doesn't go inactive in the case where there is nothing to say.

sedatk 21 hours ago

TL;DR: "Because if TCP handshake were 2-way only, the receiver couldn't confirm if it could send packets successfully or sender could receive them".

And that sounds bogus to me. Connection initiation isn't about testing if packets can reach or not. It's about acknowledgement, building a two peer consensus about how to handle upcoming packets from a peer, not reliability checks. In that sense, I don't even think the third step is necessary, but apparently, it's needed to handle the case of both endpoints going into a timeout loop, this article explains it perfectly: https://www.baeldung.com/cs/handshakes

f1shy 17 hours ago

Thanks for the link with the correct explanation:
Particularly, the two-way handshake presents potential problems when the ACK message from the server delays too much. Thus, if a connection timeout occurs, the client sends another SYN message with a new sequence number (Z, for example) to the server. However, if the server previously sent an ACK (which is delayed), it’ll discard this new SYN message. The client, in turn, receives the delayed ACK and assumes that it refers to the last sent SYN message. Here’s where the error happens: the client will send messages with the sequence number Z, while the server expects messages following the sequence number X.
- tsimionescu 4 hours ago
  
  This is a bogus explanation, one that has nothing to do with TCP.
  In TCP every segment has a SEQ number and an ACK number, regardless of it being a handshake segment or a data segment. This completely negates the described problem: the server's first response to a SYN includes "SEQ=Y, ACK=X". If the client which just sent "SYN seq=Z" receives the ACK for SEQ=X, it will drop this ACK and wait for a new one. Also, a server which receives a new SYN has no reason to drop it, it will send a new ACK instead.
- zokier 11 hours ago
  
  That is possibly even more wrong explanation. Hosts do not blindly assume what is getting acknowledged by ACKs
- dcuthbertson 11 hours ago
  
  Thanks for explaining the why. I'm glad it falls somewhere between the X and the Z.
phicoh 11 hours ago

The thing the consider is who is the first to send data over the connection.
If it is the client, then a two-way handshake is enough: client sends a SYN, server sends a SYN+ACK, then the clients data which ACKs the server's SYN. These days that is the most common model. HTTP and TLS work like that.
However, what if the server sends first? For example the banner of SMTP. In that case the client sends a SYN, the server sends a SYN+ACK, the client sends an ACK and only then the server start sending data.
In general, an operating system doesn't know what is going to happen. So the client's kernel will just send the ACK immediately even if it will be followed by a data segment shortly after.
toast0 20 hours ago

You can only reach consensus if the peers can each receive packets that the other sent.
So reaching consensus and testing for reachability in both directions are the same.
In a client/server scenario, the client knows the connection is good when it receives the SYN+ACK, but the server doesn't know until it receives the resulting ACK. So the third packet is necessary to communicate consensus to the server; it doesn't need to be a pure ACK though, it can have data, if the client's stack makes it possible to queue outgoing data before the SYN+ACK is received.
- sedatk 19 hours ago
  
  > You can only reach consensus if the peers can each receive packets that the other sent. So reaching consensus and testing for reachability in both directions are the same.
  No, that's not the intent of a handshake. Assume a hypothetical Internet where every node has guaranteed connectivity to every other node that never fails. Do we suddenly lose the need to do a 3-way handshake? No. It's not about testing connectivity, that's semantically wrong. And what's the meaning of testing connectivity for a connection of an arbitrary length and with a quality of unknown degree throughout? It doesn't make any sense.
  There is no "knowing the connection is good", there is a process of building up consensus. The peers are only interested in an answer to this question: "Does the other party assume a connection?".
  If we knew the answer to that question beforehand, we wouldn't need a handshake at all. Reachability, transmissibility are all irrelevant. And, UDP actually works like that. Both endpoints assume willingness to connect. That's why you don't need a handshake with UDP.
  TCP-FO worked liked that too, removing the need for a TCP handshake completely, because it could persist the consensus information.
  - tsimionescu 3 hours ago
    
    We don't even need to assume a perfect internet where all hosts are always connected, we just need to assume an internet where all hosts are who they say they are, and there is no way to forge your source address. In that case, yes, you wouldn't need a three-way handshake. The client would send a SYN, the server would send a SYN-ACK plus any data it wants (e.g. an SMTP banner), and the client would send any data it wants as well with the next segment (e.g. an SMTP request). If the server doesn't receive anything, it could re-send the SYN-ACK+data, just like any other packet.
    The reason this isn't done in reality is mainly that using a server's resources to send a lot of data blindly, in response to a very small request, is wasteful (if the reverse path is not accessible, the server is using all those resources retrying for nothing). It is also dangerous, if the source IP of the SYN is forged, then the server ends up sending lots of data to a victim's IP. This often happens with DNS, for example.
  - toast0 12 hours ago
    
    UDP doesn't include a protocol handshake, and there is no consensus on if the peers are communicating, unless the application adds that.
    Lack of consensus/lack of handshake on UDP is why protocols with large responses become tools for DDoS reflection, and TCP protocols with large responses don't. TCP protocols with large responses can still be used for DDoS, but only when the target takes an active part --- either filling its outbound bandwidth to serve the large response to distributed requestors or being induced into making a request with a large response that fills its inbound bandwidth.
    TCP Fast Open is interesting, but didn't seem to go anywhere. Speculatively including one MSS worth of data in the handshake process seems like it could be useful in some circumstances, but the server side implications of processing a request without an expectation that the client will receive the response are hard. (Of course, TCP can't and doesn't guarantee the client will receive the response, but when you have recent consensus on communication, you can expect it will). The crypto context shows evidence of past communication, but does not show evidence of current communication.
    
    cryptonector 6 hours ago
    
    TCP-FO lets you use recent proof of return routability to get 1 MSS of data in the SYN packet. It's akin (if you squint hard) to multiplexing multiple connections onto one. So TCP-FO ought to have become useful, but mainly it's going to be useful in local networks, not on the big bad Internet. And TCP-FO does require more state on the server.
m463 19 hours ago

without a 3-way handshake, wouldn't it be easier to do spoofing? or a man-in-the-middle attack?
I think now there are ways to do the 3-way handshake in hardware at hardware speeds, and only involve software if the connection has been vetted. This can protect against Denial-Of-Service attacks.
- tsimionescu 4 hours ago
  
  There is no significant difference between a two-way handshake and a three-way handshake, given the other parts of the TCP protocol, IF the client is the one that sends the first piece of data after the handshake. It is in fact very common for optimized hosts to send data with the "third part of the handshake", which makes it perfectly equivalent to using a two-way handshake. This happens because the first segment the client sends after the handshake must ACK the server's sequence number, regardless of whether this is "the third part of the handshake" or "the first data packet after a two-way handshake".
  The problem appears if the server would like to be the first to send data on a new connection. If the server included its data with the SYN-ACK, that would work perfectly well with benign clients, but it would be a vector for DoS attacks. An attacker could send a small packet with a forged source IP, and cause the server to send a large response to the victim's IP. So, the server can't safely send data until it receives an ACK with its secret SEQ number from the client.
- yencabulator 3 hours ago
  
  Nobody's going to bother with new stuff for better spoof protection of TCP handshakes when TLS removes the actual attack and HTTP/3 obsoletes the whole mechanism.
- toast0 11 hours ago
  
  > I think now there are ways to do the 3-way handshake in hardware at hardware speeds, and only involve software if the connection has been vetted. This can protect against Denial-Of-Service attacks.
  Do we need hardware for this? Syncookies have provided a software method to handle large volumes of inbound syn without memory restrictions since the late 90s, and it's been in all major platforms except Mac since the late 2000s; Apple forked FreeBSD's tcp months before FreeBSD added syncookies, and last I checked, Apple never pulled them in. My testing is a bit old, but I had more trouble generating line rate syns at 2x10g than handling them several years ago. Are syncookies in software enough at 100g? I'm not sure, but I'd assume so. There's plenty of things to hardware accelerate on a NIC, but syn handling doesn't seem worthwhile IMHO.
- sedatk 19 hours ago
  
  No and no. IP spoofing was already trivial with 3-way handshake. That's why random TCP sequence numbers were introduced. It's now harder, but for some scenarios, SYN cookies might also be needed.
zokier 17 hours ago

Yeah, as a proof it is dubious as it doesn't really define what "established" means in this context.
The baeldung article is just wrong
> The client, in turn, receives the delayed ACK and assumes that it refers to the last sent SYN message
ACKs have acknowledgement numbers, so this sort of confusion can not happen.

gorfian_robot 5 hours ago

back in the day we had some chats with Vint Cerf during the development of Delay-tolerant networking (DTN) for primarily for use in space scenarios (though there are other scenarios). no way SYN, SYN-ACK, ACK type was gonna cut it. I found a light overview here: https://www.quantamagazine.org/vint-cerfs-plan-for-building-...

krackers 21 hours ago

There's this interesting comment by "John Day" on that page, does anyone have more context/detail?

whycombagator 13 hours ago

No, but I found the comment more interesting after learning what his background is (based on the name/email left in the comment):
> John Day has been involved in research and development of computer networks since 1970, when his group at the University of Illinois was the 12th node on ARPANet (precursor to the Internet) and has developed and designed protocols for everything from the data link layer to the application layer. Also making fundamental contributions to research on distributed databases. He managed the development of the OSI reference model, naming and addressing, and a major contributor to the upper-layer architecture. He was a major contributor to the development of network management architecture, working in the area since 1984 and building and deploying LAN products and a network management system, a decade ahead of comparable systems. Mr. Day has published Patterns in Network Architecture: A Return to Fundamentals (Prentice Hall, 2008), which has been characterized (embarrassingly) as “the most important book on network protocols in general and the Internet in particular ever written.” The book analyzes the fundamental flaws in the Internet and proposes what appears to be the only path forward. Today Mr. Day splits his time between making this new path a reality and teaching at Boston University. Mr. Day is also a recognized scholar in the history of cartography focusing on 17thC China, and is past President of the Boston Map Society.
despair3435 19 hours ago

You can find the original paper by Watson online which explains it in more detail. The 3 way handshake is in fact not necessary. I believe the delta-t protocol was one of the available protocols in OSI as well. TCP/IP being the standard now is not due to the fact that it was technically the best. In fact, there are multiple shortcomings.
The delta-t protocol is also used in RINA, which was invented by John Day. It is also used in Ouroboros (https://arxiv.org/pdf/2001.09707), and I can confirm it works. ;)
- tonyg 18 hours ago
  
  The actual delta-t protocol spec has historically been quite hard to find, but is freely available from here: https://www.osti.gov/biblio/5542785
  Also related and of interest in this connexion: CurveCP and its handshake, https://curvecp.org/packets.html
Joel_Mckay 20 hours ago

There were several competing standards on the early networks, and almost every ambitious commercial entity wanted to embed their licensed IP into the webs core transport layer or lower on the OSI stack.
We take for granted the inter-connectivity of most modern equipment, but to this day companies still try to create synthetic technology monopolies to cash-in. i.e. to sustain a tenuous service commodity out of something that has essentially been free since the mid 1990s.
https://xkcd.com/927/
Philosophically it doesn't matter TCP is imperfect, but rather that the inter-connectivity is compatible with the inertia of the installed infrastructure.
One can indeed optimistically ignore the TCP connection drop and syn part of standards to tunnel/reverse-proxy though certain censorship firewalls... but it still does not make it safe for the people that live under such regimes.
Does this make it more or less clear? =3
- Bluecobra 12 hours ago
  
  > We take for granted the inter-connectivity of most modern equipment, but to this day companies still try to create synthetic technology monopolies to cash-in. i.e. to sustain a tenuous service commodity out of something that has essentially been free since the mid 1990s.
  This is why as a network engineer I always advocate for open standards everywhere I can to avoid vendor lock-in. The classic one is using OSPF instead of EIGRP on Cisco routers (or their other proprietary protocols). Nowadays this is much tricker with cloud computing and black box stuff like SDN/SDWAN.
  - Joel_Mckay 10 hours ago
    
    Most good engineers I've met follow the same open standards prioritization recommendation.
    The drama that goes on inside Cisco could fill a soap opera season. =3

geon 18 hours ago

Great conclusion:

3 is an arbitrary number, but one that works well in practice.

ali_piccioni 17 hours ago

The information can be compressed into a table that shows what knowledge (columns) each party gain (rows) at each stage (cells) of the connection.

            Can         Can
           Transmit  | Receive
  
 Client    SYN-ACK     SYN—ACK            
 Server    ACK         SYN

ninkendo 16 hours ago

Is your bottom row backwards? Surely the server learns that the client can transmit when it receives a SYN and that it must have received the SYN-ACK when the ACK comes back.

teleforce 21 hours ago

I think the more important question is why TCP only using positive acknowledgement (ACK) but not negative acknowledgement (NACK)?

toast0 20 hours ago

Selective acknowledgement (SACK) effectively indicates which sequence numbers are missing when there's packet loss. It's optional, but afaik, used by nearly everything on today's internet.
keeperofdakeys 20 hours ago

What would a NACK add? TCP can already send an ACK for the last successful sequence number, telling the sender to retransmit packets after that sequence number. Due to latency and large window sizes, it's far more efficient to just resend all the data than NACK individual packets.
- retSava 19 hours ago
  
  Perhaps GP considers a NACK more semantically easier to understand on an intuitive level, instead of the implicit NACK with "I know you sent 8, but we're really at 4 so resume from that". But I do agree that it's more efficient for the remote host to just immediately establish "4 was the last point we are ok at".
ay 16 hours ago

It is not possible to distinguish between the absence of negative ACK and the loss of it.

ingen0s 17 hours ago

That’s commitment

ggm 20 hours ago

1RTT FTW!

Avamander 12 hours ago

TFO has unfortunately been ruined by middleboxes meddling. Maybe once IPv6 becomes more popular and NATs (that also mangle) fall out of favor we'll have more luck enabling it.

deathanatos 21 hours ago

I've always considered it one handshake, with 3 packets, which are for some reason then called "3 way". But it's one 3-way-handshake.

qwertox 20 hours ago

It is only one handshake, with 3 packets.
The author made a mistake in the title, but the content is mostly correct (fails at "First Handshake", "Second Handshake" and "Third Handshake").
He should call them stages, phases or something like that.
- AStonesThrow 19 hours ago
  
  To be really specific, they are not "packets" but at the TCP protocol level, they are "segments"
  TFA appears to use both terms sort of interchangeably, which is uncool when describing a formalized protocol standard!
  Segments are encapsulated in IP datagrams, or packets, which are, in turn, encapsulated in other PDU types, such as Ethernet frames.
  - Ekaros 19 hours ago
    
    And one could ask are segments without well data segment a segments? As mostly 2 first PDUs in exchange do not contain data...
    
    AStonesThrow 18 hours ago
    
    Yes they are. I don't know what "well data segment" is, but perhaps you mean a payload. TCP streams consist of segments, and the fragmentation or collection of those logical segments into packets is a function of the lower-layer protocols which are not TCP.
    https://www.rfc-editor.org/rfc/rfc9293.html
    
    tsimionescu 4 hours ago
    
    I think one important mention here is that fragmentation happens at two levels: most commonly, an upper layer protocol message (such as an HTTP request) is fragmented into several TCP segments, each with one TCP header. Those TCP segments themselves can sometimes be fragmented into multiple IP packets if the TCP MSS (maximum segment size) is larger than the Ethernet MTU (maximum transmission unit).
    IP fragmentation is typically avoided, and in IPv6 it is only allowed at the originator of a packet - if a router is trying to route an IPv6 packet larger than the MTU of the outgoing link's MTU, it must drop the packet and send a signal about the MTU back. An IPv4 router may fragment the packet, but this rarely works well in practice.
  - iwontberude 9 hours ago
    
    Appreciate pedantry when it’s absolutely warranted. Well said!
ddfs123 20 hours ago

I've consider it to be 4 handshakes but the middle 2 are merged.
- ay 19 hours ago
  
  There is a scenario called “simultaneous open”, which has 4 packets handshake and is a wonderful source of corner cases and debugging. Mostly doesn’t happen these days, but is possible by standard and is explicitly described.
  - winternewt 18 hours ago
    
    It has been used for NAT traversal when both parties are behind NAT, but sadly it doesn't work with all routers.
odo1242 12 hours ago

The reason it’s considered a “3-way” handshake is mostly latency related. If the delay sending a packet between the client and server is k, there are 3 hops involved in establishing a connection so the time is 3k. Confusingly, this has nothing to do with how many devices are involved.
- jabiko 9 hours ago
  
  > The reason it’s considered a “3-way” handshake is mostly latency relate
  Its not latency related. Its because there 3 packets involved in completing the handshake.
  > If the delay sending a packet between the client and server is k, there are 3 hops involved in establishing a connection so the time is 3k.
  Also I think calling it "hops" is not correct. A hop occurs when a packet passes through a router.
  > Confusingly, this has nothing to do with how many devices are involved.
  In a TCP handshake there are only two devices involved (disregarding any middleboxes that meddle with the connection).
Uptrenda 18 hours ago

Its just 1 packet: an RST with a data field containing a swear word.
- evilc00kie 18 hours ago
  
  Jokes like that make me realize I understand the matter more than I want to :'D
- odo1242 12 hours ago
  
  I did not know RST packets had a data field lol
foobiekr 8 hours ago

it's called 3-way because from the point of view of a network message diagram, there are three arrows which change direction at each step.
what it really amounts to is a three phase transaction.
sedatk 21 hours ago

"3 shake handshake"? :)
- butterfly42069 21 hours ago
  
  You, the server and the NSA Collection.
- nasretdinov 19 hours ago
  
  3 hands handshake, nothing unusual here
  - beng-nl 16 hours ago
    
    Next TCP: Zaphod
- AStonesThrow 8 hours ago
  
  It all started when Jon Postel obtained the Holy Hand Grenade of Antioch and reverse-engineered it...

aiava 9 hours ago

[flagged]