|FROM ||Ruben Safir
|SUBJECT ||Subject: [Hangout - NYLXS] tcp networking
|From hangout-bounces-at-nylxs.com Sun Jul 21 00:26:12 2019
Received: from www2.mrbrklyn.com (www2.mrbrklyn.com [126.96.36.199])
by mrbrklyn.com (Postfix) with ESMTP id 3AAAB161150;
Sun, 21 Jul 2019 00:26:11 -0400 (EDT)
Received: from mailbackend.panix.com (mailbackend.panix.com [188.8.131.52])
by mrbrklyn.com (Postfix) with ESMTP id EEC3616113A
for ; Sun, 21 Jul 2019 00:26:07 -0400 (EDT)
Received: from [10.0.0.62] (www3.mrbrklyn.com [184.108.40.206])
by mailbackend.panix.com (Postfix) with ESMTPSA id 45rsB701wgz1gkh
for ; Sun, 21 Jul 2019 00:26:06 -0400 (EDT)
From: Ruben Safir
Date: Sun, 21 Jul 2019 00:24:28 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Subject: [Hangout - NYLXS] tcp networking
List-Id: NYLXS Tech Talk and Politics
Content-Type: text/plain; charset="us-ascii"
TCP_NODELAY: 2019 Best Practices for TCP Optimization
How to use TCP settings and algorithms to get better performance out of
Updated February 22, 2019
Download our TCP Optimization white paper for a comprehensive,
If you want to really understand TCP optimization techniques, how to
decide which to use, and how to implement them, you have come to the
right place. This post was so long and rich we decided to split it into
sections and give it a table of contents. Enjoy!
Einstein Struggling with TCP_NODELAY, probablySupergenius Albert
Einstein struggling to understand TCP_NODELAY and Nagle Delays (probably).
Table of Contents:
TCP background information: Why Nagle's Algorithm and Delayed ACK
were implemented and how they interact
Nagle's algorithm and Delayed ACK do not play well together in a
What are TCP_NODELAY and TCP_QUICKACK, and what do they do?
Resources on Delayed ACK, Nagle Delays, Tinygrams, Silly Window
Syndrome, and other TCP issues
Should I enable TCP_NODELAY?
How do I figure out if I should enable TCP_NODELAY?
How do I know if TCP_NODELAY is helping?
How can I resolve the issues caused by Nagle's algorithm and Delayed
1. TCP Background Information: Why Nagle's Algorithm and Delayed ACK
Were Implemented, and How They Interact
Today's internet is a large and global TCP/IP network that sends web
pages and huge files of all types across great distances. A lot has
changed since the internet was initially built, when small academic and
government networks largely used the Telnet and Network Control Program
(NCP) protocols. The internet has grown exponentially since its
inception, and as more types of traffic, devices, and protocols have
come online, the importance of managing this traffic efficiently has
grown as well.
When the TCP/IP stack took over as the dominant protocol in the early
1980s, leaving Telnet to more specialized purposes, there were finally
settings available to optimize traffic flow and avoid congestion and
data loss. Even now, though, it can be difficult to know when and how to
use these settings. This article will make clear some of the best use
cases for common TCP optimization settings and techniques, specifically
Nagle's Algorithm, TCP_NODELAY, Delayed ACK and TCP_QUICKACK.
Nagle's algorithm, named after its creator John Nagle, is one mechanism
for improving TCP efficiency by reducing the number of small packets
sent over the network. The goal was to prevent a node from transmitting
many small packets if the application delivers data to the socket rather
slowly. If a process is causing many small packets to be transmitted, it
may be creating undue network congestion. This is especially true if the
payload of a packet is smaller than the TCP header data.
TCP_NODELAY cartoon illustrationYou wouldn't rent a whole moving truck
to move one dresser. Why send a 1-byte Telnet instruction in a 40-byte
This is analogous to loading one dresser into a huge moving truck and
then driving across town. Unless the dresser needs to get there
immediately, you might as well wait and fill the truck up. That's what
Nagle's algorithm does. Nagle's algorithm is used to optimize the data
transfer by consolidating multiple small request bytes into a single TCP
segment so that the ratio of header data to payload is more efficient.
TCP headers take up 40 bytes, and there are plenty of applications that
can emit a single byte of payload. If your environment is configured to
send data immediately, you could end up sending a 41 byte packet with
only one byte of actual payload.
TCP delayed acknowledgment or Delayed ACK is another technique used by
some implementations of the TCP in an effort to improve network
performance and reduce congestion. Delayed ACK was invented to reduce
the number of ACKs required to acknowledge the segments and reduce the
protocol overhead. Delayed ACK is the destination retaining the ACK
segment for the value of the delayed ACK timer, about 200 - 500 ms.
Delayed ACK means TCP doesn't immediately acknowledge every single
received TCP segment. Several ACK responses may be combined together
into a single response, reducing protocol overhead. Delayed ACK is
basically a bet taken by the destination betting 200 - 500 ms, that a
new packet will arrive before the delayed ACK timer expires. Though in
some circumstances, the technique can cause a reduction in the
It is important to understand the performance impact on your
applications when you're deciding which TCP optimization methods to
Nagle's Algorithm and Delayed ACK were created around the same time, but
due to lack of collaboration between the creators, they provided an
incomplete and sometimes conflicting solution. John Nagle himself
expressed frustration about the situation in a Hacker News thread on the
That still irks me. The real problem is not tinygram prevention. It's
ACK delays, and that stupid fixed timer. They both went into TCP around
the same time, but independently. I did tinygram prevention (the Nagle
algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The
combination of the two is awful.
2. Nagle's Algorithm and Delayed ACK Do Not Play Well Together in a
By default, Nagle's algorithm and Delayed ACK are broadly implemented
across networks, including the internet. Nagle's algorithm effectively
only allows one packet to be actively transporting on the network at any
given time, this tends to hold back traffic due to the interactions
between the Nagle's algorithm and delayed ACKs. Hence Nagle's algorithm
is undesirable in highly interactive environments.
For example: Delayed ACK tries to send more data per segment if it can.
But part of Nagle's algorithm depends on an ACK to send data. Nagle's
algorithm and Delayed ACKs together create a problem because Delayed
ACKs are waiting around to send the ACK while Nagle's is waiting around
to receive the ACK! This creates random stalls of 200-500ms on segments
that could otherwise be sent immediately and delivered to the
receive-side stack and apps above it.
In situations where you need your data to be transmitted immediately and
one-way latency matters, such as when transmitting user interactions
like keypresses or mouse movements from a client to a central server
using Telnet, turning off the Nagle algorithm can make for a better user
experience. But for almost everything else where only round trip time
matters, not one-way, then turning off the Nagle algorithm may not help.
Delayed ACKs can help in certain circumstances, such as when using the
character echo option in Telnet. If the ACKs are tiny and don't use much
bandwidth then Delayed ACK is not of much help. These intricacies make
it tough to tell when to use Nagle's algorithm, Delayed ACK, and other
TCP optimization options.
There's nothing in TCP to automatically turn Nagle's algorithm or
Delayed ACKs off, so you have to understand your network well enough to
choose the options that will provide the best performance.
3. What Are TCP_NODELAY and TCP_QUICKACK, And What Do They Do?
It is very important to understand the interactions between Nagle's
algorithm and Delayed ACKs. The TCP_NODELAY socket option allows your
network to bypass Nagle Delays by disabling Nagle's algorithm, and
sending the data as soon as it's available. Enabling TCP_NODELAY forces
a socket to send the data in its buffer, whatever the packet size. To
disable Nagle's buffering algorithm, use the TCP_NODELAY socket option.
To disable Delayed ACKs, use the TCP_QUICKACK socket option.
Enabling the TCP_NODELAY option turns Nagle's algorithm off. In the case
of interactive applications or chatty protocols with a lot of handshakes
such as SSL, Citrix and Telnet, Nagle's algorithm can cause a drop in
performance, whereas enabling TCP_NODELAY can improve the performance.
In any request-response application protocols where request data can be
larger than a packet, this can artificially impose a few hundred
milliseconds latency between the requester and the responder, even if
the requester has properly buffered the request data. Nagle's algorithm
should be disabled by enabling TCP_NODELAY by the requester in this
case. If the response data can be larger than a packet, the responder
should also disable Nagle's algorithm by enabling TCP_NODELAY so the
requester can promptly receive the whole response.
4. More Resources on Nagle Delays, Delayed ACK, Tinygrams, and Silly
To Nagle or Not to Nagle, That is the Question
Nagle delays explained
What Is a Tinygram?
What is Silly Window Syndrome
TCP profile setting for the BIG-IP
5. Should I Enable TCP_NODELAY?
It really depends on what is your specific workload and dominant traffic
patterns on a service. Typically Local Area Networks (LANs) have less
issues with traffic congestion as compared to the Wide Area Networks (WANs).
If you are dealing with non-interactive type traffic or bulk transfers
such as SOAP, XMLRPC, HTTP/web traffic then enabling TCP_NODELAY to
disable Nagle's algorithm is unnecessary.
Some contexts where Nagle's algorithm won't help and TCP_NODELAY should
be enabled are:
Highly interactive applications that communicate with a central
server (Citrix, networked video games, etc)
Telnet-connected devices Applications using chatty protocols
6. How Do I Figure Out if I Should Enable TCP_NODELAY?
There is no simple rule of thumb as this is very dependent on your
traffic patterns and application mix, but here's a good test you can do
if you have ExtraHop. Leave the ExtraHop Discover appliance running to
get some baseline data, then look at the TCP stat under your key
switches. Are you seeing a high number of "tinygrams" (packets that
contain a relatively small payload compared to the overhead associated
with the headers required to transfer the data.)
If you see lots of tinygrams or a high number of Nagle Delays as a
percentage of overall traffic, then disable TCP_NODELAY that will allow
Nagle's algorithm to reduce the tinygrams. Again leave the EDA running
for some time and then look at the tinygram number, if this number is
still very high then enable TCP_NODELAY, indicating Nagle's algorithm is
not reducing the tinygrams.
Tuning tends to be an iterative process. It takes a some experimentation
to know if you should or should not enable TCP_NODELAY, and your needs
will change over time as your networking stack and applications grow and
7. How do I know if TCP_NODELAY is helping?
After enabling TCP_NODELAY to disable Nagle's algorithm and going
through the process of tuning, if you see a very low number of Nagle
Delays as a percentage of overall traffic and a very low number of
tinygrams then you know enabling TCP_NODELAY is helping.
Conversely if you see a high number of Nagle Delays as a percentage of
overall traffic and a very high number of tinygrams then enabling
TCP_NODELAY probably is not the best fit for your use case.
8. How can I resolve the issues caused by Nagle's algorithm and Delayed
If you have been through the tuning process and are still seeing network
congestion issues, you may have problems that can't be solved by
tweaking your socket settings. However, there are a few more things to
try before giving up:
Enable TCP_NODELAY to disable Nagle's algorithm via global socket
options on the servers
Make profile tweaks on proxy servers and Load Balancers: This is
especially relevant if you're running applications or environments that
only sometimes have highly interactive traffic and chatty protocols. By
dynamically switching Nagle's Algorithm and TCP_NODELAY on and off at
the load balancer level, you can keep even highly heterogeneous traffic
mixes running optimally.
Reduce the Delayed ACK timer on your servers and load balancers.
Sometimes, this kind of optimization is handled in software, at the
application level, but when that's not the case, you may still be able
to dynamically manage the ACK timer at the server or load balancer level.
As you're making these changes, keep careful watch on your network
traffic and see how each tweak impacts congestion.
At ExtraHop, we get to take a detailed look at plenty of enormous
corporate networks, and you'd be surprised how often a major company has
purchased hundreds of thousands of dollars in additional network gear
unnecessarily because their core protocols, the TCP/IP stack, weren't
optimized for their application traffic mix. It really pays to try
optimizing your current environment before throwing more hardware at the
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive
http://www.coinhangout.com - coins!
Being so tracked is for FARM ANIMALS and extermination camps,
but incompatible with living as a free human being. -RI Safir 2013
Hangout mailing list