In this section
we return to our study of TCP. As we learned in Section 3.5, TCP provides
a reliable transport service between two processes running on different
hosts. Another extremely important component of TCP is its congestion-control
mechanism. As we indicated in the previous section, TCP must use end-to-end
congestion control rather than network-assisted congestion control, since
the IP layer provides no explicit feedback to the end systems regarding
network congestion. Before diving into the details of TCP congestion control,
let's first get a high-level view of TCP's congestion-control mechanism,
as well as the overall goal that TCP strives for when multiple TCP connections
must share the bandwidth of a congested link.
A TCP connection
controls its transmission rate by limiting its number of transmitted-but-yet-to-be-acknowledged
segments. Let us denote this number of permissible unacknowledged segments
as w, often referred to as the TCP window size. Ideally,
TCP connections should be allowed to transmit as fast as possible (that
is, to have as large a number of outstanding unacknowledged segments as
possible) as long as segments are not lost (dropped at routers) due to
congestion. In very broad terms, a TCP connection starts with a small value
of w and then "probes" for the existence of additional unused link
bandwidth at the links on its end-to-end path by increasing w. A
TCP connection continues to increase w until a segment loss occurs
(as detected by a timeout or duplicate acknowledgments). When such a loss
occurs, the TCP connection reduces w to a "safe level" and then
begins probing again for unused bandwidth by slowly increasing w.
An important
measure of the performance of a TCP connection is its throughput--the rate
at which it transmits data from the sender to the receiver. Clearly, throughput
will depend on the value of w. If a TCP sender transmits all w
segments back to back, it must then wait for one round-trip time (RTT)
until it receives acknowledgments for these segments, at which point it
can send w additional segments. If a connection transmits w
segments of size MSS bytes every RTT seconds, then the connection's
throughput, or transmission rate, is (w · MSS)/RTT
bytes per second.
Suppose now
that K TCP connections are traversing a link of capacity R.
Suppose also that there are no UDP packets flowing over this link, that
each TCP connection is transferring a very large amount of data and that
none of these TCP connections traverse any other congested link. Ideally,
the window sizes in the TCP connections traversing this link should be
such that each connection achieves a throughput of R/K. More
generally, if a connection passes through N links, with link n
having transmission rate Rn and supporting a total of
Kn TCP connections, then ideally this connection should
achieve a rate of Rn/Kn on the nth
link. However, this connection's end-to-end average rate cannot exceed
the minimum rate achieved at all of the links along the end-to-end path.
That is, the end-to-end transmission rate for this connection is r
= min{R1/K1, . . ., RN/KN}.
We could think of the goal of TCP as providing this connection with this
end-to-end rate, r. (In actuality, the formula for r is more
complicated, as we should take into account the fact that one or more of
the intervening connections may be bottlenecked at some other link that
is not on this end-to-end path and hence cannot use their bandwidth share,
Rn/Kn. In this case, the value of r
would be higher than min{R1/K1, . .
. , RN/KN}. See [Bertsekas
1991].)
3.7.1: Overview
of TCP Congestion Control
In Section 3.5
we saw that each side of a TCP connection consists of a receive buffer,
a send buffer, and several variables (LastByteRead, RcvWin,
and so on.) The TCP congestion-control mechanism has each side of the connection
keep track of two additional variables: the congestion window and
the threshold. The congestion window, denoted CongWin,
imposes an additional constraint on how much traffic a host can send into
a connection. Specifically, the amount of unacknowledged data that a host
can have within a TCP connection may not exceed the minimum of CongWin
and RcvWin, that is:
LastByteSent
- LastByteAcked min{CongWin,
RcvWin}
The threshold,
which we discuss in detail below, is a variable that affects how CongWin
grows.
Let us now look
at how the congestion window evolves throughout the lifetime of a TCP connection.
In order to focus on congestion control (as opposed to flow control), let
us assume that the TCP receive buffer is so large that the receive window
constraint can be ignored. In this case, the amount of unacknowledged data
that a host can have within a TCP connection is solely limited by CongWin.
Further let's assume that a sender has a very large amount of data to send
to a receiver.
Once a TCP connection
is established between the two end systems, the application process at
the sender writes bytes to the sender's TCP send buffer. TCP grabs chunks
of size MSS, encapsulates each chunk within a TCP segment, and passes the
segments to the network layer for transmission across the network. The
TCP congestion window regulates the times at which the segments are sent
into the network (that is, passed to the network layer). Initially, the
congestion window is equal to one MSS. TCP sends the first segment into
the network and waits for an acknowledgment. If this segment is acknowledged
before its timer times out, the sender increases the congestion window
by one MSS and sends out two maximum-size segments. If these segments are
acknowledged before their timeouts, the sender increases the congestion
window by one MSS for each of the acknowledged segments, giving a congestion
window of four MSS, and sends out four maximum-sized segments. This procedure
continues as long as (1) the congestion window is below the threshold and
(2) the acknowledgments arrive before their corresponding timeouts.
During this
phase of the congestion-control procedure, the congestion window increases
exponentially fast. The congestion window is initialized to one MSS; after
one RTT, the window is increased to two segments; after two round-trip
times, the window is increased to four segments; after three round-trip
times, the window is increased to eight segments, and so forth. This phase
of the algorithm is called slow start because it begins with a small
congestion window equal to one MSS. (The transmission rate of the connection
starts slowly but accelerates rapidly.)
The slow-start
phase ends when the window size exceeds the value of threshold.
Once the congestion window is larger than the current value of threshold,
the congestion window grows linearly rather than exponentially. Specifically,
if w is the current value of the congestion window, and w
is larger than threshold, then after w acknowledgments
have arrived, TCP replaces w with w + 1. This has the effect
of increasing the congestion window by 1 in each RTT for which an entire
window's worth of acknowledgments arrives. This phase of the algorithm
is called congestion avoidance.
The congestion-avoidance
phase continues as long as the acknowledgments arrive before their corresponding
timeouts. But the window size, and hence the rate at which the TCP sender
can send, cannot increase forever. Eventually, the TCP rate will be such
that one of the links along the path becomes saturated, at which point
loss (and a resulting timeout at the sender) will occur. When a timeout
occurs, the value of threshold is set to half the value of the
current congestion window, and the congestion window is reset to one MSS.
The sender then again grows the congestion window exponentially fast using
the slow-start procedure until the congestion window hits the threshold.
In summary:
-
When the congestion
window is below the threshold, the congestion window grows exponentially.
-
When the congestion
window is above the threshold, the congestion window grows linearly.
-
Whenever there
is a timeout, the threshold is set to one-half of the current congestion
window and the congestion window is then set to 1.
If we ignore the
slow-start phase, we see that TCP essentially increases its window size
by 1 each RTT (and thus increases its transmission rate by an additive
factor) when its network path is not congested, and decreases its window
size by a factor of 2 each RTT when the path is congested. For this reason,
TCP is often referred to as an additive-increase, multiplicative-decrease
(AIMD) algorithm.
The evolution
of TCP's congestion window is illustrated in Figure 3.49. In this figure,
the threshold is initially equal to 8 • MSS. The congestion window
climbs exponentially fast during slow start and hits the threshold at the
third transmission. The congestion window then climbs linearly until loss
occurs, just after transmission 7. Note that the congestion window is 12
• MSS when loss occurs. The threshold is then set to 0.5 • CongWin
= 6 • MSS and the congestion window is set 1. And the process continues.
This congestion-control algorithm is due to V. Jacobson [Jacobson
1988]; a number of modifications to Jacobson's initial algorithm are
described in Stevens (1994) and in RFC 2581.
Figure 3.49:
Evolution of TCP's congestion window
We note briefly
here that the description of TCP slow start is an idealized one. An initial
window of up to two MSS's is a proposed standard [RFC
2581] and it is actually used in some implementations.
A Trip to
Nevada: Tahoe, Reno, and Vegas
The TCP congestion-control
algorithm just described is often referred to as Tahoe. One problem
with the Tahoe algorithm is that, when a segment is lost, the sender side
of the application may have to wait a long period of time for the timeout.
For this reason, a variant of Tahoe, called Reno, is implemented
by most operating systems. Like Tahoe, Reno sets its congestion window
to one segment upon the expiration of a timer. However, Reno also includes
the fast retransmit mechanism that we examined in Section 3.5. Recall that
fast retransmission triggers the transmission of a dropped segment if three
duplicate ACKs for a segment are received before the occurrence of the
segment's timeout. Reno also employs a fast-recovery mechanism that
essentially cancels the slow-start phase after a fast retransmission. The
interested reader is encouraged to see [Stevens
1994] and in [RFC
2581] for details. [Cela
2000] provides interactive animations of congestion avoidance, slow
start, fast retransmit, and fast recovery in TCP.
Most TCP implementations
currently use the Reno algorithm. There is, however, another algorithm
in the literature, the Vegas algorithm, that can improve Reno's performance.
Whereas Tahoe and Reno react to congestion (that is, to overflowing router
buffers), Vegas attempts to avoid congestion while maintaining good throughput.
The basic idea of Vegas is to (1) detect congestion in the routers between
source and destination before packet loss occurs and (2) lower the
rate linearly when this imminent packet loss is detected. Imminent packet
loss is predicted by observing the round-trip times. The longer the round-trip
times of the packets, the greater the congestion in the routers. The Vegas
algorithm is discussed in detail in [Brakmo
1995]; a study of its performance is given in [Ahn
1995]. As of 1999, Vegas is not a part of the most popular TCP implementations.
We emphasize
that TCP congestion control has evolved over the years, and is still evolving.
What was good for the Internet when the bulk of the TCP connections carried
SMTP, FTP, and Telnet traffic is not necessarily good for today's Web-dominated
Internet or for the Internet of the future, which will support who-knows-what
kinds of services.
Does TCP
Ensure Fairness?
In the above
discussion, we noted that a goal of TCP's congestion-control mechanism
is to share a bottleneck link's bandwidth evenly among the TCP connections
that are bottlenecked at that link. But why should TCP's additive-increase,
multiplicative-decrease algorithm achieve that goal, particularly given
that different TCP connections may start at different times and thus may
have different window sizes at a given point in time? [Chiu
1989] provides an elegant and intuitive explanation of why TCP congestion
control converges to provide an equal share of a bottleneck link's bandwidth
among competing TCP connections.
Let's consider
the simple case of two TCP connections sharing a single link with transmission
rate R, as shown in Figure 3.50. We'll assume that the two connections
have the same MSS and RTT (so that if they have the same congestion window
size, then they have the same throughput), that they have a large amount
of data to send, and that no other TCP connections or UDP datagrams traverse
this shared link. Also, we'll ignore the slow-start phase of TCP and assume
the TCP connections are operating in congestion-avoidance mode (additive-increase,
multiplicative-decrease) at all times.
Figure 3.50:
Two TCP connections sharing a single bottleneck link
Figure 3.51
plots the throughput realized by the two TCP connections. If TCP is to
equally share the link bandwidth between the two connections, then the
realized throughput should fall along the 45-degree arrow ("equal bandwidth
share") emanating from the origin. Ideally, the sum of the two throughputs
should equal R. (Certainly, each connection receiving an equal,
but zero, share of the link capacity is not a desirable situation!) So
the goal should be to have the achieved throughputs fall somewhere near
the intersection of the "equal bandwidth share" line and the "full bandwidth
utilization" line in Figure 3.51.
Figure 3.51:
Throughput realized by TCP connections 1 and 2
Suppose that
the TCP window sizes are such that at a given point in time, connections
1 and 2 realize throughputs indicated by point A in Figure 3.51.
Because the amount of link bandwidth jointly consumed by the two connections
is less than R, no loss will occur, and both connections will increase
their window by 1 per RTT as a result of TCP's congestion-avoidance algorithm.
Thus, the joint throughput of the two connections proceeds along a 45-degree
line (equal increase for both connections) starting from point A.
Eventually, the link bandwidth jointly consumed by the two connections
will be greater than R and eventually packet loss will occur. Suppose
that connections 1 and 2 experience packet loss when they realize throughputs
indicated by point B. Connections 1 and 2 then decrease their windows
by a factor of two. The resulting throughputs realized are thus at point
C, halfway along a vector starting at B and ending at the
origin. Because the joint bandwidth use is less than R at point
C, the two connections again increase their throughputs along a
45-degree line starting from C. Eventually, loss will again occur,
for example, at point D, and the two connections again decrease
their window sizes by a factor of two, and so on. You should convince yourself
that the bandwidth realized by the two connections eventually fluctuates
along the equal bandwidth share line. You should also convince yourself
that the two connections will converge to this behavior regardless of where
they are in the two-dimensional space! Although a number of idealized assumptions
lay behind this scenario, it still provides an intuitive feel for why TCP
results in an equal sharing of bandwidth among connections.
In our idealized
scenario, we assumed that only TCP connections traverse the bottleneck
link, and that only a single TCP connection is associated with a host-destination
pair. In practice, these two conditions are typically not met, and client/
server applications can thus obtain very unequal portions of link bandwidth.
Many network
applications run over TCP rather than UDP because they want to make use
of TCP's reliable transport service. But an application developer choosing
TCP gets not only reliable data transfer but also TCP congestion control.
We have just seen how TCP congestion control regulates an application's
transmission rate via the congestion-window mechanism. Many multimedia
applications do not run over TCP for this very reason--they do not want
their transmission rate throttled, even if the network is very congested.
In particular, many Internet telephone and Internet video conferencing
applications typically run over UDP. These applications prefer to pump
their audio and video into the network at a constant rate and occasionally
lose packets, rather than reduce their rates to "fair" levels at times
of congestion and not lose any packets. From the perspective of TCP, the
multimedia applications running over UDP are not being fair--they do not
cooperate with the other connections nor adjust their transmission rates
appropriately. A major challenge in the upcoming years will be to develop
congestion-control mechanisms for the Internet that prevent UDP traffic
from bringing the Internet's throughput to a grinding halt, [Floyd
1999].
But even if
we could force UDP traffic to behave fairly, the fairness problem would
still not be completely solved. This is because there is nothing to stop
an application running over TCP from using multiple parallel connections.
For example, Web browsers often use multiple parallel TCP connections to
transfer a Web page. (The exact number of multiple connections is configurable
in most browsers.) When an application uses multiple parallel connections,
it gets a larger fraction of the bandwidth in a congested link. As an example,
consider a link of rate R supporting nine ongoing client/server
applications, with each of the applications using one TCP connection. If
a new application comes along and also uses one TCP connection, then each
application gets approximately the same transmission rate of R/10.
But if this new application instead uses 11 parallel TCP connections, then
the new application gets an unfair allocation of more than R/2.
Because Web traffic is so pervasive in the Internet, multiple parallel
connections are not uncommon.
Macroscopic
Description of TCP Dynamics
Consider sending
a very large file over a TCP connection. If we take a macroscopic view
of the traffic sent by the source, we can ignore the slow-start phase.
Indeed, the connection is in the slow-start phase for a relatively short
period of time because the connection grows out of the phase exponentially
fast. When we ignore the slow-start phase, the congestion window grows
linearly, gets chopped in half when loss occurs, grows linearly, gets chopped
in half when loss occurs, and so on. This gives rise to the saw-tooth behavior
of TCP [Stevens
1994] shown in Figure 3.49.
Given this saw-tooth
behavior, what is the average throughput of a TCP connection? During a
particular round-trip interval, the rate at which TCP sends data is a function
of the congestion window and the current RTT. When the window size
is w • MSS and the current round-trip time is RTT,
then TCP's transmission rate is (w • MSS)/RTT. During
the congestion-avoidance phase, TCP probes for additional bandwidth by
increasing w by one each RTT until loss occurs. (Denote by
W the value of w at which loss occurs.) Assuming that RTT
and W are approximately constant over the duration of the connection,
the TCP transmission rate ranges from

These assumptions
lead to a highly simplified macroscopic model for the steady-state behavior
of TCP. The network drops a packet from the connection when the connection's
window size increases to W • MSS; the congestion window is
then cut in half and then increases by one MSS per round-trip time until
it again reaches W. This process repeats itself over and over again.
Because the TCP throughput increases linearly between the two extreme values,
we have:
Average throughput
of a connection = 
Using this highly
idealized model for the steady-state dynamics of TCP, we can also derive
an interesting expression that relates a connection's loss rate to its
available bandwidth [Mahdavi
1997]. This derivation is outlined in the homework problems.
3.7.2: Modeling
Latency: Static Congestion Window
Many TCP connections
transport relatively small files from one host to another. For example,
with HTTP/1.0, each object in a Web page is transported over a separate
TCP connection, and many of these objects are small text files or tiny
icons. When transporting a small file, TCP connection establishment and
slow start may have a significant impact on the latency. In this section
we present an analytical model that quantifies the impact of connection
establishment and slow start on latency. For a given object, we define
the latency as the time from when the client initiates a TCP connection
until the time at which the client receives the requested object in its
entirety.
The analysis
presented here assumes that the network is uncongested, that is, that the
TCP connection transporting the object does not have to share link bandwidth
with other TCP or UDP traffic. (We comment on this assumption below.) Also,
in order to not obscure the central issues, we carry out the analysis in
the context of the simple one-link network as shown in Figure 3.52. (This
link might model a single bottleneck on an end-to-end path. See also the
homework problems for an explicit extension to the case of multiple links.)
Figure 3.52:
A simple one-link network connecting a client and a server
We also make
the following simplifying assumptions:
-
The amount of data
that the sender can transmit is solely limited by the sender's congestion
window. (Thus, the TCP receive buffers are large.)
-
Packets are neither
lost nor corrupted, so that there are no retransmissions.
-
All protocol header
overheads--including TCP, IP, and link-layer headers--are negligible and
ignored.
-
The object (that
is, file) to be transferred consists of an integer number of segments of
size MSS (maximum segment size).
-
The only packets
that have non-negligible transmission times are packets that carry maximum-size
TCP segments. Request messages, acknowledgments, and TCP connection establishment
segments are small and have negligible transmission times.
-
The initial threshold
in the TCP congestion-control mechanism is a large value that is never
attained by the congestion window.
We also introduce
the following notation:
-
The size of the
object to be transferred is O bits.
-
The MSS (maximum
size segment) is S bits (for example, 536 bytes).
-
The transmission
rate of the link from the server to the client is R bps.
-
The round-trip
time is denoted by RTT.
In this section
we define the RTT to be the time elapsed for a small packet to travel
from client to server and then back to the client, excluding the transmission
time of the packet. It includes the two end-to-end propagation delays
between the two end systems and the processing times at the two end systems.
We shall assume that the RTT is also equal to the roundtrip time of a packet
beginning at the server.
Although the
analysis presented in this section assumes an uncongested network with
a single TCP connection, it nevertheless sheds insight on the more realistic
case of multilink congested network. For a congested network, R
roughly represents the amount of bandwidth received in steady state in
the end-to-end network connection, and RTT represents a round-trip delay
that includes queuing delays at the routers preceding the congested links.
In the congested network case, we model each TCP connection as a constant-bit-rate
connection of rate R bps preceded by a single slow-start phase.
(This is roughly how TCP Tahoe behaves when losses are detected with triple
duplicate acknowledgments.) In our numerical examples, we use values of
R and RTT that reflect typical values for a congested network.
Before beginning
the formal analysis, let us try to gain some intuition. Let us consider
what would be the latency if there were no congestion-window constraint;
that is, if the server were permitted to send segments back-to-back until
the entire object is sent. To answer this question, first note that one
RTT is required to initiate the TCP connection. After one RTT, the client
sends a request for the object (which is piggybacked onto the third segment
in the three-way TCP handshake). After a total of two RTTs, the client
begins to receive data from the server. The client receives data from the
server for a period of time O/R, the time for the server
to transmit the entire object. Thus, in the case of no congestion-window
constraint, the total latency is 2 RTT + O/R. This
represents a lower bound; the slow-start procedure, with its dynamic congestion
window, will of course elongate this latency.
Static Congestion
Window
Although TCP
uses a dynamic congestion window, it is instructive to first analyze the
case of a static congestion window. Let W, a positive integer, denote
a fixed-size static congestion window. For the static congestion window,
the server is not permitted to have more than W unacknowledged outstanding
segments. When the server receives the request from the client, the server
immediately sends W segments back-to-back to the client. The server
then sends one segment into the network for each acknowledgment it receives
from the client. The server continues to send one segment for each acknowledgment
until all of the segments of the object have been sent. There are two cases
to consider:
-
WS/R
> RTT + S/R. In this case, the server receives an
acknowledgment for the first segment in the first window before the server
completes the transmission of the first window.
-
WS/R
< RTT + S/R. In this case, the server transmits
the first window's worth of segments before the server receives an acknowledgment
for the first segment in the window.
Let us first consider
case 1, which is illustrated in Figure 3.53. In this figure the window
size is W = 4 segments.
Figure 3.53:
The case that WS/R > RTT + S/R
One RTT is required
to initiate the TCP connection. After one RTT, the client sends a request
for the object (which is piggybacked onto the third segment in the three-way
TCP handshake). After a total of two RTTs, the client begins to receive
data from the server. Segments arrive periodically from the server every
S/R seconds, and the client acknowledges every segment it
receives from the server. Because the server receives the first acknowledgment
before it completes sending a window's worth of segments, the server continues
to transmit segments after having transmitted the first window's worth
of segments. And because the acknowledgments arrive periodically at the
server every S/R seconds from the time when the first acknowledgment
arrives, the server transmits segments continuously until it has transmitted
the entire object. Thus, once the server starts to transmit the object
at rate R, it continues to transmit the object at rate R
until the entire object is transmitted. The latency therefore is 2 RTT
+ O/R.
Now let us consider
case 2, which is illustrated in Figure 3.54. In this figure, the window
size is W = 2 segments.
Figure 3.54:
The case that WS/R < RTT + S/R
Once again,
after a total of two RTTs, the client begins to receive segments from the
server. These segments arrive periodically every S/R seconds,
and the client acknowledges every segment it receives from the server.
But now the server completes the transmission of the first window before
the first acknowledgment arrives from the client. Therefore, after sending
a window, the server must stall and wait for an acknowledgment before resuming
transmission. When an acknowledgment finally arrives, the server sends
a new segment to the client. Once the first acknowledgment arrives, a window's
worth of acknowledgments arrive, with each successive acknowledgment spaced
by S/R seconds. For each of these acknowledgments, the server
sends exactly one segment. Thus, the server alternates between two states:
a transmitting state, during which it transmits W segments, and
a stalled state, during which it transmits nothing and waits for an acknowledgment.
The latency is equal to 2 RTT plus the time required for the server
to transmit the object, O/R, plus the amount of time that
the server is in the stalled state. To determine the amount of time the
server is in the stalled state, let K = O/WS; if O/WS
is not an integer, then round K up to the nearest integer. Note
that K is the number of windows of data there are in the object
of size O. The server is in the stalled state between the transmission
of each of the windows, that is, for K - 1 periods of time, with
each period lasting RTT - (W - 1)S/R (see Figure
3.54). Thus, for case 2,
Latency = 2
RTT + O/R + (K - 1) [S/R + RTT
- WS/R]
Combining the
two cases, we obtain
Latency = 2
RTT + O/R + (K - 1) [S/R + RTT
- WS/R]+
where [x]+
= max(x,0).
This completes
our analysis of static windows. The following analysis for dynamic windows
is more complicated, but parallels that for static windows.
3.7.3: Modeling
Latency: Dynamic Congestion Window
We now investigate
the latency for a file transfer when TCP's dynamic congestion window is
in force. Recall that the server first starts with a congestion window
of one segment and sends one segment to the client. When it receives an
acknowledgment for the segment, it increases its congestion window to two
segments and sends two segments to the client (spaced apart by S/R
seconds). As it receives the acknowledgments for the two segments, it increases
the congestion window to four segments and sends four segments to the client
(again spaced apart by S/R seconds). The process continues,
with the congestion window doubling every RTT. A timing diagram for TCP
is illustrated in Figure 3.55.
Figure 3.55:
TCP timing during slow start
Note that O/S
is the number of segments in the object; in the above diagram, O/S
= 15. Consider the number of segments that are in each of the windows.
The first window contains one segment, the second window contains two segments,
and the third window contains four segments. More generally, the kth
window contains 2k-1 segments. Let K be the number
of windows that cover the object; in the preceding diagram, K =
4. In general, we can express K in terms of O/S as
follows:

After transmitting
a window's worth of data, the server may stall (that is, stop transmitting)
while it waits for an acknowledgment. In Figure 3.55, the server stalls
after transmitting the first and second windows, but not after transmitting
the third. Let us now calculate the amount of stall time after transmitting
the kth window. The time the server begins to transmit the kth
window until the time when the server receives an acknowledgment for the
first segment in the window is S/R + RTT. The transmission
time of the kth window is (S/R) 2k-1.
The stall time is the difference of these two quantities, that is,
[S/R
+ RTT - 2k-1 (S/R)]+.
The server can
potentially stall after the transmission of each of the first k
- 1 windows. (The server is done after the transmission of the kth
window.) We can now calculate the latency for transferring the file. The
latency has three components: 2 RTT for setting up the TCP connection
and requesting the file, O/R, the transmission time of the
object, and the sum of all the stalled times. Thus,

The reader should
compare the above equation for the latency equation for static congestion
windows; all the terms are exactly the same except that the term WS/R
for static windows has been replaced by 2k-1(S/R)
for dynamic windows. To obtain a more compact expression for the latency,
let Q be the number of times the server would stall if the object
contained an infinite number of segments:

The actual number
of times the server stalls is P = min{Q,K-1}. In Figure
3.55, P = Q = 2. Combining the above two equations gives

We can further
simplify the above formula for latency by noting

Combining the
above two equations gives the following closed-form expression for the
latency:

Thus to calculate
the latency, we simply must calculate K and Q, set P
= min {Q,K-1}, and plug P into the above formula.
It is interesting
to compare the TCP latency to the latency that would occur if there were
no congestion control (that is, no congestion window constraint). Without
congestion control, the latency is 2 RTT + O/R, which
we define to be the minimum latency. It is a simple exercise to
show that

We see from
the above formula that TCP slow start will not significantly increase latency
if RTT << O/R, that is, if the round-trip time
is much less than the transmission time of the object. Thus, if we are
sending a relatively large object over an uncongested high-speed link,
then slow start has an insignificant effect on latency. However, with the
Web, we are often transmitting many small objects over congested links,
in which case slow start can significantly increase latency (as we'll see
in the following subsection).
Let us now take
a look at some example scenarios. In all the scenarios we set S
= 536 bytes, a common default value for TCP. We'll use an RTT of
100 msec, which is not an atypical value for a continental or intercontinental
delay over moderately congested links. First consider sending a rather
large object of size O = 100 Kbytes. The number of windows that
cover this object is K = 8. For a number of transmission rates,
the following table examines the effect of the slow-start mechanism on
the latency.
R |
O/R |
P |
Minimum Latency: O/R + 2 RTT |
Latency with slow start |
28 Kbps |
28.6 sec |
1 |
28.8 sec |
28.9 sec |
100 Kbps |
8 sec |
2 |
8.2 sec |
8.4 sec |
1 Mbps |
800 msec |
5 |
1 sec |
1.5 sec |
10 Mbps |
80 msec |
7 |
0.28 sec |
0.98 sec |
We see from
the above chart that for a large object, slow-start adds appreciable delay
only when the transmission rate is high. If the transmission rate is low,
then acknowledgments come back relatively quickly, and TCP quickly ramps
up to its maximum rate. For example, when R = 100 Kbps, the number
of stall periods is P = 2 whereas the number of windows to transmit
is K = 8; thus the server stalls only after the first two of eight
windows. On the other hand, when R = 10 Mbps, the server stalls
between each window, which causes a significant increase in the delay.
Now consider
sending a small object of size O = 5 Kbytes. The number of windows
that cover this object is K = 4. For a number of transmission rates,
the following table examines the effect of the slow-start mechanism.
R |
O/R |
P |
Minimum latency: O/R + 2 RTT |
Latency with slow start |
28 Kbps |
1.43 sec |
1 |
1.63 sec |
1.73 sec |
100 Kbps |
0.4 sec |
2 |
0.6 sec |
0.757 sec |
1 Mbps |
40 msec |
3 |
0.24 sec |
0.52 sec |
10 Mbps |
4 msec |
3 |
0.20 sec |
0.50 sec |
Once again,
slow start adds an appreciable delay when the transmission rate is high.
For example, when R = 1 Mbps, the server stalls between each window,
which causes the latency to be more than twice that of the minimum latency.
For a larger
RTT, the effect of slow start becomes significant for small objects
for smaller transmission rates. The following table examines the effect
of slow start for RTT = 1 second and O = 5 Kbytes (K = 4).
R |
O/R |
P |
Minimum latency: O/R + 2 RTT |
Latency with slow start |
28 Kbps |
1.43 sec |
3 |
3.4 sec |
5.8 sec |
100 Kbps |
0.4 sec |
3 |
2.4 sec |
5.2 sec |
1 Mbps |
40 msec |
3 |
2.0 sec |
5.0 sec |
10 Mbps |
4 msec |
3 |
2.0 sec |
5.0 sec |
In summary, slow start can significantly increase latency
when the object size is relatively small and the RTT is relatively large.
Unfortunately, this is often the scenario when sending objects over the
World Wide Web.
An Example: HTTP
As an application of the latency analysis, let's now calculate
the response time for a Web page sent over nonpersistent HTTP. Suppose
that the page consists of one base HTML page and M referenced images.
To keep things simple, let us assume that each of the M + 1 objects
contains exactly O bits.
With nonpersistent HTTP, each object is transferred independently,
one after the other. The response time of the Web page is therefore the
sum of the latencies for the individual objects. Thus

Note that the response time for nonpersistent HTTP takes
the form:
Response time = (M + 1)O/R + 2(M
+ 1)RTT + latency due to TCP slow-start
for each of the M + 1 objects.
Clearly, if there are many objects in the Web page and
if RTT is large, then non-persistent HTTP will have poor response-time
performance. In the homework problems, we will investigate the response
time for other HTTP transport schemes, including persistent connections
and nonpersistent connections with parallel connections. The reader is
also encouraged to see [Heidemann
1997] for a related analysis. |