Reliable Transport Computer Networking Reliable Transport Prof. Andrzej Duda [email protected] http://duda.imag.fr 1 1 Reliable Transport Reliable Transport Reliable data transfer Automatic Repeat reQuest ß Data are received ß Sliding window ordered and error-free ß Error and Loss detection ß Elements of procedure ß Acknowledgements: short usually means the set of control packets following functions ß Retransmission Strategies ß Error detection and ß Stop & Go correction (e.g. ARQ ) ß Go Back N ß Flow Control ß Selective Repeat 2 2 Reliable Transport Principles of reliable data transfer ß important in app., transport, link layers ß data are received ordered and error-free ß characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) 3 Reliable data transfer means that the data are received ordered and error-free. The problem of implementing reliable data transfer occurs not only at the transport layer, but also at the link layer and the application layer as well. The service abstraction provided to the upper layer entities is that of a reliable channel, where no transferred data bits are corrupted or lost, and all are delivered in the order in which they were sent. This is precisely the service model offered by TCP to the Internet applications that invoke it. It is the responsibility of a reliable data transfer protocol to implement this service abstraction. This task is made difficult by the fact that the layer below the reliable data transfer protocol may be unreliable. For example, TCP is a reliable data transfer protocol that is implemented on top of an unreliable (IP) end-end network layer. 3 Reliable Transport Reliable data transfer Our approach: ß analyze reliable data transfer protocol (rdt) ß use finite state machines (FSM) to specify the model ß introduce gradually the Elements of Procedure (EoP) event causing state transition actions taken on state transition state: when in this state “state” next state state event uniquely determined 1 2 by next event actions 4 We analyze the main elements of a reliable data transfer protocol, using finite state machine to specify the models of the sender and the receiver. The generic term "packet" is seen as more appropriate, and used here rather than "segment" for the protocol data unit, given that the concepts of a reliable data transfer protocol applies to computer networks in general. We talk about Elements of Procedure (EoP) as the elements for obtaining a reliable data transfer that are not limited to transport layer, but exist in general in Communication Networks. 4 Reliable Transport Elements of Procedure ß Elements of Procedure transform a raw, unreliable frame transport mechanism into a reliable data pipe ß ordered delivery of packets ß loss-free as long as the connection is alive ß no buffer overflow ß Elements of procedure usually means the set of following functions ß Error detection and correction (e.g. ARQ - Automatic Repeat reQuest) ß Flow Control ß Connection Management ß Elements of procedure exist in ß reliable transport (TCP for the TCP/IP architecture) (layer 4) ß also: in reliable data link (ex: over radio channels, over modems -layer 2) as HDLC ß Congestion Control 5 The Elements of Procedure (EoP) transform a raw, unreliable frame transport mechanism into a reliable data pipe by delivering the packets in order and without losses, and avoiding buffer overflow. With EoP is usually addressed three following functions: •Error detection and correction Various ways exists to carry out this process. However, techniques based on checks (parity) bits are not generally adopted for data networking, because they need too many bits to ensure the level of error correction required. Instead, ARQ (Automatic Repeat reQuest) protocols - reliable data transfer protocols based on retransmission – has been universally adopted. •Flow-Control to eliminate the possibility of the sender overflowing the receiver's buffer. Flow control is thus a speed matching service--matching the rate at which the sender is sending to the rate at which the receiving application is reading. •Connection Management: reliable data transfer must be connection oriented. In order to control the connection two phases are needed: connection setup and connection release. Applications that use UDP must implement some form of equivalent control, if the application cares about not loosing data. UDP applications (like NFS, DNS) typically send data blocks and expect a response, which can be used as an application layer acknowledgement. Finally, as is the case of TCP, we can also have congestion control to avoid network congestion. 5 Reliable Transport ACK/NAK handling ß ACKs & NAKs: short control packets ß cumulative versus selective ß positive (ACK) versus negative (NAK) ß Stop and wait ß The sender waits for an ACK/NAK ß Only one packet at time can be transmitted ß Go back N ß packets are transmitted without waiting for an ACK ß All following packets are resent on receipt of NAK ß Selective repeat procedure ß Only packets negatively acknowledged are resent 6 There are a number of ways of handling the response to ACK/NAK. The three most commonly identified procedures are the following: •Stop and wait: The sender can send only one packet at time and have to wait for an ACK/NAK. If a NAK is received or a timeout expires the packet is retransmitted. If an ACK arrives the current packet is dropped by the sender buffer. •Go back N: The packets are transmitted without waiting for an ACK. If a NAK is received or a timeout expires the packet in question and all the packets following are retransmitted. •Selective repeat procedure Only packets negatively acknowledged, or for which the timeout has expired without receiving an ACK, are resent. 6 Reliable Transport Retransmission Strategies ß underlying channel can ß Approach: sender waits garble and lose packets “reasonable” amount of (data or ACKs) time for ACK ß checksum, seq. #, ACKs, ß retransmits if no ACK or NAK retransmissions will be of received in this time help, but not enough ß if pkt (or ACK) just delayed ß to deal with loss & (not lost): errors: ß retransmission will be duplicate, but use of seq. ß sender waits until certain #’s already handles this time, then retransmits ß receiver must specify seq # ß duplicate of pkt being ACKed ß requires countdown timer 7 If the underlying channel can lose packets as well, two additional concerns must now be addressed by the protocol: how to detect packet loss and what to do when packet loss occurs. This can be done with the use of checksumming, sequence numbers, ACK packets, and retransmissions. To deal with loss, the sender could wait long enough so that it is certain that a packet has been lost, and then t can simply retransmit the data packet. However, cannot wait forever. The sender must clearly wait at least as long as a round-trip delay between the sender and receiver (which may include buffering at intermediate routers or gateways) plus whatever amount of time is needed to process a packet at the receiver. In many networks, this worst-case maximum delay is very difficult to even estimate, much less know with certainty. Moreover, the protocol should ideally recover from packet loss as soon as possible; waiting for a worst-case delay could mean a long wait until error recovery is initiated. The approach thus adopted in practice is for the sender to "judiciously" choose a time value such that packet loss is likely, although not guaranteed, to have happened. If an ACK is not received within this time, the packet is retransmitted. Note that if a packet experiences a particularly large delay, the sender may retransmit the packet even though neither the data packet nor its ACK have been lost. This introduces the possibility of duplicate data packets in the sender-to-receiver channel, that we already see how to handle. From the sender's viewpoint, retransmission is a panacea. The sender does not know whether a data packet was lost, an ACK was lost, or if the packet or ACK was simply overly delayed. In all cases, the action is the same: retransmit. In order to implement a time-based retransmission mechanism, a countdown timer will be needed that can interrupt the sender after a given amount of timer has expired. The sender will thus need to be able to (1) start the timer each time a packet (either a first-time packet, or a retransmission) is sent, (2) respond to a timer interrupt (taking appropriate actions), and (3) stop the timer. 7 Reliable Transport Send and Wait (simple) ß Reliable transmission pkt n ß Flow control OK ß sender can send the ACK next packet after receiving ACK pkt n+1 ß No losses error NACK pkt n+1 time 8 The send side has two states. In one state, the send-side protocol is waiting for data to be passed down from the upper layer. In the other state, the sender protocol is waiting for an ACK or a NAK packet from the receiver. It is important to note that when the receiver is in the wait-for-ACK-or-NAK state, it cannot get more data from the upper layer; that will only happen after the sender receives an ACK and leaves this state. Thus, the sender will not send a new piece of data until it is sure that the receiver has correctly received the current packet. Because of this behavior, protocols such as this one are known as stop-and-wait protocols. The receiver-side FSM has a single state. On packet arrival, the receiver replies with either an ACK or a NAK, depending on whether or not the received packet is corrupted. There are two limitations with Stop and Wait: •ACK and NAK can get corrupted. A solution is to introduce sequence numbers. •a little efficiency because of idle periods, when the bandwidth delay product is not negligible. A solution is to allow multiple transmissions, which in turn requires a sliding window in order to avoid unlimited buffer at the destination. 8 Reliable Transport Losses ß Timer pkt n ¥ ß if no response within a time interval, T retransmission ß the time interval must pkt n be longer than RTT OK ACK pkt n+1 time 9 Minimally, we will need to add checksum bits to ACK/NAK packets in order to detect such errors. The more difficult question is how the protocol should recover from errors in ACK or NAK packets. The difficulty here is that if an ACK or NAK is corrupted, the sender has no way of knowing whether or not the receiver has correctly received the last piece of transmitted data. If we add a new message (e.g. "What did you say?" ) even this message can get lost, and we are at the same point. If we just retransmit we could introduce duplicate packets into the sender-to-receiver channel. The fundamental difficulty with duplicate packets is that the receiver doesn't know whether the ACK or NAK it last sent was received correctly at the sender. Thus, it cannot know a priori whether an arriving packet contains new data or is a retransmission! A simple solution to this new problem (and one adopted in almost all existing data-transfer protocols including TCP) is to add a new field to the data packet and have the sender number its data packets by putting a flag (sometimes referred to as sequence number) into this field. The receiver then need only check this sequence number to determine whether or not the received packet is a retransmission. The existence of sender-generated duplicate packets and packet (data, ACK) loss also complicates the sender's processing of any ACK packet it receives. If an ACK is received, how is the sender to know if it was sent by the receiver in response to its (sender's) own most recently transmitted packet, or is a delayed ACK sent in response to an earlier transmission of a different data packet? The solution to this dilemma is to augment the ACK packet with an acknowledgment field. When the receiver generates an ACK, it will copy the sequence number of the data packet being acknowledged into this acknowledgment field. By examining the contents of the acknowledgment field, the sender can determine the sequence number of the packet being positively acknowledged. 9 Reliable Transport Problem of duplicates ß Retransmission ß next and retransmitted pkt n packets are confused OK : ß Solution T ¥ ACK n received ß sequence number pkt n accepted as n+1 time 10 10
Description: