| draft-ietf-quic-recovery-01.txt | draft-ietf-quic-recovery-02.txt | |||
|---|---|---|---|---|
| QUIC J. Iyengar, Ed. | QUIC J. Iyengar, Ed. | |||
| Internet-Draft I. Swett, Ed. | Internet-Draft I. Swett, Ed. | |||
| Intended status: Standards Track Google | Intended status: Standards Track Google | |||
| Expires: July 18, 2017 January 14, 2017 | Expires: September 14, 2017 March 13, 2017 | |||
| QUIC Loss Detection and Congestion Control | QUIC Loss Detection and Congestion Control | |||
| draft-ietf-quic-recovery-01 | draft-ietf-quic-recovery-02 | |||
| Abstract | Abstract | |||
| QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | |||
| on decades of transport and security experience, and implements | on decades of transport and security experience, and implements | |||
| mechanisms that make it attractive as a modern general-purpose | mechanisms that make it attractive as a modern general-purpose | |||
| transport. QUIC implements the spirit of known TCP loss detection | transport. QUIC implements the spirit of known TCP loss detection | |||
| mechanisms, described in RFCs, various Internet-drafts, and also | mechanisms, described in RFCs, various Internet-drafts, and also | |||
| those prevalent in the Linux TCP implementation. This document | those prevalent in the Linux TCP implementation. This document | |||
| describes QUIC loss detection and congestion control, and attributes | describes QUIC loss detection and congestion control, and attributes | |||
| skipping to change at page 1, line 48 ¶ | skipping to change at page 1, line 48 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on July 18, 2017. | This Internet-Draft will expire on September 14, 2017. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 27 ¶ | skipping to change at page 2, line 27 ¶ | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 | 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 | |||
| 2. Design of the QUIC Transmission Machinery . . . . . . . . . . 3 | 2. Design of the QUIC Transmission Machinery . . . . . . . . . . 3 | |||
| 2.1. Relevant Differences Between QUIC and TCP . . . . . . . . 4 | 2.1. Relevant Differences Between QUIC and TCP . . . . . . . . 4 | |||
| 2.1.1. Monotonically Increasing Packet Numbers . . . . . . . 4 | 2.1.1. Monotonically Increasing Packet Numbers . . . . . . . 4 | |||
| 2.1.2. No Reneging . . . . . . . . . . . . . . . . . . . . . 5 | 2.1.2. No Reneging . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.1.3. More ACK Ranges . . . . . . . . . . . . . . . . . . . 5 | 2.1.3. More ACK Ranges . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1.4. Explicit Correction For Delayed Acks . . . . . . . . 5 | 2.1.4. Explicit Correction For Delayed Acks . . . . . . . . 5 | |||
| 3. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3.1. Constants of interest . . . . . . . . . . . . . . . . . . 5 | 3.1. Constants of interest . . . . . . . . . . . . . . . . . . 5 | |||
| 3.2. Variables of interest . . . . . . . . . . . . . . . . . . 6 | 3.2. Variables of interest . . . . . . . . . . . . . . . . . . 6 | |||
| 3.3. Initialization . . . . . . . . . . . . . . . . . . . . . 6 | 3.3. Initialization . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 3.4. Setting the Loss Detection Alarm . . . . . . . . . . . . 7 | 3.4. On Sending a Packet . . . . . . . . . . . . . . . . . . . 7 | |||
| 3.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 8 | 3.5. On Ack Receipt . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.6. On Ack Receipt . . . . . . . . . . . . . . . . . . . . . 8 | 3.6. On Packet Acknowledgment . . . . . . . . . . . . . . . . 8 | |||
| 3.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 9 | 3.7. Setting the Loss Detection Alarm . . . . . . . . . . . . 9 | |||
| 3.7.1. Handshake Packets . . . . . . . . . . . . . . . . . . 9 | ||||
| 3.7.2. Tail Loss Probe and Retransmission Timeout . . . . . 9 | ||||
| 3.7.3. Early Retransmit . . . . . . . . . . . . . . . . . . 9 | ||||
| 3.7.4. Pseudocode . . . . . . . . . . . . . . . . . . . . . 10 | ||||
| 3.8. On Alarm Firing . . . . . . . . . . . . . . . . . . . . . 10 | 3.8. On Alarm Firing . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 3.9. Detecting Lost Packets . . . . . . . . . . . . . . . . . 10 | 3.9. Detecting Lost Packets . . . . . . . . . . . . . . . . . 11 | |||
| 4. Congestion Control . . . . . . . . . . . . . . . . . . . . . 10 | 3.9.1. Handshake Packets . . . . . . . . . . . . . . . . . . 11 | |||
| 5. TCP mechanisms in QUIC . . . . . . . . . . . . . . . . . . . 10 | 3.9.2. Pseudocode . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 5.1. RFC 6298 (RTO computation) . . . . . . . . . . . . . . . 11 | 4. Congestion Control . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 5.2. FACK Loss Recovery (paper) . . . . . . . . . . . . . . . 11 | 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 5.3. RFC 3782, RFC 6582 (NewReno Fast Recovery) . . . . . . . 11 | 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 5.4. TLP (draft) . . . . . . . . . . . . . . . . . . . . . . . 11 | 6.1. Normative References . . . . . . . . . . . . . . . . . . 12 | |||
| 5.5. RFC 5827 (Early Retransmit) with Delay Timer . . . . . . 11 | 6.2. Informative References . . . . . . . . . . . . . . . . . 13 | |||
| 5.6. RFC 5827 (F-RTO) . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 5.7. RFC 6937 (Proportional Rate Reduction) . . . . . . . . . 12 | ||||
| 5.8. TCP Cubic (draft) with optional RFC 5681 (Reno) . . . . . 12 | ||||
| 5.9. Hybrid Slow Start (paper) . . . . . . . . . . . . . . . . 12 | ||||
| 5.10. RACK (draft) . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 7. Normative References . . . . . . . . . . . . . . . . . . . . 12 | ||||
| Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 13 | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 13 | |||
| Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 13 | Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 13 | |||
| B.1. Since draft-ietf-quic-recovery-00: . . . . . . . . . . . 13 | B.1. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 14 | |||
| B.2. Since draft-iyengar-quic-loss-recovery-01: . . . . . . . 13 | B.2. Since draft-ietf-quic-recovery-00: . . . . . . . . . . . 14 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | B.3. Since draft-iyengar-quic-loss-recovery-01: . . . . . . . 14 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 1. Introduction | 1. Introduction | |||
| QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | |||
| on decades of transport and security experience, and implements | on decades of transport and security experience, and implements | |||
| mechanisms that make it attractive as a modern general-purpose | mechanisms that make it attractive as a modern general-purpose | |||
| transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | |||
| QUIC implements the spirit of known TCP loss recovery mechanisms, | QUIC implements the spirit of known TCP loss recovery mechanisms, | |||
| described in RFCs, various Internet-drafts, and also those prevalent | described in RFCs, various Internet-drafts, and also those prevalent | |||
| skipping to change at page 5, line 41 ¶ | skipping to change at page 5, line 34 ¶ | |||
| called on packet transmission, when a packet is acked, and timer | called on packet transmission, when a packet is acked, and timer | |||
| expiration events. | expiration events. | |||
| 3.1. Constants of interest | 3.1. Constants of interest | |||
| Constants used in loss recovery and congestion control are based on a | Constants used in loss recovery and congestion control are based on a | |||
| combination of RFCs, papers, and common practice. Some may need to | combination of RFCs, papers, and common practice. Some may need to | |||
| be changed or negotiated in order to better suit a variety of | be changed or negotiated in order to better suit a variety of | |||
| environments. | environments. | |||
| o kMaxTLPs: 2 Maximum number of tail loss probes before an RTO | kMaxTLPs (default 2): Maximum number of tail loss probes before an | |||
| fires. | RTO fires. | |||
| o kReorderingThreshold: 3 Maximum reordering in packet number space | kReorderingThreshold (default 3): Maximum reordering in packet | |||
| before FACK style loss detection considers a packet lost. | number space before FACK style loss detection considers a packet | |||
| lost. | ||||
| o kTimeReorderingThreshold: 1/8 Maximum reordering in time sapce | kTimeReorderingFraction (default 1/8): Maximum reordering in time | |||
| before time based loss detection considers a packet lost. In | sapce before time based loss detection considers a packet lost. | |||
| fraction of an RTT. | In fraction of an RTT. | |||
| o kMinTLPTimeout: 10ms Minimum time in the future a tail loss probe | kMinTLPTimeout (default 10ms): Minimum time in the future a tail | |||
| loss probe alarm may be set for. | ||||
| kMinRTOTimeout (default 200ms): Minimum time in the future an RTO | ||||
| alarm may be set for. | alarm may be set for. | |||
| o kMinRTOTimeout: 200ms Minimum time in the future an RTO alarm may | kDelayedAckTimeout (default 25ms): The length of the peer's delayed | |||
| be set for. | ack timer. | |||
| o kDelayedAckTimeout: 25ms The length of the peer's delayed ack | kDefaultInitialRtt (default 100ms): The default RTT used before an | |||
| timer. | RTT sample is taken. | |||
| 3.2. Variables of interest | 3.2. Variables of interest | |||
| We first describe the variables required to implement the loss | We first describe the variables required to implement the loss | |||
| detection mechanisms described in this section. | detection mechanisms described in this section. | |||
| o loss_detection_alarm: Multi-modal alarm used for loss detection. | loss_detection_alarm: Multi-modal alarm used for loss detection. | |||
| o alarm_mode: QUIC maintains a single loss detection alarm, which | ||||
| switches between various modes. This mode is used to determine | ||||
| the duration of the alarm. | ||||
| o handshake_count: The number of times the handshake packets have | handshake_count: The number of times the handshake packets have been | |||
| been retransmitted without receiving an ack. | retransmitted without receiving an ack. | |||
| o tlp_count: The number of times a tail loss probe has been sent | tlp_count: The number of times a tail loss probe has been sent | |||
| without receiving an ack. | without receiving an ack. | |||
| o rto_count: The number of times an rto has been sent without | rto_count: The number of times an rto has been sent without | |||
| receiving an ack. | receiving an ack. | |||
| o smoothed_rtt: The smoothed RTT of the connection, computed as | smoothed_rtt: The smoothed RTT of the connection, computed as | |||
| described in [RFC6298] | described in [RFC6298] | |||
| o rttvar: The RTT variance. | rttvar: The RTT variance, computed as described in [RFC6298] | |||
| o reordering_threshold: The largest delta between the largest acked | initial_rtt: The initial RTT used before any RTT measurements have | |||
| been made. | ||||
| reordering_threshold: The largest delta between the largest acked | ||||
| retransmittable packet and a packet containing retransmittable | retransmittable packet and a packet containing retransmittable | |||
| frames before it's declared lost. | frames before it's declared lost. | |||
| o use_time_loss: When true, loss detection operates solely based on | time_reordering_fraction: The reordering window as a fraction of | |||
| reordering threshold in time, rather than in packet number gaps. | max(smoothed_rtt, latest_rtt). | |||
| o sent_packets: An association of packet numbers to information | loss_time: The time at which the next packet will be considered lost | |||
| about them. | based on early transmit or exceeding the reordering window in | |||
| time. | ||||
| sent_packets: An association of packet numbers to information about | ||||
| them, including a number field indicating the packet number, a | ||||
| time field indicating the time a packet was sent, and a bytes | ||||
| field indicating the packet's size. sent_packets is ordered by | ||||
| packet number, and packets remain in sent_packets until | ||||
| acknowledged or lost. | ||||
| 3.3. Initialization | 3.3. Initialization | |||
| At the beginning of the connection, initialize the loss detection | At the beginning of the connection, initialize the loss detection | |||
| variables as follows: | variables as follows: | |||
| loss_detection_alarm.reset(); | loss_detection_alarm.reset() | |||
| handshake_count = 0; | handshake_count = 0 | |||
| tlp_count = 0; | tlp_count = 0 | |||
| rto_count = 0; | rto_count = 0 | |||
| reordering_threshold = kReorderingThreshold; | if (UsingTimeLossDetection()) | |||
| use_time_loss = false; | reordering_threshold = infinite | |||
| smoothed_rtt = 0; | time_reordering_fraction = kTimeReorderingFraction | |||
| rttvar = 0; | ||||
| 3.4. Setting the Loss Detection Alarm | ||||
| QUIC loss detection uses a single alarm for all timer-based loss | ||||
| detection. The duration of the alarm is based on the alarm's mode, | ||||
| which is set in the packet and timer events further below. The | ||||
| function SetLossDetectionAlarm defined below shows how the single | ||||
| timer is set based on the alarm mode. | ||||
| Pseudocode for SetLossDetectionAlarm follows: | ||||
| SetLossDetectionAlarm(): | ||||
| if (retransmittable packets are not outstanding): | ||||
| loss_detection_alarm.cancel(); | ||||
| return; | ||||
| if (handshake packets are outstanding): | ||||
| // Handshake retransmission alarm. | ||||
| alarm_duration = max(1.5 * smoothed_rtt, kMinTLPTimeout) << handshake_count; | ||||
| handshake_count++; | ||||
| else if (largest sent packet is acked): | ||||
| // Early retransmit alarm. | ||||
| alarm_duration = 0.25 x smoothed_rtt; | ||||
| else if (tlp_count < kMaxTLPs): | ||||
| // Tail Loss Probe alarm. | ||||
| if (retransmittable_packets_outstanding = 1): | ||||
| alarm_duration = max(1.5 x smoothed_rtt + kDelayedAckTimeout, | ||||
| 2 x smoothed_rtt); | ||||
| else: | ||||
| alarm_duration = max (kMinTLPTimeout, 2 x smoothed_rtt); | ||||
| tlp_count++; | ||||
| else: | ||||
| // RTO alarm. | ||||
| if (rto_count = 0): | ||||
| alarm_duration = max(kMinRTOTimeout, smoothed_rtt + 4 x rttvar); | ||||
| else: | else: | |||
| alarm_duration = loss_detection_alarm.get_delay() << 1; | reordering_threshold = kReorderingThreshold | |||
| rto_count++; | time_reordering_fraction = infinite | |||
| loss_time = 0 | ||||
| loss_detection_alarm.set(now + alarm_duration); | smoothed_rtt = 0 | |||
| rttvar = 0 | ||||
| initial_rtt = kDefaultInitialRtt | ||||
| 3.5. On Sending a Packet | 3.4. On Sending a Packet | |||
| After any packet is sent, be it a new transmission or a rebundled | After any packet is sent, be it a new transmission or a rebundled | |||
| transmission, the following OnPacketSent function is called. The | transmission, the following OnPacketSent function is called. The | |||
| parameters to OnPacketSent are as follows: | parameters to OnPacketSent are as follows: | |||
| o packet_number: The packet number of the sent packet. | o packet_number: The packet number of the sent packet. | |||
| o is_retransmittble: A boolean that indicates whether the packet | o is_retransmittble: A boolean that indicates whether the packet | |||
| contains at least one frame requiring reliable deliver. The | contains at least one frame requiring reliable deliver. The | |||
| retransmittability of various QUIC frames is described in | retransmittability of various QUIC frames is described in | |||
| [QUIC-TRANSPORT]. If false, it is still acceptable for an ack to | [QUIC-TRANSPORT]. If false, it is still acceptable for an ack to | |||
| be received for this packet. However, a caller MUST NOT set | be received for this packet. However, a caller MUST NOT set | |||
| is_retransmittable to true if an ack is not expected. | is_retransmittable to true if an ack is not expected. | |||
| o sent_bytes: The number of bytes sent in the packet. | ||||
| Pseudocode for OnPacketSent follows: | Pseudocode for OnPacketSent follows: | |||
| OnPacketSent(packet_number, is_retransmittable): | OnPacketSent(packet_number, is_retransmittable, sent_bytes): | |||
| # TODO: Clarify the data in sent_packets. | sent_packets[packet_number].packet_number = packet_number | |||
| sent_packets[packet_number] = {now} | sent_packets[packet_number].time = now | |||
| if is_retransmittable: | if is_retransmittable: | |||
| sent_packets[packet_number].bytes = sent_bytes | ||||
| SetLossDetectionAlarm() | SetLossDetectionAlarm() | |||
| 3.6. On Ack Receipt | 3.5. On Ack Receipt | |||
| When an ack is received, it may acknowledge 0 or more packets. | When an ack is received, it may acknowledge 0 or more packets. | |||
| Pseudocode for OnAckReceived and UpdateRtt follow: | Pseudocode for OnAckReceived and UpdateRtt follow: | |||
| OnAckReceived(ack): | OnAckReceived(ack): | |||
| // If the largest acked is newly acked, update the RTT. | // If the largest acked is newly acked, update the RTT. | |||
| if (sent_packets[ack.largest_acked]): | if (sent_packets[ack.largest_acked]): | |||
| rtt_sample = now - sent_packets[ack.largest_acked] | rtt_sample = now - sent_packets[ack.largest_acked].time | |||
| if (rtt_sample > ack.ack_delay): | if (rtt_sample > ack.ack_delay): | |||
| rtt_sample -= ack.delay; | rtt_sample -= ack.delay | |||
| UpdateRtt(rtt_sample) | UpdateRtt(rtt_sample) | |||
| // Find all newly acked packets. | // Find all newly acked packets. | |||
| for acked_packet in DetermineNewlyAckedPackets(): | for acked_packet_number in DetermineNewlyAckedPackets(): | |||
| OnPacketAcked(acked_packet) | OnPacketAcked(acked_packet_number) | |||
| DetectLostPackets(ack.largest_acked_packet); | DetectLostPackets(ack.largest_acked_packet) | |||
| SetLossDetectionAlarm(); | SetLossDetectionAlarm() | |||
| UpdateRtt(rtt_sample): | UpdateRtt(rtt_sample): | |||
| // Based on {{RFC6298}}. | ||||
| if (smoothed_rtt == 0): | if (smoothed_rtt == 0): | |||
| smoothed_rtt = rtt_sample | smoothed_rtt = rtt_sample | |||
| rttvar = rtt_sample / 2 | rttvar = rtt_sample / 2 | |||
| else: | else: | |||
| rttvar = 3/4 * rttvar + 1/4 * (smoothed_rtt - rtt_sample) | rttvar = 3/4 * rttvar + 1/4 * (smoothed_rtt - rtt_sample) | |||
| smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * rtt_sample | smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * rtt_sample | |||
| 3.7. On Packet Acknowledgment | 3.6. On Packet Acknowledgment | |||
| When a packet is acked for the first time, the following | When a packet is acked for the first time, the following | |||
| OnPacketAcked function is called. Note that a single ACK frame may | OnPacketAcked function is called. Note that a single ACK frame may | |||
| newly acknowledge several packets. OnPacketAcked must be called once | newly acknowledge several packets. OnPacketAcked must be called once | |||
| for each of these newly acked packets. | for each of these newly acked packets. | |||
| OnPacketAcked takes one parameter, acked_packet, which is the packet | OnPacketAcked takes one parameter, acked_packet, which is the packet | |||
| number of the newly acked packet, and returns a list of packet | number of the newly acked packet, and returns a list of packet | |||
| numbers that are detected as lost. | numbers that are detected as lost. | |||
| Pseudocode for OnPacketAcked follows: | Pseudocode for OnPacketAcked follows: | |||
| OnPacketAcked(acked_packet): | OnPacketAcked(acked_packet_number): | |||
| handshake_count = 0; | handshake_count = 0 | |||
| tlp_count = 0; | tlp_count = 0 | |||
| rto_count = 0; | rto_count = 0 | |||
| # TODO: Don't remove packets immediately, since they can be used for | sent_packets.remove(acked_packet_number) | |||
| # detecting spurous retransmits. | ||||
| sent_packets.remove(acked_packet); | ||||
| 3.8. On Alarm Firing | ||||
| QUIC uses one loss recovery alarm, which when set, can be in one of | 3.7. Setting the Loss Detection Alarm | |||
| several modes. When the alarm fires, the mode determines the action | ||||
| to be performed. OnAlarm returns a list of packet numbers that are | ||||
| detected as lost. | ||||
| Pseudocode for OnAlarm follows: | QUIC loss detection uses a single alarm for all timer-based loss | |||
| detection. The duration of the alarm is based on the alarm's mode, | ||||
| which is set in the packet and timer events further below. The | ||||
| function SetLossDetectionAlarm defined below shows how the single | ||||
| timer is set based on the alarm mode. | ||||
| OnAlarm(acked_packet): | 3.7.1. Handshake Packets | |||
| lost_packets = DetectLostPackets(acked_packet); | ||||
| MaybeRetransmitLostPackets(); | ||||
| SetLossDetectionAlarm(); | ||||
| 3.9. Detecting Lost Packets | The initial flight has no prior RTT sample. A client SHOULD remember | |||
| the previous RTT it observed when resumption is attempted and use | ||||
| that for an initial RTT value. If no previous RTT is available, the | ||||
| initial RTT defaults to 200ms. Once an RTT measurement is taken, it | ||||
| MUST replace initial_rtt. | ||||
| Packets in QUIC are only considered lost once a larger packet number | Endpoints MUST retransmit handshake frames if not acknowledged within | |||
| is acknowledged. DetectLostPackets is called every time there is a | a time limit. This time limit will start as the largest of twice the | |||
| new largest packet or if the loss detection alarm fires the previous | rtt value and MinTLPTimeout. Each consecutive handshake | |||
| largest acked packet is supplied. | retransmission doubles the time limit, until an acknowledgement is | |||
| received. | ||||
| DetectLostPackets takes one parameter, acked_packet, which is the | Handshake frames may be cancelled by handshake state transitions. In | |||
| packet number of the largest acked packet, and returns a list of | particular, all non-protected frames SHOULD be no longer be | |||
| packet numbers detected as lost. | transmitted once packet protection is available. | |||
| Pseudocode for DetectLostPackets follows: | When stateless rejects are in use, the connection is considered | |||
| immediately closed once a reject is sent, so no timer is set to | ||||
| retransmit the reject. | ||||
| DetectLostPackets(acked_packet): | Version negotiation packets are always stateless, and MUST be sent | |||
| lost_packets = {}; | once per per handshake packet that uses an unsupported QUIC version, | |||
| foreach (unacked_packet less than acked_packet): | and MAY be sent in response to 0RTT packets. | |||
| if (unacked_packet.time_sent < | ||||
| acked_packet.time_sent - kTimeReorderThreshold * smoothed_rtt): | ||||
| lost_packets.insert(unacked_packet.packet_number); | ||||
| else if (unacked_packet.packet_number < | ||||
| acked_packet.packet_number - reordering_threshold) | ||||
| lost_packets.insert(unacked_packet.packet_number); | ||||
| return lost_packets; | ||||
| 4. Congestion Control | 3.7.2. Tail Loss Probe and Retransmission Timeout | |||
| (describe NewReno-style congestion control for QUIC.) | Tail loss probes [I-D.dukkipati-tcpm-tcp-loss-probe] and | |||
| retransmission timeouts[RFC6298] are an alarm based mechanism to | ||||
| recover from cases when there are outstanding retransmittable | ||||
| packets, but an acknowledgement has not been received in a timely | ||||
| manner. | ||||
| 5. TCP mechanisms in QUIC | 3.7.3. Early Retransmit | |||
| QUIC implements the spirit of a variety of RFCs, Internet drafts, and | Early retransmit [RFC5827] is implemented with a 1/4 RTT timer. It | |||
| other well-known TCP loss recovery mechanisms, though the | is part of QUIC's time based loss detection, but is always enabled, | |||
| implementation details differ from the TCP implementations. | even when only packet reordering loss detection is enabled. | |||
| 5.1. RFC 6298 (RTO computation) | 3.7.4. Pseudocode | |||
| QUIC calculates SRTT and RTTVAR according to the standard formulas. | Pseudocode for SetLossDetectionAlarm follows: | |||
| An RTT sample is only taken if the delayed ack correction is smaller | ||||
| than the measured RTT (otherwise a negative RTT would result), and | ||||
| the ack's contains a new, larger largest observed packet number. | ||||
| min_rtt is only based on the observed RTT, but SRTT uses the delayed | ||||
| ack correction delta. | ||||
| As described above, QUIC implements RTO with the standard timeout and | SetLossDetectionAlarm(): | |||
| CWND reduction. However, QUIC retransmits the earliest outstanding | if (retransmittable packets are not outstanding): | |||
| packets rather than the latest, because QUIC doesn't have | loss_detection_alarm.cancel(); | |||
| retransmission ambiguity. QUIC uses the commonly accepted min RTO of | return | |||
| 200ms instead of the 1s the RFC specifies. | ||||
| 5.2. FACK Loss Recovery (paper) | if (handshake packets are outstanding): | |||
| // Handshake retransmission alarm. | ||||
| if (smoothed_rtt == 0): | ||||
| alarm_duration = 2 * initial_rtt | ||||
| else: | ||||
| alarm_duration = 2 * smoothed_rtt | ||||
| alarm_duration = max(alarm_duration, kMinTLPTimeout) | ||||
| alarm_duration = alarm_duration << handshake_count | ||||
| else if (loss_time != 0): | ||||
| // Early retransmit timer or time loss detection. | ||||
| alarm_duration = loss_time - now | ||||
| else if (tlp_count < kMaxTLPs): | ||||
| // Tail Loss Probe | ||||
| if (retransmittable_packets_outstanding = 1): | ||||
| alarm_duration = 1.5 * smoothed_rtt + kDelayedAckTimeout | ||||
| else: | ||||
| alarm_duration = kMinTLPTimeout | ||||
| alarm_duration = max(alarm_duration, 2 * smoothed_rtt) | ||||
| else: | ||||
| // RTO alarm | ||||
| if (rto_count = 0): | ||||
| alarm_duration = smoothed_rtt + 4 * rttvar | ||||
| alarm_duration = max(alarm_duration, kMinRTOTimeout) | ||||
| else: | ||||
| alarm_duration = loss_detection_alarm.get_delay() << 1 | ||||
| QUIC implements the algorithm for early loss recovery described in | loss_detection_alarm.set(now + alarm_duration) | |||
| the FACK paper (and implemented in the Linux kernel.) QUIC uses the | ||||
| packet number to measure the FACK reordering threshold. Currently | ||||
| QUIC does not implement an adaptive threshold as many TCP | ||||
| implementations (i.e., the Linux kernel) do. | ||||
| 5.3. RFC 3782, RFC 6582 (NewReno Fast Recovery) | 3.8. On Alarm Firing | |||
| QUIC only reduces its CWND once per congestion window, in keeping | QUIC uses one loss recovery alarm, which when set, can be in one of | |||
| with the NewReno RFC. It tracks the largest outstanding packet at | several modes. When the alarm fires, the mode determines the action | |||
| the time the loss is declared and any losses which occur before that | to be performed. | |||
| packet number are considered part of the same loss event. It's worth | ||||
| noting that some TCP implementations may do this on a sequence number | ||||
| basis, and hence consider multiple losses of the same packet a single | ||||
| loss event. | ||||
| 5.4. TLP (draft) | Pseudocode for OnLossDetectionAlarm follows: | |||
| QUIC always sends two tail loss probes before RTO is triggered. QUIC | OnLossDetectionAlarm(): | |||
| invokes tail loss probe even when a loss is outstanding, which is | if (handshake packets are outstanding): | |||
| different than some TCP implementations. | // Handshake retransmission alarm. | |||
| RetransmitAllHandshakePackets(); | ||||
| handshake_count++; | ||||
| // TODO: Clarify early retransmit and time loss. | ||||
| else if (loss_time != 0): | ||||
| // Early retransmit or Time Loss Detection | ||||
| DetectLostPackets(largest_acked_packet) | ||||
| else if (tlp_count < kMaxTLPs): | ||||
| // Tail Loss Probe. | ||||
| if (HasNewDataToSend()): | ||||
| SendOnePacketOfNewData() | ||||
| else: | ||||
| RetransmitOldestPacket() | ||||
| tlp_count++ | ||||
| else: | ||||
| // RTO. | ||||
| RetransmitOldestTwoPackets() | ||||
| rto_count++ | ||||
| 5.5. RFC 5827 (Early Retransmit) with Delay Timer | SetLossDetectionAlarm() | |||
| QUIC implements early retransmit with a timer in order to minimize | 3.9. Detecting Lost Packets | |||
| spurious retransmits. The timer is set to 1/4 SRTT after the final | ||||
| outstanding packet is acked. | ||||
| 5.6. RFC 5827 (F-RTO) | Packets in QUIC are only considered lost once a larger packet number | |||
| is acknowledged. DetectLostPackets is called every time an ack is | ||||
| received. If the loss detection alarm fires and the loss_time is | ||||
| set, the previous largest acked packet is supplied. | ||||
| QUIC implements F-RTO by not reducing the CWND and SSThresh until a | 3.9.1. Handshake Packets | |||
| subsequent ack is received and it's sure the RTO was not spurious. | ||||
| Conceptually this is similar, but it makes for a much cleaner | ||||
| implementation with fewer edge cases. | ||||
| 5.7. RFC 6937 (Proportional Rate Reduction) | The receiver MUST ignore unprotected packets that ack protected | |||
| packets. The receiver MUST trust protected acks for unprotected | ||||
| packets, however. Aside from this, loss detection for handshake | ||||
| packets when an ack is processed is identical to other packets. | ||||
| PRR-SSRB is implemented by QUIC in the epoch when recovering from a | 3.9.2. Pseudocode | |||
| loss. | ||||
| 5.8. TCP Cubic (draft) with optional RFC 5681 (Reno) | DetectLostPackets takes one parameter, acked, which is the largest | |||
| acked packet. | ||||
| TCP Cubic is the default congestion control algorithm in QUIC. Reno | Pseudocode for DetectLostPackets follows: | |||
| is also an easily available option which may be requested via | ||||
| connection options and is fully implemented. | ||||
| 5.9. Hybrid Slow Start (paper) | DetectLostPackets(largest_acked): | |||
| loss_time = 0 | ||||
| lost_packets = {} | ||||
| delay_until_lost = infinite; | ||||
| if (time_reordering_fraction != infinite): | ||||
| delay_until_lost = | ||||
| (1 + time_reordering_fraction) * max(latest_rtt, smoothed_rtt) | ||||
| else if (largest_acked.packet_number == largest_sent_packet): | ||||
| // Early retransmit alarm. | ||||
| delay_until_lost = 9/8 * max(latest_rtt, smoothed_rtt) | ||||
| foreach (unacked less than largest_acked.packet_number): | ||||
| time_since_sent = now() - unacked.time_sent | ||||
| packet_delta = largest_acked.packet_number - unacked.packet_number | ||||
| if (time_since_sent > delay_until_lost): | ||||
| lost_packets.insert(unacked) | ||||
| else if (packet_delta > reordering_threshold) | ||||
| lost_packets.insert(unacked) | ||||
| else if (loss_time == 0 && delay_until_lost != infinite): | ||||
| loss_time = delay_until_lost - time_since_sent | ||||
| QUIC implements hybrid slow start, but disables ack train detection, | // Inform the congestion controller of lost packets and | |||
| because it has shown to falsely trigger when coupled with packet | // lets it decide whether to retransmit immediately. | |||
| pacing, which is also on by default in QUIC. Currently the minimum | OnPacketsLost(lost_packets) | |||
| delay increase is 4ms, the maximum is 16ms, and within that range | foreach (packet in lost_packets) | |||
| QUIC exits slow start if the min_rtt within a round increases by more | sent_packets.remove(packet.packet_number) | |||
| than one eighth of the connection mi | ||||
| 5.10. RACK (draft) | 4. Congestion Control | |||
| QUIC's loss detection is by it's time-ordered nature, very similar to | (describe NewReno-style congestion control [RFC6582] for QUIC.) | |||
| RACK. Though QUIC defaults to loss detection based on reordering | (describe appropriate byte counting.) (define recovery based on | |||
| threshold in packets, it could just as easily be based on fractions | packet numbers.) (describe min_rtt based hystart.) (describe how | |||
| of an rtt, as RACK does. | QUIC's F-RTO [RFC5682] delays reducing CWND.) (describe PRR | |||
| [RFC6937]) | ||||
| 6. IANA Considerations | 5. IANA Considerations | |||
| This document has no IANA actions. Yet. | This document has no IANA actions. Yet. | |||
| 7. Normative References | 6. References | |||
| [QUIC-TLS] | 6.1. Normative References | |||
| Thomson, M., Ed. and S. Turner, Ed, Ed., "Using Transport | ||||
| Layer Security (TLS) to Secure QUIC". | ||||
| [QUIC-TRANSPORT] | [QUIC-TRANSPORT] | |||
| Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
| Multiplexed and Secure Transport". | Multiplexed and Secure Transport". | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <http://www.rfc-editor.org/info/rfc2119>. | <http://www.rfc-editor.org/info/rfc2119>. | |||
| 6.2. Informative References | ||||
| [I-D.dukkipati-tcpm-tcp-loss-probe] | ||||
| Dukkipati, N., Cardwell, N., Cheng, Y., and M. Mathis, | ||||
| "Tail Loss Probe (TLP): An Algorithm for Fast Recovery of | ||||
| Tail Losses", draft-dukkipati-tcpm-tcp-loss-probe-01 (work | ||||
| in progress), February 2013. | ||||
| [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, | ||||
| "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting | ||||
| Spurious Retransmission Timeouts with TCP", RFC 5682, | ||||
| DOI 10.17487/RFC5682, September 2009, | ||||
| <http://www.rfc-editor.org/info/rfc5682>. | ||||
| [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and | ||||
| P. Hurtig, "Early Retransmit for TCP and Stream Control | ||||
| Transmission Protocol (SCTP)", RFC 5827, | ||||
| DOI 10.17487/RFC5827, May 2010, | ||||
| <http://www.rfc-editor.org/info/rfc5827>. | ||||
| [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, | [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, | |||
| "Computing TCP's Retransmission Timer", RFC 6298, | "Computing TCP's Retransmission Timer", RFC 6298, | |||
| DOI 10.17487/RFC6298, June 2011, | DOI 10.17487/RFC6298, June 2011, | |||
| <http://www.rfc-editor.org/info/rfc6298>. | <http://www.rfc-editor.org/info/rfc6298>. | |||
| [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | ||||
| NewReno Modification to TCP's Fast Recovery Algorithm", | ||||
| RFC 6582, DOI 10.17487/RFC6582, April 2012, | ||||
| <http://www.rfc-editor.org/info/rfc6582>. | ||||
| [RFC6937] Mathis, M., Dukkipati, N., and Y. Cheng, "Proportional | ||||
| Rate Reduction for TCP", RFC 6937, DOI 10.17487/RFC6937, | ||||
| May 2013, <http://www.rfc-editor.org/info/rfc6937>. | ||||
| Appendix A. Acknowledgments | Appendix A. Acknowledgments | |||
| Appendix B. Change Log | Appendix B. Change Log | |||
| *RFC Editor's Note:* Please remove this section prior to | *RFC Editor's Note:* Please remove this section prior to | |||
| publication of a final version of this document. | publication of a final version of this document. | |||
| B.1. Since draft-ietf-quic-recovery-00: | B.1. Since draft-ietf-quic-recovery-01 | |||
| o Changes initial default RTT to 100ms | ||||
| o Added time-based loss detection and fixes early retransmit | ||||
| o Clarified loss recovery for handshake packets | ||||
| o Fixed references and made TCP references informative | ||||
| B.2. Since draft-ietf-quic-recovery-00: | ||||
| o Improved description of constants and ACK behavior | o Improved description of constants and ACK behavior | |||
| B.2. Since draft-iyengar-quic-loss-recovery-01: | B.3. Since draft-iyengar-quic-loss-recovery-01: | |||
| o Adopted as base for draft-ietf-quic-recovery. | o Adopted as base for draft-ietf-quic-recovery. | |||
| o Updated authors/editors list. | o Updated authors/editors list. | |||
| o Added table of contents. | o Added table of contents. | |||
| Authors' Addresses | Authors' Addresses | |||
| Jana Iyengar (editor) | Jana Iyengar (editor) | |||
| End of changes. 76 change blocks. | ||||
| 222 lines changed or deleted | 280 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||