| draft-ietf-quic-recovery-05.txt | draft-ietf-quic-recovery-06.txt | |||
|---|---|---|---|---|
| QUIC J. Iyengar, Ed. | QUIC J. Iyengar, Ed. | |||
| Internet-Draft I. Swett, Ed. | Internet-Draft I. Swett, Ed. | |||
| Intended status: Standards Track Google | Intended status: Standards Track Google | |||
| Expires: February 16, 2018 August 15, 2017 | Expires: March 26, 2018 September 22, 2017 | |||
| QUIC Loss Detection and Congestion Control | QUIC Loss Detection and Congestion Control | |||
| draft-ietf-quic-recovery-05 | draft-ietf-quic-recovery-06 | |||
| Abstract | Abstract | |||
| This document describes loss detection and congestion control | This document describes loss detection and congestion control | |||
| mechanisms for QUIC. | mechanisms for QUIC. | |||
| Note to Readers | Note to Readers | |||
| Discussion of this draft takes place on the QUIC working group | Discussion of this draft takes place on the QUIC working group | |||
| mailing list (quic@ietf.org), which is archived at | mailing list (quic@ietf.org), which is archived at | |||
| https://mailarchive.ietf.org/arch/search/?email_list=quic. | https://mailarchive.ietf.org/arch/search/?email_list=quic . | |||
| Working Group information can be found at https://github.com/quicwg; | Working Group information can be found at https://github.com/quicwg ; | |||
| source code and issues list for this draft can be found at | source code and issues list for this draft can be found at | |||
| https://github.com/quicwg/base-drafts/labels/recovery. | https://github.com/quicwg/base-drafts/labels/recovery . | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on February 16, 2018. | This Internet-Draft will expire on March 26, 2018. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 19 ¶ | skipping to change at page 2, line 19 ¶ | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 | 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 | |||
| 2. Design of the QUIC Transmission Machinery . . . . . . . . . . 3 | 2. Design of the QUIC Transmission Machinery . . . . . . . . . . 3 | |||
| 2.1. Relevant Differences Between QUIC and TCP . . . . . . . . 4 | 2.1. Relevant Differences Between QUIC and TCP . . . . . . . . 4 | |||
| 2.1.1. Monotonically Increasing Packet Numbers . . . . . . . 4 | 2.1.1. Monotonically Increasing Packet Numbers . . . . . . . 4 | |||
| 2.1.2. No Reneging . . . . . . . . . . . . . . . . . . . . . 4 | 2.1.2. No Reneging . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1.3. More ACK Ranges . . . . . . . . . . . . . . . . . . . 5 | 2.1.3. More ACK Ranges . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1.4. Explicit Correction For Delayed Acks . . . . . . . . 5 | 2.1.4. Explicit Correction For Delayed Acks . . . . . . . . 5 | |||
| 3. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 6 | 3.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 6 | |||
| 3.2.1. Constants of interest . . . . . . . . . . . . . . . . 6 | 3.2.1. Constants of interest . . . . . . . . . . . . . . . . 6 | |||
| 3.2.2. Variables of interest . . . . . . . . . . . . . . . . 6 | 3.2.2. Variables of interest . . . . . . . . . . . . . . . . 7 | |||
| 3.2.3. Initialization . . . . . . . . . . . . . . . . . . . 8 | 3.2.3. Initialization . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.2.4. On Sending a Packet . . . . . . . . . . . . . . . . . 8 | 3.2.4. On Sending a Packet . . . . . . . . . . . . . . . . . 8 | |||
| 3.2.5. On Ack Receipt . . . . . . . . . . . . . . . . . . . 9 | 3.2.5. On Ack Receipt . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.2.6. On Packet Acknowledgment . . . . . . . . . . . . . . 9 | 3.2.6. On Packet Acknowledgment . . . . . . . . . . . . . . 9 | |||
| 3.2.7. Setting the Loss Detection Alarm . . . . . . . . . . 10 | 3.2.7. Setting the Loss Detection Alarm . . . . . . . . . . 10 | |||
| 3.2.8. On Alarm Firing . . . . . . . . . . . . . . . . . . . 12 | 3.2.8. On Alarm Firing . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.2.9. Detecting Lost Packets . . . . . . . . . . . . . . . 13 | 3.2.9. Detecting Lost Packets . . . . . . . . . . . . . . . 13 | |||
| 3.3. Discussion . . . . . . . . . . . . . . . . . . . . . . . 14 | 3.3. Discussion . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4. Congestion Control . . . . . . . . . . . . . . . . . . . . . 14 | 4. Congestion Control . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4.1. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 15 | 4.1. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 4.2. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 15 | 4.2. Congestion Avoidance . . . . . . . . . . . . . . . . . . 15 | |||
| 4.3. Constants of interest . . . . . . . . . . . . . . . . . . 15 | 4.3. Recovery Period . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 4.4. Variables of interest . . . . . . . . . . . . . . . . . . 15 | 4.4. Tail Loss Probe . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 4.5. Initialization . . . . . . . . . . . . . . . . . . . . . 16 | 4.5. Retransmission Timeout . . . . . . . . . . . . . . . . . 15 | |||
| 4.6. On Packet Acknowledgement . . . . . . . . . . . . . . . . 16 | 4.6. Pacing Rate . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 4.7. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 16 | 4.7. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 4.8. On Retransmission Timeout Verified . . . . . . . . . . . 17 | 4.7.1. Constants of interest . . . . . . . . . . . . . . . . 16 | |||
| 4.9. Pacing Packets . . . . . . . . . . . . . . . . . . . . . 17 | 4.7.2. Variables of interest . . . . . . . . . . . . . . . . 16 | |||
| 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 | 4.7.3. Initialization . . . . . . . . . . . . . . . . . . . 17 | |||
| 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 4.7.4. On Packet Sent . . . . . . . . . . . . . . . . . . . 17 | |||
| 6.1. Normative References . . . . . . . . . . . . . . . . . . 17 | 4.7.5. On Packet Acknowledgement . . . . . . . . . . . . . . 17 | |||
| 6.2. Informative References . . . . . . . . . . . . . . . . . 17 | 4.7.6. On Packets Lost . . . . . . . . . . . . . . . . . . . 17 | |||
| Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 18 | 4.7.7. On Retransmission Timeout Verified . . . . . . . . . 18 | |||
| Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 18 | 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
| B.1. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 18 | 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| B.2. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 18 | 6.1. Normative References . . . . . . . . . . . . . . . . . . 18 | |||
| B.3. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 18 | 6.2. Informative References . . . . . . . . . . . . . . . . . 18 | |||
| B.4. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 19 | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 19 | |||
| B.5. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 19 | Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 19 | |||
| B.6. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 19 | B.1. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 19 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 | B.2. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 19 | |||
| B.3. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 19 | ||||
| B.4. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 20 | ||||
| B.5. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 20 | ||||
| B.6. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 20 | ||||
| B.7. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 20 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
| 1. Introduction | 1. Introduction | |||
| QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | |||
| on decades of transport and security experience, and implements | on decades of transport and security experience, and implements | |||
| mechanisms that make it attractive as a modern general-purpose | mechanisms that make it attractive as a modern general-purpose | |||
| transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | |||
| QUIC implements the spirit of known TCP loss recovery mechanisms, | QUIC implements the spirit of known TCP loss recovery mechanisms, | |||
| described in RFCs, various Internet-drafts, and also those prevalent | described in RFCs, various Internet-drafts, and also those prevalent | |||
| skipping to change at page 5, line 46 ¶ | skipping to change at page 6, line 4 ¶ | |||
| kReorderingThreshold packets are sent after it, the sender cannot | kReorderingThreshold packets are sent after it, the sender cannot | |||
| expect to detect loss based on the previous mechanism. In this | expect to detect loss based on the previous mechanism. In this | |||
| case, a sender uses both ack information and an alarm to detect | case, a sender uses both ack information and an alarm to detect | |||
| loss. Specifically, when the last sent packet is acknowledged, | loss. Specifically, when the last sent packet is acknowledged, | |||
| the sender waits a short period of time to allow for reordering | the sender waits a short period of time to allow for reordering | |||
| and then marks any unacknowledged packets as lost. This mechanism | and then marks any unacknowledged packets as lost. This mechanism | |||
| is based on the Linux implementation of TCP Early Retransmit. | is based on the Linux implementation of TCP Early Retransmit. | |||
| o If a packet is sent at the tail, there are no packets sent after | o If a packet is sent at the tail, there are no packets sent after | |||
| it, and the sender cannot use ack information to detect its loss. | it, and the sender cannot use ack information to detect its loss. | |||
| The sender therefore relies on an alarm to detect such tail | The sender therefore relies on an alarm to detect such tail | |||
| losses. This mechanism is based on TCP's Tail Loss Probe. | losses. This mechanism is based on TCP's Tail Loss Probe. | |||
| o If all else fails, a Retransmission Timeout (RTO) alarm is always | o If all else fails, a Retransmission Timeout (RTO) alarm is always | |||
| set when any retransmittable packet is outstanding. When this | set when any retransmittable packet is outstanding. When this | |||
| alarm fires, all unacknowledged packets are marked as lost. | alarm fires, all unacknowledged packets are marked as lost. | |||
| o Instead of a packet threshold to tolerate reordering, a QUIC | o Instead of a packet threshold to tolerate reordering, a QUIC | |||
| sender may use a time threshold. This allows for senders to be | sender may use a time threshold. This allows for senders to be | |||
| tolerant of short periods of significant reordering. In this | tolerant of short periods of significant reordering. In this | |||
| mechanism, a QUIC sender marks a packet as lost when a packet | mechanism, a QUIC sender marks a packet as lost when a larger | |||
| larger than it is acknowledged and a threshold amount of time has | packet number is acknowledged and a threshold amount of time has | |||
| passed since the packet was sent. | passed since the packet was sent. | |||
| o Handshake packets, which contain STREAM frames for stream 0, are | o Handshake packets, which contain STREAM frames for stream 0, are | |||
| critical to QUIC transport and crypto negotiation, so a separate | critical to QUIC transport and crypto negotiation, so a separate | |||
| alarm period is used for them. | alarm period is used for them. | |||
| 3.2. Algorithm Details | 3.2. Algorithm Details | |||
| 3.2.1. Constants of interest | 3.2.1. Constants of interest | |||
| skipping to change at page 8, line 35 ¶ | skipping to change at page 8, line 39 ¶ | |||
| largest_sent_packet = 0 | largest_sent_packet = 0 | |||
| 3.2.4. On Sending a Packet | 3.2.4. On Sending a Packet | |||
| After any packet is sent, be it a new transmission or a rebundled | After any packet is sent, be it a new transmission or a rebundled | |||
| transmission, the following OnPacketSent function is called. The | transmission, the following OnPacketSent function is called. The | |||
| parameters to OnPacketSent are as follows: | parameters to OnPacketSent are as follows: | |||
| o packet_number: The packet number of the sent packet. | o packet_number: The packet number of the sent packet. | |||
| o is_retransmittable: A boolean that indicates whether the packet | o is_ack_only: A boolean that indicates whether a packet only | |||
| contains at least one frame requiring reliable deliver. The | contains an ACK frame. If true, it is still expected an ack will | |||
| retransmittability of various QUIC frames is described in | be received for this packet, but it is not congestion controlled. | |||
| [QUIC-TRANSPORT]. If false, it is still acceptable for an ack to | ||||
| be received for this packet. However, a caller MUST NOT set | ||||
| is_retransmittable to true if an ack is not expected. | ||||
| o sent_bytes: The number of bytes sent in the packet. | o sent_bytes: The number of bytes sent in the packet, not including | |||
| UDP or IP overhead, but including QUIC framing overhead. | ||||
| Pseudocode for OnPacketSent follows: | Pseudocode for OnPacketSent follows: | |||
| OnPacketSent(packet_number, is_retransmittable, sent_bytes): | OnPacketSent(packet_number, is_ack_only, sent_bytes): | |||
| time_of_last_sent_packet = now | time_of_last_sent_packet = now | |||
| largest_sent_packet = packet_number | largest_sent_packet = packet_number | |||
| sent_packets[packet_number].packet_number = packet_number | sent_packets[packet_number].packet_number = packet_number | |||
| sent_packets[packet_number].time = now | sent_packets[packet_number].time = now | |||
| if is_retransmittable: | if !is_ack_only: | |||
| OnPacketSentCC(sent_bytes) | ||||
| sent_packets[packet_number].bytes = sent_bytes | sent_packets[packet_number].bytes = sent_bytes | |||
| SetLossDetectionAlarm() | SetLossDetectionAlarm() | |||
| 3.2.5. On Ack Receipt | 3.2.5. On Ack Receipt | |||
| When an ack is received, it may acknowledge 0 or more packets. | When an ack is received, it may acknowledge 0 or more packets. | |||
| Pseudocode for OnAckReceived and UpdateRtt follow: | Pseudocode for OnAckReceived and UpdateRtt follow: | |||
| OnAckReceived(ack): | OnAckReceived(ack): | |||
| skipping to change at page 9, line 41 ¶ | skipping to change at page 9, line 42 ¶ | |||
| DetectLostPackets(ack.largest_acked_packet) | DetectLostPackets(ack.largest_acked_packet) | |||
| SetLossDetectionAlarm() | SetLossDetectionAlarm() | |||
| UpdateRtt(latest_rtt): | UpdateRtt(latest_rtt): | |||
| // Based on {{RFC6298}}. | // Based on {{RFC6298}}. | |||
| if (smoothed_rtt == 0): | if (smoothed_rtt == 0): | |||
| smoothed_rtt = latest_rtt | smoothed_rtt = latest_rtt | |||
| rttvar = latest_rtt / 2 | rttvar = latest_rtt / 2 | |||
| else: | else: | |||
| rttvar = 3/4 * rttvar + 1/4 * (smoothed_rtt - latest_rtt) | rttvar = 3/4 * rttvar + 1/4 * abs(smoothed_rtt - latest_rtt) | |||
| smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * latest_rtt | smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * latest_rtt | |||
| 3.2.6. On Packet Acknowledgment | 3.2.6. On Packet Acknowledgment | |||
| When a packet is acked for the first time, the following | When a packet is acked for the first time, the following | |||
| OnPacketAcked function is called. Note that a single ACK frame may | OnPacketAcked function is called. Note that a single ACK frame may | |||
| newly acknowledge several packets. OnPacketAcked must be called once | newly acknowledge several packets. OnPacketAcked must be called once | |||
| for each of these newly acked packets. | for each of these newly acked packets. | |||
| OnPacketAcked takes one parameter, acked_packet, which is the packet | OnPacketAcked takes one parameter, acked_packet, which is the packet | |||
| skipping to change at page 12, line 23 ¶ | skipping to change at page 12, line 23 ¶ | |||
| alarm_duration = 2 * kDefaultInitialRtt | alarm_duration = 2 * kDefaultInitialRtt | |||
| else: | else: | |||
| alarm_duration = 2 * smoothed_rtt | alarm_duration = 2 * smoothed_rtt | |||
| alarm_duration = max(alarm_duration, kMinTLPTimeout) | alarm_duration = max(alarm_duration, kMinTLPTimeout) | |||
| alarm_duration = alarm_duration * (2 ^ handshake_count) | alarm_duration = alarm_duration * (2 ^ handshake_count) | |||
| else if (loss_time != 0): | else if (loss_time != 0): | |||
| // Early retransmit timer or time loss detection. | // Early retransmit timer or time loss detection. | |||
| alarm_duration = loss_time - now | alarm_duration = loss_time - now | |||
| else if (tlp_count < kMaxTLPs): | else if (tlp_count < kMaxTLPs): | |||
| // Tail Loss Probe | // Tail Loss Probe | |||
| if (retransmittable_packets_outstanding = 1): | if (retransmittable_packets_outstanding == 1): | |||
| alarm_duration = 1.5 * smoothed_rtt + kDelayedAckTimeout | alarm_duration = 1.5 * smoothed_rtt + kDelayedAckTimeout | |||
| else: | else: | |||
| alarm_duration = kMinTLPTimeout | alarm_duration = kMinTLPTimeout | |||
| alarm_duration = max(alarm_duration, 2 * smoothed_rtt) | alarm_duration = max(alarm_duration, 2 * smoothed_rtt) | |||
| else: | else: | |||
| // RTO alarm | // RTO alarm | |||
| alarm_duration = smoothed_rtt + 4 * rttvar | alarm_duration = smoothed_rtt + 4 * rttvar | |||
| alarm_duration = max(alarm_duration, kMinRTOTimeout) | alarm_duration = max(alarm_duration, kMinRTOTimeout) | |||
| alarm_duration = alarm_duration * (2 ^ rto_count) | alarm_duration = alarm_duration * (2 ^ rto_count) | |||
| skipping to change at page 14, line 50 ¶ | skipping to change at page 14, line 50 ¶ | |||
| where less than packet per 25ms is delivered, acking every packet is | where less than packet per 25ms is delivered, acking every packet is | |||
| beneficial to congestion control and loss recovery. | beneficial to congestion control and loss recovery. | |||
| The default initial RTT of 100ms was chosen because it is slightly | The default initial RTT of 100ms was chosen because it is slightly | |||
| higher than both the median and mean min_rtt typically observed on | higher than both the median and mean min_rtt typically observed on | |||
| the public internet. | the public internet. | |||
| 4. Congestion Control | 4. Congestion Control | |||
| QUIC's congestion control is based on TCP NewReno[RFC6582] congestion | QUIC's congestion control is based on TCP NewReno[RFC6582] congestion | |||
| control to determine the congestion window and pacing rate. | control to determine the congestion window and pacing rate. QUIC | |||
| congestion control is specified in bytes due to finer control and the | ||||
| ease of appropriate byte counting[RFC3465]. | ||||
| 4.1. Slow Start | 4.1. Slow Start | |||
| QUIC begins every connection in slow start and exits slow start upon | QUIC begins every connection in slow start and exits slow start upon | |||
| loss. While in slow start, QUIC increases the congestion window by | loss. QUIC re-enters slow start after a retransmission timeout. | |||
| the number of acknowledged bytes when each ack is processed. | While in slow start, QUIC increases the congestion window by the | |||
| number of acknowledged bytes when each ack is processed. | ||||
| 4.2. Recovery | 4.2. Congestion Avoidance | |||
| Slow start exits to congestion avoidance. Congestion avoidance in | ||||
| NewReno uses an additive increase multiplicative decrease (AIMD) | ||||
| approach that increases the congestion window by one MSS of bytes per | ||||
| congestion window acknowledged. When a loss is detected, NewReno | ||||
| halves the congestion window and sets the slow start threshold to the | ||||
| new congestion window. | ||||
| 4.3. Recovery Period | ||||
| Recovery is a period of time beginning with detection of a lost | Recovery is a period of time beginning with detection of a lost | |||
| packet. It ends when all packets outstanding at the time recovery | packet. Because QUIC retransmits frames, not packets, it defines the | |||
| began have been acknowledged or lost. During recovery, the | end of recovery as all packets outstanding at the start of recovery | |||
| congestion window is not increased or decreased. | being acknowledged or lost. This is slightly different from TCP's | |||
| definition of recovery ending when the lost packet that started | ||||
| recovery is acknowledged. During recovery, the congestion window is | ||||
| not increased or decreased. As such, multiple lost packets only | ||||
| decrease the congestion window once as long as they're lost before | ||||
| exiting recovery. This causes QUIC to decrease the congestion window | ||||
| multiple times if retransmisions are lost, but limits the reduction | ||||
| to once per round trip. | ||||
| 4.3. Constants of interest | 4.4. Tail Loss Probe | |||
| If recovery sends a tail loss probe, no change is made to the | ||||
| congestion window or pacing rate. Acknowledgement or loss of tail | ||||
| loss probes are treated like any other packet. | ||||
| 4.5. Retransmission Timeout | ||||
| When retransmissions are sent due to a retransmission timeout alarm, | ||||
| no change is made to the congestion window or pacing rate until the | ||||
| next acknowledgement arrives. When an ack arrives, if packets prior | ||||
| to the first retransmission timeout are acknowledged, then the | ||||
| congestion window remains the same. If no packets prior to the first | ||||
| retransmission timeout are acknowledged, the retransmission timeout | ||||
| has been validated and the congestion window must be reduced to the | ||||
| minimum congestion window and slow start is begun. | ||||
| 4.6. Pacing Rate | ||||
| The pacing rate is a function of the mode, the congestion window, and | ||||
| the smoothed rtt. Specifically, the pacing rate is 2 times the | ||||
| congestion window divided by the smoothed RTT during slow start and | ||||
| 1.25 times the congestion window divided by the smoothed RTT during | ||||
| slow start. In order to fairly compete with flows that are not | ||||
| pacing, it is recommended to not pace the first 10 sent packets when | ||||
| exiting quiescence. | ||||
| 4.7. Pseudocode | ||||
| 4.7.1. Constants of interest | ||||
| Constants used in congestion control are based on a combination of | Constants used in congestion control are based on a combination of | |||
| RFCs, papers, and common practice. Some may need to be changed or | RFCs, papers, and common practice. Some may need to be changed or | |||
| negotiated in order to better suit a variety of environments. | negotiated in order to better suit a variety of environments. | |||
| kDefaultMss (default 1460 bytes): The default max packet size used | kDefaultMss (default 1460 bytes): The default max packet size used | |||
| for calculating default and minimum congestion windows. | for calculating default and minimum congestion windows. | |||
| kInitialWindow (default 10 * kDefaultMss): Default limit on the | kInitialWindow (default 10 * kDefaultMss): Default limit on the | |||
| amount of outstanding data in bytes. | amount of outstanding data in bytes. | |||
| kMinimumWindow (default 2 * kDefaultMss): Default minimum congestion | kMinimumWindow (default 2 * kDefaultMss): Default minimum congestion | |||
| window. | window. | |||
| kLossReductionFactor (default 0.5): Reduction in congestion window | kLossReductionFactor (default 0.5): Reduction in congestion window | |||
| when a new loss event is detected. | when a new loss event is detected. | |||
| 4.4. Variables of interest | 4.7.2. Variables of interest | |||
| Variables required to implement the congestion control mechanisms are | Variables required to implement the congestion control mechanisms are | |||
| described in this section. | described in this section. | |||
| bytes_in_flight: The sum of the size in bytes of all sent packets | bytes_in_flight: The sum of the size in bytes of all sent packets | |||
| that contain at least one retransmittable or PADDING frame, and | that contain at least one retransmittable or PADDING frame, and | |||
| have not been acked or declared lost. The size does not include | have not been acked or declared lost. The size does not include | |||
| IP or UDP overhead. Ack only frames do not count towards | IP or UDP overhead. Packets only containing ack frames do not | |||
| byte_in_flight. | count towards byte_in_flight to ensure congestion control does not | |||
| impede congestion feedback. | ||||
| congestion_window: Maximum number of bytes in flight that may be | congestion_window: Maximum number of bytes in flight that may be | |||
| sent. | sent. | |||
| end_of_recovery: The packet number after which QUIC will no longer | end_of_recovery: The largest packet number sent when QUIC detects a | |||
| be in recovery. | loss. When a larger packet is acknowledged, QUIC exits recovery. | |||
| ssthresh Slow start threshold in bytes. When the congestion window | ssthresh Slow start threshold in bytes. When the congestion window | |||
| is below ssthresh, it grows by the number of bytes acknowledged | is below ssthresh, the mode is slow start and the window grows by | |||
| for each ack. | the number of bytes acknowledged. | |||
| 4.5. Initialization | 4.7.3. Initialization | |||
| At the beginning of the connection, initialize the loss detection | At the beginning of the connection, initialize the congestion control | |||
| variables as follows: | variables as follows: | |||
| congestion_window = kInitialWindow | congestion_window = kInitialWindow | |||
| bytes_in_flight = 0 | bytes_in_flight = 0 | |||
| end_of_recovery = 0 | end_of_recovery = 0 | |||
| ssthresh = infinite | ssthresh = infinite | |||
| 4.6. On Packet Acknowledgement | 4.7.4. On Packet Sent | |||
| Whenever a packet is sent, and it contains non-ACK frames, the packet | ||||
| increases bytes_in_flight. | ||||
| OnPacketSentCC(bytes_sent): | ||||
| bytes_in_flight += bytes_sent | ||||
| 4.7.5. On Packet Acknowledgement | ||||
| Invoked from loss detection's OnPacketAcked and is supplied with | Invoked from loss detection's OnPacketAcked and is supplied with | |||
| acked_packet from sent_packets. | acked_packet from sent_packets. | |||
| Pseudocode for OnPacketAckedCC follows: | ||||
| OnPacketAckedCC(acked_packet): | OnPacketAckedCC(acked_packet): | |||
| // Remove from bytes_in_flight. | ||||
| bytes_in_flight -= acked_packet.bytes | ||||
| if (acked_packet.packet_number < end_of_recovery): | if (acked_packet.packet_number < end_of_recovery): | |||
| // Do not increase congestion window in recovery period. | ||||
| return | return | |||
| if (congestion_window < ssthresh): | if (congestion_window < ssthresh): | |||
| congestion_window += acket_packets.bytes | // Slow start. | |||
| congestion_window += acked_packets.bytes | ||||
| else: | else: | |||
| // Congestion avoidance. | ||||
| congestion_window += | congestion_window += | |||
| acked_packets.bytes / congestion_window | kDefaultMss * acked_packets.bytes / congestion_window | |||
| 4.7. On Packets Lost | 4.7.6. On Packets Lost | |||
| Invoked by loss detection from DetectLostPackets when new packets are | Invoked by loss detection from DetectLostPackets when new packets are | |||
| detected lost. | detected lost. | |||
| OnPacketsLost(lost_packets): | OnPacketsLost(lost_packets): | |||
| // Remove lost packets from bytes_in_flight. | ||||
| for (lost_packet : lost_packets): | ||||
| bytes_in_flight -= lost_packet.bytes | ||||
| largest_lost_packet = lost_packets.last() | largest_lost_packet = lost_packets.last() | |||
| // Start a new recovery epoch if the lost packet is larger | // Start a new recovery epoch if the lost packet is larger | |||
| // than the end of the previous recovery epoch. | // than the end of the previous recovery epoch. | |||
| if (end_of_recovery < largest_lost_packet.packet_number): | if (end_of_recovery < largest_lost_packet.packet_number): | |||
| end_of_recovery = largest_sent_packet | end_of_recovery = largest_sent_packet | |||
| congestion_window *= kLossReductionFactor | congestion_window *= kLossReductionFactor | |||
| congestion_window = max(congestion_window, kMinimumWindow) | congestion_window = max(congestion_window, kMinimumWindow) | |||
| ssthresh = congestion_window | ssthresh = congestion_window | |||
| 4.8. On Retransmission Timeout Verified | 4.7.7. On Retransmission Timeout Verified | |||
| QUIC decreases the congestion window to the minimum value once the | QUIC decreases the congestion window to the minimum value once the | |||
| retransmission timeout has been confirmed to not be spurious when the | retransmission timeout has been verified. | |||
| first post-RTO acknowledgement is processed. | ||||
| OnRetransmissionTimeoutVerified() | OnRetransmissionTimeoutVerified() | |||
| congestion_window = kMinimumWindow | congestion_window = kMinimumWindow | |||
| 4.9. Pacing Packets | ||||
| QUIC sends a packet if there is available congestion window and | ||||
| sending the packet does not exceed the pacing rate. | ||||
| TimeToSend returns infinite if the congestion controller is | ||||
| congestion window limited, a time in the past if the packet can be | ||||
| sent immediately, and a time in the future if sending is pacing | ||||
| limited. | ||||
| TimeToSend(packet_size): | ||||
| if (bytes_in_flight + packet_size > congestion_window) | ||||
| return infinite | ||||
| return time_of_last_sent_packet + | ||||
| (packet_size * smoothed_rtt) / congestion_window | ||||
| 5. IANA Considerations | 5. IANA Considerations | |||
| This document has no IANA actions. Yet. | This document has no IANA actions. Yet. | |||
| 6. References | 6. References | |||
| 6.1. Normative References | 6.1. Normative References | |||
| [QUIC-TRANSPORT] | [QUIC-TRANSPORT] | |||
| Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
| Multiplexed and Secure Transport", draft-ietf-quic- | Multiplexed and Secure Transport", draft-ietf-quic- | |||
| transport (work in progress), August 2017. | transport (work in progress), September 2017. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | |||
| <http://www.rfc-editor.org/info/rfc2119>. | editor.org/info/rfc2119>. | |||
| 6.2. Informative References | 6.2. Informative References | |||
| [LOSS-PROBE] | [LOSS-PROBE] | |||
| Dukkipati, N., Cardwell, N., Cheng, Y., and M. Mathis, | Dukkipati, N., Cardwell, N., Cheng, Y., and M. Mathis, | |||
| "Tail Loss Probe (TLP): An Algorithm for Fast Recovery of | "Tail Loss Probe (TLP): An Algorithm for Fast Recovery of | |||
| Tail Losses", draft-dukkipati-tcpm-tcp-loss-probe-01 (work | Tail Losses", draft-dukkipati-tcpm-tcp-loss-probe-01 (work | |||
| in progress), February 2013. | in progress), February 2013. | |||
| [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte | ||||
| Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February | ||||
| 2003, <https://www.rfc-editor.org/info/rfc3465>. | ||||
| [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, | [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, | |||
| "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting | "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting | |||
| Spurious Retransmission Timeouts with TCP", RFC 5682, | Spurious Retransmission Timeouts with TCP", RFC 5682, | |||
| DOI 10.17487/RFC5682, September 2009, | DOI 10.17487/RFC5682, September 2009, <https://www.rfc- | |||
| <http://www.rfc-editor.org/info/rfc5682>. | editor.org/info/rfc5682>. | |||
| [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and | [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and | |||
| P. Hurtig, "Early Retransmit for TCP and Stream Control | P. Hurtig, "Early Retransmit for TCP and Stream Control | |||
| Transmission Protocol (SCTP)", RFC 5827, | Transmission Protocol (SCTP)", RFC 5827, | |||
| DOI 10.17487/RFC5827, May 2010, | DOI 10.17487/RFC5827, May 2010, <https://www.rfc- | |||
| <http://www.rfc-editor.org/info/rfc5827>. | editor.org/info/rfc5827>. | |||
| [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, | [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, | |||
| "Computing TCP's Retransmission Timer", RFC 6298, | "Computing TCP's Retransmission Timer", RFC 6298, | |||
| DOI 10.17487/RFC6298, June 2011, | DOI 10.17487/RFC6298, June 2011, <https://www.rfc- | |||
| <http://www.rfc-editor.org/info/rfc6298>. | editor.org/info/rfc6298>. | |||
| [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | |||
| NewReno Modification to TCP's Fast Recovery Algorithm", | NewReno Modification to TCP's Fast Recovery Algorithm", | |||
| RFC 6582, DOI 10.17487/RFC6582, April 2012, | RFC 6582, DOI 10.17487/RFC6582, April 2012, | |||
| <http://www.rfc-editor.org/info/rfc6582>. | <https://www.rfc-editor.org/info/rfc6582>. | |||
| Appendix A. Acknowledgments | Appendix A. Acknowledgments | |||
| Appendix B. Change Log | Appendix B. Change Log | |||
| *RFC Editor's Note:* Please remove this section prior to | *RFC Editor's Note:* Please remove this section prior to | |||
| publication of a final version of this document. | publication of a final version of this document. | |||
| B.1. Since draft-ietf-quic-recovery-04 | B.1. Since draft-ietf-quic-recovery-05 | |||
| o Add more congestion control text (#776) | ||||
| B.2. Since draft-ietf-quic-recovery-04 | ||||
| No significant changes. | No significant changes. | |||
| B.2. Since draft-ietf-quic-recovery-03 | B.3. Since draft-ietf-quic-recovery-03 | |||
| No significant changes. | No significant changes. | |||
| B.3. Since draft-ietf-quic-recovery-02 | B.4. Since draft-ietf-quic-recovery-02 | |||
| o Integrate F-RTO (#544, #409) | o Integrate F-RTO (#544, #409) | |||
| o Add congestion control (#545, #395) | o Add congestion control (#545, #395) | |||
| o Require connection abort if a skipped packet was acknowledged | o Require connection abort if a skipped packet was acknowledged | |||
| (#415) | (#415) | |||
| o Simplify RTO calculations (#142, #417) | o Simplify RTO calculations (#142, #417) | |||
| B.4. Since draft-ietf-quic-recovery-01 | B.5. Since draft-ietf-quic-recovery-01 | |||
| o Overview added to loss detection | o Overview added to loss detection | |||
| o Changes initial default RTT to 100ms | o Changes initial default RTT to 100ms | |||
| o Added time-based loss detection and fixes early retransmit | o Added time-based loss detection and fixes early retransmit | |||
| o Clarified loss recovery for handshake packets | o Clarified loss recovery for handshake packets | |||
| o Fixed references and made TCP references informative | o Fixed references and made TCP references informative | |||
| B.5. Since draft-ietf-quic-recovery-00 | B.6. Since draft-ietf-quic-recovery-00 | |||
| o Improved description of constants and ACK behavior | o Improved description of constants and ACK behavior | |||
| B.6. Since draft-iyengar-quic-loss-recovery-01 | B.7. Since draft-iyengar-quic-loss-recovery-01 | |||
| o Adopted as base for draft-ietf-quic-recovery | o Adopted as base for draft-ietf-quic-recovery | |||
| o Updated authors/editors list | o Updated authors/editors list | |||
| o Added table of contents | o Added table of contents | |||
| Authors' Addresses | Authors' Addresses | |||
| Jana Iyengar (editor) | Jana Iyengar (editor) | |||
| End of changes. 54 change blocks. | ||||
| 100 lines changed or deleted | 161 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||