| draft-ietf-quic-recovery-23.txt | draft-ietf-quic-recovery-24.txt | |||
|---|---|---|---|---|
| QUIC J. Iyengar, Ed. | QUIC J. Iyengar, Ed. | |||
| Internet-Draft Fastly | Internet-Draft Fastly | |||
| Intended status: Standards Track I. Swett, Ed. | Intended status: Standards Track I. Swett, Ed. | |||
| Expires: March 15, 2020 Google | Expires: May 7, 2020 Google | |||
| September 12, 2019 | November 04, 2019 | |||
| QUIC Loss Detection and Congestion Control | QUIC Loss Detection and Congestion Control | |||
| draft-ietf-quic-recovery-23 | draft-ietf-quic-recovery-24 | |||
| Abstract | Abstract | |||
| This document describes loss detection and congestion control | This document describes loss detection and congestion control | |||
| mechanisms for QUIC. | mechanisms for QUIC. | |||
| Note to Readers | Note to Readers | |||
| Discussion of this draft takes place on the QUIC working group | Discussion of this draft takes place on the QUIC working group | |||
| mailing list (quic@ietf.org), which is archived at | mailing list (quic@ietf.org), which is archived at | |||
| skipping to change at page 1, line 42 ¶ | skipping to change at page 1, line 42 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on March 15, 2020. | This Internet-Draft will expire on May 7, 2020. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 40 ¶ | skipping to change at page 2, line 40 ¶ | |||
| 4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 8 | 4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 8 | |||
| 5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 9 | 5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 10 | 5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 10 | |||
| 5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 10 | 5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 10 | |||
| 5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 10 | 5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 10 | |||
| 5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 11 | 5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 11 | 5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 11 | |||
| 5.3. Handshakes and New Paths . . . . . . . . . . . . . . . . 12 | 5.3. Handshakes and New Paths . . . . . . . . . . . . . . . . 12 | |||
| 5.3.1. Sending Probe Packets . . . . . . . . . . . . . . . . 13 | 5.3.1. Sending Probe Packets . . . . . . . . . . . . . . . . 13 | |||
| 5.3.2. Loss Detection . . . . . . . . . . . . . . . . . . . 14 | 5.3.2. Loss Detection . . . . . . . . . . . . . . . . . . . 14 | |||
| 5.4. Retry and Version Negotiation . . . . . . . . . . . . . . 14 | 5.4. Handling Retry Packets . . . . . . . . . . . . . . . . . 14 | |||
| 5.5. Discarding Keys and Packet State . . . . . . . . . . . . 14 | 5.5. Discarding Keys and Packet State . . . . . . . . . . . . 14 | |||
| 5.6. Discussion . . . . . . . . . . . . . . . . . . . . . . . 15 | ||||
| 6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 15 | 6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 6.1. Explicit Congestion Notification . . . . . . . . . . . . 15 | 6.1. Explicit Congestion Notification . . . . . . . . . . . . 15 | |||
| 6.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 16 | 6.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 6.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 16 | 6.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 16 | |||
| 6.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 16 | 6.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 6.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 16 | 6.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 16 | |||
| 6.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 17 | 6.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 6.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 17 | 6.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 17 | |||
| 6.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 6.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 6.9. Under-utilizing the Congestion Window . . . . . . . . . . 18 | 6.9. Under-utilizing the Congestion Window . . . . . . . . . . 18 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | |||
| 7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 19 | 7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 19 | |||
| 7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 19 | 7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 19 | |||
| 7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 19 | 7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 19 | |||
| 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 20 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 20 | |||
| 9.2. Informative References . . . . . . . . . . . . . . . . . 20 | 9.2. Informative References . . . . . . . . . . . . . . . . . 20 | |||
| 9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 22 | 9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 22 | Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 22 | |||
| A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 22 | A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 22 | |||
| A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 22 | A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 22 | |||
| A.2. Constants of interest . . . . . . . . . . . . . . . . . . 23 | A.2. Constants of interest . . . . . . . . . . . . . . . . . . 23 | |||
| A.3. Variables of interest . . . . . . . . . . . . . . . . . . 23 | A.3. Variables of interest . . . . . . . . . . . . . . . . . . 23 | |||
| A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 24 | A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 24 | |||
| A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 25 | A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 24 | |||
| A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 25 | A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 25 | |||
| A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 26 | A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 26 | |||
| A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 27 | A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 27 | |||
| A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 29 | A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
| A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 29 | A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 29 | |||
| Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 30 | Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 30 | |||
| B.1. Constants of interest . . . . . . . . . . . . . . . . . . 30 | B.1. Constants of interest . . . . . . . . . . . . . . . . . . 30 | |||
| B.2. Variables of interest . . . . . . . . . . . . . . . . . . 31 | B.2. Variables of interest . . . . . . . . . . . . . . . . . . 31 | |||
| B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 32 | B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 32 | |||
| B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 32 | B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 32 | |||
| B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 32 | B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 32 | |||
| B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 33 | B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 33 | |||
| B.7. Process ECN Information . . . . . . . . . . . . . . . . . 33 | B.7. Process ECN Information . . . . . . . . . . . . . . . . . 33 | |||
| B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 34 | B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 34 | |||
| Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 34 | Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 34 | |||
| C.1. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 34 | C.1. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 34 | |||
| C.2. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 34 | C.2. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 35 | |||
| C.3. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 35 | C.3. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 35 | |||
| C.4. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 35 | C.4. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 35 | |||
| C.5. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 35 | C.5. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 35 | |||
| C.6. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 36 | C.6. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 36 | |||
| C.7. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 36 | C.7. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 36 | |||
| C.8. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 37 | C.8. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 36 | |||
| C.9. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 37 | C.9. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 37 | |||
| C.10. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 37 | C.10. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 37 | |||
| C.11. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 37 | C.11. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 38 | |||
| C.12. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 37 | C.12. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 38 | |||
| C.13. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 38 | C.13. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 38 | |||
| C.14. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 38 | C.14. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 38 | |||
| C.15. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 38 | C.15. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 38 | |||
| C.16. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 38 | C.16. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 38 | |||
| C.17. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 38 | C.17. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 39 | |||
| C.18. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 38 | C.18. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 39 | |||
| C.19. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 38 | C.19. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 39 | |||
| C.20. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 38 | C.20. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 39 | |||
| C.21. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 39 | C.21. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 39 | |||
| C.22. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 39 | C.22. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 39 | |||
| C.23. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 39 | C.23. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 39 | |||
| Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 39 | C.24. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 39 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 39 | Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 | ||||
| 1. Introduction | 1. Introduction | |||
| QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | QUIC is a new multiplexed and secure transport atop UDP. QUIC builds | |||
| on decades of transport and security experience, and implements | on decades of transport and security experience, and implements | |||
| mechanisms that make it attractive as a modern general-purpose | mechanisms that make it attractive as a modern general-purpose | |||
| transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | transport. The QUIC protocol is described in [QUIC-TRANSPORT]. | |||
| QUIC implements the spirit of existing TCP loss recovery mechanisms, | QUIC implements the spirit of existing TCP loss recovery mechanisms, | |||
| described in RFCs, various Internet-drafts, and also those prevalent | described in RFCs, various Internet-drafts, and also those prevalent | |||
| skipping to change at page 4, line 43 ¶ | skipping to change at page 4, line 43 ¶ | |||
| capitals, as shown here. | capitals, as shown here. | |||
| Definitions of terms that are used in this document: | Definitions of terms that are used in this document: | |||
| ACK-only: Any packet containing only one or more ACK frame(s). | ACK-only: Any packet containing only one or more ACK frame(s). | |||
| In-flight: Packets are considered in-flight when they have been sent | In-flight: Packets are considered in-flight when they have been sent | |||
| and are not ACK-only, and they are not acknowledged, declared | and are not ACK-only, and they are not acknowledged, declared | |||
| lost, or abandoned along with old keys. | lost, or abandoned along with old keys. | |||
| Ack-eliciting Frames: All frames besides ACK or PADDING are | Ack-eliciting Frames: All frames other than ACK, PADDING, and | |||
| considered ack-eliciting. | CONNECTION_CLOSE are considered ack-eliciting. | |||
| Ack-eliciting Packets: Packets that contain ack-eliciting frames | Ack-eliciting Packets: Packets that contain ack-eliciting frames | |||
| elicit an ACK from the receiver within the maximum ack delay and | elicit an ACK from the receiver within the maximum ack delay and | |||
| are called ack-eliciting packets. | are called ack-eliciting packets. | |||
| Crypto Packets: Packets containing CRYPTO data sent in Initial or | Crypto Packets: Packets containing CRYPTO data sent in Initial or | |||
| Handshake packets. | Handshake packets. | |||
| Out-of-order Packets: Packets that do not increase the largest | Out-of-order Packets: Packets that do not increase the largest | |||
| received packet number for its packet number space by exactly one. | received packet number for its packet number space by exactly one. | |||
| skipping to change at page 5, line 39 ¶ | skipping to change at page 5, line 39 ¶ | |||
| and congestion control logic: | and congestion control logic: | |||
| o All packets are acknowledged, though packets that contain no ack- | o All packets are acknowledged, though packets that contain no ack- | |||
| eliciting frames are only acknowledged along with ack-eliciting | eliciting frames are only acknowledged along with ack-eliciting | |||
| packets. | packets. | |||
| o Long header packets that contain CRYPTO frames are critical to the | o Long header packets that contain CRYPTO frames are critical to the | |||
| performance of the QUIC handshake and use shorter timers for | performance of the QUIC handshake and use shorter timers for | |||
| acknowledgement. | acknowledgement. | |||
| o Packets that contain only ACK frames do not count toward | o Packets containing frames besides ACK or CONNECTION_CLOSE frames | |||
| congestion control limits and are not considered in-flight. | count toward congestion control limits and are considered in- | |||
| flight. | ||||
| o PADDING frames cause packets to contribute toward bytes in flight | o PADDING frames cause packets to contribute toward bytes in flight | |||
| without directly causing an acknowledgment to be sent. | without directly causing an acknowledgment to be sent. | |||
| 3.1. Relevant Differences Between QUIC and TCP | 3.1. Relevant Differences Between QUIC and TCP | |||
| Readers familiar with TCP's loss detection and congestion control | Readers familiar with TCP's loss detection and congestion control | |||
| will find algorithms here that parallel well-known TCP ones. | will find algorithms here that parallel well-known TCP ones. | |||
| Protocol differences between QUIC and TCP however contribute to | Protocol differences between QUIC and TCP however contribute to | |||
| algorithmic differences. We briefly describe these protocol | algorithmic differences. We briefly describe these protocol | |||
| skipping to change at page 10, line 51 ¶ | skipping to change at page 10, line 51 ¶ | |||
| reordering resilience. | reordering resilience. | |||
| 5.1.2. Time Threshold | 5.1.2. Time Threshold | |||
| Once a later packet packet within the same packet number space has | Once a later packet packet within the same packet number space has | |||
| been acknowledged, an endpoint SHOULD declare an earlier packet lost | been acknowledged, an endpoint SHOULD declare an earlier packet lost | |||
| if it was sent a threshold amount of time in the past. To avoid | if it was sent a threshold amount of time in the past. To avoid | |||
| declaring packets as lost too early, this time threshold MUST be set | declaring packets as lost too early, this time threshold MUST be set | |||
| to at least kGranularity. The time threshold is: | to at least kGranularity. The time threshold is: | |||
| kTimeThreshold * max(SRTT, latest_RTT, kGranularity) | kTimeThreshold * max(smoothed_rtt, latest_rtt, kGranularity) | |||
| If packets sent prior to the largest acknowledged packet cannot yet | If packets sent prior to the largest acknowledged packet cannot yet | |||
| be declared lost, then a timer SHOULD be set for the remaining time. | be declared lost, then a timer SHOULD be set for the remaining time. | |||
| Using max(SRTT, latest_RTT) protects from the two following cases: | Using max(smoothed_rtt, latest_rtt) protects from the two following | |||
| cases: | ||||
| o the latest RTT sample is lower than the SRTT, perhaps due to | o the latest RTT sample is lower than the smoothed RTT, perhaps due | |||
| reordering where the acknowledgement encountered a shorter path; | to reordering where the acknowledgement encountered a shorter | |||
| path; | ||||
| o the latest RTT sample is higher than the SRTT, perhaps due to a | o the latest RTT sample is higher than the smoothed RTT, perhaps due | |||
| sustained increase in the actual RTT, but the smoothed SRTT has | to a sustained increase in the actual RTT, but the smoothed RTT | |||
| not yet caught up. | has not yet caught up. | |||
| The RECOMMENDED time threshold (kTimeThreshold), expressed as a | The RECOMMENDED time threshold (kTimeThreshold), expressed as a | |||
| round-trip time multiplier, is 9/8. | round-trip time multiplier, is 9/8. | |||
| Implementations MAY experiment with absolute thresholds, thresholds | Implementations MAY experiment with absolute thresholds, thresholds | |||
| from previous connections, adaptive thresholds, or including RTT | from previous connections, adaptive thresholds, or including RTT | |||
| variance. Smaller thresholds reduce reordering resilience and | variance. Smaller thresholds reduce reordering resilience and | |||
| increase spurious retransmissions, and larger thresholds increase | increase spurious retransmissions, and larger thresholds increase | |||
| loss detection delay. | loss detection delay. | |||
| 5.2. Probe Timeout | 5.2. Probe Timeout | |||
| A Probe Timeout (PTO) triggers sending one or two probe datagrams | A Probe Timeout (PTO) triggers sending one or two probe datagrams | |||
| when ack-eliciting packets are not acknowledged within the expected | when ack-eliciting packets are not acknowledged within the expected | |||
| period of time or the handshake has not been completed. A PTO | period of time or the handshake has not been completed. A PTO | |||
| enables a connection to recover from loss of tail packets or | enables a connection to recover from loss of tail packets or | |||
| acknowledgements. The PTO algorithm used in QUIC implements the | acknowledgements. The PTO algorithm used in QUIC implements the | |||
| reliability functions of Tail Loss Probe [TLP] [RACK], RTO [RFC5681] | reliability functions of Tail Loss Probe [RACK], RTO [RFC5681] and | |||
| and F-RTO algorithms for TCP [RFC5682], and the timeout computation | F-RTO algorithms for TCP [RFC5682], and the timeout computation is | |||
| is based on TCP's retransmission timeout period [RFC6298]. | based on TCP's retransmission timeout period [RFC6298]. | |||
| 5.2.1. Computing PTO | 5.2.1. Computing PTO | |||
| When an ack-eliciting packet is transmitted, the sender schedules a | When an ack-eliciting packet is transmitted, the sender schedules a | |||
| timer for the PTO period as follows: | timer for the PTO period as follows: | |||
| PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay | PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay | |||
| kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in | kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in | |||
| Appendix A.2 and Appendix A.3. | Appendix A.2 and Appendix A.3. | |||
| skipping to change at page 12, line 39 ¶ | skipping to change at page 12, line 41 ¶ | |||
| network SHOULD use the previous connection's final smoothed RTT value | network SHOULD use the previous connection's final smoothed RTT value | |||
| as the resumed connection's initial RTT. If no previous RTT is | as the resumed connection's initial RTT. If no previous RTT is | |||
| available, the initial RTT SHOULD be set to 500ms, resulting in a 1 | available, the initial RTT SHOULD be set to 500ms, resulting in a 1 | |||
| second initial timeout as recommended in [RFC6298]. | second initial timeout as recommended in [RFC6298]. | |||
| A connection MAY use the delay between sending a PATH_CHALLENGE and | A connection MAY use the delay between sending a PATH_CHALLENGE and | |||
| receiving a PATH_RESPONSE to seed initial_rtt for a new path, but the | receiving a PATH_RESPONSE to seed initial_rtt for a new path, but the | |||
| delay SHOULD NOT be considered an RTT sample. | delay SHOULD NOT be considered an RTT sample. | |||
| Until the server has validated the client's address on the path, the | Until the server has validated the client's address on the path, the | |||
| amount of data it can send is limited, as specified in Section 8.1 of | amount of data it can send is limited to three times the amount of | |||
| [QUIC-TRANSPORT]. Data at Initial encryption MUST be retransmitted | data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If | |||
| before Handshake data and data at Handshake encryption MUST be | no data can be sent, then the PTO alarm MUST NOT be armed. | |||
| retransmitted before any ApplicationData data. If no data can be | ||||
| sent, then the PTO alarm MUST NOT be armed until data has been | ||||
| received from the client. | ||||
| Since the server could be blocked until more packets are received | Since the server could be blocked until more packets are received | |||
| from the client, it is the client's responsibility to send packets to | from the client, it is the client's responsibility to send packets to | |||
| unblock the server until it is certain that the server has finished | unblock the server until it is certain that the server has finished | |||
| its address validation (see Section 8 of [QUIC-TRANSPORT]). That is, | its address validation (see Section 8 of [QUIC-TRANSPORT]). That is, | |||
| the client MUST set the probe timer if the client has not received an | the client MUST set the probe timer if the client has not received an | |||
| acknowledgement for one of its Handshake or 1-RTT packets. | acknowledgement for one of its Handshake or 1-RTT packets. | |||
| Prior to handshake completion, when few to none RTT samples have been | Prior to handshake completion, when few to none RTT samples have been | |||
| generated, it is possible that the probe timer expiration is due to | generated, it is possible that the probe timer expiration is due to | |||
| skipping to change at page 13, line 25 ¶ | skipping to change at page 13, line 25 ¶ | |||
| keys are discarded. | keys are discarded. | |||
| 5.3.1. Sending Probe Packets | 5.3.1. Sending Probe Packets | |||
| When a PTO timer expires, a sender MUST send at least one ack- | When a PTO timer expires, a sender MUST send at least one ack- | |||
| eliciting packet as a probe, unless there is no data available to | eliciting packet as a probe, unless there is no data available to | |||
| send. An endpoint MAY send up to two full-sized datagrams containing | send. An endpoint MAY send up to two full-sized datagrams containing | |||
| ack-eliciting packets, to avoid an expensive consecutive PTO | ack-eliciting packets, to avoid an expensive consecutive PTO | |||
| expiration due to a single lost datagram. | expiration due to a single lost datagram. | |||
| It is possible that the sender has no new or previously-sent data to | When the PTO timer expires, and there is new or previously sent | |||
| send. As an example, consider the following sequence of events: new | unacknowledged data, it MUST be sent. Data that was previously sent | |||
| with Initial encryption MUST be sent before Handshake data and data | ||||
| previously sent at Handshake encryption MUST be sent before any | ||||
| ApplicationData data. | ||||
| It is possible the sender has no new or previously-sent data to send. | ||||
| As an example, consider the following sequence of events: new | ||||
| application data is sent in a STREAM frame, deemed lost, then | application data is sent in a STREAM frame, deemed lost, then | |||
| retransmitted in a new packet, and then the original transmission is | retransmitted in a new packet, and then the original transmission is | |||
| acknowledged. In the absence of any new application data, a PTO | acknowledged. When there is no data to send, the sender SHOULD send | |||
| timer expiration now would find the sender with no new or previously- | a PING or other ack-eliciting frame in a single packet, re-arming the | |||
| sent data to send. | PTO timer. | |||
| When there is no data to send, the sender SHOULD send a PING or other | ||||
| ack-eliciting frame in a single packet, re-arming the PTO timer. | ||||
| Alternatively, instead of sending an ack-eliciting packet, the sender | Alternatively, instead of sending an ack-eliciting packet, the sender | |||
| MAY mark any packets still in flight as lost. Doing so avoids | MAY mark any packets still in flight as lost. Doing so avoids | |||
| sending an additional packet, but increases the risk that loss is | sending an additional packet, but increases the risk that loss is | |||
| declared too aggressively, resulting in an unnecessary rate reduction | declared too aggressively, resulting in an unnecessary rate reduction | |||
| by the congestion controller. | by the congestion controller. | |||
| Consecutive PTO periods increase exponentially, and as a result, | Consecutive PTO periods increase exponentially, and as a result, | |||
| connection recovery latency increases exponentially as packets | connection recovery latency increases exponentially as packets | |||
| continue to be dropped in the network. Sending two packets on PTO | continue to be dropped in the network. Sending two packets on PTO | |||
| expiration increases resilience to packet drops, thus reducing the | expiration increases resilience to packet drops, thus reducing the | |||
| probability of consecutive PTO events. | probability of consecutive PTO events. | |||
| Probe packets sent on a PTO MUST be ack-eliciting. A probe packet | Probe packets sent on a PTO MUST be ack-eliciting. A probe packet | |||
| SHOULD carry new data when possible. A probe packet MAY carry | SHOULD carry new data when possible. A probe packet MAY carry | |||
| retransmitted unacknowledged data when new data is unavailable, when | retransmitted unacknowledged data when new data is unavailable, when | |||
| flow control does not permit new data to be sent, or to | flow control does not permit new data to be sent, or to | |||
| opportunistically reduce loss recovery delay. Implementations MAY | opportunistically reduce loss recovery delay. Implementations MAY | |||
| use alternate strategies for determining the content of probe | use alternative strategies for determining the content of probe | |||
| packets, including sending new or retransmitted data based on the | packets, including sending new or retransmitted data based on the | |||
| application's priorities. | application's priorities. | |||
| When the PTO timer expires multiple times and new data cannot be | When the PTO timer expires multiple times and new data cannot be | |||
| sent, implementations must choose between sending the same payload | sent, implementations must choose between sending the same payload | |||
| every time or sending different payloads. Sending the same payload | every time or sending different payloads. Sending the same payload | |||
| may be simpler and ensures the highest priority frames arrive first. | may be simpler and ensures the highest priority frames arrive first. | |||
| Sending different payloads each time reduces the chances of spurious | Sending different payloads each time reduces the chances of spurious | |||
| retransmission. | retransmission. | |||
| skipping to change at page 14, line 26 ¶ | skipping to change at page 14, line 29 ¶ | |||
| Delivery or loss of packets in flight is established when an ACK | Delivery or loss of packets in flight is established when an ACK | |||
| frame is received that newly acknowledges one or more packets. | frame is received that newly acknowledges one or more packets. | |||
| A PTO timer expiration event does not indicate packet loss and MUST | A PTO timer expiration event does not indicate packet loss and MUST | |||
| NOT cause prior unacknowledged packets to be marked as lost. When an | NOT cause prior unacknowledged packets to be marked as lost. When an | |||
| acknowledgement is received that newly acknowledges packets, loss | acknowledgement is received that newly acknowledges packets, loss | |||
| detection proceeds as dictated by packet and time threshold | detection proceeds as dictated by packet and time threshold | |||
| mechanisms; see Section 5.1. | mechanisms; see Section 5.1. | |||
| 5.4. Retry and Version Negotiation | 5.4. Handling Retry Packets | |||
| A Retry or Version Negotiation packet causes a client to send another | A Retry packet causes a client to send another Initial packet, | |||
| Initial packet, effectively restarting the connection process and | effectively restarting the connection process. A Retry packet | |||
| resetting congestion control and loss recovery state, including | indicates that the Initial was received, but not processed. A Retry | |||
| resetting any pending timers. Either packet indicates that the | packet cannot be treated as an acknowledgment, because it does not | |||
| Initial was received but not processed. Neither packet can be | indicate that a packet was processed or specify the packet number. | |||
| treated as an acknowledgment for the Initial. | ||||
| The client MAY however compute an RTT estimate to the server as the | Clients that receive a Retry packet reset congestion control and loss | |||
| time period from when the first Initial was sent to when a Retry or a | recovery state, including resetting any pending timers. Other | |||
| connection state, in particular cryptographic handshake messages, is | ||||
| retained; see Section 17.2.5 of [QUIC-TRANSPORT]. | ||||
| The client MAY compute an RTT estimate to the server as the time | ||||
| period from when the first Initial was sent to when a Retry or a | ||||
| Version Negotiation packet is received. The client MAY use this | Version Negotiation packet is received. The client MAY use this | |||
| value to seed the RTT estimator for a subsequent connection attempt | value in place of its default for the initial RTT estimate. | |||
| to the server. | ||||
| 5.5. Discarding Keys and Packet State | 5.5. Discarding Keys and Packet State | |||
| When packet protection keys are discarded (see Section 4.9 of | When packet protection keys are discarded (see Section 4.9 of | |||
| [QUIC-TLS]), all packets that were sent with those keys can no longer | [QUIC-TLS]), all packets that were sent with those keys can no longer | |||
| be acknowledged because their acknowledgements cannot be processed | be acknowledged because their acknowledgements cannot be processed | |||
| anymore. The sender MUST discard all recovery state associated with | anymore. The sender MUST discard all recovery state associated with | |||
| those packets and MUST remove them from the count of bytes in flight. | those packets and MUST remove them from the count of bytes in flight. | |||
| Endpoints stop sending and receiving Initial packets once they start | Endpoints stop sending and receiving Initial packets once they start | |||
| skipping to change at page 15, line 4 ¶ | skipping to change at page 15, line 9 ¶ | |||
| 5.5. Discarding Keys and Packet State | 5.5. Discarding Keys and Packet State | |||
| When packet protection keys are discarded (see Section 4.9 of | When packet protection keys are discarded (see Section 4.9 of | |||
| [QUIC-TLS]), all packets that were sent with those keys can no longer | [QUIC-TLS]), all packets that were sent with those keys can no longer | |||
| be acknowledged because their acknowledgements cannot be processed | be acknowledged because their acknowledgements cannot be processed | |||
| anymore. The sender MUST discard all recovery state associated with | anymore. The sender MUST discard all recovery state associated with | |||
| those packets and MUST remove them from the count of bytes in flight. | those packets and MUST remove them from the count of bytes in flight. | |||
| Endpoints stop sending and receiving Initial packets once they start | Endpoints stop sending and receiving Initial packets once they start | |||
| exchanging Handshake packets (see Section 17.2.2.1 of | exchanging Handshake packets (see Section 17.2.2.1 of | |||
| [QUIC-TRANSPORT]). At this point, recovery state for all in-flight | [QUIC-TRANSPORT]). At this point, recovery state for all in-flight | |||
| Initial packets is discarded. | Initial packets is discarded. | |||
| When 0-RTT is rejected, recovery state for all in-flight 0-RTT | When 0-RTT is rejected, recovery state for all in-flight 0-RTT | |||
| packets is discarded. | packets is discarded. | |||
| If a server accepts 0-RTT, but does not buffer 0-RTT packets that | If a server accepts 0-RTT, but does not buffer 0-RTT packets that | |||
| arrive before Initial packets, early 0-RTT packets will be declared | arrive before Initial packets, early 0-RTT packets will be declared | |||
| lost, but that is expected to be infrequent. | lost, but that is expected to be infrequent. | |||
| It is expected that keys are discarded after packets encrypted with | It is expected that keys are discarded after packets encrypted with | |||
| them would be acknowledged or declared lost. Initial secrets however | them would be acknowledged or declared lost. Initial secrets however | |||
| might be destroyed sooner, as soon as handshake keys are available | might be destroyed sooner, as soon as handshake keys are available | |||
| (see Section 4.9.1 of [QUIC-TLS]). | (see Section 4.9.1 of [QUIC-TLS]). | |||
| 5.6. Discussion | ||||
| The majority of constants were derived from best common practices | ||||
| among widely deployed TCP implementations on the internet. | ||||
| Exceptions follow. | ||||
| A shorter delayed ack time of 25ms was chosen because longer delayed | ||||
| acks can delay loss recovery and for the small number of connections | ||||
| where less than packet per 25ms is delivered, acking every packet is | ||||
| beneficial to congestion control and loss recovery. | ||||
| 6. Congestion Control | 6. Congestion Control | |||
| QUIC's congestion control is based on TCP NewReno [RFC6582]. NewReno | QUIC's congestion control is based on TCP NewReno [RFC6582]. NewReno | |||
| is a congestion window based congestion control. QUIC specifies the | is a congestion window based congestion control. QUIC specifies the | |||
| congestion window in bytes rather than packets due to finer control | congestion window in bytes rather than packets due to finer control | |||
| and the ease of appropriate byte counting [RFC3465]. | and the ease of appropriate byte counting [RFC3465]. | |||
| QUIC hosts MUST NOT send packets if they would increase | QUIC hosts MUST NOT send packets if they would increase | |||
| bytes_in_flight (defined in Appendix B.2) beyond the available | bytes_in_flight (defined in Appendix B.2) beyond the available | |||
| congestion window, unless the packet is a probe packet sent after a | congestion window, unless the packet is a probe packet sent after a | |||
| skipping to change at page 17, line 33 ¶ | skipping to change at page 17, line 25 ¶ | |||
| are substantially delayed. This duration is computed as follows: | are substantially delayed. This duration is computed as follows: | |||
| (smoothed_rtt + 4 * rttvar + max_ack_delay) * | (smoothed_rtt + 4 * rttvar + max_ack_delay) * | |||
| kPersistentCongestionThreshold | kPersistentCongestionThreshold | |||
| For example, assume: | For example, assume: | |||
| smoothed_rtt = 1 rttvar = 0 max_ack_delay = 0 | smoothed_rtt = 1 rttvar = 0 max_ack_delay = 0 | |||
| kPersistentCongestionThreshold = 3 | kPersistentCongestionThreshold = 3 | |||
| If an eck-eliciting packet is sent at time = 0, the following | If an ack-eliciting packet is sent at time = 0, the following | |||
| scenario would illustrate persistent congestion: | scenario would illustrate persistent congestion: | |||
| +-----+------------------------+ | +-----+------------------------+ | |||
| | t=0 | Send Pkt #1 (App Data) | | | t=0 | Send Pkt #1 (App Data) | | |||
| +-----+------------------------+ | +-----+------------------------+ | |||
| | t=1 | Send Pkt #2 (PTO 1) | | | t=1 | Send Pkt #2 (PTO 1) | | |||
| | | | | | | | | |||
| | t=3 | Send Pkt #3 (PTO 2) | | | t=3 | Send Pkt #3 (PTO 2) | | |||
| | | | | | | | | |||
| | t=7 | Send Pkt #4 (PTO 3) | | | t=7 | Send Pkt #4 (PTO 3) | | |||
| skipping to change at page 18, line 13 ¶ | skipping to change at page 18, line 6 ¶ | |||
| kPersistentCongestionThreshold) = 3. Because the threshold was | kPersistentCongestionThreshold) = 3. Because the threshold was | |||
| reached and because none of the packets between the oldest and the | reached and because none of the packets between the oldest and the | |||
| newest packets are acknowledged, the network is considered to have | newest packets are acknowledged, the network is considered to have | |||
| experienced persistent congestion. | experienced persistent congestion. | |||
| When persistent congestion is established, the sender's congestion | When persistent congestion is established, the sender's congestion | |||
| window MUST be reduced to the minimum congestion window | window MUST be reduced to the minimum congestion window | |||
| (kMinimumWindow). This response of collapsing the congestion window | (kMinimumWindow). This response of collapsing the congestion window | |||
| on persistent congestion is functionally similar to a sender's | on persistent congestion is functionally similar to a sender's | |||
| response on a Retransmission Timeout (RTO) in TCP [RFC5681] after | response on a Retransmission Timeout (RTO) in TCP [RFC5681] after | |||
| Tail Loss Probes (TLP) [TLP]. | Tail Loss Probes (TLP) [RACK]. | |||
| 6.8. Pacing | 6.8. Pacing | |||
| This document does not specify a pacer, but it is RECOMMENDED that a | This document does not specify a pacer, but it is RECOMMENDED that a | |||
| sender pace sending of all in-flight packets based on input from the | sender pace sending of all in-flight packets based on input from the | |||
| congestion controller. For example, a pacer might distribute the | congestion controller. For example, a pacer might distribute the | |||
| congestion window over the SRTT when used with a window-based | congestion window over the smoothed RTT when used with a window-based | |||
| controller, and a pacer might use the rate estimate of a rate-based | controller, and a pacer might use the rate estimate of a rate-based | |||
| controller. | controller. | |||
| An implementation should take care to architect its congestion | An implementation should take care to architect its congestion | |||
| controller to work well with a pacer. For instance, a pacer might | controller to work well with a pacer. For instance, a pacer might | |||
| wrap the congestion controller and control the availability of the | wrap the congestion controller and control the availability of the | |||
| congestion window, or a pacer might pace out packets handed to it by | congestion window, or a pacer might pace out packets handed to it by | |||
| the congestion controller. Timely delivery of ACK frames is | the congestion controller. Timely delivery of ACK frames is | |||
| important for efficient loss recovery. Packets containing only ACK | important for efficient loss recovery. Packets containing only ACK | |||
| frames should therefore not be paced, to avoid delaying their | frames should therefore not be paced, to avoid delaying their | |||
| delivery to the peer. | delivery to the peer. | |||
| Sending multiple packets into the network without any delay between | ||||
| them creates a packet burst that might cause short-term congestion | ||||
| and losses. Implementations MUST either use pacing or limit such | ||||
| bursts to the initial congestion window, which is recommended to be | ||||
| the minimum of 10 * max_datagram_size and max(2* max_datagram_size, | ||||
| 14720)), where max_datagram_size is the current maximum size of a | ||||
| datagram for the connection, not including UDP or IP overhead. | ||||
| As an example of a well-known and publicly available implementation | As an example of a well-known and publicly available implementation | |||
| of a flow pacer, implementers are referred to the Fair Queue packet | of a flow pacer, implementers are referred to the Fair Queue packet | |||
| scheduler (fq qdisc) in Linux (3.11 onwards). | scheduler (fq qdisc) in Linux (3.11 onwards). | |||
| 6.9. Under-utilizing the Congestion Window | 6.9. Under-utilizing the Congestion Window | |||
| A congestion window that is under-utilized SHOULD NOT be increased in | When bytes in flight is smaller than the congestion window and | |||
| either slow start or congestion avoidance. This can happen due to | sending is not pacing limited, the congestion window is under- | |||
| insufficient application data or flow control credit. | utilized. When this occurs, the congestion window SHOULD NOT be | |||
| increased in either slow start or congestion avoidance. This can | ||||
| happen due to insufficient application data or flow control credit. | ||||
| A sender MAY use the pipeACK method described in section 4.3 of | A sender MAY use the pipeACK method described in section 4.3 of | |||
| [RFC7661] to determine if the congestion window is sufficiently | [RFC7661] to determine if the congestion window is sufficiently | |||
| utilized. | utilized. | |||
| A sender that paces packets (see Section 6.8) might delay sending | A sender that paces packets (see Section 6.8) might delay sending | |||
| packets and not fully utilize the congestion window due to this | packets and not fully utilize the congestion window due to this | |||
| delay. A sender should not consider itself application limited if it | delay. A sender should not consider itself application limited if it | |||
| would have fully utilized the congestion window without pacing delay. | would have fully utilized the congestion window without pacing delay. | |||
| Bursting more than an initial window's worth of data into the network | A sender MAY implement alternative mechanisms to update its | |||
| might cause short-term congestion and losses. Implemementations | congestion window after periods of under-utilization, such as those | |||
| SHOULD either use pacing or reduce their congestion window to limit | proposed for TCP in [RFC7661]. | |||
| such bursts. | ||||
| A sender MAY implement alternate mechanisms to update its congestion | ||||
| window after periods of under-utilization, such as those proposed for | ||||
| TCP in [RFC7661]. | ||||
| 7. Security Considerations | 7. Security Considerations | |||
| 7.1. Congestion Signals | 7.1. Congestion Signals | |||
| Congestion control fundamentally involves the consumption of signals | Congestion control fundamentally involves the consumption of signals | |||
| - both loss and ECN codepoints - from unauthenticated entities. On- | - both loss and ECN codepoints - from unauthenticated entities. On- | |||
| path attackers can spoof or alter these signals. An attacker can | path attackers can spoof or alter these signals. An attacker can | |||
| cause endpoints to reduce their sending rate by dropping packets, or | cause endpoints to reduce their sending rate by dropping packets, or | |||
| alter send rate by changing ECN codepoints. | alter send rate by changing ECN codepoints. | |||
| skipping to change at page 20, line 17 ¶ | skipping to change at page 20, line 15 ¶ | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| This document has no IANA actions. Yet. | This document has no IANA actions. Yet. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [QUIC-TLS] | [QUIC-TLS] | |||
| Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure | Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure | |||
| QUIC", draft-ietf-quic-tls-23 (work in progress), | QUIC", draft-ietf-quic-tls-24 (work in progress), November | |||
| September 2019. | 2019. | |||
| [QUIC-TRANSPORT] | [QUIC-TRANSPORT] | |||
| Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
| Multiplexed and Secure Transport", draft-ietf-quic- | Multiplexed and Secure Transport", draft-ietf-quic- | |||
| transport-23 (work in progress), September 2019. | transport-24 (work in progress), November 2019. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| skipping to change at page 22, line 10 ¶ | skipping to change at page 22, line 10 ¶ | |||
| [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating | [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating | |||
| TCP to Support Rate-Limited Traffic", RFC 7661, | TCP to Support Rate-Limited Traffic", RFC 7661, | |||
| DOI 10.17487/RFC7661, October 2015, | DOI 10.17487/RFC7661, October 2015, | |||
| <https://www.rfc-editor.org/info/rfc7661>. | <https://www.rfc-editor.org/info/rfc7661>. | |||
| [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | |||
| R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | |||
| RFC 8312, DOI 10.17487/RFC8312, February 2018, | RFC 8312, DOI 10.17487/RFC8312, February 2018, | |||
| <https://www.rfc-editor.org/info/rfc8312>. | <https://www.rfc-editor.org/info/rfc8312>. | |||
| [TLP] Dukkipati, N., Cardwell, N., Cheng, Y., and M. Mathis, | ||||
| "Tail Loss Probe (TLP): An Algorithm for Fast Recovery of | ||||
| Tail Losses", draft-dukkipati-tcpm-tcp-loss-probe-01 (work | ||||
| in progress), February 2013. | ||||
| 9.3. URIs | 9.3. URIs | |||
| [1] https://mailarchive.ietf.org/arch/search/?email_list=quic | [1] https://mailarchive.ietf.org/arch/search/?email_list=quic | |||
| [2] https://github.com/quicwg | [2] https://github.com/quicwg | |||
| [3] https://github.com/quicwg/base-drafts/labels/-recovery | [3] https://github.com/quicwg/base-drafts/labels/-recovery | |||
| Appendix A. Loss Recovery Pseudocode | Appendix A. Loss Recovery Pseudocode | |||
| skipping to change at page 30, line 51 ¶ | skipping to change at page 30, line 51 ¶ | |||
| We now describe an example implementation of the congestion | We now describe an example implementation of the congestion | |||
| controller described in Section 6. | controller described in Section 6. | |||
| B.1. Constants of interest | B.1. Constants of interest | |||
| Constants used in congestion control are based on a combination of | Constants used in congestion control are based on a combination of | |||
| RFCs, papers, and common practice. Some may need to be changed or | RFCs, papers, and common practice. Some may need to be changed or | |||
| negotiated in order to better suit a variety of environments. | negotiated in order to better suit a variety of environments. | |||
| kMaxDatagramSize: The sender's maximum payload size. Does not | ||||
| include UDP or IP overhead. The max packet size is used for | ||||
| calculating initial and minimum congestion windows. The | ||||
| RECOMMENDED value is 1200 bytes. | ||||
| kInitialWindow: Default limit on the initial amount of data in | kInitialWindow: Default limit on the initial amount of data in | |||
| flight, in bytes. Taken from [RFC6928], but increased slightly to | flight, in bytes. Taken from [RFC6928], but increased slightly to | |||
| account for the smaller 8 byte overhead of UDP vs 20 bytes for | account for the smaller 8 byte overhead of UDP vs 20 bytes for | |||
| TCP. The RECOMMENDED value is the minimum of 10 * | TCP. The RECOMMENDED value is the minimum of 10 * | |||
| kMaxDatagramSize and max(2* kMaxDatagramSize, 14720)). | max_datagram_size and max(2 * max_datagram_size, 14720)). | |||
| kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED | kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED | |||
| value is 2 * kMaxDatagramSize. | value is 2 * max_datagram_size. | |||
| kLossReductionFactor: Reduction in congestion window when a new loss | kLossReductionFactor: Reduction in congestion window when a new loss | |||
| event is detected. The RECOMMENDED value is 0.5. | event is detected. The RECOMMENDED value is 0.5. | |||
| kPersistentCongestionThreshold: Period of time for persistent | kPersistentCongestionThreshold: Period of time for persistent | |||
| congestion to be established, specified as a PTO multiplier. The | congestion to be established, specified as a PTO multiplier. The | |||
| rationale for this threshold is to enable a sender to use initial | rationale for this threshold is to enable a sender to use initial | |||
| PTOs for aggressive probing, as TCP does with Tail Loss Probe | PTOs for aggressive probing, as TCP does with Tail Loss Probe | |||
| (TLP) [TLP] [RACK], before establishing persistent congestion, as | (TLP) [RACK], before establishing persistent congestion, as TCP | |||
| TCP does with a Retransmission Timeout (RTO) [RFC5681]. The | does with a Retransmission Timeout (RTO) [RFC5681]. The | |||
| RECOMMENDED value for kPersistentCongestionThreshold is 3, which | RECOMMENDED value for kPersistentCongestionThreshold is 3, which | |||
| is approximately equivalent to having two TLPs before an RTO in | is approximately equivalent to having two TLPs before an RTO in | |||
| TCP. | TCP. | |||
| B.2. Variables of interest | B.2. Variables of interest | |||
| Variables required to implement the congestion control mechanisms are | Variables required to implement the congestion control mechanisms are | |||
| described in this section. | described in this section. | |||
| max_datagram_size: The sender's current maximum payload size. Does | ||||
| not include UDP or IP overhead. The max datagram size is used for | ||||
| congestion window computations. An endpoint sets the value of | ||||
| this variable based on its PMTU (see Section 14.1 of | ||||
| [QUIC-TRANSPORT]), with a minimum value of 1200 bytes. | ||||
| ecn_ce_counters[kPacketNumberSpace]: The highest value reported for | ecn_ce_counters[kPacketNumberSpace]: The highest value reported for | |||
| the ECN-CE counter in the packet number space by the peer in an | the ECN-CE counter in the packet number space by the peer in an | |||
| ACK frame. This value is used to detect increases in the reported | ACK frame. This value is used to detect increases in the reported | |||
| ECN-CE counter. | ECN-CE counter. | |||
| bytes_in_flight: The sum of the size in bytes of all sent packets | bytes_in_flight: The sum of the size in bytes of all sent packets | |||
| that contain at least one ack-eliciting or PADDING frame, and have | that contain at least one ack-eliciting or PADDING frame, and have | |||
| not been acked or declared lost. The size does not include IP or | not been acked or declared lost. The size does not include IP or | |||
| UDP overhead, but does include the QUIC header and AEAD overhead. | UDP overhead, but does include the QUIC header and AEAD overhead. | |||
| Packets only containing ACK frames do not count towards | Packets only containing ACK frames do not count towards | |||
| skipping to change at page 33, line 23 ¶ | skipping to change at page 33, line 23 ¶ | |||
| return | return | |||
| if (IsAppLimited()): | if (IsAppLimited()): | |||
| // Do not increase congestion_window if application | // Do not increase congestion_window if application | |||
| // limited. | // limited. | |||
| return | return | |||
| if (congestion_window < ssthresh): | if (congestion_window < ssthresh): | |||
| // Slow start. | // Slow start. | |||
| congestion_window += acked_packet.size | congestion_window += acked_packet.size | |||
| else: | else: | |||
| // Congestion avoidance. | // Congestion avoidance. | |||
| congestion_window += kMaxDatagramSize * acked_packet.size | congestion_window += max_datagram_size * acked_packet.size | |||
| / congestion_window | / congestion_window | |||
| B.6. On New Congestion Event | B.6. On New Congestion Event | |||
| Invoked from ProcessECN and OnPacketsLost when a new congestion event | Invoked from ProcessECN and OnPacketsLost when a new congestion event | |||
| is detected. May start a new recovery period and reduces the | is detected. May start a new recovery period and reduces the | |||
| congestion window. | congestion window. | |||
| CongestionEvent(sent_time): | CongestionEvent(sent_time): | |||
| // Start a new congestion event if packet was sent after the | // Start a new congestion event if packet was sent after the | |||
| skipping to change at page 34, line 37 ¶ | skipping to change at page 34, line 37 ¶ | |||
| if (InPersistentCongestion(largest_lost_packet)): | if (InPersistentCongestion(largest_lost_packet)): | |||
| congestion_window = kMinimumWindow | congestion_window = kMinimumWindow | |||
| Appendix C. Change Log | Appendix C. Change Log | |||
| *RFC Editor's Note:* Please remove this section prior to | *RFC Editor's Note:* Please remove this section prior to | |||
| publication of a final version of this document. | publication of a final version of this document. | |||
| Issue and pull request numbers are listed with a leading octothorp. | Issue and pull request numbers are listed with a leading octothorp. | |||
| C.1. Since draft-ietf-quic-recovery-22 | C.1. Since draft-ietf-quic-recovery-23 | |||
| o Define under-utilizing the congestion window (#2630, #2686, #2675) | ||||
| o PTO MUST send data if possible (#3056, #3057) | ||||
| o Connection Close is not ack-eliciting (#3097, #3098) | ||||
| o MUST limit bursts to the initial congestion window (#3160) | ||||
| o Define the current max_datagram_size for congestion control | ||||
| (#3041, #3167) | ||||
| o Separate PTO by packet number space (#3067, #3074, #3066) | ||||
| C.2. Since draft-ietf-quic-recovery-22 | ||||
| o PTO should always send an ack-eliciting packet (#2895) | o PTO should always send an ack-eliciting packet (#2895) | |||
| o Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) | o Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) | |||
| o Move ACK generation text to transport draft (#1860, #2916) | o Move ACK generation text to transport draft (#1860, #2916) | |||
| C.2. Since draft-ietf-quic-recovery-21 | C.3. Since draft-ietf-quic-recovery-21 | |||
| o No changes | o No changes | |||
| C.3. Since draft-ietf-quic-recovery-20 | C.4. Since draft-ietf-quic-recovery-20 | |||
| o Path validation can be used as initial RTT value (#2644, #2687) | o Path validation can be used as initial RTT value (#2644, #2687) | |||
| o max_ack_delay transport parameter defaults to 0 (#2638, #2646) | o max_ack_delay transport parameter defaults to 0 (#2638, #2646) | |||
| o Ack Delay only measures intentional delays induced by the | o Ack Delay only measures intentional delays induced by the | |||
| implementation (#2596, #2786) | implementation (#2596, #2786) | |||
| C.4. Since draft-ietf-quic-recovery-19 | C.5. Since draft-ietf-quic-recovery-19 | |||
| o Change kPersistentThreshold from an exponent to a multiplier | o Change kPersistentThreshold from an exponent to a multiplier | |||
| (#2557) | (#2557) | |||
| o Send a PING if the PTO timer fires and there's nothing to send | o Send a PING if the PTO timer fires and there's nothing to send | |||
| (#2624) | (#2624) | |||
| o Set loss delay to at least kGranularity (#2617) | o Set loss delay to at least kGranularity (#2617) | |||
| o Merge application limited and sending after idle sections. Always | o Merge application limited and sending after idle sections. Always | |||
| skipping to change at page 35, line 39 ¶ | skipping to change at page 36, line 5 ¶ | |||
| packet is ack-eliciting but the largest_acked is not (#2592) | packet is ack-eliciting but the largest_acked is not (#2592) | |||
| o Don't arm the handshake timer if there is no handshake data | o Don't arm the handshake timer if there is no handshake data | |||
| (#2590) | (#2590) | |||
| o Clarify that the time threshold loss alarm takes precedence over | o Clarify that the time threshold loss alarm takes precedence over | |||
| the crypto handshake timer (#2590, #2620) | the crypto handshake timer (#2590, #2620) | |||
| o Change initial RTT to 500ms to align with RFC6298 (#2184) | o Change initial RTT to 500ms to align with RFC6298 (#2184) | |||
| C.5. Since draft-ietf-quic-recovery-18 | C.6. Since draft-ietf-quic-recovery-18 | |||
| o Change IW byte limit to 14720 from 14600 (#2494) | o Change IW byte limit to 14720 from 14600 (#2494) | |||
| o Update PTO calculation to match RFC6298 (#2480, #2489, #2490) | o Update PTO calculation to match RFC6298 (#2480, #2489, #2490) | |||
| o Improve loss detection's description of multiple packet number | o Improve loss detection's description of multiple packet number | |||
| spaces and pseudocode (#2485, #2451, #2417) | spaces and pseudocode (#2485, #2451, #2417) | |||
| o Declare persistent congestion even if non-probe packets are sent | o Declare persistent congestion even if non-probe packets are sent | |||
| and don't make persistent congestion more aggressive than RTO | and don't make persistent congestion more aggressive than RTO | |||
| skipping to change at page 36, line 4 ¶ | skipping to change at page 36, line 19 ¶ | |||
| o Update PTO calculation to match RFC6298 (#2480, #2489, #2490) | o Update PTO calculation to match RFC6298 (#2480, #2489, #2490) | |||
| o Improve loss detection's description of multiple packet number | o Improve loss detection's description of multiple packet number | |||
| spaces and pseudocode (#2485, #2451, #2417) | spaces and pseudocode (#2485, #2451, #2417) | |||
| o Declare persistent congestion even if non-probe packets are sent | o Declare persistent congestion even if non-probe packets are sent | |||
| and don't make persistent congestion more aggressive than RTO | and don't make persistent congestion more aggressive than RTO | |||
| verified was (#2365, #2244) | verified was (#2365, #2244) | |||
| o Move pseudocode to the appendices (#2408) | o Move pseudocode to the appendices (#2408) | |||
| o What to send on multiple PTOs (#2380) | o What to send on multiple PTOs (#2380) | |||
| C.6. Since draft-ietf-quic-recovery-17 | C.7. Since draft-ietf-quic-recovery-17 | |||
| o After Probe Timeout discard in-flight packets or send another | o After Probe Timeout discard in-flight packets or send another | |||
| (#2212, #1965) | (#2212, #1965) | |||
| o Endpoints discard initial keys as soon as handshake keys are | o Endpoints discard initial keys as soon as handshake keys are | |||
| available (#1951, #2045) | available (#1951, #2045) | |||
| o 0-RTT state is discarded when 0-RTT is rejected (#2300) | o 0-RTT state is discarded when 0-RTT is rejected (#2300) | |||
| o Loss detection timer is cancelled when ack-eliciting frames are in | o Loss detection timer is cancelled when ack-eliciting frames are in | |||
| skipping to change at page 36, line 32 ¶ | skipping to change at page 36, line 48 ¶ | |||
| controller (#2138, 2187) | controller (#2138, 2187) | |||
| o Process ECN counts before marking packets lost (#2142) | o Process ECN counts before marking packets lost (#2142) | |||
| o Mark packets lost before resetting crypto_count and pto_count | o Mark packets lost before resetting crypto_count and pto_count | |||
| (#2208, #2209) | (#2208, #2209) | |||
| o Congestion and loss recovery state are discarded when keys are | o Congestion and loss recovery state are discarded when keys are | |||
| discarded (#2327) | discarded (#2327) | |||
| C.7. Since draft-ietf-quic-recovery-16 | C.8. Since draft-ietf-quic-recovery-16 | |||
| o Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP | o Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP | |||
| and min crypto timeouts; eliminate timeout validation (#2114, | and min crypto timeouts; eliminate timeout validation (#2114, | |||
| #2166, #2168, #1017) | #2166, #2168, #1017) | |||
| o Redefine how congestion avoidance in terms of when the period | o Redefine how congestion avoidance in terms of when the period | |||
| starts (#1928, #1930) | starts (#1928, #1930) | |||
| o Document what needs to be tracked for packets that are in flight | o Document what needs to be tracked for packets that are in flight | |||
| (#765, #1724, #1939) | (#765, #1724, #1939) | |||
| skipping to change at page 37, line 7 ¶ | skipping to change at page 37, line 22 ¶ | |||
| (#1969, #1212, #934, #1974) | (#1969, #1212, #934, #1974) | |||
| o Reduce congestion window after idle, unless pacing is used (#2007, | o Reduce congestion window after idle, unless pacing is used (#2007, | |||
| #2023) | #2023) | |||
| o Disable RTT calculation for packets that don't elicit | o Disable RTT calculation for packets that don't elicit | |||
| acknowledgment (#2060, #2078) | acknowledgment (#2060, #2078) | |||
| o Limit ack_delay by max_ack_delay (#2060, #2099) | o Limit ack_delay by max_ack_delay (#2060, #2099) | |||
| o Initial keys are discarded once Handshake are avaialble (#1951, | o Initial keys are discarded once Handshake keys are available | |||
| #2045) | (#1951, #2045) | |||
| o Reorder ECN and loss detection in pseudocode (#2142) | o Reorder ECN and loss detection in pseudocode (#2142) | |||
| o Only cancel loss detection timer if ack-eliciting packets are in | o Only cancel loss detection timer if ack-eliciting packets are in | |||
| flight (#2093, #2117) | flight (#2093, #2117) | |||
| C.8. Since draft-ietf-quic-recovery-14 | C.9. Since draft-ietf-quic-recovery-14 | |||
| o Used max_ack_delay from transport params (#1796, #1782) | o Used max_ack_delay from transport params (#1796, #1782) | |||
| o Merge ACK and ACK_ECN (#1783) | o Merge ACK and ACK_ECN (#1783) | |||
| C.9. Since draft-ietf-quic-recovery-13 | C.10. Since draft-ietf-quic-recovery-13 | |||
| o Corrected the lack of ssthresh reduction in CongestionEvent | o Corrected the lack of ssthresh reduction in CongestionEvent | |||
| pseudocode (#1598) | pseudocode (#1598) | |||
| o Considerations for ECN spoofing (#1426, #1626) | o Considerations for ECN spoofing (#1426, #1626) | |||
| o Clarifications for PADDING and congestion control (#837, #838, | o Clarifications for PADDING and congestion control (#837, #838, | |||
| #1517, #1531, #1540) | #1517, #1531, #1540) | |||
| o Reduce early retransmission timer to RTT/8 (#945, #1581) | o Reduce early retransmission timer to RTT/8 (#945, #1581) | |||
| o Packets are declared lost after an RTO is verified (#935, #1582) | o Packets are declared lost after an RTO is verified (#935, #1582) | |||
| C.10. Since draft-ietf-quic-recovery-12 | C.11. Since draft-ietf-quic-recovery-12 | |||
| o Changes to manage separate packet number spaces and encryption | o Changes to manage separate packet number spaces and encryption | |||
| levels (#1190, #1242, #1413, #1450) | levels (#1190, #1242, #1413, #1450) | |||
| o Added ECN feedback mechanisms and handling; new ACK_ECN frame | o Added ECN feedback mechanisms and handling; new ACK_ECN frame | |||
| (#804, #805, #1372) | (#804, #805, #1372) | |||
| C.11. Since draft-ietf-quic-recovery-11 | C.12. Since draft-ietf-quic-recovery-11 | |||
| No significant changes. | No significant changes. | |||
| C.12. Since draft-ietf-quic-recovery-10 | C.13. Since draft-ietf-quic-recovery-10 | |||
| o Improved text on ack generation (#1139, #1159) | o Improved text on ack generation (#1139, #1159) | |||
| o Make references to TCP recovery mechanisms informational (#1195) | o Make references to TCP recovery mechanisms informational (#1195) | |||
| o Define time_of_last_sent_handshake_packet (#1171) | o Define time_of_last_sent_handshake_packet (#1171) | |||
| o Added signal from TLS the data it includes needs to be sent in a | o Added signal from TLS the data it includes needs to be sent in a | |||
| Retry packet (#1061, #1199) | Retry packet (#1061, #1199) | |||
| o Minimum RTT (min_rtt) is initialized with an infinite value | o Minimum RTT (min_rtt) is initialized with an infinite value | |||
| (#1169) | (#1169) | |||
| C.13. Since draft-ietf-quic-recovery-09 | C.14. Since draft-ietf-quic-recovery-09 | |||
| No significant changes. | No significant changes. | |||
| C.14. Since draft-ietf-quic-recovery-08 | C.15. Since draft-ietf-quic-recovery-08 | |||
| o Clarified pacing and RTO (#967, #977) | o Clarified pacing and RTO (#967, #977) | |||
| C.15. Since draft-ietf-quic-recovery-07 | C.16. Since draft-ietf-quic-recovery-07 | |||
| o Include Ack Delay in RTO(and TLP) computations (#981) | o Include Ack Delay in RTO(and TLP) computations (#981) | |||
| o Ack Delay in SRTT computation (#961) | o Ack Delay in SRTT computation (#961) | |||
| o Default RTT and Slow Start (#590) | o Default RTT and Slow Start (#590) | |||
| o Many editorial fixes. | o Many editorial fixes. | |||
| C.16. Since draft-ietf-quic-recovery-06 | C.17. Since draft-ietf-quic-recovery-06 | |||
| No significant changes. | No significant changes. | |||
| C.17. Since draft-ietf-quic-recovery-05 | C.18. Since draft-ietf-quic-recovery-05 | |||
| o Add more congestion control text (#776) | o Add more congestion control text (#776) | |||
| C.18. Since draft-ietf-quic-recovery-04 | C.19. Since draft-ietf-quic-recovery-04 | |||
| No significant changes. | No significant changes. | |||
| C.19. Since draft-ietf-quic-recovery-03 | C.20. Since draft-ietf-quic-recovery-03 | |||
| No significant changes. | No significant changes. | |||
| C.20. Since draft-ietf-quic-recovery-02 | C.21. Since draft-ietf-quic-recovery-02 | |||
| o Integrate F-RTO (#544, #409) | o Integrate F-RTO (#544, #409) | |||
| o Add congestion control (#545, #395) | o Add congestion control (#545, #395) | |||
| o Require connection abort if a skipped packet was acknowledged | o Require connection abort if a skipped packet was acknowledged | |||
| (#415) | (#415) | |||
| o Simplify RTO calculations (#142, #417) | o Simplify RTO calculations (#142, #417) | |||
| C.21. Since draft-ietf-quic-recovery-01 | C.22. Since draft-ietf-quic-recovery-01 | |||
| o Overview added to loss detection | o Overview added to loss detection | |||
| o Changes initial default RTT to 100ms | o Changes initial default RTT to 100ms | |||
| o Added time-based loss detection and fixes early retransmit | o Added time-based loss detection and fixes early retransmit | |||
| o Clarified loss recovery for handshake packets | o Clarified loss recovery for handshake packets | |||
| o Fixed references and made TCP references informative | o Fixed references and made TCP references informative | |||
| C.22. Since draft-ietf-quic-recovery-00 | C.23. Since draft-ietf-quic-recovery-00 | |||
| o Improved description of constants and ACK behavior | o Improved description of constants and ACK behavior | |||
| C.23. Since draft-iyengar-quic-loss-recovery-01 | C.24. Since draft-iyengar-quic-loss-recovery-01 | |||
| o Adopted as base for draft-ietf-quic-recovery | o Adopted as base for draft-ietf-quic-recovery | |||
| o Updated authors/editors list | o Updated authors/editors list | |||
| o Added table of contents | o Added table of contents | |||
| Acknowledgments | Acknowledgments | |||
| Authors' Addresses | Authors' Addresses | |||
| Jana Iyengar (editor) | Jana Iyengar (editor) | |||
| Fastly | Fastly | |||
| Email: jri.ietf@gmail.com | Email: jri.ietf@gmail.com | |||
| End of changes. 69 change blocks. | ||||
| 144 lines changed or deleted | 155 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||