Network Working Group                                           Q. Xiong
Internet-Draft                                           ZTE Corporation
Intended status: Informational                                    K. Yao
Expires: 1 March 2025                                       China Mobile
                                                                C. Huang
                                                           China Telecom
                                                                  Z. Han
                                                            China Unicom
                                                          28 August 2024


  Use Cases, Problems and Requirements for High Performance Wide Area
                                Network
                  draft-xiong-uc-problem-req-hp-wan-00

Abstract

   High Performance Wide Area Network (HP-WAN) is designed for many
   applications such as scientific research, education, and other data-
   intensive applications which demand massive data transmission, and it
   needs to ensure data integrity and provide stable and efficient
   transmission services.

   This document describes the use cases, analyses the problems, and
   outlines the requirements for HP-WANs.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 1 March 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.




Xiong, et al.             Expires 1 March 2025                  [Page 1]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   4
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  High Performance Computing (HPC)  . . . . . . . . . . . .   5
     3.2.  Backup and Disaster Recovery  . . . . . . . . . . . . . .   5
     3.3.  Multimedia Content Production . . . . . . . . . . . . . .   6
     3.4.  AI Training . . . . . . . . . . . . . . . . . . . . . . .   6
   4.  Problem Statements  . . . . . . . . . . . . . . . . . . . . .   7
     4.1.  Challenging with Low Bandwidth Utilization of a Single
           Elephant Flow . . . . . . . . . . . . . . . . . . . . . .   7
     4.2.  Challenging with Massive Flows Data with Large Burst  . .   7
     4.3.  Challenging with Long-distance Delay and Slow Feedback  .   8
     4.4.  Challenging with Packet Loss Impacting Transport
           Protocols . . . . . . . . . . . . . . . . . . . . . . . .   8
   5.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   9
     5.1.  Service Requirements  . . . . . . . . . . . . . . . . . .   9
     5.2.  Performance Requirements  . . . . . . . . . . . . . . . .  10
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  10
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   Data is fundamental for many scientific research, including biology,
   astronomy, and artificial intelligence(AI), etc.  Within these areas,
   there are many applications that generate huge volume of data by
   using advanced instruments and high-end computing devices.  For data
   sharing and data backup, these applications usually require massive
   data transmission over long distance, for example, sharing data
   between research institutes over thousands of kilometers.  These
   applications include High Performance Computing (HPC) for scientific
   research, cloud storage and backup of industrial internet data,
   distributed training, and so on.  It needs to ensure data integrity



Xiong, et al.             Expires 1 March 2025                  [Page 2]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   and provide stable and efficient transmission services in Wide Area
   Networks (WANs).  These WANs need to connect research institutions,
   universities, and data centers across large geographical areas.

   Traditional data migration solutions include manual transportation of
   hard copy, which not only incurs more labor cost, but also lacks
   safety, and high-speed dedicated connectivity (e.g.  Direct optical
   connection), which is expensive.  Moreover, the applications may
   demand a periodic and temporary migration, require task-based data
   transmission with low real-time requirements, and the transmission
   frequency is variable, all of which will lead to low network
   utilization and cost-effectiveness.

   The massive data may be transmitted over non-dedicated WANs and the
   network requirements demand high performance such as the high-
   throughput data transmission which depends on the transport layer
   protocols such as Transfer Control Protocol (TCP), Quick UDP Internet
   Connections (QUIC), Remote Direct Memory Access (RDMA) and so on.
   But the performance of TCP will be impacted by the packet loss
   retransmission techniques.  And for RDMA, there are three main
   implementation methods such as InfiniBand (IB), which is a high-
   performance dedicated network technology, but requires specific
   InfiniBand hardware support, Internet Wide Area RDMA Protocol
   (iWARP), which is based on the TCP/IP protocol, but the transmission
   performance may be affected by the congestion control and flow
   control of TCP, and RDMA over Converged Ethernet (RoCE), which allows
   the execution of RDMA over Ethernet, but it has applicability issues
   over WANs.

   Moreover, the long-distance connection and massive data transmission
   between two or more sites have become a key factor affecting the
   performance.  For instance, the long-distance networks may have more
   uncertainties, such as routing changes, network congestion, packet
   loss and link quality fluctuations, all of which may have a negative
   impact on the performance.  The services are massive and concurrent
   with multiple types and different traffic models such as the elephant
   flows with short interval time, high speed and large data scale,
   which may occupy a large amount of network resources and affect the
   performance.

   High Performance Wide Area Network (HP-WAN) is designed specifically
   to meet the high-speed, low-latency, and high-capacity needs of
   massive data set applications, which puts forward higher performance
   requirements such as ultra-high goodput, high bandwidth utilization,
   ultra-low packet loss ratio, and resilience to ensure effective high-
   throughput transmission.





Xiong, et al.             Expires 1 March 2025                  [Page 3]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   This document describes the use cases, analyses the problems, and
   outlines the requirements for HP-WANs.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Terminology

   The terminology is defined as following.

   High Performance Wide Area Networks (HP-WANs): indicate the networks
   designed specifically to meet the high-speed, low-latency, and high-
   capacity needs of scientific research, education, and data-intensive
   applications.  The primary goal of HP-WAN is to achieve massive data
   transmission, which puts forward higher performance requirements such
   as ultra-high goodput, high bandwidth utilization, ultra-low packet
   loss ratio, and resilience to ensure effective high-throughput
   transmission.

   It also makes use of the following abbreviations and definitions in
   this document:

   DC:            Data Center

   DCI:           Data Centers Interconnection

   HPC:           High Performance Computing

   WAN:           Wide Area Networks

   MAN:           Metropolitan Area Networks

   PFC:           Priority Flow Control

   ECN:           Explicit Congestion Notification

   ECMP:          Equal-Cost Multipath

   RTT:           Round-Trip Time

   TCP:           Transfer Control Protocol

   RDMA:          Remote Direct Memory Access Round-Trip Time



Xiong, et al.             Expires 1 March 2025                  [Page 4]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   QUIC:          Quick UDP Internet Connections

3.  Use Cases

   Several use cases are documented for scenarios requiring high-
   performance data transmission over WANs.

3.1.  High Performance Computing (HPC)

   High Performance Computing (HPC) uses computing clusters to perform
   complex scientific computing and data analysis tasks.  HPC is a
   critical component to solve some complex problems in various fields
   such as scientific research, engineering, finance, and data analysis.

   For example, the research data of large science and engineering
   projects in cooperation with many research institutions requires
   long-term archiving of about 50~300PB of data every year.  The PSII
   protein process generates 30 to 120 high-resolution images per second
   during experiments.  This results in 60~100 GB of data every five
   minutes, requiring data transmission from one laboratory to another
   for analysis.  Another example is Five-hundred-meter Aperture
   Spherical radio Telescope (FAST), astronomical data calculation with
   over 200 observations for each project, a single project generating
   observation data of TB~PB, and an annual production data of about
   15PB per year.

   HPC requires high bandwidth and high-speed network to facilitate the
   rapid data exchange between processing units.  It also requires high-
   capacity and high-throughput storage solutions to handle the vast
   amounts of data generated by simulations and computations.  It is
   necessary to support large-scale parallel processing, high-speed data
   transmission, and low latency communication to achieve effective
   collaboration between computing nodes.

3.2.  Backup and Disaster Recovery

   As the development of the cloud computing industry, cloud data
   centers are bearing a large amount of various enterprise IT services.
   The storage, transmission, and protection of the massive growth data
   bring new challenges.

   For instance, disaster recovery of core application data is required
   to ensure the enterprise data security and the service continuity.
   In the scenario of disaster recovery of the operator's traffic data,
   the daily data backup volume of a single IT cloud resource pool is at
   the TB level.  The primary and backup data centers are normally built
   in different locations with long data transmission distances.
   However, they do not have strict requirements for data transmission



Xiong, et al.             Expires 1 March 2025                  [Page 5]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   time.  By utilizing the tidal effect of the network, the idle
   bandwidth at night can be utilized for the transmission, so as to
   improve the data transmission efficiency and reduce the data
   transmission cost.

3.3.  Multimedia Content Production

   Multimedia Content Production refers to the process of creating and
   editing content that combines different media forms such as text,
   audio, images, animations, and video.  This field is characterized by
   the use of digital technology to produce engaging and dynamic content
   for various platforms, including film, television, the internet, and
   mobile devices.  It requires processing a large amount of data,
   including raw video materials, special effects, and rendering
   results.

   For example, for film and video production, the raw material data of
   a large-scale variety show or film and television program is at the
   PB level, with a single transmission of data in the range of 10TB to
   100TB.  And with the development of new media such as 4K/8K, 5G, AI,
   VR/AR and short video, large amount of audio and video data needs to
   be transmitted between data centers or different storage sites across
   long distance.  For AR/VR videos, the terminal outputs 1080P image
   quality requires 40M per user.  It demands data transmission with the
   traffic characteristics such as massive data scale and large burst.

3.4.  AI Training

   With the increasing demand for computing power in AI large-scale
   model training, the scale of a single data center is limited due to
   factors such as power supply.  The AI training clusters expands from
   single data center to multiple DCs.  Collaborative training across
   multiple DCs typically refers to the process of distributed machine
   learning training across multiple data centers, which can improve
   computational efficiency, accelerate model training speed, and
   utilize more data resources.

   For example, it is used for the training process of deep learning and
   the training data has reached 3.05TB.  Uploading a large model
   training templates requires uploading TB/PB level data to the data
   center.  Each training session has fewer data flows with larger
   bandwidth.  And 20% of the current network's services accounts for
   80% of the traffic which resulting in elephant flows.  Compared with
   traditional DCI scenarios, parameters exchange significantly
   increases the amount of data transmission across DCs, typically from
   tens to hundreds of TB.  It should provide sufficient bandwidth, low
   latency, and high reliability for data centers communications.




Xiong, et al.             Expires 1 March 2025                  [Page 6]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


4.  Problem Statements

   Challenges of effective high-performance transmission in HP-WAN come
   from massive concurrent services and long-distance delays and packet
   loss.  The existing network technologies have various problems and
   cannot meet the demands.  This document outlines the problems for HP-
   WANs.

4.1.  Challenging with Low Bandwidth Utilization of a Single Elephant
      Flow

   In HP-WAN applications, a large amount of data will be transmitted in
   a single time, for example, a single flow data is TB~PB.  It may be
   elephant flows which lasts for a long time with short interval time,
   high speed and large data scale in the network, which may occupy a
   large amount of network resources and affect the performance.  It may
   be challenging for low bandwidth utilization with network congestion
   and load imbalance.

   When transmitting massive data, the traffic is mainly elephant flow
   and the network resources is insufficient in WANs.  Uneven network
   load will lead to a decrease in network throughput and low link
   utilization.  Load balance refers to a method for the allocation of
   load (traffic) to multiple links for forwarding traffic.  For
   example, it will be challenging for HASH conflict and poor network
   balancing with massive elephant flows when flow-based ECMP
   distributes the elephant flows into the same link, resulting in
   congestion and packet loss.

4.2.  Challenging with Massive Flows Data with Large Burst

   There are massive flows data transfers with large burst which may
   cause instantaneous congestion and packet loss within network device
   queues in WANs.  There will be more aggregations at the edge of WANs
   and it may be accumulated as the flows traverse, join, and separate
   over hops.  It will be challenging for congestion control and
   bandwidth guarantee for the bursty traffic.

   Moreover, the applications may have multiple concurrent services co-
   existed with existing dynamic flows.  Considering the multiple
   services with various types and different traffic requirements, the
   traffic is required to be scheduled to multiple paths and fine-
   grained network resources to achieve high utilization and QoS
   guarantee.  It will be challenging for traffic scheduling especially
   when it is unable to get the Traffic Specification (T-SPEC) of the
   flows.





Xiong, et al.             Expires 1 March 2025                  [Page 7]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


4.3.  Challenging with Long-distance Delay and Slow Feedback

   In HP-WAN scenarios, it will be challenging for flow control due to
   the long-distance link and transmission delay.  Flow control refers
   to a method for ensuring the data is transmitted efficiently and
   reliably and controlling the rate of data transmission to prevent the
   fast sender from overwhelming the slow receiver and prevent packet
   loss in congested situations.  It is required to configure the
   reasonable threshold and increase buffer for effective throughput
   without packet loss for the long-distance delay.

   It will be also challenging for congestion control in WANs for
   controlling the total amount of data entering the network to maintain
   the traffic at an acceptable level.  The long-distance transmission
   of thousands of kilometers results in extremely long link
   transmission delays and it will delay the network state feedback.
   For example, as per [RFC3168], Explicit Congestion Notification (ECN)
   defines an end-to-end congestion notification mechanism based on IP
   and transport layers.  When the congestion occurred, the device will
   mark packets and transmits congestion information to the server and
   the server sends packets to the client to notify the source to adjust
   the transmission rate to achieve congestion control.  The long-
   distance will delay the notification and slow the feedback, which
   result in the untimely adjustment.

   Moreover, the slow feedback may has impact for some congestion
   control algorithms.  For example, Bottleneck Bandwidth and Round-trip
   propagation time (BBR) is a congestion-based congestion control
   algorithm for TCP, which actively measures bottleneck bandwidth
   (BtlBw) and round-trip propagation time (RTprop) based on the model
   to calculate the bandwidth delay product (BDP) and then to adjust the
   transmission rate to maximize throughput and minimize latency.  But
   BBR relies on real-time measurement of the parameters which may vary
   greatly, feedback slowly, thereby affecting the control precision of
   BBR in long-distance networks.  Moreover, the Data Center Quantized
   Congestion Notification (DCQCN) and High Precision Congestion Control
   (HPCC++) would not tolerate the long feedback loop.  The stability
   and adaptability of congestion control algorithms may be challenging
   in HP-WAN scenarios.

4.4.  Challenging with Packet Loss Impacting Transport Protocols

   It will be challenging that the packet loss has a significant impact
   on the throughput of some transmission protocols especially in HP-WAN
   scenarios.  For example, the design of RDMA is aimed at high
   performance and low latency, which makes RDMA have strict
   requirements for the network, that is, the network would be better to
   provide ultra-low packet loss, otherwise the performance degradation



Xiong, et al.             Expires 1 March 2025                  [Page 8]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   will be significant, which poses greater challenges to the underlying
   network hardware and also limits the network size of RDMA.  RDMA
   relies on a goBackN retransmission mechanism and the throughput
   dramatically decreases with packet loss rates greater than 0.1%, and
   a 2% packet loss rate effectively reduces throughput to zero.

   And for TCP and QUIC, Congestion-based Upon Bandwidth-Information
   (CUBIC) is a traditional congestion algorithm, as per [RFC9438], and
   it uses a more aggressive window increase function which is suitable
   for high-speed and long-distance network.  When packet loss occurs,
   CUBIC will reduce the congestion window based on its multiplicative
   window decrease factor, that will slow the convergence speed.  So it
   has a requirement for low network packet loss.  As per [RFC9438],
   section 5.2, it is required a packet loss rate of 2.9e-8 to achieve
   the throughput of 10 Gbps rate.  The throughput will dramatically
   decrease when the packet loss ratio is over a threshold value.

5.  Requirements

5.1.  Service Requirements

   The characteristics of above use cases and problems may include
   massive elephant flows data with large burst, multiple concurrent
   services co-existed with dynamic flows and long distances between
   sites.  This document outlines the service requirements from users as
   following shown.

   *  Massive data transmission, e.g. a single flow data is TB~PB.

   *  Task-based data transmission, and the frequency is variable, e.g.a
      periodic and temporary migration.

   *  Long-distance transmission, between one or more sites or DCs,
      e.g.more than 1000km.

   *  Instant transmission, it needs to be transmitted immediately or at
      a specific time.

   *  Timely transmission, it has a completion time but without real-
      time transmission requirements.

   *  Low cost

   *  Data security and integrity







Xiong, et al.             Expires 1 March 2025                  [Page 9]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   *  Compatibility and complementation with dedicated networks such as
      Research and Education Network.  For example, it is required to
      provide switching with a fine-grained mapping between private
      networks and WANs to achieve optimal operating and consumption
      costs.

5.2.  Performance Requirements

   This document outlines the requirements for effective high-throughput
   data transmission in HP-WAN with the performance indicators such as
   ultra-high bandwidth utilization, ultra-low packet loss ratio and low
   latency as following shown.

   *  Ultra-low Packet Loss Ratio: according to the performance
      indicators of throughput, the packet loss negatively correlates
      with throughput.  The lower the packet loss rate, the higher the
      throughput.  It is important to ensure the ultra-low packet loss
      ratio to achieve high-throughput data transmission in HP-WAN.

   *  Ultra-high Bandwidth Utilization: refers to the efficient use of
      available network capacity to maximize data transfer rates and
      minimize latency.  It is required to improve the bandwidth
      utilization to achieve high-throughput data transmission for
      multiple concurrent services in HP-WAN.

   *  Low Latency: RTT is another performance indicators of throughput
      which negatively correlated with throughput.  The lower the RTT,
      the higher the throughput.  It is required to guarantee low long-
      distance delay to achieve high-throughput data transmission in HP-
      WAN.

6.  Security Considerations

   This document covers a number of representative applications and
   network scenarios that are expected to make use of HP-WAN
   technologies.  Each of the potential use cases does not raise any
   security concerns or issues, but may have security considerations
   from both the use-specific perspective and the technology-specific
   perspective.

7.  IANA Considerations

   This document makes no requests for IANA action.

8.  Acknowledgements

   The authors would like to acknowledge Zheng Zhang, Yao Liu and
   Guangping Huang for their thorough review and very helpful comments.



Xiong, et al.             Expires 1 March 2025                 [Page 10]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


9.  References

9.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,
              <https://www.rfc-editor.org/info/rfc3168>.

   [RFC7424]  Krishnan, R., Yong, L., Ghanwani, A., So, N., and B.
              Khasnabish, "Mechanisms for Optimizing Link Aggregation
              Group (LAG) and Equal-Cost Multipath (ECMP) Component Link
              Utilization in Networks", RFC 7424, DOI 10.17487/RFC7424,
              January 2015, <https://www.rfc-editor.org/info/rfc7424>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC8664]  Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W.,
              and J. Hardwick, "Path Computation Element Communication
              Protocol (PCEP) Extensions for Segment Routing", RFC 8664,
              DOI 10.17487/RFC8664, December 2019,
              <https://www.rfc-editor.org/info/rfc8664>.

   [RFC9232]  Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and
              A. Wang, "Network Telemetry Framework", RFC 9232,
              DOI 10.17487/RFC9232, May 2022,
              <https://www.rfc-editor.org/info/rfc9232>.

   [RFC9438]  Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed.,
              "CUBIC for Fast and Long-Distance Networks", RFC 9438,
              DOI 10.17487/RFC9438, August 2023,
              <https://www.rfc-editor.org/info/rfc9438>.

Authors' Addresses

   Quan Xiong
   ZTE Corporation
   China
   Email: xiong.quan@zte.com.cn





Xiong, et al.             Expires 1 March 2025                 [Page 11]

Internet-Draft  Use Cases, Problems and Requirements for     August 2024


   Kehan Yao
   China Mobile
   China
   Email: yaokehan@chinamobile.com


   Cancan Huang
   China Telecom
   China
   Email: huangcanc@chinatelecom.cn


   Zhengxin Han
   China Unicom
   China
   Email: hanzx21@chinaunicom.cn



































Xiong, et al.             Expires 1 March 2025                 [Page 12]