1 |
ZHANG Z, CHANG C K, LIN H B, et al. "Is network the bottleneck of distributed training?"[EB]. arXiv preprint arXiv, 2020: 2006.10103.
|
2 |
LUO L, WEST P, KRISHNAMURTHY A, CEZE L, et al. PLink: discovering and exploiting datacenter network locality for efficient cloud-based distributed training[C]//Proceedings of 2020 MLSys. 2020.
|
3 |
ROTHENBERGER B, TARANOV K, PERRIG A, et al. {ReDMArk}: Bypassing {RDMA} security mechanisms[C]//30th USENIX Security Symposium (USENIX Security 21). 2021: 4277-4292.
|
4 |
HOEFLER T, ROWETH D, UNDERWOOD K, et al. Datacenter Ethernet and RDMA: issues at hyperscale[EB]. arXiv preprint arXiv, 2023: 2302.03337.
|
5 |
HOPPS C. Analysis of an equal-cost multi-path algorithm[R]. Technical Report, 2000.
|
6 |
QURESHI M A, CHENG Y C, YIN Q W,et al. PLB: congestion signals are simple and effective for network load balancing[C]// Proceedings of the ACM SIGCOMM 2022 Conference. New York: ACM Press, 2022: 207-218.
|
7 |
SONG C H, KHOOI X Z, JOSHI R, et al. Network load balancing with in-network reordering support for RDMA[C]//Proceedings of the ACM SIGCOMM 2023 Conference. 2023: 816-831.
|
8 |
DIXIT A, PRAKASH A, HU Y C, et al. On the impact of packet spraying in data center networks[C]//Proceedings of IEEE INFOCOM 2013. Piscataway: IEEE Press, 2013: 2130-2138.
|
9 |
SCHARF M, KIESEL S. NXG03-5: Head-of-line blocking in TCP and SCTP: analysis and measurements[C]//Proceedings of the 49th IEEE Global Telecommunications Conference(GLOBECOM 2006). Piscataway: IEEE Press, 2006: 1-5.
|
10 |
XUE J C, CHAUDHRY M U, VAMANAN B, et al. Dart: divide and specialize for fast response to congestion in RDMA-based datacenter networks[J]. IEEE/ACM Transactions on Networking, 2020, 28(1): 322-335.
|
11 |
ZHU Y B, ERAN H, FIRESTONE D, et al. Congestion control for large-scale RDMA deployments[J]. ACM SIGCOMM Computer Communication Review, 2015, 45(4): 523-536..
|
12 |
HU S H, ZHU Y B, CHENG P, et al. Deadlocks in datacenter networks: why do they form, and how to avoid them[C]//Proceedings of the 15th ACM Workshop on Hot Topics in Networks. New York: ACM Press, 2016: 92-98.
|
13 |
MITTAL R, THE L V, DUKKIPATI N, et al. TIMELY: RTT-based congestion control for the datacenter[C]//Proceedings of SIGCOMM 2015. 2015: 537-550.
|
14 |
LI Y, MIAO R, LIU H H, et al. HPCC: high precision congestion control[C]//Proceedings of the ACM Special Interest Group on Data Communication. NewYork: ACM Press, 2019: 44-58.
|
15 |
PINKERTON J, DELEGANES E. RFC 5042:direct data placement protocol (DDP)/remote direct memory access protocol (RDMAP) security[R]. IEFT, 2007.
|
16 |
Google. Google white paper: PSP architecture specification[R]. 2022.
|
17 |
IEEE 802.1AE-2018: media access control (MAC) security [S]. 2018.
|
18 |
HOPPS C. RFC 9347:aggregation and fragmentation mode for encapsulating security payload (ESP) and its use for IP traffic flow security (IP-TFS)[R]. IEFT, 2023.
|