Publications

My statement

Many systems and networking conferences have a special operational systems track for papers describing the design, implementation, analysis, and experience with large-scale, operational systems and networks. Different from regular research papers, such operational papers need not present new ideas or results to be accepted; indeed, new ideas or results will not influence whether the papers are accepted. Given very different review criteria, I explicitly mark my operational systems track papers in my website and CV. I promise the systems and networks described in these papers really serve (or have served) customer workloads in production environments.

Note

"*" marks the co-primary authors

Conference Papers

  1. Hardware-assisted RDMA Performance Isolation for Public Clouds (to appear)
    Jiaqi Lou*, Xinhao Kong*, Jinghan Huang, Wei Bai, Nam Sung Kim, Danyang Zhuo
    USENIX NSDI 2024

  2. Reverie: Low Pass Filter-Based Switch Buffer Sharing for Datacenters with RDMA and TCP Traffic (to appear)
    Vamsi Addanki, Wei Bai, Stefan Schmid, Maria Apostolaki
    USENIX NSDI 2024

  3. Towards Domain-Specific Network Transport for Distributed DNN Training (to appear)
    Hao Wang, Han Tian, Jingrong Chen, Xinchen Wan, Jiacheng Xia, Gaoxiong Zeng, Wei Bai, Junchen Jiang, Yong Wang, Kai Chen
    USENIX NSDI 2024

  4. Understanding the Micro-Behaviors of Hardware Offloaded Network Stacks with Lumina [pdf]
    Zhuolong Yu, Bowen Su, Wei Bai, Shachar Raindel, Vladimir Braverman, Xin Jin
    ACM SIGCOMM 2023

  5. FlexPass: A Case for Flexible Credit-based Transport for Datacenter Networks [pdf]
    Hwijoon Lim, Jaehong Kim, Inho Cho, Keon Jang, Wei Bai, Dongsu Han
    ACM EuroSys 2023

  6. Empowering Azure Storage with RDMA [pdf]
    Wei Bai, Shanim Sainul Abdeen, Ankit Agrawal, Krishan Kumar Attre, Paramvir Bahl, Ameya Bhagat, Gowri Bhaskara, Tanya Brokhman, Lei Cao, Ahmad Cheema, Rebecca Chow, Jeff Cohen, Mahmoud Elhaddad, Vivek Ette, Igal Figlin, Daniel Firestone, Mathew George, Ilya German, Lakhmeet Ghai, Eric Green, Albert Greenberg, Manish Gupta, Randy Haagens, Matthew Hendel, Ridwan Howlader, Neetha John, Julia Johnstone, Tom Jolly, Greg Kramer, David Kruse, Ankit Kumar, Erica Lan, Ivan Lee, Avi Levy, Marina Lipshteyn, Xin Liu, Chen Liu, Guohan Lu, Yuemin Lu, Xiakun Lu, Vadim Makhervaks, Ulad Malashanka, David A. Maltz, Ilias Marinos, Rohan Mehta, Sharda Murthi, Anup Namdhari, Aaron Ogus, Jitendra Padhye, Madhav Pandya, Douglas Phillips, Adrian Power, Suraj Puri, Shachar Raindel, Jordan Rhee, Anthony Russo, Maneesh Sah, Ali Sheriff, Chris Sparacino, Ashutosh Srivastava, Weixiang Sun, Nick Swanson, Fuhou Tian, Lukasz Tomczyk, Vamsi Vadlamuri, Alec Wolman, Ying Xie, Joyce Yom, Lihua Yuan, Yanzhao Zhang, Brian Zill
    USENIX NSDI 2023 (Operational Systems Track)

  7. Understanding RDMA Microarchitecture Resources for Performance Isolation [pdf]
    Xinhao Kong, Jingrong Chen, Wei Bai, Yechen Xu, Mahmoud Elhaddad, Shachar Raindel, Jitendra Padhye, Alvin R. Lebeck, Danyang Zhuo
    USENIX NSDI 2023

  8. 1Pipe: Scalable Total Order Communication in Data Center Networks [pdf]
    Bojie Li, Gefei Zuo, Wei Bai, Lintao Zhang
    ACM SIGCOMM 2021

  9. Towards Timeout-less Transport in Commodity Datacenter Networks [pdf]
    Hwijoon Lim, Wei Bai, Yibo Zhu, Youngmok Jung, Dongsu Han
    ACM EuroSys 2021

  10. Aeolus: A Building Block for Proactive Transport in Datacenters [pdf]
    Shuihai Hu, Wei Bai, Gaoxiong Zeng, Zilong Wang, Baochen Qiao, Kai Chen, Kun Tan, Yi Wang
    ACM SIGCOMM 2020

  11. OmniMon: Re-architecting Network Telemetry with Resource Efficiency and Full Accuracy [pdf]
    Qun Huang, Haifeng Sun, Patrick P. C. Lee, Wei Bai, Feng Zhu, Yungang Bao
    ACM SIGCOMM 2020

  12. One More Config is Enough: Saving (DC)TCP for High-speed Extremely Shallow-buffered Datacenters [pdf]
    Wei Bai, Shuihai Hu, Kai Chen, Kun Tan, Yongqiang Xiong
    IEEE INFOCOM 2020

  13. Enabling ECN for Datacenter Networks with RTT Variations [pdf]
    Junxue Zhang, Wei Bai, Kai Chen
    ACM CoNEXT 2019

  14. Congestion Control for Cross-Datacenter Networks [pdf]
    Gaoxiong Zeng, Wei Bai, Ge Chen, Kai Chen, Dongsu Han, Yibo Zhu, Lei Cui
    IEEE ICNP 2019

  15. FlowShader: a Generalized Framework for GPU-accelerated VNF Flow Processing [pdf]
    Xiaodong Yi, Junjie Wang, Jingpu Duan, Wei Bai, Chuan Wu, Yongqiang Xiong, Dongsu Han
    IEEE ICNP 2019

  16. DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing Pipelines [pdf]
    Yang Cheng, Dan Li, Zhiyuan Guo, Binyao Jiang, Jiaxin Lin, Xi Fan, Jinkun Geng, Xinyi Yu, Wei Bai, Lei Qu, Ran Shu, Peng Cheng, Yongqiang Xiong, Jianping Wu
    ICPP 2019

  17. SocksDirect: Datacenter Sockets can be Fast and Compatible [pdf]
    Bojie Li, Tianyi Cui, Zibo Wang, Wei Bai, Lintao Zhang
    ACM SIGCOMM 2019

  18. Accelerating Rule-matching Systems with Learned Rankers [pdf]
    Zhao Lucis Li, Mike Chieh-Jan Liang, Wei Bai, Qiming Zheng, Yongqiang Xiong, Guangzhong Sun
    USENIX ATC 2019

  19. Resilient Datacenter Load Balancing in the Wild [pdf]
    Hong Zhang, Junxue Zhang, Wei Bai, Kai Chen, Mosharaf Chowdhury
    ACM SIGCOMM 2017

  20. Rate-Aware Flow Scheduling for Commodity Data Center Networks [pdf]
    Ziyang Li, Wei Bai, Kai Chen, Dongsu Han, Yiming Zhang, Dongsheng Li, Hongfang Yu
    IEEE INFOCOM 2017

  21. Enabling ECN over Generic Packet Scheduling [pdf]
    Wei Bai, Kai Chen, Li Chen, Changhoon Kim, Haitao Wu
    ACM CoNEXT 2016

  22. Scheduling Mix-flows in Commodity Datacenters with Karuna [pdf]
    Li Chen, Kai Chen, Wei Bai, Mohammad Alizadeh
    ACM SIGCOMM 2016

  23. Enabling ECN in Multi-Service Multi-Queue Data Centers [pdf]
    Wei Bai, Li Chen, Kai Chen, Haitao Wu
    USENIX NSDI 2016

  24. Providing Bandwidth Guarantees, Work Conservation and Low Latency Simultaneously in the Cloud [pdf]
    Shuihai Hu, Wei Bai, Kai Chen, Chen Tian, Ying Zhang, Haitao Wu
    IEEE INFOCOM 2016

  25. Guaranteeing Deadlines for Inter-Datacenter Transfers [pdf]
    Hong Zhang, Kai Chen, Wei Bai, Dongsu Han, Chen Tian, Hao Wang, Haibing Guan, Ming Zhang
    ACM EuroSys 2015

  26. Information-Agnostic Flow Scheduling for Commodity Data Centers [pdf]
    Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang
    USENIX NSDI 2015

  27. Explicit Path Control in Commodity Data Centers: Design and Applications [pdf]
    Shuihai Hu, Kai Chen, Haitao Wu, Wei Bai, Chang Lan, Hao Wang, Hongze Zhao, Chuanxiong Guo
    USENIX NSDI 2015

  28. RAPIER: Integrating Routing and Scheduling for Coflow-aware Data Center Networks [pdf]
    Yangming Zhao, Kai Chen, Wei Bai, Minlan Yu, Chen Tian, Yanhui Geng, Yiming Zhang, Dan Li, Sheng Wang
    IEEE INFOCOM 2015

  29. PAC: Taming TCP Incast Congestion Using Proactive ACK Control [pdf]
    Wei Bai, Kai Chen, Haitao Wu, Wuwei Lan, Yangming Zhao
    IEEE ICNP 2014

  30. HadoopWatch: A First Step Towards Comprehensive Traffic Forecasting in Cloud Computing [pdf]
    Yang Peng, Kai Chen, Guohui Wang, Wei Bai, Zhiqiang Ma, Lin Gu
    IEEE INFOCOM 2014

Workshop Papers

  1. Towards a Manageable Intra-Host Network [pdf]
    Xinhao Kong, Jiaqi Lou, Wei Bai, Nam Sung Kim, Danyang Zhuo
    HotOS 2023

  2. Rethinking transport layer design for distributed machine learning [pdf]
    Jiacheng Xia, Gaoxiong Zeng, Junxue Zhang, Weiyan Wang, Wei Bai, Junchen Jiang, Kai Chen
    APNet 2019

  3. Augmenting Proactive Congestion Control with Aeolus [pdf]
    Shuihai Hu, Wei Bai, Baochen Qiao, Kai Chen, Kun Tan
    APNet 2018

  4. Combining ECN and RTT for Datacenter Transport [pdf]
    Gaoxiong Zeng, Wei Bai, Ge Chen, Kai Chen, Dongsu Han, Yibo Zhu
    APNet 2017

  5. Congestion Control for High-speed Extremely Shallow-buffered Datacenter Networks [pdf]
    Wei Bai, Kai Chen, Shuihai Hu, Kun Tan, Yongqiang Xiong
    APNet 2017

  6. PIAS: Practical Information-Agnostic Flow Scheduling for Data Center Networks [pdf]
    Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, Weicheng Sun
    ACM HotNets 2014

Journal Papers

  1. Congestion Control for Cross-Datacenter Networks [pdf]
    Gaoxiong Zeng, Wei Bai, Ge Chen, Kai Chen, Dongsu Han, Yibo Zhu, Lei Cui
    IEEE/ACM Transactions on Networking, 2022

  2. Aeolus: A Building Block for Proactive Transport in Datacenter Networks [pdf]
    Shuihai Hu, Gaoxiong Zeng, Wei Bai, Zilong Wang, Baochen Qiao, Kai Chen, Kun Tan, Yi Wang
    IEEE/ACM Transactions on Networking, 2021

  3. Accelerating End-to-End Deep Learning Workflow With Codesign of Data Preprocessing and Scheduling [pdf]
    Yang Cheng, Dan Li, Zhiyuan Guo, Binyao Jiang, Jinkun Geng, Wei Bai, Jianping Wu, Yongqiang Xiong
    IEEE Transactions on Parallel and Distributed Systems, 2020

  4. One More Config is Enough: Saving (DC)TCP for High-speed Extremely Shallow-buffered Datacenters [pdf]
    Wei Bai, Shuihai Hu, Kai Chen, Kun Tan, Yongqiang Xiong
    IEEE/ACM Transactions on Networking, 2020

  5. Providing Bandwidth Guarantees, Work Conservation and Low Latency Simultaneously in the Cloud [pdf]
    Shuihai Hu, Wei Bai, Kai Chen, Chen Tian, Ying Zhang, Haitao Wu
    IEEE Transactions on Cloud Computing, 2018

  6. RepNet: Cutting Latency with Flow Replication in Data Center Networks [pdf]
    Shuhao Liu, Hong Xu, Libin Liu, Wei Bai, Kai Chen, Zhiping Cai
    IEEE Transactions on Services Computing, 2018

  7. PIAS: Practical Information-Agnostic Flow Scheduling for Commodity Data Centers [pdf]
    Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang
    IEEE/ACM Transactions on Networking, 2017

  8. Guaranteeing Deadlines for Inter-Datacenter Transfers [pdf]
    Hong Zhang, Kai Chen, Wei Bai, Dongsu Han, Chen Tian, Hao Wang, Haibing Guan, Ming Zhang
    IEEE/ACM Transactions on Networking, 2017

  9. Explicit Path Control in Commodity Data Centers: Design and Applications [pdf]
    Shuihai Hu, Kai Chen, Haitao Wu, Wei Bai, Chang Lan, Hao Wang, Hongze Zhao, Chuanxiong Guo
    IEEE/ACM Transactions on Networking, 2016

  10. Towards Comprehensive Traffic Forecasting in Cloud Computing: Design and Application [pdf]
    Yang Peng, Kai Chen, Guohui Wang, Wei Bai, Yangming Zhao, Hao Wang, Yanhui Geng, Zhiqiang Ma, and Lin Gu
    IEEE/ACM Transactions on Networking, 2016