Publications

Publications by in reversed chronological order. See citations in Google Scholar or Semantic Scholar.

2024

  1. 2023.arXiv-CaraServe-Li.png
    CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
    Suyi Li, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, and Wei Wang
    arXiv preprint arXiv:2401.11240, 2024

2023

  1. 2023.ATC-FGD-Weng-preview.png
    Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent
    Qizhen Weng, Lingyun Yang, Yinghao Yu, Wei Wang, Xiaochuan Tang, Guodong Yang, and Liping Zhang
    In 2023 USENIX Annual Technical Conference (ATC), 2023

2022

  1. 2022.NSDI-MLaaS-Weng-preview.png
    MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters
    Qizhen Weng, Wencong Xiao, Yinghao Yu, Wei Wang, Cheng Wang, Jian He, Yong Li, Liping Zhang, Wei Lin, and Yu Ding
    In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2022
  2. 2022.SoCC-Alibaba-Zhang-preview.png
    Workload consolidation in Alibaba clusters: the good, the bad, and the ugly
    Yongkang Zhang, Yinghao Yu, Wei Wang, Qiukai Chen, Jie Wu, Zuowei Zhang, Jiang Zhong, Tianchen Ding, Qizhen Weng, Lingyun Yang, and 4 more authors
    In 13th ACM Symposium on Cloud Computing (SoCC), 2022

2021

  1. 2021.TCC-LBBSP-Chen-preview.png
    Accelerating Distributed Learning in Non-Dedicated Environments
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    IEEE Transactions on Cloud Computing (TCC), 2021

2020

  1. 2020.SC-Metis-Wang_Weng-preview.png
    Metis: Learning to schedule long-running applications in shared container clusters at scale
    Luping Wang, Qizhen Weng, Wei Wang, Chen Chen, and Bo Li
    In International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2020
  2. 2020.SoCC-LBBSP-Chen-preview.png
    Semi-dynamic load balancing: efficient distributed learning in non-dedicated environments
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    In 11th ACM Symposium on Cloud Computing (SoCC), 2020

2019

  1. APSys
    Towards framework-independent, non-intrusive performance characterization for dataflow computation
    Huangshi Tian, Qizhen Weng, and Wei Wang
    In Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), 2019

2018

  1. SoCC
    Fast distributed deep learning via worker-adaptive batch sizing
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    In 9th ACM Symposium on Cloud Computing (SoCC), 2018
  2. ICDCS
    Opus: Fair and efficient cache sharing for in-memory data analytics
    Yinghao Yu, Wei Wang, Jun Zhang, Qizhen Weng, and Khaled Ben Letaief
    In 38th IEEE International Conference on Distributed Computing Systems (ICDCS), 2018