Publications

Publications by in reversed chronological order. See citations in Google Scholar or Semantic Scholar.

2026

  1. EuroSys
    Suika: Efficient and High-quality Re-scheduling of 3D-parallelized LLM Training Jobs in Shared Clusters
    Yuxuan Wang, Yanbo Wang, Chen Chen, Chunyu Xue, Qizhen Weng, Yin Chen, Zeren Li, Xuqi Zhu, Yongqiang Yang, Quan Chen, and 1 more author
    In 21th ACM European Conference on Computer Systems (EuroSys), Apr 2026
  2. 2026.EuroSys-GRouter-Wu-preview.png
    Efficient Data Passing for Serverless Inference Workflows: A GPU-Centric Approach
    Hao Wu, Yaochen Liu, Minchen Yu, Qizhen Weng, Junxiao Deng, Yue Yu, Hao Fan, Song Wu, Wei Wang, and Hai Jin
    In 21th ACM European Conference on Computer Systems (EuroSys), Apr 2026

2025

  1. 2025.arXiv-TeleWorld-Chen-preview.png
    TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model
    Yabo Chen, Yuanzhi Liang, Jiepeng Wang, Tingxi Chen, Junfei Cheng, Zixiao Gu, Yuyang Huang, Zicheng Jiang, Wei Li, Tian Li, and 17 more authors
    arXiv preprint arXiv:2601.00051, Dec 2025
    Ranked No. 1 on the WorldScore Leaderboard in December 2025
  2. 2025.arXiv-AITrinity-Fan-preview.png
    Computation-Bandwidth-Memory Trade-offs: A Unified Paradigm for AI Infrastructure
    Yuankai Fan, Qizhen Weng, and Xuelong Li
    arXiv preprint arXiv:2601.11577, Dec 2025
  3. 2025.arXiv-Janus-Zhang-preview.png
    Janus: Disaggregating Attention and Experts for Scalable MoE Inference
    Zhexiang Zhang, Ye Wang, Xiangyu Wang, Yumiao Zhao, Jingzhe Jiang, Qizhen Weng, Shaohuai Shi, Yin Chen, and Minchen Yu
    arXiv preprint arXiv:2512.13525, Dec 2025
  4. 2025.Vicinagearth-NL2SQL-Fan-preview.png
    Rethinking Data in NL2SQL: A Survey of What We Have and What We Expect
    Yuankai Fan, Qizhen Weng, Yin Chen, and X. Sean Wang
    Vicinagearth, Nov 2025
  5. 2025.ATC-Toppings-Li-preview.png
    Toppings: CPU-Assisted, Rank-Aware Adapter Serving for LLM Inference
    Suyi Li, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, and Wei Wang
    In 2025 USENIX Annual Technical Conference (ATC), Jul 2025
  6. 2025.AI4Good-GreenMultiCluster-Weng-preview.png
    AI for green multi-cluster: Intelligent management towards green and low-carbon, large-scale multi-clusters
    Qizhen Weng, and Yuankai Fan
    In AI for Good Innovate for Impact Report, Jul 2025
  7. 2025.arXiv-IGTCache-Wang-preview.png
    Efficient Unified Caching for Accelerating Heterogeneous AI Workloads
    Tianze Wang, Yifei Liu, Chen Chen, Pengfei Zuo, Jiawei Zhang, Qizhen Weng, Yin Chen, Zhenhua Han, Jieru Zhao, Quan Chen, and 1 more author
    arXiv preprint arXiv:2506.12370, Jun 2025
  8. 2025.NSDI-Prism-Yang-preview.png
    GPU-Disaggregated Serving for Deep Learning Recommendation Models at Scale
    Lingyun Yang, Yongchen Wang, Yinghao Yu, Qizhen Weng, Jianbo Dong, Kan Liu, Chi Zhang, Yanyi Zi, Hao Li, Zechao Zhang, and 12 more authors
    In 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI), Apr 2025

2024

  1. 2024.arXiv-LLM Survey-Duan-preview.png
    Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
    Jiangfei Duan, Shuo Zhang, Zerui Wang, Lijuan Jiang, Wenwen Qu, Qinghao Hu, Guoteng Wang, Qizhen Weng, Hang Yan, Xingcheng Zhang, and 6 more authors
    arXiv preprint arXiv:2407.20018, Jul 2024
  2. 2024.arXiv-InternLM2-Cai-preview.svg
    InternLM2 Technical Report
    Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, and 90 more authors
    arXiv preprint arXiv:2403.17297, Jul 2024

2023

  1. 2023.ATC-FGD-Weng-preview.png
    Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent
    Qizhen Weng, Lingyun Yang, Yinghao Yu, Wei Wang, Xiaochuan Tang, Guodong Yang, and Liping Zhang
    In 2023 USENIX Annual Technical Conference (ATC), Jul 2023

2022

  1. 2022.NSDI-MLaaS-Weng-preview.png
    MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters
    Qizhen Weng, Wencong Xiao, Yinghao Yu, Wei Wang, Cheng Wang, Jian He, Yong Li, Liping Zhang, Wei Lin, and Yu Ding
    In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Jul 2022
  2. 2022.SoCC-Alibaba-Zhang-preview.png
    Workload Consolidation in Alibaba Clusters: the Good, the Bad, and the Ugly
    Yongkang Zhang, Yinghao Yu, Wei Wang, Qiukai Chen, Jie Wu, Zuowei Zhang, Jiang Zhong, Tianchen Ding, Qizhen Weng, Lingyun Yang, and 4 more authors
    In 13th ACM Symposium on Cloud Computing (SoCC), Jul 2022

2021

  1. 2021.TCC-LBBSP-Chen-preview.png
    Accelerating Distributed Learning in Non-Dedicated Environments
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    IEEE Transactions on Cloud Computing (TCC), Jul 2021

2020

  1. 2020.SC-Metis-Wang_Weng-preview.png
    Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters at Scale
    Luping Wang, Qizhen Weng, Wei Wang, Chen Chen, and Bo Li
    In International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Jul 2020
  2. 2020.SoCC-LBBSP-Chen-preview.png
    Semi-Dynamic Load Balancing: Efficient Distributed Learning in Non-Dedicated Environments
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    In 11th ACM Symposium on Cloud Computing (SoCC), Jul 2020

2019

  1. APSys
    Towards Framework-Independent, Non-Intrusive Performance Characterization for Dataflow Computation
    Huangshi Tian, Qizhen Weng, and Wei Wang
    In Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), Jul 2019

2018

  1. SoCC
    Fast Distributed Deep Learning via Worker-Adaptive Batch Sizing
    Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, and Bo Li
    In 9th ACM Symposium on Cloud Computing (SoCC), Jul 2018
  2. ICDCS
    Opus: Fair and Efficient Cache Sharing for In-Memory Data Analytics
    Yinghao Yu, Wei Wang, Jun Zhang, Qizhen Weng, and Khaled Ben Letaief
    In 38th IEEE International Conference on Distributed Computing Systems (ICDCS), Jul 2018