Qizhen Weng 翁祈桢
AI Infra Team Lead. Research Scientist. Ph.D. in CSE from HKUST.

My research interests encompass AI Infrastructure, Machine Learning Systems, and Cloud Computing, with a particular emphasis on enhancing GPU cluster efficiency and optimizing training performance for large-scale generative models, such as large language models (LLMs), multimodal LLMs (MLLMs), and diffusion transformers (DiTs).
(1) Since 2024, I have been leading the AI Infrastructure Research Center at the Institute of Artificial Intelligence (TeleAI), China Telecom, where I oversee initiatives to advance AI system capabilities. (2) Prior to this, I joined the Shanghai AI Laboratory in 2022 as a Systems Researcher, contributing to the systems for large language model training and inference. (3) Earlier, I gained valuable experience as a Research Intern at Alibaba Cloud & Alibaba Group, where I focused on GPU cluster management and AI job scheduling for over two years, beginning in 2020.
I received my Ph.D. in Computer Science and Engineering from The Hong Kong University of Science and Technology in 2022, under the guidance of Prof. Wei Wang. I also hold a B.Eng. degree from Shanghai Jiao Tong University in 2017 and enriched my academic journey with a study period at UC Berkeley in 2015.
Awards
- Young Elite Scientists Sponsorship Program, CAST, 2025: for AI development tools and infrastructure
- Hong Kong PhD Fellowship Scheme, RGC of HK, 2017: awarded to 231 top students worldwide
- Shanghai Outstanding Graduates, SH Gov., 2017: awarded to top 3% students in the college
- Cyber-Security Scholarship, CIDF, 2016: awarded to 1% students in the major
News & Highlights
Jun 15, 2025 | ♻️Invited Keynote Speaker at AI for Good Global Submmit: I will be delivering a Keynote speech on AI Solutions in China Telecom at the AI for Good Global Summit 8-11 July in Geneva, hosted by the ITU of the United Nations. Join us as we discuss how AI can shape a sustainable future! |
---|---|
Apr 1, 2025 | 📜USENIX ATC 2025: Paper “Toppings: CPU-Assisted, Rank-Aware Adapter Serving for LLM Inference” accepeted to USENIX ATC 2025. |
Feb 22, 2025 | 💡Openings: I’m currently recruiting highly motivated students who can intern in Shanghai for 3+ months. If you’re excited about advancing AI through LLM/MLLM/DiT, please drop me an email with your CV. Experience with deep learning frameworks, distributed systems, or CUDA programming is a plus but not required. |