Projects
Contributing
- 🔥 SGLang: PR#18213: “[Bugfix] Fix model output corruption caused by EPLB rebalance (Eager and CUDA Graph modes)”
- 💃 VeRL: PR#2629: “[rollout, trainer] feat: Enabling Request Skewness Scheduler towards near-equal generated token in rollout”
Maintaining
- 🎨 TeleTron: scalable long-context multi-modal Transformer training framework.
- 🔀 Kubernetes Scheduler Simulator: evaluates different scheduling policies in GPU-sharing clusters.
- 📊 Alibaba Cluster Trace Program: provides AI workload traces from real production clusters with analysis.