Publications

Google Scholar

Work in progress

  • 1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
    Chang Gao, Jianfei Chen#, Kang Zhao, Jiaqi Wang, Liping Jing#
    (arXiv)

  • Identifying Sensitive Weights via Post-quantization Integral
    Yuezhou Hu, Weiyu Huang, Zichen Liang, Chang Chen, Jintao Zhang, Jun Zhu, Jianfei Chen#
    (arXiv)

  • Accurate INT8 Training Through Dynamic Block-Level Fallback
    Pengle Zhang, Jia wei, Jintao Zhang, Jun Zhu, Jianfei Chen#
    (arXiv)

  • SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training
    Jintao Zhang∗, Jia Wei∗, Pengle Zhang, Xiaoming Xu, Haofeng Huang, Haoxu Wang, Kai Jiang, Jun Zhu, Jianfei Chen#
    (arXiv)

  • Sparse VideoGen2: Accelerating Video Diffusion Transformers with Sparse Attention via Semantic-Aware Permutation
    Shuo Yang, Haocheng Xi, Yilong Zhao, Muyang Li, Jintao Zhang, Han Cai, Yujun Lin, Xiuyu Li, Chenfeng Xu, Kelly Peng, Jianfei Chen, Song Han, Kurt Keutzer, Ion Stoica
    (arXiv)

  • LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
    Fengqi Zhu, Rongzhen Wang, Shen Nie, Xiaolu Zhang, Chunwei Wu, Jun Hu, JUN ZHOU, Jianfei Chen, Yankai Lin, Ji-Rong Wen, Chongxuan Li
    (arXiv)

2025

  • Task-Specific Zero-shot Quantization-Aware Training for Object Detection
    ChangHao Li, Xinrui Chen, Ji Wang, Kang Zhao, Jianfei Chen#
    To appear in Inference Conference on Computer Vision, 2025

  • Maximum Redundancy Pruning: A Principle-Driven Layerwise Sparsity Allocation for LLMs
    Chang Gao, Kang Zhao, Runqi Wang, Jianfei Chen, Liping Jing
    ACM Multimedia (ACMMM), 2025 (arXiv)

  • SparseDM: Toward Sparse Efficient Diffusion Models
    Kafeng Wang, Jianfei Chen#, He Li, Zhenpeng Mi, Jun Zhu#
    International Conference on Multimedia & Expo, 2025 (arXiv, GitHub)

  • Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
    Haocheng Xi, Shuo Yang, Yilong Zhao, Chenfeng Xu, Muyang Li, Xiuyu Li, Yujun Lin, Han Cai, Jintao Zhang, Dacheng Li, Jianfei Chen, Ion Stoica, Kurt Keutzer, Song Han
    International Conference on Machine Learning, 2025 (pdf, GitHub)

  • Oscillation-Reduced MXFP4 Training for Vision Transformers
    Yuxiang Chen, Haocheng Xi, Jun Zhu, Jianfei Chen#
    International Conference on Machine Learning, 2025 (arXiv, pdf, GitHub)

  • Visual Generation Without Guidance
    Huayu Chen, Kai Jiang, Kaiwen Zheng, Jianfei Chen, Hang Su, Jun Zhu
    International Conference on Machine Learning, 2025 (pdf, GitHub)

  • FrameBridge: Improving Image-to-Video Generation with Bridge Models
    Yuji Wang, Zehua Chen, Chen Xiaoyu, Yixiang Wei, Jun Zhu, Jianfei Chen#
    International Conference on Machine Learning, 2025 (pdf, GitHub)

  • SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization
    Jintao Zhang, Haofeng Huang, Pengle Zhang, Jia Wei, Jun Zhu, Jianfei Chen#
    International Conference on Machine Learning, 2025 (arXiv, pdf, GitHub)

  • SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
    Jintao Zhang, Chendong Xiang, Haofeng Huang, Jia Wei, Haocheng Xi, Jun Zhu, Jianfei Chen
    International Conference on Machine Learning, 2025 (arXiv, pdf, GitHub)

  • SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
    Jintao Zhang, Jia Wei, Pengle Zhang, Jun Zhu, Jianfei Chen#
    International Conference on Learning Representations, 2025 (arXiv, GitHub)

  • ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
    Ziteng Wang, Jianfei Chen#, and Jun Zhu
    International Conference on Learning Representations, 2025 (arXiv, GitHub)

  • Diffusion Bridge Implicit Models
    Kaiwen Zheng, Guande He, Jianfei Chen, Fan Bao, Jun Zhu
    International Conference on Learning Representations, 2025 (arXiv, pdf)

  • Elucidating the Preconditioning in Consistency Distillation
    Kaiwen Zheng, Guande He, Jianfei Chen, Fan Bao, Jun Zhu
    International Conference on Learning Representations, 2025 (pdf)

  • COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
    Haocheng Xi, Han Cai, Ligeng Zhu, Yao Lu, Kurt Keutzer, Jianfei Chen#, Song Han#
    International Conference on Learning Representations, 2025 (pdf, arXiv, GitHub)

  • On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
    Bingrui Li, Wei Huang, Andi Han, Zhanpeng Zhou, Taiji Suzuki, Jun Zhu, Jianfei Chen#
    International Conference on Learning Representations, 2025 (pdf, arXiv)

  • Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
    Weiyu Huang, Yuezhou Hu, Guohao Jian, Jun Zhu, Jianfei Chen#
    To appear in AAAI Conference on Artificial Intelligence, 2025 (arXiv, GitHub)

2024

  • Calibrating Deep Ensemble through Functional Variational Inference
    Zhijie Deng, Feng Zhou, Jianfei Chen, Guoqiang Wu, Jun Zhu
    Transactions on Machine Learning Research (pdf)

  • C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
    Tianjiao Luo, Tim Pearce, Huayu Chen, Jianfei Chen, Jun Zhu
    Neural Information Processing Systems (NeurIPS), 2024 (arXiv, pdf)

  • S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
    Yuezhou Hu, Jun Zhu, Jianfei Chen#
    Neural Information Processing Systems (NeurIPS), 2024 (arXiv, pdf, GitHub)

  • Consistency Diffusion Bridge Models
    Guande He, Kaiwen Zheng, Jianfei Chen, Fan Bao, Jun Zhu
    Neural Information Processing Systems (NeurIPS), 2024 (pdf)

  • Accelerating Transformer Pre-training with 2:4 Sparsity
    Yuezhou Hu, Kang Zhao, Weiyu Huang, Jianfei Chen#, Jun Zhu
    International Conference on Machine Learning (ICML), 2024 (arXiv, pdf, GitHub)

  • Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
    Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen#, Jun Zhu
    International Conference on Machine Learning (ICML), 2024 (Spotlight) (arXiv, pdf, GitHub)

  • Efficient Backpropagation with Variance Controlled Adaptive Sampling.
    Ziteng Wang, Jianfei Chen#, Jun Zhu.
    International Conference on Learning Representations (ICLR), 2024 (pdf, GitHub)

2023

  • DPM-Solver-v3: Improved Diffusion ODE Solvers with Empirical Model Statistics.
    Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu.
    Neural Information Processing Systems (NeurIPS), 2023 (pdf, GitHub)

  • Training Transformers with 4-bit Integers.
    Haocheng Xi, ChangHao Li, Jianfei Chen#, Jun Zhu.
    Neural Information Processing Systems (NeurIPS), 2023 (pdf, arxiv, GitHub)

  • Memory Efficient Optimizers with 4-bit States.
    Bingrui Li, Jianfei Chen#, Jun Zhu.
    Neural Information Processing Systems (NeurIPS), 2023 (Spotlight) (pdf, arxiv, GitHub)

  • Stabilizing GANs’ Training with Brownian Motion Controller.
    Tianjiao Luo, Ziyu Zhu, Jianfei Chen#, Jun Zhu.
    International Conference on Machine Learning (ICML), 2023 (pdf, arxiv)

  • Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs.
    Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu.
    International Conference on Machine Learning (ICML), 2023 (pdf, arxiv, GitHub)

  • Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning.
    Cheng Lu, Huayu Chen, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu.
    International Conference on Machine Learning (ICML), 2023 (pdf, arxiv, GitHub)

  • Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models.
    Guande He, Jianfei Chen#, and Jun Zhu#.
    International Conference on Learning Representations (ICLR), 2023 (pdf)

  • Parameter-efficient fine-tuning of large-scale pre-trained language models.
    Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang
    Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, and Maosong Sun
    Nature Machine Intelligence, 2023 (pdf)

2022

  • ZhuSuan: design and implementation of differentiable probabilistic programming libraries (in Chinese).
    Jiaxin Shi, Jianfei Chen, and Jun Zhu.
    Sci Sin Inform, 2022, 52: 804–821, doi: 10.1360/SSI-2021-0005 (paper)

  • DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
    Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu
    Neural Information Processing Systems (NeurIPS), 2022 (Oral, Accept rate ~1.7%) (arXiv, GitHub)

  • GACT: Activation Compressed Training for Generic Network Architectures
    Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han
    Jianfei Chen#, Zhiyuan Liu, Jie Tang, Joseph E. Gonzalez, Michael W. Mahoney, and Alvin Cheung
    International Conference on Machine Learning (ICML), 2022 (pdf, arXiv, GitHub)

  • Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching
    Cheng Lu, Kaiwen Zheng, Fan Bao, Chongxuan Li, Jianfei Chen#, Jun Zhu#
    International Conference on Machine Learning (ICML), 2022 (pdf, arXiv, GitHub)

  • Fast Lossless Neural Compression with Integer-Only Discrete Flows
    Siyu Wang, Jianfei Chen#, Chongxuan Li, Jun Zhu#, and Bo Zhang
    International Conference on Machine Learning (ICML), 2022 (pdf, arXiv)

2021

  • ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
    Jianfei Chen*, Lianmin Zheng*, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, and Joseph E. Gonzalez
    International Conference on Machine Learning (ICML), 2021 (Long talk, Accept rate ~3%) (pdf, arXiv, GitHub)

  • Implicit Normalizing Flows
    Cheng Lu, Jianfei Chen, Chongxuan Li, Qiuhao Wang, and Jun Zhu
    International Conference on Learning Representations (ICLR), 2021 (Spotlight, Accept rate ~5.5%) (pdf, arXiv)

2020

  • A Statistical Framework for Low-bitwidth Training of Deep Neural Networks
    Jianfei Chen, Yu Gai, Zhewei Yao, Michael W. Mahoney, and Joseph E. Gonzalez
    Neural Information Processing Systems (NeurIPS), 2020 (pdf, arXiv, GitHub)

  • VFlow: More Expressive Generative Flows with Variational Data Augmentation
    Jianfei Chen, Cheng Lu, Biqi Chenli, Jun Zhu, and Tian Tian
    International Conference on Machine Learning (ICML), 2020 (pdf, arXiv, GitHub)

2019

  • Efficient Algorithms for Representation Learning (PhD Dissertation, in Chinese).
    Jianfei Chen.

  • Efficient Learning Algorithm for Maximum Entropy Discrimination Topic Models (in Chinese).
    Jianfei Chen and Jun Zhu.
    Pattern Recognition and Artificial Intelligence, 2019 Vol. 32 (8): 736-745 (pdf)

2018

  • Stochastic Expectation Maximization with Variance Reduction.
    Jianfei Chen, Jun Zhu, Yee Whye Teh, and Tong Zhang.
    Neural Information Processing System, Montreal, Canada, 2018 (NIPS 2018) (pdf, GitHub)

  • Stochastic Training of Graph Convolutional Networks with Variance Reduction.
    Jianfei Chen, Jun Zhu, and Le Song.
    International Conference on Machine Learning, Stockholm, Sweden, 2018 (ICML 2018) (pdf, arXiv, GitHub)

  • Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems.
    Zihao Xiao, Jun Zhu, and Jianfei Chen AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA, 2018. (pdf)

  • Scalable Inference for Hierarchical Topic Models.
    Jianfei Chen, Jun Zhu, Jie Lu and Shixia Liu.
    Very Large Data Bases (VLDB), Rio de Janeiro, Brazil, 2018. (pdf, arXiv)

2017

  • ZhuSuan: A Library for Bayesian Deep Learning.
    Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, and Yuhao Zhou.
    arXiv:1709.05870. (arXiv, GitHub)

  • Population Matching Discrepancy and Applications in Deep Learning.
    Jianfei Chen, Chongxuan Li, Yizhong Ru and Jun Zhu.
    Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, 2017. (pdf, GitHub)

  • Big Learning with Bayesian Methods.
    Jun Zhu, Jianfei Chen, and Wenbo Hu.
    National Science Review 4.4 (2017): 627-651. (pdf, arXiv)

  • SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs.
    Kaiwei Li, Jianfei Chen, Wenguang Chen, and Jun Zhu.
    Architectural Support for Programming Languages and Operating Systems (ASPLOS), Xi'an, China, 2017. (saberlda, arXiv)

2016

  • WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation.
    Jianfei Chen, Kaiwei Li, Jun Zhu, and Wenguang Chen.
    Very Large Data Bases (VLDB), New Delhi, India, 2016. (pdf, arXiv, GitHub)

  • Distributing the Stochastic Gradient Sampler for Large-Scale LDA.
    Yuan Yang, Jianfei Chen, and Jun Zhu.
    In Proc. of SIGKDD Conference on on Knowledge Discovery and Data Mining (KDD), San Francisco, 2016. (pdf)

  • TopicPanorama: A Full Picture of Relevant Topics.
    Xiting Wang, Shixia Liu, Junlin Liu, Jianfei Chen, Jun Zhu, and Baining Guo.
    IEEE Transactions on Visualization and Computer Graphics, 2016. (pdf)

  • Scaling up Dynamic Topic Models.
    Arnab Bhadury, Jianfei Chen, Jun Zhu, and Shixia Liu.
    World Wide Web Conference (WWW), Montreal, Canada, 2016. (pdf, arXiv)

2015

  • Dropout Training for SVMs with Data Augmentation.
    Ning Chen, Jun Zhu, Jianfei Chen, and Ting Chen.
    Frontiers of Computer Science (2015): 1-20. (pdf, arXiv)

2014

  • TopicPanorama: a Full Picture of Relevant Topics.
    Shixia Liu, Xiting Wang, Jianfei Chen, Jun Zhu, and Baining Guo.
    Proc. of IEEE Visualization, Paris, France, 2014.

  • Bayesian Max-Margin Multitask Learning with Data Augmentation.
    Chengtao Li, Jun Zhu, and Jianfei Chen.
    In Proc. of International Conference on Machine Learning, Beijing, China, 2014. (pdf)

  • Dropout Training for Support Vector Machines.
    Ning Chen, Jun Zhu, Jianfei Chen and Bo Zhang.
    Association for the Advancement of Artificial Intelligence (AAAI), 2014. (pdf)

2013

  • Scabable Inference for Logistic-Normal Topic Models.
    Jianfei Chen, Jun Zhu, Zi Wang, Xun Zheng and Bo Zhang.
    Advances in Neural Information Processing Systems (NIPS), 2013. (pdf, GitHub, demo)