Publications
Google Scholar
Work in progress
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang, Jia wei, Pengle Zhang, Jun Zhu, Jianfei Chen#
(arXiv, GitHub)
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
Chang Gao, Jianfei Chen#, Kang Zhao, Jiaqi Wang, Liping Jing#
(arXiv)
Diffusion Bridge Implicit Models
Kaiwen Zheng, Guande He, Jianfei Chen, Fan Bao, Jun Zhu
(arXiv)
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang, Jianfei Chen#, He Li, Zhenpeng Mi, Jun Zhu#
(arXiv)
2025
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang, Yuezhou Hu, Guohao Jian, Jun Zhu, Jianfei Chen#
To appear in AAAI Conference on Artificial Intelligence, 2025 (arXiv)
2024
Calibrating Deep Ensemble through Functional Variational Inference
Zhijie Deng, Feng Zhou, Jianfei Chen, Guoqiang Wu, Jun Zhu
Transactions on Machine Learning Research (pdf)
C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
Tianjiao Luo, Tim Pearce, Huayu Chen, Jianfei Chen, Jun Zhu
Neural Information Processing Systems (NeurIPS), 2024 (arXiv, pdf)
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu, Jun Zhu, Jianfei Chen#
Neural Information Processing Systems (NeurIPS), 2024 (arXiv, pdf)
Consistency Diffusion Bridge Models
Guande He, Kaiwen Zheng, Jianfei Chen, Fan Bao, Jun Zhu
Neural Information Processing Systems (NeurIPS), 2024 (pdf)
Accelerating Transformer Pre-Training with 2:4 Sparsity
Yuezhou Hu, Kang Zhao, Weiyu Huang, Jianfei Chen#, Jun Zhu
International Conference on Machine Learning (ICML), 2024 (arXiv, pdf, GitHub)
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen#, Jun Zhu
International Conference on Machine Learning (ICML), 2024 (arXiv, pdf, GitHub)
Efficient Backpropagation with Variance Controlled Adaptive Sampling.
Ziteng Wang, Jianfei Chen#, Jun Zhu.
International Conference on Learning Representations (ICLR), 2024 (pdf, GitHub)
2023
DPM-Solver-v3: Improved Diffusion ODE Solvers with Empirical Model Statistics.
Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu.
Neural Information Processing Systems (NeurIPS), 2023 (pdf, GitHub)
Training Transformers with 4-bit Integers.
Haocheng Xi, ChangHao Li, Jianfei Chen#, Jun Zhu.
Neural Information Processing Systems (NeurIPS), 2023 (pdf, arxiv, GitHub)
Memory Efficient Optimizers with 4-bit States.
Bingrui Li, Jianfei Chen#, Jun Zhu.
Neural Information Processing Systems (NeurIPS), 2023 (Spotlight) (pdf, arxiv, GitHub)
Stabilizing GANs’ Training with Brownian Motion Controller.
Tianjiao Luo, Ziyu Zhu, Jianfei Chen#, Jun Zhu.
International Conference on Machine Learning (ICML), 2023 (pdf, arxiv)
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs.
Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu.
International Conference on Machine Learning (ICML), 2023 (pdf, arxiv, GitHub)
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning.
Cheng Lu, Huayu Chen, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu.
International Conference on Machine Learning (ICML), 2023 (pdf, arxiv, GitHub)
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models.
Guande He, Jianfei Chen#, and Jun Zhu#.
International Conference on Learning Representations (ICLR), 2023 (pdf)
Parameter-efficient fine-tuning of large-scale pre-trained language models.
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang
Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, and Maosong Sun
Nature Machine Intelligence, 2023 (pdf)
2022
ZhuSuan: design and implementation of differentiable probabilistic programming libraries (in Chinese).
Jiaxin Shi, Jianfei Chen, and Jun Zhu.
Sci Sin Inform, 2022, 52: 804–821, doi: 10.1360/SSI-2021-0005 (paper)
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu
Neural Information Processing Systems (NeurIPS), 2022 (Oral, Accept rate ~1.7%) (arXiv, GitHub)
GACT: Activation Compressed Training for Generic Network Architectures
Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han
Jianfei Chen#, Zhiyuan Liu, Jie Tang, Joseph E. Gonzalez, Michael W. Mahoney, and Alvin Cheung
International Conference on Machine Learning (ICML), 2022 (pdf, arXiv, GitHub)
Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching
Cheng Lu, Kaiwen Zheng, Fan Bao, Chongxuan Li, Jianfei Chen#, Jun Zhu#
International Conference on Machine Learning (ICML), 2022 (pdf, arXiv, GitHub)
Fast Lossless Neural Compression with Integer-Only Discrete Flows
Siyu Wang, Jianfei Chen#, Chongxuan Li, Jun Zhu#, and Bo Zhang
International Conference on Machine Learning (ICML), 2022 (pdf, arXiv)
2021
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen*, Lianmin Zheng*, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, and Joseph E. Gonzalez
International Conference on Machine Learning (ICML), 2021 (Long talk, Accept rate ~3%) (pdf, arXiv, GitHub)
Implicit Normalizing Flows
Cheng Lu, Jianfei Chen, Chongxuan Li, Qiuhao Wang, and Jun Zhu
International Conference on Learning Representations (ICLR), 2021 (Spotlight, Accept rate ~5.5%) (pdf, arXiv)
2020
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks
Jianfei Chen, Yu Gai, Zhewei Yao, Michael W. Mahoney, and Joseph E. Gonzalez
Neural Information Processing Systems (NeurIPS), 2020 (pdf, arXiv, GitHub)
VFlow: More Expressive Generative Flows with Variational Data Augmentation
Jianfei Chen, Cheng Lu, Biqi Chenli, Jun Zhu, and Tian Tian
International Conference on Machine Learning (ICML), 2020 (pdf, arXiv, GitHub)
2019
2018
Stochastic Expectation Maximization with Variance Reduction.
Jianfei Chen, Jun Zhu, Yee Whye Teh, and Tong Zhang.
Neural Information Processing System, Montreal, Canada, 2018 (NIPS 2018) (pdf, GitHub)
Stochastic Training of Graph Convolutional Networks with Variance Reduction.
Jianfei Chen, Jun Zhu, and Le Song.
International Conference on Machine Learning, Stockholm, Sweden, 2018 (ICML 2018) (pdf, arXiv, GitHub)
Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems.
Zihao Xiao, Jun Zhu, and Jianfei Chen
AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA, 2018. (pdf)
Scalable Inference for Hierarchical Topic Models.
Jianfei Chen, Jun Zhu, Jie Lu and Shixia Liu.
Very Large Data Bases (VLDB), Rio de Janeiro, Brazil, 2018. (pdf, arXiv)
2017
ZhuSuan: A Library for Bayesian Deep Learning.
Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, and Yuhao Zhou.
arXiv:1709.05870. (arXiv, GitHub)
Population Matching Discrepancy and Applications in Deep Learning.
Jianfei Chen, Chongxuan Li, Yizhong Ru and Jun Zhu.
Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, 2017. (pdf, GitHub)
Big Learning with Bayesian Methods.
Jun Zhu, Jianfei Chen, and Wenbo Hu.
National Science Review 4.4 (2017): 627-651. (pdf, arXiv)
SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs.
Kaiwei Li, Jianfei Chen, Wenguang Chen, and Jun Zhu.
Architectural Support for Programming Languages and Operating Systems (ASPLOS), Xi'an, China, 2017. (saberlda, arXiv)
2016
WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation.
Jianfei Chen, Kaiwei Li, Jun Zhu, and Wenguang Chen.
Very Large Data Bases (VLDB), New Delhi, India, 2016. (pdf, arXiv, GitHub)
Distributing the Stochastic Gradient Sampler for Large-Scale LDA.
Yuan Yang, Jianfei Chen, and Jun Zhu.
In Proc. of SIGKDD Conference on on Knowledge Discovery and Data Mining (KDD), San Francisco, 2016. (pdf)
TopicPanorama: A Full Picture of Relevant Topics.
Xiting Wang, Shixia Liu, Junlin Liu, Jianfei Chen, Jun Zhu, and Baining Guo.
IEEE Transactions on Visualization and Computer Graphics, 2016. (pdf)
Scaling up Dynamic Topic Models.
Arnab Bhadury, Jianfei Chen, Jun Zhu, and Shixia Liu.
World Wide Web Conference (WWW), Montreal, Canada, 2016. (pdf, arXiv)
2015
Dropout Training for SVMs with Data Augmentation.
Ning Chen, Jun Zhu, Jianfei Chen, and Ting Chen.
Frontiers of Computer Science (2015): 1-20. (pdf, arXiv)
2014
TopicPanorama: a Full Picture of Relevant Topics.
Shixia Liu, Xiting Wang, Jianfei Chen, Jun Zhu, and Baining Guo.
Proc. of IEEE Visualization, Paris, France, 2014.
Bayesian Max-Margin Multitask Learning with Data Augmentation.
Chengtao Li, Jun Zhu, and Jianfei Chen.
In Proc. of International Conference on Machine Learning, Beijing, China, 2014. (pdf)
Dropout Training for Support Vector Machines.
Ning Chen, Jun Zhu, Jianfei Chen and Bo Zhang.
Association for the Advancement of Artificial Intelligence (AAAI), 2014. (pdf)
2013
Scabable Inference for Logistic-Normal Topic Models.
Jianfei Chen, Jun Zhu, Zi Wang, Xun Zheng and Bo Zhang.
Advances in Neural Information Processing Systems (NIPS), 2013. (pdf, GitHub, demo)
|