Paper-Conference

Scalable Detection of Floating-point Errors via Adaptive Parallel Subdomain Search

In Proceedings of the 25th IEEE International Conference on Software Quality, Reliability and Security (QRS 2025)

Zuoyan Zhang, Shihan Yuan, Hongru Yang, Jie Zhao, Jinchen Xu

Optimizing Deep Learning Inference Efficiency through Block Dependency Analysis

In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025)

Zhanyuan Di, Leping Wang, En Shao, Zhaojia Ma, Ziyi Ren, Feng Hua, Lixian Ma, Jie Zhao, Guangming Tan, Ninghui Sun

Magneto: Accelerating Parallel Structures in DNNs via Co-Optimization of Operators (poster)

In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2025)

Zhanyuan Di, Leping Wang, Ziyi Ren, En Shao, Jie Zhao, Siyuan Feng, Dingwen Tao, Guangming Tan, Ninghui Sun

Post-Link Outlining for Code Size Reduction

In Proceedings of the ACM SIGPLAN 2025 International Conference on Compiler Construction (CC 2025)

Shaobai Yuan, Jihong He, Yihui Xie, Feng Wang, Jie Zhao

Arfa: an Agile Regime-based Floating-point Optimization Approach for Rounding Errors

In Proceedings of the 33rd ACM International Symposium on Software Testing and Analysis (ISSTA 2024)

Jinchen Xu, Mengqi Cui, Fei Li, Zuoyan Zhang, Hongru Yang, Bei Zhou, Jie Zhao

Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning

In Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2024)

Yi Zhai, Sijia Yang, Keyu Pan, Renwei Zhang, Shuo Liu, Chao Liu, Zichun Ye, Jianmin Ji, Jie Zhao, Yu Zhang, Yanyong Zhang

A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs

In Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2024)

Jinchen Xu, Guanghui Song, Bei Zhou, Fei Li, Jiangwei Hao, Jie Zhao

Apollo: Automatic Partition-based Operator Fusion through Layer by Layer Optimization

In Proceedings of Machine Learning and Systems (MLSys 2022)

Jie Zhao, Xiong Gao, Ruijie Xia, Zhaochuang Zhang, Deshi Chen, Lei Chen, Renwei Zhang, Zhen Geng, Bin Cheng, Xuefeng Jin

Eiffel: Inferring Input Ranges of Significant Floating-point Errors via Polynomial Extrapolation

In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)

Zuoyan Zhang, Bei Zhou, Jiangwei Hao, Hongru Yang, Mengqi Cui, Yuchang Zhou, Guanghui Song, Fei Li, Jinchen Xu, Jie Zhao

SIRIUS: Harvesting Whole-Program Optimization Opportunities for DNNs

In Proceedings of Machine Learning and Systems (MLSys 2023)

Yijin Li, Jiacheng Zhao, Qianqi Sun, Haohui Mai, Lei Chen, Wanlu Cao, Yanfan Chen, Zhicheng Li, Ying Liu, Xinyuan Zhang, Xiyu Shi, Jie Zhao, Jingling Xue, Huimin Cui, Xiaobing Feng