[ICS 2024] Chendi Li*, Yufan Xu*,, Sina Mahdipour Saravani, Ponnuswamy Sadayappan.
Accelerated Auto-Tuning of GPU Kernels for Tensor Computations.
[TPDS 2024] Cunyang Wei, Haipeng Jia, Yunquan Zhang, Jianyu Yao, Chendi Li, Wenxuan Cao.
IrGEMM: An Input-Aware Tuning Framework for Irregular GEMM on ARM and X86 CPUs.
[ICS 2023] Tun Chen, Haipeng Jia, Yunquan Zhang, Kun Li, Zhihao Li, Xiang Zhao, Jianyu Yao, Chendi Li.
OpenFFT: An Adaptive Tuning Framework for 3D FFT on ARM Multicore CPUs.
[ISPA 2021] Chendi Li, Haipeng Jia, Hang Cao, Jianyu Yao, Boqian Shi, Chunyang Xiang, Jinbo Sun, Pengqi Lu, Yunquan Zhang.
AutoTSMM: An Auto-tuning Framework for Building High-Performance Tall-and-Skinny Matrix-Matrix Multiplication on CPUs.
[HPCC 2021] Tun Chen, Haipeng Jia, Zhihao Li, Chendi Li, Yunquan Zhang.
A Transpose-free Three-dimensional FFT Algorithm on ARM CPUs.
[ICPADS 2021] Jianyu Yao, Boqian Shi, Chunyang Xiang, Haipeng Jia, Chendi Li, Hang Cao, Yunquan Zhang.
IAAT: An Input-Aware Adaptive Tuning Framework for Small GEMM.