PPoPP 2024
Sat 2 - Wed 6 March 2024 Edinburgh, United Kingdom
Tue 5 Mar 2024 11:50 - 12:10 at Moorfoot - ML Workloads Chair(s): Xipeng Shen

Two distinguishing features of state-of-the-art mobile and autonomous systems are 1) there are often multiple workloads, mainly deep neural network (DNN) inference, running concurrently and continuously; and 2) they operate on shared memory system-on-chips (SoC) that embed heterogeneous accelerators tailored for specific operations. State-of-the-art lacks efficient performance and resource management techniques necessary to either maximize total system throughput or minimize end-to-end workload latency. In this work, we propose HaX-CoNN, a novel scheme that characterizes and maps layers in concurrently executing DNN inference workloads to a diverse set of accelerators within a SoC. Our scheme uniquely takes per-layer execution characteristics, shared memory (SM) contention, and inter-accelerator transitions into account to find optimal schedules. We evaluate HaX-CoNN on NVIDIA Orin, NVIDIA Xavier, and Qualcomm Snapdragon 865 SoCs. Our experimental results indicate that HaX-CoNN minimizes memory contention by up to 45% and can improve latency and total throughput by up to 32% and 29%, respectively, in comparison to the state-of-the-art approaches.

Tue 5 Mar

Displayed time zone: London change

11:30 - 12:30
ML WorkloadsMain Conference at Moorfoot
Chair(s): Xipeng Shen North Carolina State University
11:30
20m
Talk
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU
Main Conference
xiaoyanliu Beihang University, Xuegui Zheng Beihang University, Hailong Yang Beihang University, China, Zhongzhi Luan Beihang University, Depei Qian Beihang University, China
Link to publication DOI
11:50
20m
Talk
Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips
Main Conference
Ismet Dagli Colorado School of Mines, Mehmet Belviranli Colorado School of Mines
Link to publication DOI
12:10
20m
Talk
Training one DeePMD Model in Minutes: a Step Towards Online Learning
Main Conference
Siyu Hu Institute of Computing Technology, Chinese Academy of Sciences, Tong Zhao Institute of Computing Technology, Chinese Academy of Sciences, Qiuchen Sha Institute of Computing Technology, Chinese Academy of Sciences, Enji Li Institute of Computing Technology, Chinese Academy of Sciences, Xiangyu Meng College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Liping Liu Institute of Semiconductors, Chinese Academy of Sciences, Lin-Wang Wang Institute of Semiconductors, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS), Weile Jia Institute of Computing Technology, Chinese Academy of Sciences
Link to publication DOI