PPoPP 2024
Sat 2 - Wed 6 March 2024 Edinburgh, United Kingdom
Tue 5 Mar 2024 12:10 - 12:30 at Moorfoot - ML Workloads Chair(s): Xipeng Shen

Neural Network Molecular Dynamics (NNMD) has become a major approach in material simulations, which can speed-up the molecular dynamics (MD) simulation for thousands of times, while maintaining \textit{ab initio} accuracy, thus has a potential to fundamentally change the paradigm of material simulations. However, there are two time-consuming bottlenecks of the NNMD developments. One is the data access of \textit{ab initio} calculation results. The other, which is the focus of the current work, is reducing the training time of NNMD model. The training of NNMD model is different from most other neural network training because the atomic force (which is related to the gradient of the network) is an important physical property to be fit. Tests show the traditional stochastic gradient methods, like the Adam algorithms, cannot efficiently deploy the multisample minibatch algorithm. As a result, a typical training (taking the Deep Potential Molecular Dynamics (DeePMD) as an example) can take many hours. In this work, we designed a heuristic minibatch quasi-Newtonian optimizer based on Extended Kalman Filter method. An early reduction of gradient and error is adopted to reduce memory footprint and communication. The memory footprint, communication and settings of hyper-parameters of this new method are analyzed in detail. Computational innovations such as customized kernels of the symmetry-preserving descriptor are applied to exploit the computing power of the heterogeneous architecture. Experiments are performed on 8 different datasets representing different real case situations, and numerical results show that our new method has an average speedup of 32.2 compared to the Reorganized Layer-wised Extended Kalman Filter with 1 GPU, reducing the absolute training time of one DeePMD model from hours to several minutes, making it one step toward online training.

Tue 5 Mar

Displayed time zone: London change

11:30 - 12:30
ML WorkloadsMain Conference at Moorfoot
Chair(s): Xipeng Shen North Carolina State University
11:30
20m
Talk
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU
Main Conference
xiaoyanliu Beihang University, Xuegui Zheng Beihang University, Hailong Yang Beihang University, China, Zhongzhi Luan Beihang University, Depei Qian Beihang University, China
Link to publication DOI
11:50
20m
Talk
Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips
Main Conference
Ismet Dagli Colorado School of Mines, Mehmet Belviranli Colorado School of Mines
Link to publication DOI
12:10
20m
Talk
Training one DeePMD Model in Minutes: a Step Towards Online Learning
Main Conference
Siyu Hu Institute of Computing Technology, Chinese Academy of Sciences, Tong Zhao Institute of Computing Technology, Chinese Academy of Sciences, Qiuchen Sha Institute of Computing Technology, Chinese Academy of Sciences, Enji Li Institute of Computing Technology, Chinese Academy of Sciences, Xiangyu Meng College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Liping Liu Institute of Semiconductors, Chinese Academy of Sciences, Lin-Wang Wang Institute of Semiconductors, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS), Weile Jia Institute of Computing Technology, Chinese Academy of Sciences
Link to publication DOI