Distributed multi-scale high-throughput intelligent calculations and data management platform and its application in material design

Guanjie Wang1,2, Jian Zhou1 and Zhimei Sun1*

1School of Materials Science and Engineering, Beihang University, Beijing, 100191, China

2School of Integrated Circuit Science and Engineering, Beihang University, Beijing, 100191, Chinaore

EXTENDED ABSTRACT: The data-driven fourth scientific paradigm has subverted the traditional trial-and-error materials research and development model, and improved the speed and efficiency of new material discovery and design. With the development of supercomputers and machine learning algorithms, it is critical to develop an intelligent computing platform integrating multi-scale simulations, distributed calculations, high-throughput automatic workflow, material database and machine learning. Here, we introduce our self-developed distributed multi-scale high-throughput intelligent computing and data management platform named as ALKEMIE2.0, acronyms for Artificial Leaming and Knowledge Enhanced Materials Informatics Engineering (https://alkemine.org), and its application in defect structure prediction, thermal conductivity efficient calculation, thermoelectric performance prediction with machine learning, and the atomic interaction potential with the neural network. ALKEMIE2.0 is based on the AMDIV design concept and includes five core elements as the basic software facilities of material genetic engineering, including automation, modularization, materials database, artificial intelligence and visualization. The platform integrates multi-scale computing simulation software by the data coupling interface. Further, ALKEMI2.0 is based on ALKEMIE-Server middleware, which can easily and automatically open daemon services and realize information interaction in distributed supercomputers. Its high-throughput calculation workflows that support 104 magnitude concurrencies are implemented through the integration of automatic frameworks of model constructions, calculation workflows and data analyses. In addition, ALKEMIE2.0 also includes shared and private multi-type material databases, which can be communicated through uniquely ALKEMIE data identification based on the FAIR principle. Finally, it can realize dynamic interactive visualization of machine learning model training and application.

Keywords: High-throughput calculations; Multi-scale calculations; Materials database; Machine learning; Intelligent computing platform; Materials genome engineering. 

Brief Introduction of Speaker
Guanjie Wang

Guanjie Wang is currently an assistant professor at Beihang University, China. He obtained his PhD degree in Material Physics and Chemistry in 2022 under the supervision of Prof. Zhimei Sun. His research interests are computational materials science (especially in first-principles calculations and molecular dynamics), phase-change materials, machine learning potential and the development of high-throughput automatic visualization computing platform etc. He has published 12 peer-reviewed papers including JACS、JAC、JPCS、 CMS, etc, and has been approved for 9 software copyrights. He was the winner of the Best Poster Award at the 3rd, 4th and 5th Forum of Materials Genome Engineering, and is the winner of the 2022 Materials Genome Engineering Young Scientist award, etc.