Lee, L. K., Yeo, H., Lin, Z., Senthilnath, J., Zhou, B., & Yoon, J. W. (2023). Analysis of error distribution in catalyst-adsorbate binding energies generated by GNN. https://doi.org/10.14293/p2199-8442.1.sop-.pibjc1.v1
Abstract:
Scalable methods of storing renewable energy to reduce climate change require high- performance catalysts that can be discovered using Artificial Intelligence (AI), prompting data scientists to train models. In the OC20 dataset (IS2RE), the Graphormer model, a type of Graph Neural Network (GNN), performed the best. We hope to find trends in the error distribution of catalyst-adsorbate binding energies to explain and make improvements.
GNNs are based on Graph Theory, which makes them optimised for certain data structures. In the context of molecules, a graph can be used to represent the atomic and molecular structure of a compound through its edges and nodes. By using a GNN, it is possible to effectively capture the intricate relationships and dependencies between the atoms and bonds in a molecule. This is important for predicting and understanding the properties and behaviour of the molecule.
With Python and Jupyter Notebook, we had many cutting-edge packages and libraries at our disposal; some included are matplotlib, pandas, and Atomic Simulation Environment (ASE). Through the data exploration, wrangling, and analysis, several insights were gleaned.
Smaller unit cells tended to have larger errors. Across the period, there are local minima of errors in Group 10. Chlorine, perhaps because it is the only halogen, had an abnormally large error. Adsorbates containing Nitrogen had greater errors, as opposed to ones that are grouped as C1, C2 or O/H only.
License type:
Attribution 4.0 International (CC BY 4.0)
Funding Info:
This research / project is supported by the ASTAR - Accelerated Materials Development for Manufacturing Program
Grant Reference no. : A1898b0043