Explainable decision-making framework concealed in graph statistical models of chemistry
Loading...
Date
2025-01
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of New Brunswick
Abstract
Recent advancements in probabilistic models in chemistry have unlocked ground-breaking potential, yet these innovations come with heightened caution. Decisions made by techniques, such as neural network models, are seldom fully understood, even by developers themselves, making it difficult to integrate these models into an established scientific discourse. Nevertheless, their use remains widespread and likely to increase, as they generate predictions that rival or surpass traditional chemical models in efficiency. This great potential combined with a lack of explainability has placed these models under increasing scrutiny, leading to the field of explainable artificial intelligence.
This thesis investigates graph probabilistic models of chemistry, particularly graph neural nets, to develop an explanatory framework of decision-making that can be quantitatively blueprinted and replicated. We probe the cryptic high-dimensional nature of the feature space of these models, compacting their dimensions to elucidate a decision-making framework based on the molecular substructures of chemistry. We then demonstrate that the decision-making framework of these models is organized around chemical formula language/syntax from which the hidden framework can be replicated, while also providing a novel way of exploring reactions. Finally, we show the completeness of these models by transferring their capabilities to solve a wide range of chemical problems, from predicting pKa values and NMR data to modeling electron density and solubility.