Persistent Path-Spectral (PPS) Based Machine Learning for Protein–Ligand Binding Affinity Prediction

Abstract

Molecular descriptors are essential to quantitative structure activity/property relationship (QSAR/QSPR) models and machine learning models. Here we propose persistent path-spectral (PPS), PPS-based molecular descriptors, and PPS-based machine learning model for the prediction of the protein−ligand binding affinity, for the first time. For the graph, simplicial complex, and hypergraph representation of molecular structures and interactions, the path-Laplacian can be constructed and the derived path-spectral naturally gives a quantitative description of molecules. Further, by introducing the filtration process of the representation, the persistent path-spectral can be derived, which gives a multiscale characterization of molecules. Molecular descriptors from the persistent path-spectral attributes then are combined with the machine learning model, in particular, the gradient boosting tree, to form our PPS-ML model. We test our model on three most commonly used data sets, i.e., PDBbind-v2007, PDBbind-v2013, and PDBbind-v2016, and our model can achieve competitive results.

Publication
Journal of Chemical Information and Modeling 63(3), 1066-1075
Xiang Liu
Xiang Liu
Phd Candidate

My research interests include Topological and Geometric data analysis, and Mathematical AI.