This title appears in the Scientific Report : 2018 

TopScore: Using Deep Neural Networks and Large Diverse Data Sets for Accurate Protein Model Quality Assessment
Mulnaes, Daniel
Gohlke, Holger (Corresponding author)
John von Neumann - Institut für Computing; NIC
Strukturbiochemie ; ICS-6
Jülich Supercomputing Center; JSC
Journal of chemical theory and computation, 14 (2018) 11, S. 6117–6126
Washington, DC 2018
Journal Article
Forschergruppe Gohlke
Functional Macromolecules and Complexes
Computational Science and Mathematical Methods
Please use the identifier: in citations.
The value of protein models obtained with automated protein structure prediction depends primarily on their accuracy. Protein model quality assessment is thus critical to select the model that can best answer biologically relevant questions from an ensemble of predictions. However, despite many advances in the field, different methods capture different types of errors, begging the question of which method to use. We introduce TopScore, a meta Model Quality Assessment Program (meta-MQAP) that uses deep neural networks to combine scores from 15 different primary predictors to predict accurate residue-wise and whole-protein error estimates. The predictions on six large independent data sets are highly correlated to superposition-independent errors in the model, achieving a Pearson’s Rall2 of 0.93 and 0.78 for whole-protein and residue-wise error predictions, respectively. This is a significant improvement over any of the investigated primary MQAPs, demonstrating that much can be gained by optimally combining different methods and using different and very large data sets.