Espritz: Accurate and fast prediction of protein disorder

Ian Walsh, Alberto J.M. Martin, Tomàs Di domenico, Silvio C.E. Tosatto*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

416 Scopus citations

Abstract

Motivation: Intrinsically disordered regions are key for the function of numerous proteins, and the scant available experimental annotations suggest the existence of different disorder flavors. While efficient predictions are required to annotate entire genomes, most existing methods require sequence profiles for disorder prediction, making them cumbersome for high-throughput applications.Results: In this work, we present an ensemble of protein disorder predictors called ESpritz. These are based on bidirectional recursive neural networks and trained on three different flavors of disorder, including a novel NMR flexibility predictor. ESpritz can produce fast and accurate sequence-only predictions, annotating entire genomes in the order of hours on a single processor core. Alternatively, a slower but slightly more accurate ESpritz variant using sequence profiles can be used for applications requiring maximum performance. Two levels of prediction confidence allow either to maximize reasonable disorder detection or to limit expected false positives to 5%. ESpritz performs consistently well on the recent CASP9 data, reaching a S w measure of 54.82 and area under the receiver operator curve of 0.856. The fast predictor is four orders of magnitude faster and remains better than most publicly available CASP9 methods, making it ideal for genomic scale predictions. Conclusions: ESpritz predicts three flavors of disorder at two distinct false positive rates, either with a fast or slower and slightly more accurate approach. Given its state-of-the-art performance, it can be especially useful for high-throughput applications.

Original languageEnglish
Article numberbtr682
Pages (from-to)503-509
Number of pages7
JournalBioinformatics
Volume28
Issue number4
DOIs
StatePublished - 2012
Externally publishedYes

Bibliographical note

Funding Information:
Funding: University of Padova grants (CPDA098382, CPDR097328); FIRB Futuro in Ricerca grant (RBFR08ZSXY) to SC.E.T.

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Espritz: Accurate and fast prediction of protein disorder'. Together they form a unique fingerprint.

Cite this