Finding Motifs in Protein Secondary Structure for Use in Function Prediction
pmid: 16706721
Finding Motifs in Protein Secondary Structure for Use in Function Prediction
This paper presents a novel algorithm for the discovery of biological sequence motifs. Our motivation is the prediction of gene function. We seek to discover motifs and combinations of motifs in the secondary structure of proteins for application to the understanding and prediction of functional classes. The motifs found by our algorithm allow both flexible length structural elements and flexible length gaps and can be of arbitrary length. The algorithm is based on neither top-down nor bottom-up search, but rather is dichotomic. It is also "anytime," so that fixed termination of the search is not necessary. We have applied our algorithm to yeast sequence data to discover rules predicting function classes from secondary structure. These resultant rules are informative, consistent with known biology, and a contribution to scientific knowledge. Surprisingly, the rules also demonstrate that secondary structure prediction algorithms are effective for membrane proteins and suggest that the association between secondary structure and function is stronger in membrane proteins than globular ones. We demonstrate that our algorithm can successfully predict gene function directly from predicted secondary structure; e.g., we correctly predict the gene YGL124c to be involved in the functional class "cytoplasmic and nuclear degradation." Datasets and detailed results (generated motifs, rules, evaluation on test dataset, and predictions on unknown dataset) are available at www.aber.ac.uk/compsci/Research/bio/dss/yeast.ss.mips/, and www.genepredictions.org.
- University of Salford United Kingdom
- University of Wales United Kingdom
- University of Rennes 1 France
- University of Rennes France
Flexible motifs, Saccharomyces cerevisiae Proteins, Amino Acid Motifs, Membrane Proteins, Functional genomics, Dichotomic search algorithm, Structure-Activity Relationship, Predictive Value of Tests, Sequence Analysis, Protein, Protein secondary structure, Algorithms
Flexible motifs, Saccharomyces cerevisiae Proteins, Amino Acid Motifs, Membrane Proteins, Functional genomics, Dichotomic search algorithm, Structure-Activity Relationship, Predictive Value of Tests, Sequence Analysis, Protein, Protein secondary structure, Algorithms
27 Research products, page 1 of 3
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
- 2017IsRelatedTo
chevron_left - 1
- 2
- 3
chevron_right
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).12 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average
