Incorporating Deep Learning With Word Embedding to Identify Plant Ubiquitylation Sites
Incorporating Deep Learning With Word Embedding to Identify Plant Ubiquitylation Sites
Protein ubiquitylation is an important posttranslational modification (PTM), which is involved in diverse biological processes and plays an essential role in the regulation of physiological mechanisms and diseases. The Protein Lysine Modifications Database (PLMD) has accumulated abundant ubiquitylated proteins with their substrate sites for more than 20 kinds of species. Numerous works have consequently developed a variety of ubiquitylation site prediction tools across all species, mainly relying on the predefined sequence features and machine learning algorithms. However, the difference in ubiquitylated patterns between these species stays unclear. In this work, the sequence-based characterization of ubiquitylated substrate sites has revealed remarkable differences among plants, animals, and fungi. Then an improved word-embedding scheme based on the transfer learning strategy was incorporated with the multilayer convolutional neural network (CNN) for identifying protein ubiquitylation sites. For the prediction of plant ubiquitylation sites, the proposed deep learning scheme could outperform the machine learning-based methods, with the accuracy of 75.6%, precision of 73.3%, recall of 76.7%, F-score of 0.7493, and 0.82 AUC on the independent testing set. Although the ubiquitylated specificity of substrate sites is complicated, this work has demonstrated that the application of the word-embedding method can enable the extraction of informative features and help the identification of ubiquitylated sites. To accelerate the investigation of protein ubiquitylation, the data sets and source code used in this study are freely available at https://github.com/wang-hong-fei/DL-plant-ubsites-prediction.
- University of Science and Technology of China China (People's Republic of)
- Chinese University of Hong Kong China (People's Republic of)
- University of Missouri United States
- UNIVERSITY OF MISSOURI
- THE CHINESE UNIVERSITY OF HONG KONG China (People's Republic of)
Cell and Developmental Biology, QH301-705.5, deep learning, convolutional neural network, plant, transfer learning, Biology (General), ubiquitylation, word embedding
Cell and Developmental Biology, QH301-705.5, deep learning, convolutional neural network, plant, transfer learning, Biology (General), ubiquitylation, word embedding
3 Research products, page 1 of 1
- 2017IsRelatedTo
- 2017IsRelatedTo
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).26 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Top 10% influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Top 10% impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10%
