Abstract:
Proteins play crucial roles in diverse biological functions. Accurately annotating their functions is essential for understanding cellular mechanisms and developing thera...Show MoreMetadata
Abstract:
Proteins play crucial roles in diverse biological functions. Accurately annotating their functions is essential for understanding cellular mechanisms and developing therapies for complex diseases. Computational methods have been proposed as alternatives to labor-intensive and expensive experimental approaches. Existing computational methods have demonstrated that protein evolution information and Protein-Protein Interactions (PPIs) are essential for protein function prediction. However, traditional computational approaches for generating evolution information are time-consuming. On the other hand, proteins lacking interactions are ignored in previous studies. To address these limitations, we propose a novel deep learning framework, named DeepFMB, which incorporates multi-type biological knowledge. DeepFMB leverages a pre-trained protein language model to extract evolution information. Moreover, DeepFMB generates PPI-related features and orthology-related features using graph neural networks on the constructed PPI and orthology networks. Then, these multi-type features are fused adaptively for protein function prediction. Compared to eight state-of-the-art methods, DeepFMB outperforms all of them in terms of F-max and AUPR. Additionally, with the combination of sequence similarity-based inference, our predicted model predicts protein functions more accurately. Experimental results also validate the superior performance of our methods in predicting low-frequency GO terms. Ablation studies demonstrate that the multi-type biological knowledge we use is highly relevant to protein functions. The source code can be downloaded from https://github.com/CSUBioGroup/DeepFMB.
Published in: IEEE Transactions on Computational Biology and Bioinformatics ( Early Access )