Identifying LncRNA-Encoded Short Peptides Using Optimized Hybrid Features and Ensemble Learning | IEEE Journals & Magazine | IEEE Xplore

Identifying LncRNA-Encoded Short Peptides Using Optimized Hybrid Features and Ensemble Learning


Abstract:

Long non-coding RNA (lncRNA) contains short open reading frames (sORFs), and sORFs-encoded short peptides (SEPs) have become the focus of scientific studies due to their ...Show More

Abstract:

Long non-coding RNA (lncRNA) contains short open reading frames (sORFs), and sORFs-encoded short peptides (SEPs) have become the focus of scientific studies due to their crucial role in life activities. The identification of SEPs is vital to further understanding their regulatory function. Bioinformatics methods can quickly identify SEPs to provide credible candidate sequences for verifying SEPs by biological experimenrts. However, there is a lack of methods for identifying SEPs directly. In this study, a machine learning method to identify SEPs of plant lncRNA (ISPL) is proposed. Hybrid features including sequence features and physicochemical features are extracted manually or adaptively to construct different modal features. In order to keep the stability of feature selection, the non-linear correction applied in Max-Relevance-Max-Distance (nocRD) feature selection method is proposed, which integrates multiple feature ranking results and uses the iterative random forest for different modal features dimensionality reduction. Classification models with different modal features are constructed, and their outputs are combined for ensemble classification. The experimental results show that the accuracy of ISPL is 89.86% percent on the independent test set, which will have important implications for further studies of functional genomic.
Published in: IEEE/ACM Transactions on Computational Biology and Bioinformatics ( Volume: 19, Issue: 5, 01 Sept.-Oct. 2022)
Page(s): 2873 - 2881
Date of Publication: 12 August 2021

ISSN Information:

PubMed ID: 34383651

Funding Agency:

No metrics found for this document.

1 Introduction

Long non-coding RNA (lncRNA), a type of non-coding RNA (ncRNA), has a pivotal role in life activities such as growth, development, and resistance of human, animal [1], [2], [3], and plant [4], [5], [6]. Surprisingly, several studies have documented that some lncRNAs contain short open reading frames (sORFs) which are no longer than 300nt [7], [8], and these sORFs can be translated into short peptides that are no longer than 100aa.

Usage
Select a Year
2025

View as

Total usage sinceAug 2021:324
02468JanFebMarAprMayJunJulAugSepOctNovDec736000000000
Year Total:16
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.