Genome assembling is one of the challenges in metagenomic analysis. It is usually assumed that the sequencing reads are from the same genome. However, the mobile elements active in microbial genomes raise a critical question mark on this assumption. This work formulated this issue as a binary classification problem. The accurate discrimination of mobile elements from chromosomes could greatly facilitate the metagenomic analysis. After quantifying the sequencing reads in metagenome, the collaboration of binary classification algorithms with feature selection algorithms, including ReliefF, chi-squared test, and Fisher’s t-test was investigated. All feature subsets were tested using the classification algorithms such as logisitic regression, extreme learning machine, support vector machine and random forest. Experimental results demonstrate that the model based on ReliefF algorithm and Random Forest algorithm achieves over 95% in accuracy with only 100 features, which outperforms the model utilizing all 690 features.
Citing format PENG Chao, WANG Pu, GE Ruiquan, et al. Accurate Detection of Mobile Sequence in Metagenome[J]. Journal of Integration Technology,2016,5(2):85-96