基于 XGBoost 的基因静态数据调控网络推断方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

中国博士后科学基金项目(2018M633187);深圳市发改委健康大数据智能分析技术国家地方联合工程中心


XGBoost-Based Gene Network Inference Method for Steady-State Data
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    对于静态基因表达数据来说,推断基因调控网络仍是系统生物学中的一个挑战——存在大量识别难度高的直接或间接调控关系,而传统方法的准确性和可靠性还有待进一步提高。为此,该文提出一种基于 Boosting 集成模型的方法(XGBoost),应用随机化和正则化来解决模型过拟合问题,同时针对建模所得权重不一致的问题,对初始权重增加归一化和统计学方法处理。最终,采用 DREAM5挑战的基准数据集对所提出方法进行性能验证。实验结果表明,XGBoost 比现有其他方法获得更好的性能:在 in-silico 生成的模拟数据集中,接受者操作特征曲线面积(AUPR)和正确率-召回率曲线面积(AUROC)两个评估指标均显著优于现有方法;在 E.coli 和 S.cerevisiae 两种生物的真实实验数据中,AUROC 指标均高于现有最优方法。

    Abstract:

    Inferring gene regulatory networks (GRNs) from steady gene expression data remains a challenge in systems biology. There are a large number of potential direct or indirect regulatory relationships that are difficult to be identified by traditional methods. To address this issue, we propose a new method based on boosting integrated model, and apply randomization and regularization to solve the model over fitting problem. For the inconsistent weights from different subproblems, we integrate normalization and statistical methods to deal with the initial weights. Using the benchmark datasets from DREAM5 challenges, it shows that our method achieves better performance than other state-of-the-art methods. In the simulated data set generated by in-silico, the two evaluation indicators of area under precision-recall curves (AUPR) and area under receiver operating characteristic (AUROC) are significantly better than existing methods, and the accuracy is higher in the real experimental data of two organisms, E.coli and S.cerevisiae. Especially for AUROC, the indicators are higher than the existing best methods.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
车丹丹,郭顺,姜青山.基于 XGBoost 的基因静态数据调控网络推断方法 [J].集成技术,2020,9(2):50-59

Citing format
CHE Dandan, GUO Shun, JIANG Qingshan. XGBoost-Based Gene Network Inference Method for Steady-State Data[J]. Journal of Integration Technology,2020,9(2):50-59

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-03-25
  • 出版日期:
文章二维码