基于机器学习的动态分区并行文件系统性能优化
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

广东省重点领域研发项目(2019B010137002, 2020B010164003);国家自然科学基金面上项目(61672513)


Research on Machine Learning-Based Performance Optimization of Dynamic Partitioned Parallel File System
Author:
Affiliation:

Fund Project:

Key Research and Development Projects in Guangdong Province (2019B010137002, 2020B010164003);General Project of the National Natural Science Foundation of China(61672513)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    近年来,随着大数据、云计算技术的发展,应用系统越来越集中,规模亦越来越大,使得存 储系统的性能问题越来越突出。为应对其性能要求,并行文件系统得到了大量的应用。然而现有的并 行文件系统优化方法,大多只考虑应用系统或并行文件系统本身,较少考虑两者之间的协同。该文基 于应用系统在并行文件系统上的访问模式对存储系统的性能有显著影响这一特点,提出基于动态分区 的并行文件系统优化方法。首先,利用机器学习技术来分析挖掘各个性能影响因素和性能指标之间的 关系和规律,生成优化模型。其次,以优化模型为基础,辅助并行文件系统的参数调优工作。最后, 基于 Ceph 存储系统进行原型实现,并设计了三层架构应用系统进行了性能测试,最终达到优化并行 文件系统访问性能的目的。实验结果表明,所提出方法可以达到 85% 的预测优化准确率;在所提出模 型的辅助优化下,并行文件系统的吞吐量性能得到约 3.6 倍的提升。

    Abstract:

    Thanks to the rapid evolvement of technologies including big data and cloud computing, application systems become more and more centralized with boosted scale, which gradually highlights the performance issues of storage systems. Parallel file systems have been applied in a wide range of applications to meet the performance requirements of large-scale applications running on the storage systems. However, the majority of currently used parallel file system optimization methods only takes the application system or the parallel file system itself into account, and seldom considers the collaboration among them. Considering that the access mode of an application system when accessing the parallel file system will have a significant impact on the storage system performance, this study proposes a parallel file system optimization approach based on dynamic partitioning. The key idea is to firstly leverage machine learning techniques to reveal the relationships between factors that can influence the system performance and build an optimization model accordingly. Then, the optimization model will facilitate the parameter optimization of parallel file systems. Finally, the model is tested on a Ceph-based storage system prototype with a three-layer application system. The proposed model successfully optimizes the parallel file system access performance. Experimentally, the proposed model achieves an optimization prediction accuracy of 85%. With the assistance of the proposed model, the system throughput is improved by 3.6 times.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
吴嘉澍,王红博,代 浩,等.基于机器学习的动态分区并行文件系统性能优化 [J].集成技术,2020,9(6):71-83

Citing format
WU Jiashu, WANG Hongbo, DAI Hao, et al. Research on Machine Learning-Based Performance Optimization of Dynamic Partitioned Parallel File System[J]. Journal of Integration Technology,2020,9(6):71-83

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-11-24
  • 出版日期:
文章二维码