大数据基准测试程序包构建方法研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


An Approach to Build a Big Data Benchmark Suite
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    基准测试程序是评估计算机系统的关键测试工具。然而,大数据时代的到来使得开发大数据系统基准测 试程序面临着更加严峻的挑战,当前学术界和产业界还不存在得到广泛认可的大数据基准测试程序包。文章利 用实际的交通大数据系统构建了一个基于 Hadoop 平台的交通大数据基准测试程序包 SIAT-Bench。通过选取多个 层次属性量化了程序行为特征,采用聚类算法分析了不同程序-输入数据集对的相似性。根据聚类结果,为 SIATBench 选取了有代表性的程序和输入数据集。实验结果表明,SIAT-Bench 在满足程序行为多样性的同时消除了基 准测试集中的冗余。

    Abstract:

    Benchmarks are important tools to evaluate the performance of a variety of computing systems. However, benchmarks for big data systems are lacking as big data is relatively new and researchers are interested in understanding how big data systems including hardware and software work but do not have data. In this paper, an approach to develop big data benchmarks was devised at first. Then a big data benchmark suite named SIAT-Bench, which contains five representative workloads from Shenzhen urban transportation system, was presented. To this end, the program behavior was characterized and the impact of input data sets was qualified by observing metrics from multiple levels such as microarchitecture, OS and application layer. Then statistical techniques such as Principal Component Analysis (PCA) and Clustering were employed to perform similarity analysis between different workload-input pairs. Finally, we built SIATBench by selecting representative workloads and associated input sets according to the clustering results. Experimental results show that SIAT-Bench properly satisfies the requirements of a benchmark suite.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
熊 文,喻之斌,须成忠.大数据基准测试程序包构建方法研究 [J].集成技术,2014,3(4):1-9

Citing format
XIONG Wen, YU Zhibin, XU Chengzhong. An Approach to Build a Big Data Benchmark Suite[J]. Journal of Integration Technology,2014,3(4):1-9

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2014-07-22
  • 出版日期:
文章二维码