A Performance Analysis for Hadoop under Heterogeneous Cloud Computing Environments


Ethical statement:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials

    Cloud computing grows rapidly nowadays, which brings virtualization technology to traditional datacenters in order to implement service-on-demand of computing resources, such as Amazon’s Elastic Cloud Computing (EC2) Services. Hadoop is an open-source implementation of Google’s MapReduce, which is a distributed parallel computing model for large-scale dataset. Hadoop is gaining more and more focuses both in academy and industry. It is an open question that how to combine cloud computing infrastructures with Hadoop efficiently, i.e., making full use of the former’s elastic resources and the latter’s advantages of scalability, fault-tolerance and running on commodity hardware. In this paper, we carry out a series of experiments to evaluate and analyze the performance of Hadoop on our heterogeneous clouding computing testbed. We demonstrate that the performance of Hadoop is degraded under the scenario with high I/O overheads, compared with the traditional scenario where each node in a cluster is a physical machine. Our work can act as a basis for improving the performance of Hadoop under the cloud computing environments.

    Cited by
Get Citation

LIU Dan-dan, CHEN Jun, LIANG Feng, FAN Xiao-peng. A Performance Analysis for Hadoop under Heterogeneous Cloud Computing Environments[J]. Journal of Integration Technology,2012,1(4):46-51

Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Received:
  • Revised:
  • Adopted:
  • Online: April 08,2013
  • Published: