天玑大数据引擎及其应用
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家高技术研究发展计划(863 计划)(2013AA01A213)


Golaxy Big Data Engine and Its Applications
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    大数据计算面对的是传统 IT 技术无法处理的数据量超大规模、服务请求高吞吐量和数据类型异质多样的 挑战。得益于国内外各大互联网公司的实际应用和开源代码贡献,Apache Hadoop 软件已成为 PB 量级大数据处 理的成熟技术和事实标准,并且围绕不同类型大数据处理需求的软件生态环境已经建立起来。文章介绍了大数据 计算系统中存储、索引和压缩解压缩的硬件加速三项研究工作,即 RCFile、CCIndex 和 SwiftFS,有效解决了大 数据计算系统的存储空间问题和查询性能等问题。这些研究成果已形成关键技术并集成在天玑大数据引擎软件栈 中,直接支持了淘宝和腾讯公司的多个生产性应用。

    Abstract:

    Volume, variety and velocity are the three challenges that the big data computing must be faced with, which cannot be dealt with by traditional IT technologies. Benefiting from numerous domestic and overseas Internet companies’ practical applications and continuous code contributions, the Apache Hadoop has become a mature software stack and the de facto standard of the PetaByte scale data processing. Furthermore, around different types of data processing requirements, different software ecosystems have been established. In the big data system field, three research works of data placement, index construction and compression and decompression hardware acceleration, i.e. RCFile, CCIndex and SwiftFS respectively, effectively solving the storage space and query performance issues, were introduced in this paper. The above research achievements have been already integrated into the Golaxy big data engine software stack in the form of key technologies, and directly supported multiple practical applications of Taobao Inc. and Tencent Inc.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
查 礼,程学旗.天玑大数据引擎及其应用 [J].集成技术,2014,3(4):18-30

Citing format
ZHA Li, CHENG Xueqi. Golaxy Big Data Engine and Its Applications[J]. Journal of Integration Technology,2014,3(4):18-30

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2014-07-22
  • 出版日期:
文章二维码