基于多现场可编程门阵列异构平台的 流水线技术优化方法
作者:
基金项目:

深圳市无人驾驶感知决策与执行技术工程实验室(Y7D004);深圳电动汽车动力平台与安全技术重点实验室


Optimization Methods of Pipeline Technique Based on Multi-field Programmable Gate Array Heterogeneous Platform
Author:
  • HU Yanbu

    HU Yanbu

    Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;Xidian University, Xi’an 710071, China;CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • SHAO Cuiping

    SHAO Cuiping

    Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China;Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen 518055, China
    在期刊界中查找
    在百度中查找
    在本站中查找
  • LI Huiyun

    LI Huiyun

    Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China;Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen 518055, China
    在期刊界中查找
    在百度中查找
    在本站中查找
Fund Project:

Shenzhen Engineering Laboratory for Autonomous Driving Technology(Y7D004);Shenzhen Key Laboratory of Electric Vehicle Powertrain Platform and Safety Technology

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    该研究提出了一种基于多现场可编程门阵列异构平台的流水线技术优化方法。首先,基于二 分法思想对任务进行划分,使任务量尽可能均衡地部署在不同现场可编程门阵列单元中,从而提高板 级流水线均衡度;其次,针对板间传输延迟进行了流水线结构的优化,在板间延迟较大时,将板间延 迟作为流水线的一级可以提高平台吞吐率;最后,并行优化计算单元内部模块,并通过数据关系重 排、循环展开、循环流水线等手段充分利用现场可编程门阵列计算资源,提高吞吐率和能效比。采用 AlexNet 网络为例进行的验证结果显示,与优化之前的流水线结构相比,改进后的流水线结构吞吐率 提高了 215.6%,能效比提高了 105.5%,单次任务运行时间减少了 36.6%。

    Abstract:

    This paper presents an optimal pipeline processing method based on multi-FPGA (field programmable gate array) heterogeneous platform. Firstly, the task is divided according to the dichotomy scheme, so that the task quantity can be deployed in each FPGA unit as evenly as possible. And the balance degree of board-level pipeline can be improved. Secondly, the optimization of pipeline structure is applied for the inter-board transmission delay. While the inter-board delay is large, the inter-board delay can be taken as one stage of the pipeline to improve the throughput of the platform. Finally, the computing unit is optimized in parallel, and the FPGA resources are fully utilized by means of data relation rearrangement, loop unroll and loop pipeline, etc. As the result, throughput and energy efficiency of the data processing system can be improved. The AlexNet was used for the experiment to verify the effectiveness of the proposed method. Experimental results showed that, compared with original pipeline structure, throughput of the optimized pipeline structure can be improved by 215.6%, the energy efficiency can be increased by 105.5%, and the running time of a single task can be reduced by 36.6%.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
胡延步,邵翠萍,李慧云.基于多现场可编程门阵列异构平台的 流水线技术优化方法 [J].集成技术,2020,9(5):81-92

Citing format
HU Yanbu, SHAO Cuiping, LI Huiyun. Optimization Methods of Pipeline Technique Based on Multi-field Programmable Gate Array Heterogeneous Platform[J]. Journal of Integration Technology,2020,9(5):81-92

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-05-09
  • 最后修改日期:2020-07-05
  • 在线发布日期: 2020-09-23
文章二维码