Abstract:This paper presents an optimal pipeline processing method based on multi-FPGA (field programmable gate array) heterogeneous platform. Firstly, the task is divided according to the dichotomy scheme, so that the task quantity can be deployed in each FPGA unit as evenly as possible. And the balance degree of board-level pipeline can be improved. Secondly, the optimization of pipeline structure is applied for the inter-board transmission delay. While the inter-board delay is large, the inter-board delay can be taken as one stage of the pipeline to improve the throughput of the platform. Finally, the computing unit is optimized in parallel, and the FPGA resources are fully utilized by means of data relation rearrangement, loop unroll and loop pipeline, etc. As the result, throughput and energy efficiency of the data processing system can be improved. The AlexNet was used for the experiment to verify the effectiveness of the proposed method. Experimental results showed that, compared with original pipeline structure, throughput of the optimized pipeline structure can be improved by 215.6%, the energy efficiency can be increased by 105.5%, and the running time of a single task can be reduced by 36.6%.