Efficient Data Transfer Method for Image Filtering Implementation on FPGA Using OpenCL
TL;DRAbstract
Heterogeneous platforms which commonly consist of a central processing unit (CPU) and a graphic processing unit (GPU) receive lots of attention to achieve both high performance and low power consumption. Furthermore, modern heterogeneous platforms often employ a field programmable gate array (FPGA) device in addition to a CPU and a GPU. To fully utilize these heterogeneous hardware accelerators, Open Computing Language (OpenCL) has been developed. In this paper, an FPGA implementation of image filtering with effective data transfer using OpenCL is proposed. To utilize the configurable pipelined architecture of the target FPGA, an effective local memory allocation scheme is proposed for a convolution kernel, and a loopunrolling method is applied to increase the local memory allocation efficiency. By using the proposed method, the average local memory access latency is improved significantly for various memory access patterns. Also, the proposed filtering kernel shows a better performanc
Chat with Paper
AI Agents for this Paper
Heterogeneous platforms which commonly consist of a central processing unit (CPU) and a graphic processing unit (GPU) receive lots of attention to achieve both high performance and low power consumption. Furthermore, modern heterogeneous platforms often employ a field programmable gate array (FPGA) device in addition to a CPU and a GPU. To fully utilize these heterogeneous hardware accelerators, Open Computing Language (OpenCL) has been developed. In this paper, an FPGA implementation of image filtering with effective data transfer using OpenCL is proposed. To utilize the configurable pipelined architecture of the target FPGA, an effective local memory allocation scheme is proposed for a convolution kernel, and a loopunrolling method is applied to increase the local memory allocation efficiency. By using the proposed method, the average local memory access latency is improved significantly for various memory access patterns. Also, the proposed filtering kernel shows a better performanc
Keywords
Chat
Click to start Chat