Effective Implementation of Edge-Preserving Filtering on CPU Microarchitectures (original) (raw)
Related papers
Fast on-chip mean filter requiring only integer operations
Visual Communications and Image Processing 2008, 2008
This paper presents a novel formulation of the classical mean filtering, which has been shown to stem from the theory of continued fractions as well as from the rules of binomial expansion. Such an alternative formulation of mean filtering is marked by its sufficiency of only a few primitive operations, namely binary shifts and addition (subtraction), in the integer domain. Subsequently, the resultant process of smoothing a digital image using the mean filter is devoid of any floating-point computation, and can be implemented by a simple hardware, there of. In addition, the formulation has the ability of yielding an approximate solution using fewer operations, which can bring the hardware cost further down. We have tested our method for various images, and have reported some relevant results to demonstrate its elegance, versatility, and effectiveness, specially when an approximate solution is called for.
Fast on-chip mean filter requiring only integer operations
2008
This paper presents a novel formulation of the classical mean filtering, which has been shown to stem from the theory of continued fractions as well as from the rules of binomial expansion. Such an alternative formulation of mean filtering is marked by its sufficiency of only a few primitive operations, namely binary shifts and addition (subtraction), in the integer domain. Subsequently, the resultant process of smoothing a digital image using the mean filter is devoid of any floating-point computation, and can be implemented by a simple hardware, thereof. In addition, the formulation has the ability of yielding an approximate solution using fewer operations, which can bring the hardware cost further down. We have tested our method for various images, and have reported some relevant results to demonstrate its elegance, versatility, and effectiveness, specially when an approximate solution is called for.
DESIGN OF APPLICATION SPECIFIC INSTRUCTION-SET PROCESSOR FOR IMAGE AND VIDEO FILTERING
2000
Two architectures for cost-effective and real-time implemen- tation of non-linear image and video filters are presented in the paper. The first architecture is a traditional VHDL-based ASIC (Application Specific Integrated Circuit) design while the second one is an ADL (Architecture Description Language) based ASIP (Application Specific Instruction Set Processor). A system to improve the visual quality of images, based on
Accelerating Local Laplacian Filters on FPGAs
2020 30th International Conference on Field-Programmable Logic and Applications (FPL), 2020
Images when processed using various enhancement techniques often lead to edge degradation and other unwanted artifacts such as halos. These artifacts pose a major problem for photographic applications where they can denude the quality of an image. There is a plethora of edge-aware techniques proposed in the field of image processing. However, these require the application of complex optimization or post-processing methods. Local Laplacian Filtering is an edge-aware image processing technique that involves the construction of simple Gaussian and Laplacian pyramids. This technique can be successfully applied for detail smoothing, detail enhancement, tone mapping and inverse tone mapping of an image while keeping it artifact-free. The problem though with this approach is that it is computationally expensive. Hence, parallelization schemes using multi-core CPUs and GPUs have been proposed. As is well known, they are not power-efficient, and a well-designed hardware architecture on an FP...
Hardware-accelerated high-quality filtering on PC hardware
2001
Abstract We describe a method for exploiting commodity 3D graphics hardware in order to achieve hardwareaccelerated high-quality filtering with arbitrary filter kernels. Our approach is based on reordering the evaluation of the filter convolution sum to accommodate the way the hardware works. We exploit multiple rendering passes together with the capability of current graphics hardware to index into several textures at the same time (multi-texturing). The method we present is applicable in one, two, and three dimensions.
Mac Based an Adaptive Edge Detection Filter for Image Processing Applications on Fpga
2018
An adaptive edge detection filter has been introduced for image processing applications. And important point is the proposed filter works uniquely compare to other existing filter due to its properties. A new approach is proposed for edge detection in image processing applications .By using fully parallel and fully pipeline MAC (Multiply and Accumulate) concept which can implemented on FPGA tool kit. This provides area, cost, and performance efficiencies with respect to other methods. Filter is designed using Xilinx DSP tools, MATLAB and synthesized with ISE 10.1 and implemented on vertex II pro based 2V0ffll48-7 FPGA device. Partial results like blurring of images are shown.
IJSRD - International Journal for Scientific Research and Development, 2018
— Design of median filter capable of filtering 36 pixels which has the efficiency of a conventional filter of size 9. This is achieved by dividing the sliding window matrix of a 6X6 matrix into four 3X3 matrix. Main idea is to synchronize all four 3X3 matrix for Median operation so that it can reproduce conventional 6X6 matrix sliding window. Four different mean values are replaced at a time making the processing speed comparatively quicker than conventional 6X6 Median sorting. Filter size is fixed and the Median operation is done through adaptive median sorting algorithm to minimize the processing time. Data driven clock gating techniques are used in the system to reduce switching transition.
Hardware software co-design of a fast bilateral filter in FPGA
Bilateral filters are widely used in computer vision and digital imaging applications such as denoising, video abstraction, demosaicing, optical-flow estimation etc. to name a few. Its smoothing and edge preserving characteristics suites perfectly for image and video processing applications, yet its high computational complexity makes real-time hardware implementation a challenging task. This paper provides an efficient Field Programmable Gate Array (FPGA) based implementation of an edge preserving fast bilateral filter on a hardware software co-design environment of a most recent algorithm preserving the boundaries, spikes and canyons in presence of noise. Further, the four stage parallel pipelined architecture greatly improves the speed of operation. Moreover, our separable kernel implementation of the filtering hardware increases the speed of execution by almost five times than the traditional convolution filtering, while utilizing less hardware resource.
Simulation Acceleration of Image Filtering on CMOS Vision Chips Using Many-Core Processors
2019 Forum for Specification and Design Languages (FDL), 2019
This paper describes an efficient numerical solution to speed up transient simulations of analog circuits on a many-core computer. The technique is based on an explicit integration method, parallelised on a multiprocessor architecture. Although the integration step is smaller than the required one by traditional simulation methods based on Newton-Raphson iterations, explicit methods do not require to compute complex calculations such us matrix factorizations, which lead to long CPU simulation times. The proposed technique has been implemented on a NVIDIA GPU and has been demonstrated simulating Gaussian filtering operations performed by a CMOS vision chip. These type of devices, which are used to perform computation on the edge, include built-in image processing functions, turning them into very complex and time consuming circuits during their design. The proposed method is faster that Ngspice for different image sizes, and for 128 x 128 pixels image size it achieves a speed up of two orders of magnitude.
Analysis of video filtering on the cell processor
2008
In this paper an analysis of bi-dimensional video filtering on the Cell Broadband Engine Processor is presented. To evaluate the processor, a highly adaptive filtering algorithm was chosen: the Deblocking Filter of the H.264 video compression standard. The baseline version is a scalar implementation extracted from the FFMPEG H.264 decoder. The scalar version was vectorized using the SIMD instructions of the Cell Synergistic Processing Element (SPE) and with AltiVec instructions for the Power Processor Element. Results show that approximately one third of the processing time of the SPE SIMD version is used for transposition and data packing and unpacking. Despite the required SIMD overhead and the high adaptivity of the kernel, the SIMD version of the kernel is 2.6 times faster than the scalar versions.