He-Yuan Lin - Academia.edu (original) (raw)
Papers by He-Yuan Lin
Multimedia Image and Video Processing, 2017
2008 IEEE Workshop on Signal Processing Systems, 2008
Based on concurrent exploration of both algorithm and architecture, this paper introduces an effi... more Based on concurrent exploration of both algorithm and architecture, this paper introduces an efficient verification methodology that targets at comprehensive functional verification throughout different levels of design granularities for multi-format media SoCpsilas with applications in MPEGpsilas Reconfigurable Video Coding. We present a verification technique that minimizes the number of test patterns but at the same time covering multiple profiles based
2006 IEEE International Conference on Multimedia and Expo, 2006
This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algor... more This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algorithm is based on optimization theory and consists of the strategies including 3D spatio-temporal motion vector prediction, modified one-at-a-time search scheme, and multiple update paths. The simulation results indicate our algorithm is better than other recently proposed ones under the same computational budget and is very close to full search. The low-cost feature and regular demand of computational resource make our algorithm suitable for VLSI implementation. The algorithm also makes single chip solution for high-definition coding feasible.
2009 16th IEEE International Conference on Image Processing (ICIP), 2009
Video codec applications become more and more complex to design. To ease the description of such ... more Video codec applications become more and more complex to design. To ease the description of such applications, MPEG creates a Framework called Reconfigurable Video Coding (RVC). All existing codecs in MPEG have a similar structure, they are based on a hybrid decoding structure and some of their part can be reused on other design. In RVC, the dataflow is expressed using a network of components also called Functional Units (FUs) interconnected by FIFOs. An FU, programmed in CAL Language, includes the processing and the internal states. This paper puts the focus on a parallel dataflow description of the most complex MPEG RVC decoder available called MPEG4-AVC Constrained Baseline Profile (CBP) decoder.
2006 IEEE International Symposium on Circuits and Systems
... Gwo Giun Lee, Drew Wei-Chi Su, He-Yuan Lin and Ming-Jiun Wang Department of Electrical Engine... more ... Gwo Giun Lee, Drew Wei-Chi Su, He-Yuan Lin and Ming-Jiun Wang Department of Electrical Engineering National Cheng Kung University ... TABLE I. PSNR COMPARISONS OF VARIOUS ALGORITHMS Figure 7. (a)foreman and (c)stefan: results from 2-field motion adaptive ...
Lecture Notes in Computer Science, 2008
ABSTRACT In this paper, we present a simple but effective algorithm for the extraction of percept... more ABSTRACT In this paper, we present a simple but effective algorithm for the extraction of perceptual hue feature set used in color image/video segmentation with emphasis on color textures. Feature extraction, with significant impact on the overall image/video analysis process, plays a critical role in classification-based segmentation. Color textures are accurately characterized by the newly introduced feature set with invariance to illumination, translation, and rotation, which is contributed by the statistical scheme in exploring the distribution of six rudimentary colors and the achromatic component at local positions. The feature set provides characteristic information and enables segmentation that is more meaningful than the recently published works do.
2011 IEEE International Symposium of Circuits and Systems (ISCAS), 2011
ABSTRACT In this paper, an area efficient reconfigurable inverse transformation architecture for ... more ABSTRACT In this paper, an area efficient reconfigurable inverse transformation architecture for multiple standards is proposed. We present a top-down design methodology with complexity analysis, commonalities extraction, and dataflow modeling to systematically design reconfigurable architecture. By exporting and sharing the commonalities, the adder usage of the proposed reconfigurable inverse transform processing element can be reduced 44% compared with the total amount of adders in performing target inverse transform types. Then, the reconfigurable architecture is synthesized using TSMC 0.18 um library. The working frequency is 108Mhz, which is derived from the dataflow scheduling. The area synthesis result is 32k gates, which indicates that the proposed design has more efficient area than other documented design in VLSI implementation. In addition, the proposed architecture also satisfies the accuracy requirement. Therefore, the proposed design have lower cost and enough flexibility for multi-standard purposes with 1920×1088 resolution and 64 frames per second and the color format is 4:2:0 for real time processing.
2007 IEEE International Symposium on Circuits and Systems, 2007
In this paper, a novel edge pattern recognition (EPR) deinterlacing algorithm with successive 4-f... more In this paper, a novel edge pattern recognition (EPR) deinterlacing algorithm with successive 4-field enhanced motion detection is introduced. The EPR algorithm surpasses the performance of ELA-based and other conventional methods especially at textural scenes. In addition, the current 4-field enhanced motion detection scheme overcomes conventional motion missing artifacts by gaining good motion detection accuracies and suppression of "motion missing" detection errors efficiently. Furthermore, with the incorporation of our new successive 4-field enhanced motion detection, the interpolation technique of EPR algorithm is capable of flexible adaptation in achieving better performance on textural scenes in generic video sequences. I.
2009 IEEE International Symposium on Circuits and Systems, 2009
ABSTRACT This paper presents a motion-compensated deinterlacing algorithm featuring spectrum-adap... more ABSTRACT This paper presents a motion-compensated deinterlacing algorithm featuring spectrum-adaptive interpolation of interlaced field. Using motion-compensated reference pictures, the proposed spectrum-adaptive filter tactically identifies the baseband via spectrum analysis and removes the replicas of interlaced sampling, making overall algorithm adapt to versatile video scene and different degree of motion compensation scenarios. The experimental results indicate that our proposed algorithm has better objective performance than other motion-compensated and non-motion-compensated algorithms do especially in complex moving textures. The subjective results also support the benefits of our spectrum-adaptive filter.
2010 IEEE Workshop On Signal Processing Systems, 2010
ABSTRACT Algorithmic complexity analysis and dataflow models play significant roles in the concur... more ABSTRACT Algorithmic complexity analysis and dataflow models play significant roles in the concurrent optimization of both algorithms and architectures, which is now a new design paradigm referred to as Algorithm/Architecture Co-exploration. One of the essential complexity metrics is the parallelism revealing the number of operations that can be concurrently executed. Inspired by the principle component analysis (PCA) capable of transforming random variables into uncorrelated ones and hence dependency analysis, this paper presents a systematic methodology for identifying independent operations in algorithms and hence quantifying the intrinsic degree of parallelism based on the dataflow modeling and subsequent eigen-decomposition of the dataflow graphs. Our quantified degree of parallelism is platform-independent and is capable of providing insight into architectural characteristics in early design stages. Starting from different dataflows derived from signal flow graphs in basic signal processing algorithms, the case study on DCT shows that our proposed method is capable of quantitatively characterizing the algorithmic parallelisms making possible the potentially facilitation of the design space exploration in early system design stages especially for parallel processing platforms.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
In this paper, based on the proposed parallelization scheme of binary arithmetic decoding, a para... more In this paper, based on the proposed parallelization scheme of binary arithmetic decoding, a parallel AVC/H.264 context-based adaptive binary arithmetic coding (CABAC) decoder with high throughput is proposed. Following the topdown design methodology, algorithm analyzing and dataflow modeling in both high and low granularities are performed to achieve the proposed architecture. According to the analysis for algorithm, the similarity between CABAC decoder and Viterbi decoder is found to extend the degree of parallelism for binary arithmetic decoding. The application of proposed design is specified to support AVC/H.264 High Profile, 4.2 Level, and 1920 1088 resolution at 64 frames per second. By increasing the degree of parallelism of bin decoding, the throughput of the proposed architecture is shown by the experiments to have improved 3.5 times as compared to the sequential bin decoding, and the decoded bin per second can reach 378M at clock speed 108MHz.
2008 IEEE International Symposium on Circuits and Systems, 2008
Abstract This paper introduced a spatial-temporal content-adaptive algorithm, which can precisely... more Abstract This paper introduced a spatial-temporal content-adaptive algorithm, which can precisely select an appropriate interpolation technique for high-quality deinterlacing according to the spectral, edge-oriented and statistical features of local video content. Our ...
2008 IEEE International Conference on Multimedia and Expo, 2008
Abstract Targeted for highly sophisticated visual signal processing, we introduce in this paper c... more Abstract Targeted for highly sophisticated visual signal processing, we introduce in this paper complexity metrics or measures of algorithms which featuring architectural information are feedback or back annotated in early design stages to facilitate concurrent exploration ...
2009 IEEE International Symposium on Circuits and Systems, 2009
Abstract In this paper we introduce a novel algorithm that can detect local features and choose a... more Abstract In this paper we introduce a novel algorithm that can detect local features and choose a proper interpolation method for de-interlacing. An edge is a high frequency pattern with certain direction which is a noticeable feature in video sequences. We proposed a ...
2009 IEEE International Symposium on Circuits and Systems, 2009
Abstract This paper introduces a low complexity VLSI hardware architecture for entropy coding wit... more Abstract This paper introduces a low complexity VLSI hardware architecture for entropy coding with increased throughput, based on the study of the statistical properties of the context-based adaptive variable length coding (CAVLC) in AVC/H. 264. These enhanced ...
Thin Solid Films, 2002
Nanostructured and amorphous silicon carbon nitride (SiC N) films have been deposited by magnetro... more Nanostructured and amorphous silicon carbon nitride (SiC N) films have been deposited by magnetron sputtering of silicon x y carbide under reactive gas environment. Gas mixtures containing methane and nitrogen with various ratios were used for deposition. Auger electron spectroscopy, X-ray photoelectron spectroscopy and micro-Raman spectroscopy were employed to characterize the composition and bonding structures, while scanning electron microscopy and transmission electron microscopy were used to investigate the microstructure of the SiC N films. As the methaneynitrogen ratio was increased, the SiC N films x y x y changed from mirror-like smooth films to column-like and ridge-like C-rich SiC N nanostructures. Micro-Raman studies also x y showed some blueshift and narrowing of the G band at higher methane concentrations, suggesting an increase in the short-range order of the graphite-like phase in the nanostructured films. The sharper geometric features of the nanostructured SiC N films x y and possibly the higher conductivity of the films led to an enhancement in field emission properties. A low turn-on field (-10 V mm) and high emission current density ()0.2 mA cm), as well as good temporal emission stability, have been achieved y1 y2 for the nanostructured SiC N films. x y
IEICE Transactions on Information and Systems, 2007
Summary: This paper introduces a texture analysis mechanism utilizing multiresolution technique t... more Summary: This paper introduces a texture analysis mechanism utilizing multiresolution technique to reduce false motion detection and hence thoroughly improve the interpolation results for high-quality deinterlacing. Conventional motion-adaptive deinterlacing algorithm selects ...
IEEE Transactions on Parallel and Distributed Systems, 2012
Degree of parallelism (DoP) is an essential complexity metric that characterizes the number of in... more Degree of parallelism (DoP) is an essential complexity metric that characterizes the number of independent operation sets (IOSs) that can be concurrently executed within an algorithm. This paper presents a generic framework to identify IOSs and to quantify the DoP based on rank theorem in linear algebra. This framework is applied to extract algorithmic parallelisms at various granularities, namely, multigrain
IEEE Transactions on Multimedia, 2007
AbstractThis paper presents a new spatiotemporal motion estimation algorithm and its VLSI archi... more AbstractThis paper presents a new spatiotemporal motion estimation algorithm and its VLSI architecture for video coding based on algorithm and architecture co-design methodology. The algorithm consists of the new strategies of spatiotemporal motion vector prediction, ...
EURASIP Journal on Image and Video Processing, 2008
A novel motion-adaptive deinterlacing algorithm with edge-pattern recognition and hybrid motion d... more A novel motion-adaptive deinterlacing algorithm with edge-pattern recognition and hybrid motion detection is introduced. The great variety of video contents makes the processing of assorted motion, edges, textures, and the combination of them very difficult with a single algorithm. The edge-pattern recognition algorithm introduced in this paper exhibits the flexibility in processing both textures and edges which need to be separately accomplished by line average and edge-based line average before. Moreover, predicting the neighboring pixels for pattern analysis and interpolation further enhances the adaptability of the edge-pattern recognition unit when motion detection is incorporated. Our hybrid motion detection features accurate detection of fast and slow motion in interlaced video and also the motion with edges. Using only three fields for detection also renders higher temporal correlation for interpolation. The better performance of our deinterlacing algorithm with higher content-adaptability and less memory cost than the state-of-the-art 4-field motion detection algorithms can be seen from the subjective and objective experimental results of the CIF and PAL video sequences.
Multimedia Image and Video Processing, 2017
2008 IEEE Workshop on Signal Processing Systems, 2008
Based on concurrent exploration of both algorithm and architecture, this paper introduces an effi... more Based on concurrent exploration of both algorithm and architecture, this paper introduces an efficient verification methodology that targets at comprehensive functional verification throughout different levels of design granularities for multi-format media SoCpsilas with applications in MPEGpsilas Reconfigurable Video Coding. We present a verification technique that minimizes the number of test patterns but at the same time covering multiple profiles based
2006 IEEE International Conference on Multimedia and Expo, 2006
This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algor... more This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algorithm is based on optimization theory and consists of the strategies including 3D spatio-temporal motion vector prediction, modified one-at-a-time search scheme, and multiple update paths. The simulation results indicate our algorithm is better than other recently proposed ones under the same computational budget and is very close to full search. The low-cost feature and regular demand of computational resource make our algorithm suitable for VLSI implementation. The algorithm also makes single chip solution for high-definition coding feasible.
2009 16th IEEE International Conference on Image Processing (ICIP), 2009
Video codec applications become more and more complex to design. To ease the description of such ... more Video codec applications become more and more complex to design. To ease the description of such applications, MPEG creates a Framework called Reconfigurable Video Coding (RVC). All existing codecs in MPEG have a similar structure, they are based on a hybrid decoding structure and some of their part can be reused on other design. In RVC, the dataflow is expressed using a network of components also called Functional Units (FUs) interconnected by FIFOs. An FU, programmed in CAL Language, includes the processing and the internal states. This paper puts the focus on a parallel dataflow description of the most complex MPEG RVC decoder available called MPEG4-AVC Constrained Baseline Profile (CBP) decoder.
2006 IEEE International Symposium on Circuits and Systems
... Gwo Giun Lee, Drew Wei-Chi Su, He-Yuan Lin and Ming-Jiun Wang Department of Electrical Engine... more ... Gwo Giun Lee, Drew Wei-Chi Su, He-Yuan Lin and Ming-Jiun Wang Department of Electrical Engineering National Cheng Kung University ... TABLE I. PSNR COMPARISONS OF VARIOUS ALGORITHMS Figure 7. (a)foreman and (c)stefan: results from 2-field motion adaptive ...
Lecture Notes in Computer Science, 2008
ABSTRACT In this paper, we present a simple but effective algorithm for the extraction of percept... more ABSTRACT In this paper, we present a simple but effective algorithm for the extraction of perceptual hue feature set used in color image/video segmentation with emphasis on color textures. Feature extraction, with significant impact on the overall image/video analysis process, plays a critical role in classification-based segmentation. Color textures are accurately characterized by the newly introduced feature set with invariance to illumination, translation, and rotation, which is contributed by the statistical scheme in exploring the distribution of six rudimentary colors and the achromatic component at local positions. The feature set provides characteristic information and enables segmentation that is more meaningful than the recently published works do.
2011 IEEE International Symposium of Circuits and Systems (ISCAS), 2011
ABSTRACT In this paper, an area efficient reconfigurable inverse transformation architecture for ... more ABSTRACT In this paper, an area efficient reconfigurable inverse transformation architecture for multiple standards is proposed. We present a top-down design methodology with complexity analysis, commonalities extraction, and dataflow modeling to systematically design reconfigurable architecture. By exporting and sharing the commonalities, the adder usage of the proposed reconfigurable inverse transform processing element can be reduced 44% compared with the total amount of adders in performing target inverse transform types. Then, the reconfigurable architecture is synthesized using TSMC 0.18 um library. The working frequency is 108Mhz, which is derived from the dataflow scheduling. The area synthesis result is 32k gates, which indicates that the proposed design has more efficient area than other documented design in VLSI implementation. In addition, the proposed architecture also satisfies the accuracy requirement. Therefore, the proposed design have lower cost and enough flexibility for multi-standard purposes with 1920×1088 resolution and 64 frames per second and the color format is 4:2:0 for real time processing.
2007 IEEE International Symposium on Circuits and Systems, 2007
In this paper, a novel edge pattern recognition (EPR) deinterlacing algorithm with successive 4-f... more In this paper, a novel edge pattern recognition (EPR) deinterlacing algorithm with successive 4-field enhanced motion detection is introduced. The EPR algorithm surpasses the performance of ELA-based and other conventional methods especially at textural scenes. In addition, the current 4-field enhanced motion detection scheme overcomes conventional motion missing artifacts by gaining good motion detection accuracies and suppression of "motion missing" detection errors efficiently. Furthermore, with the incorporation of our new successive 4-field enhanced motion detection, the interpolation technique of EPR algorithm is capable of flexible adaptation in achieving better performance on textural scenes in generic video sequences. I.
2009 IEEE International Symposium on Circuits and Systems, 2009
ABSTRACT This paper presents a motion-compensated deinterlacing algorithm featuring spectrum-adap... more ABSTRACT This paper presents a motion-compensated deinterlacing algorithm featuring spectrum-adaptive interpolation of interlaced field. Using motion-compensated reference pictures, the proposed spectrum-adaptive filter tactically identifies the baseband via spectrum analysis and removes the replicas of interlaced sampling, making overall algorithm adapt to versatile video scene and different degree of motion compensation scenarios. The experimental results indicate that our proposed algorithm has better objective performance than other motion-compensated and non-motion-compensated algorithms do especially in complex moving textures. The subjective results also support the benefits of our spectrum-adaptive filter.
2010 IEEE Workshop On Signal Processing Systems, 2010
ABSTRACT Algorithmic complexity analysis and dataflow models play significant roles in the concur... more ABSTRACT Algorithmic complexity analysis and dataflow models play significant roles in the concurrent optimization of both algorithms and architectures, which is now a new design paradigm referred to as Algorithm/Architecture Co-exploration. One of the essential complexity metrics is the parallelism revealing the number of operations that can be concurrently executed. Inspired by the principle component analysis (PCA) capable of transforming random variables into uncorrelated ones and hence dependency analysis, this paper presents a systematic methodology for identifying independent operations in algorithms and hence quantifying the intrinsic degree of parallelism based on the dataflow modeling and subsequent eigen-decomposition of the dataflow graphs. Our quantified degree of parallelism is platform-independent and is capable of providing insight into architectural characteristics in early design stages. Starting from different dataflows derived from signal flow graphs in basic signal processing algorithms, the case study on DCT shows that our proposed method is capable of quantitatively characterizing the algorithmic parallelisms making possible the potentially facilitation of the design space exploration in early system design stages especially for parallel processing platforms.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
In this paper, based on the proposed parallelization scheme of binary arithmetic decoding, a para... more In this paper, based on the proposed parallelization scheme of binary arithmetic decoding, a parallel AVC/H.264 context-based adaptive binary arithmetic coding (CABAC) decoder with high throughput is proposed. Following the topdown design methodology, algorithm analyzing and dataflow modeling in both high and low granularities are performed to achieve the proposed architecture. According to the analysis for algorithm, the similarity between CABAC decoder and Viterbi decoder is found to extend the degree of parallelism for binary arithmetic decoding. The application of proposed design is specified to support AVC/H.264 High Profile, 4.2 Level, and 1920 1088 resolution at 64 frames per second. By increasing the degree of parallelism of bin decoding, the throughput of the proposed architecture is shown by the experiments to have improved 3.5 times as compared to the sequential bin decoding, and the decoded bin per second can reach 378M at clock speed 108MHz.
2008 IEEE International Symposium on Circuits and Systems, 2008
Abstract This paper introduced a spatial-temporal content-adaptive algorithm, which can precisely... more Abstract This paper introduced a spatial-temporal content-adaptive algorithm, which can precisely select an appropriate interpolation technique for high-quality deinterlacing according to the spectral, edge-oriented and statistical features of local video content. Our ...
2008 IEEE International Conference on Multimedia and Expo, 2008
Abstract Targeted for highly sophisticated visual signal processing, we introduce in this paper c... more Abstract Targeted for highly sophisticated visual signal processing, we introduce in this paper complexity metrics or measures of algorithms which featuring architectural information are feedback or back annotated in early design stages to facilitate concurrent exploration ...
2009 IEEE International Symposium on Circuits and Systems, 2009
Abstract In this paper we introduce a novel algorithm that can detect local features and choose a... more Abstract In this paper we introduce a novel algorithm that can detect local features and choose a proper interpolation method for de-interlacing. An edge is a high frequency pattern with certain direction which is a noticeable feature in video sequences. We proposed a ...
2009 IEEE International Symposium on Circuits and Systems, 2009
Abstract This paper introduces a low complexity VLSI hardware architecture for entropy coding wit... more Abstract This paper introduces a low complexity VLSI hardware architecture for entropy coding with increased throughput, based on the study of the statistical properties of the context-based adaptive variable length coding (CAVLC) in AVC/H. 264. These enhanced ...
Thin Solid Films, 2002
Nanostructured and amorphous silicon carbon nitride (SiC N) films have been deposited by magnetro... more Nanostructured and amorphous silicon carbon nitride (SiC N) films have been deposited by magnetron sputtering of silicon x y carbide under reactive gas environment. Gas mixtures containing methane and nitrogen with various ratios were used for deposition. Auger electron spectroscopy, X-ray photoelectron spectroscopy and micro-Raman spectroscopy were employed to characterize the composition and bonding structures, while scanning electron microscopy and transmission electron microscopy were used to investigate the microstructure of the SiC N films. As the methaneynitrogen ratio was increased, the SiC N films x y x y changed from mirror-like smooth films to column-like and ridge-like C-rich SiC N nanostructures. Micro-Raman studies also x y showed some blueshift and narrowing of the G band at higher methane concentrations, suggesting an increase in the short-range order of the graphite-like phase in the nanostructured films. The sharper geometric features of the nanostructured SiC N films x y and possibly the higher conductivity of the films led to an enhancement in field emission properties. A low turn-on field (-10 V mm) and high emission current density ()0.2 mA cm), as well as good temporal emission stability, have been achieved y1 y2 for the nanostructured SiC N films. x y
IEICE Transactions on Information and Systems, 2007
Summary: This paper introduces a texture analysis mechanism utilizing multiresolution technique t... more Summary: This paper introduces a texture analysis mechanism utilizing multiresolution technique to reduce false motion detection and hence thoroughly improve the interpolation results for high-quality deinterlacing. Conventional motion-adaptive deinterlacing algorithm selects ...
IEEE Transactions on Parallel and Distributed Systems, 2012
Degree of parallelism (DoP) is an essential complexity metric that characterizes the number of in... more Degree of parallelism (DoP) is an essential complexity metric that characterizes the number of independent operation sets (IOSs) that can be concurrently executed within an algorithm. This paper presents a generic framework to identify IOSs and to quantify the DoP based on rank theorem in linear algebra. This framework is applied to extract algorithmic parallelisms at various granularities, namely, multigrain
IEEE Transactions on Multimedia, 2007
AbstractThis paper presents a new spatiotemporal motion estimation algorithm and its VLSI archi... more AbstractThis paper presents a new spatiotemporal motion estimation algorithm and its VLSI architecture for video coding based on algorithm and architecture co-design methodology. The algorithm consists of the new strategies of spatiotemporal motion vector prediction, ...
EURASIP Journal on Image and Video Processing, 2008
A novel motion-adaptive deinterlacing algorithm with edge-pattern recognition and hybrid motion d... more A novel motion-adaptive deinterlacing algorithm with edge-pattern recognition and hybrid motion detection is introduced. The great variety of video contents makes the processing of assorted motion, edges, textures, and the combination of them very difficult with a single algorithm. The edge-pattern recognition algorithm introduced in this paper exhibits the flexibility in processing both textures and edges which need to be separately accomplished by line average and edge-based line average before. Moreover, predicting the neighboring pixels for pattern analysis and interpolation further enhances the adaptability of the edge-pattern recognition unit when motion detection is incorporated. Our hybrid motion detection features accurate detection of fast and slow motion in interlaced video and also the motion with edges. Using only three fields for detection also renders higher temporal correlation for interpolation. The better performance of our deinterlacing algorithm with higher content-adaptability and less memory cost than the state-of-the-art 4-field motion detection algorithms can be seen from the subjective and objective experimental results of the CIF and PAL video sequences.