You-Sung Chang - Academia.edu (original) (raw)

Papers by You-Sung Chang

Research paper thumbnail of Customization of embedded system for low-power application = 저전력 어플리케이션을 위한 임베디드 시스템의 최적화

Research paper thumbnail of High performance cache design for superscalar microprocessor = 수퍼스칼라 마이크로프로세서를 위한 고성능 분리형 캐쉬의 설계

Research paper thumbnail of Verification of a microprocessor using real world applications

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Customization of a CISC processor core for low-power applications

Research paper thumbnail of Conforming block inversion for low power memory

IEEE Transactions on Very Large Scale Integration Systems, Feb 1, 2002

Research paper thumbnail of Conforming inverted data store for low power memory

In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consum... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It reduces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits different from the precharging value. In this way, bitline toggling during memory access is minimized and this ultimately contributes to a reduction in power consumption. We develop two practical implementations for the proposed method, that are vertical strip, and horizontal strip inversion schemes. Simulation results indicate that implementation of the strip-based inversion schemes contribute to a power reduction up to 50%.

Research paper thumbnail of An Efficient Approach to Functional Verification of Complex Processors

Research paper thumbnail of Systeme, procede et article de fabrication destines au stockage de datagramme entrant dans une matrice de commutation dans un systeme de jeu de puces de matrice de commutation

L'invention concerne un systeme, un procede et un article de fabrication destines a stocker u... more L'invention concerne un systeme, un procede et un article de fabrication destines a stocker un datagramme entrant (3702) dans une matrice de commutation (3706) d'un dispositif de commutation (104). La matrice de commutation (3706) comporte une paire de tampons (3708, 3710) possedant chacun deux parties (3712, 3714). Les donnees du datagramme (3702) sont recues et les parties du tampon sont remplies de facon sequentielle avec les donnees. Un transfert de donnees est periodiquement accorde par les tampons (3708, 3710) dans la matrice de commutation (3706). A chaque fois que le transfert de donnees est accorde dans la sequence que les parties du tampon (3712, 3714) ont remplie, les donnees se trouvant dans une des parties du tampon peuvent etre transferees dans la matrice de commutation (3706).

Research paper thumbnail of Customization of embedded system for low-power application = 저전력 어플리케이션을 위한 임베디드 시스템의 최적화

Research paper thumbnail of High performance cache design for superscalar microprocessor = 수퍼스칼라 마이크로프로세서를 위한 고성능 분리형 캐쉬의 설계

Research paper thumbnail of Design and Implementation of Content Switching Network Processor and Scalable Switch Fabric

This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.

Research paper thumbnail of HDL Saver Allowing Restrat after Souce Modification

Research paper thumbnail of Verification of a microprocessor using real world applications

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Switch Expansion Architecture Using Local Switching Network

IEEE International Conference on Communications, 2000

We propose a scheme for expanding switch modules to form a larger switch using the so-called loca... more We propose a scheme for expanding switch modules to form a larger switch using the so-called local switching obtained by allowing bidirectional connection to the conventional three-stage Clos (1953) network switch. The performance of the proposed expansion architecture was compared to that of the conventional three-stage Clos network. We simulated 32×32 switching systems using two expansion architectures with the same

Research paper thumbnail of Conforming block inversion for low power memory

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

ABSTRACT

Research paper thumbnail of Conforming inverted data store for low power memory

International Symposium on Low Power Electronics and Design, 1999

In this paper, we propose a 'conforming inverted data store' scheme for reducing the powe... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It re- duces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits differ- ent from the precharging value. In this

Research paper thumbnail of Verification of a microprocessor using real world applications

Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361), 1999

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Customization of a CISC Processor Core for Low-Power Applications

This paper describes a core-customization process of a CISC processor core for a given applicatio... more This paper describes a core-customization process of a CISC processor core for a given application program. It aims at the power reduction in the CISC processor core by fully utilizing the microcode-based control scheme, that is one of the most characterizing features of a CISC processor. The optimization process includes two key techniques, generation of application-specific complex in- structions (ASCI) and low-power-oriented microcode-ROM compilation, which independently operate at the two dif- ferent levels of optimization. As a means of architectural level of optimization, application-specific complex instruc- tions are generated so as to reduce the activities of fetch and decode units, and in the point of physical level of opti- mization, the microcode-ROM is compiled to reduce the bit- line toggling for each microcode-ROM access. Our experi- mental results based on transistor-level simulation show the proposed techniques can jointly reduce the total power con- sumption of the...

Research paper thumbnail of Design and Implementation of Content Switching Network Processor and Scalable Switch Fabric

This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.

Research paper thumbnail of Customization of embedded system for low-power application = 저전력 어플리케이션을 위한 임베디드 시스템의 최적화

Research paper thumbnail of High performance cache design for superscalar microprocessor = 수퍼스칼라 마이크로프로세서를 위한 고성능 분리형 캐쉬의 설계

Research paper thumbnail of Verification of a microprocessor using real world applications

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Customization of a CISC processor core for low-power applications

Research paper thumbnail of Conforming block inversion for low power memory

IEEE Transactions on Very Large Scale Integration Systems, Feb 1, 2002

Research paper thumbnail of Conforming inverted data store for low power memory

In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consum... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It reduces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits different from the precharging value. In this way, bitline toggling during memory access is minimized and this ultimately contributes to a reduction in power consumption. We develop two practical implementations for the proposed method, that are vertical strip, and horizontal strip inversion schemes. Simulation results indicate that implementation of the strip-based inversion schemes contribute to a power reduction up to 50%.

Research paper thumbnail of An Efficient Approach to Functional Verification of Complex Processors

Research paper thumbnail of Systeme, procede et article de fabrication destines au stockage de datagramme entrant dans une matrice de commutation dans un systeme de jeu de puces de matrice de commutation

L'invention concerne un systeme, un procede et un article de fabrication destines a stocker u... more L'invention concerne un systeme, un procede et un article de fabrication destines a stocker un datagramme entrant (3702) dans une matrice de commutation (3706) d'un dispositif de commutation (104). La matrice de commutation (3706) comporte une paire de tampons (3708, 3710) possedant chacun deux parties (3712, 3714). Les donnees du datagramme (3702) sont recues et les parties du tampon sont remplies de facon sequentielle avec les donnees. Un transfert de donnees est periodiquement accorde par les tampons (3708, 3710) dans la matrice de commutation (3706). A chaque fois que le transfert de donnees est accorde dans la sequence que les parties du tampon (3712, 3714) ont remplie, les donnees se trouvant dans une des parties du tampon peuvent etre transferees dans la matrice de commutation (3706).

Research paper thumbnail of Customization of embedded system for low-power application = 저전력 어플리케이션을 위한 임베디드 시스템의 최적화

Research paper thumbnail of High performance cache design for superscalar microprocessor = 수퍼스칼라 마이크로프로세서를 위한 고성능 분리형 캐쉬의 설계

Research paper thumbnail of Design and Implementation of Content Switching Network Processor and Scalable Switch Fabric

This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.

Research paper thumbnail of HDL Saver Allowing Restrat after Souce Modification

Research paper thumbnail of Verification of a microprocessor using real world applications

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Switch Expansion Architecture Using Local Switching Network

IEEE International Conference on Communications, 2000

We propose a scheme for expanding switch modules to form a larger switch using the so-called loca... more We propose a scheme for expanding switch modules to form a larger switch using the so-called local switching obtained by allowing bidirectional connection to the conventional three-stage Clos (1953) network switch. The performance of the proposed expansion architecture was compared to that of the conventional three-stage Clos network. We simulated 32×32 switching systems using two expansion architectures with the same

Research paper thumbnail of Conforming block inversion for low power memory

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

ABSTRACT

Research paper thumbnail of Conforming inverted data store for low power memory

International Symposium on Low Power Electronics and Design, 1999

In this paper, we propose a 'conforming inverted data store' scheme for reducing the powe... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It re- duces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits differ- ent from the precharging value. In this

Research paper thumbnail of Verification of a microprocessor using real world applications

Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361), 1999

In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.

Research paper thumbnail of Customization of a CISC Processor Core for Low-Power Applications

This paper describes a core-customization process of a CISC processor core for a given applicatio... more This paper describes a core-customization process of a CISC processor core for a given application program. It aims at the power reduction in the CISC processor core by fully utilizing the microcode-based control scheme, that is one of the most characterizing features of a CISC processor. The optimization process includes two key techniques, generation of application-specific complex in- structions (ASCI) and low-power-oriented microcode-ROM compilation, which independently operate at the two dif- ferent levels of optimization. As a means of architectural level of optimization, application-specific complex instruc- tions are generated so as to reduce the activities of fetch and decode units, and in the point of physical level of opti- mization, the microcode-ROM is compiled to reduce the bit- line toggling for each microcode-ROM access. Our experi- mental results based on transistor-level simulation show the proposed techniques can jointly reduce the total power con- sumption of the...

Research paper thumbnail of Design and Implementation of Content Switching Network Processor and Scalable Switch Fabric

This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.