You-Sung Chang - Academia.edu (original) (raw)
Papers by You-Sung Chang
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
IEEE Transactions on Very Large Scale Integration Systems, Feb 1, 2002
In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consum... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It reduces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits different from the precharging value. In this way, bitline toggling during memory access is minimized and this ultimately contributes to a reduction in power consumption. We develop two practical implementations for the proposed method, that are vertical strip, and horizontal strip inversion schemes. Simulation results indicate that implementation of the strip-based inversion schemes contribute to a power reduction up to 50%.
L'invention concerne un systeme, un procede et un article de fabrication destines a stocker u... more L'invention concerne un systeme, un procede et un article de fabrication destines a stocker un datagramme entrant (3702) dans une matrice de commutation (3706) d'un dispositif de commutation (104). La matrice de commutation (3706) comporte une paire de tampons (3708, 3710) possedant chacun deux parties (3712, 3714). Les donnees du datagramme (3702) sont recues et les parties du tampon sont remplies de facon sequentielle avec les donnees. Un transfert de donnees est periodiquement accorde par les tampons (3708, 3710) dans la matrice de commutation (3706). A chaque fois que le transfert de donnees est accorde dans la sequence que les parties du tampon (3712, 3714) ont remplie, les donnees se trouvant dans une des parties du tampon peuvent etre transferees dans la matrice de commutation (3706).
This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
IEEE International Conference on Communications, 2000
We propose a scheme for expanding switch modules to form a larger switch using the so-called loca... more We propose a scheme for expanding switch modules to form a larger switch using the so-called local switching obtained by allowing bidirectional connection to the conventional three-stage Clos (1953) network switch. The performance of the proposed expansion architecture was compared to that of the conventional three-stage Clos network. We simulated 32×32 switching systems using two expansion architectures with the same
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ABSTRACT
International Symposium on Low Power Electronics and Design, 1999
In this paper, we propose a 'conforming inverted data store' scheme for reducing the powe... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It re- duces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits differ- ent from the precharging value. In this
Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361), 1999
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
This paper describes a core-customization process of a CISC processor core for a given applicatio... more This paper describes a core-customization process of a CISC processor core for a given application program. It aims at the power reduction in the CISC processor core by fully utilizing the microcode-based control scheme, that is one of the most characterizing features of a CISC processor. The optimization process includes two key techniques, generation of application-specific complex in- structions (ASCI) and low-power-oriented microcode-ROM compilation, which independently operate at the two dif- ferent levels of optimization. As a means of architectural level of optimization, application-specific complex instruc- tions are generated so as to reduce the activities of fetch and decode units, and in the point of physical level of opti- mization, the microcode-ROM is compiled to reduce the bit- line toggling for each microcode-ROM access. Our experi- mental results based on transistor-level simulation show the proposed techniques can jointly reduce the total power con- sumption of the...
This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
IEEE Transactions on Very Large Scale Integration Systems, Feb 1, 2002
In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consum... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It reduces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits different from the precharging value. In this way, bitline toggling during memory access is minimized and this ultimately contributes to a reduction in power consumption. We develop two practical implementations for the proposed method, that are vertical strip, and horizontal strip inversion schemes. Simulation results indicate that implementation of the strip-based inversion schemes contribute to a power reduction up to 50%.
L'invention concerne un systeme, un procede et un article de fabrication destines a stocker u... more L'invention concerne un systeme, un procede et un article de fabrication destines a stocker un datagramme entrant (3702) dans une matrice de commutation (3706) d'un dispositif de commutation (104). La matrice de commutation (3706) comporte une paire de tampons (3708, 3710) possedant chacun deux parties (3712, 3714). Les donnees du datagramme (3702) sont recues et les parties du tampon sont remplies de facon sequentielle avec les donnees. Un transfert de donnees est periodiquement accorde par les tampons (3708, 3710) dans la matrice de commutation (3706). A chaque fois que le transfert de donnees est accorde dans la sequence que les parties du tampon (3712, 3714) ont remplie, les donnees se trouvant dans une des parties du tampon peuvent etre transferees dans la matrice de commutation (3706).
This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
IEEE International Conference on Communications, 2000
We propose a scheme for expanding switch modules to form a larger switch using the so-called loca... more We propose a scheme for expanding switch modules to form a larger switch using the so-called local switching obtained by allowing bidirectional connection to the conventional three-stage Clos (1953) network switch. The performance of the proposed expansion architecture was compared to that of the conventional three-stage Clos network. We simulated 32×32 switching systems using two expansion architectures with the same
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ABSTRACT
International Symposium on Low Power Electronics and Design, 1999
In this paper, we propose a 'conforming inverted data store' scheme for reducing the powe... more In this paper, we propose a 'conforming inverted data store' scheme for reducing the power consumption in memory components. It re- duces the power consumption by conforming memory contents to a precharging value of the memory. It selectively stores normal or inverted data so to reduce the total number of accessing bits differ- ent from the precharging value. In this
Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361), 1999
In this paper, we describe a fast and convenient verification methodology for microprocessor usin... more In this paper, we describe a fast and convenient verification methodology for microprocessor using large-size, real application programs as test vectors. The verification environment is based on automatic consistency checking between the golden behavioral reference model and the target HDL model, which are run in an handshaking fashion. In conjunction with the automatic comparison facility, a new HDL saver is proposed to accelerate the verification process. The proposed saver allows ' restart' from the nearest checkpoint before the point of inconsistency detection regardless of whether any modification on the source code is made or not. It is to be contrasted with conventional saver that does not allow restart when some design change, or debugging is made. We have proved the effectiveness of the environment through applying it to a realworld example, i.e., Pentium-compatible processor design process. It was shown that the HDL verification with the proposed saver can be faster and more flexible than the hardware emulation approach. In short, it was demonstrated that restartability with source code modification capability is very important in obtaining the short debugging turnaround time by eliminating a large number of redundant simulations.
This paper describes a core-customization process of a CISC processor core for a given applicatio... more This paper describes a core-customization process of a CISC processor core for a given application program. It aims at the power reduction in the CISC processor core by fully utilizing the microcode-based control scheme, that is one of the most characterizing features of a CISC processor. The optimization process includes two key techniques, generation of application-specific complex in- structions (ASCI) and low-power-oriented microcode-ROM compilation, which independently operate at the two dif- ferent levels of optimization. As a means of architectural level of optimization, application-specific complex instruc- tions are generated so as to reduce the activities of fetch and decode units, and in the point of physical level of opti- mization, the microcode-ROM is compiled to reduce the bit- line toggling for each microcode-ROM access. Our experi- mental results based on transistor-level simulation show the proposed techniques can jointly reduce the total power con- sumption of the...
This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps ... more This paper proposes a network pro- cessor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.