mohsen naderi | Sorbonne University (original) (raw)
Papers by mohsen naderi
This paper investigates ideological aspects of literary translations by specifically focusing on ... more This paper investigates ideological aspects of literary translations by specifically focusing on the analysis of the textual features of a literary source text (namely, Sadeq Hedayat's Buf-e Kur [The Blind Owl], a novel originally published in Persian in 1937) and its corresponding translated target text (namely, The Blind Owl, rendered into English by D. P. Costello in 1957). Inspired mainly by theoretical framework for the analysis and assessment of translated works, the researchers have analysed the target text by accounting for the sets of constraints relating to genre, discourse, and text as semiotic systems within which the expression of ideology takes place. In order to indicate the translator's ideological orientations, the study has mainly concentrated on three textual features, namely, "transitivity shifts", "nominalizations", and "modality shifts", together with a handful of lexical choices suggesting "domestication". Venuti's proposal (1995) -that the present status of English as a "dominant" language serves as a motive for translators to adopt a "domesticating" strategy -has also been drawn on.
In this paper we present a design tool for automatic synthesis of Verilog behavioral description ... more In this paper we present a design tool for automatic synthesis of Verilog behavioral description of an asynchronous circuit into delay insensitive presynthesized library modules, using syntax directed techniques. Our design tool can also generate appropriate output to support implementing the circuit on ASICs and LUT-based FPGAs consequently rapid prototyping of the asynchronous circuit becomes readily available using the proposed tool.
Although asynchronous circuits are accepted as low-power, low-EMI and high-performance circuits, ... more Although asynchronous circuits are accepted as low-power, low-EMI and high-performance circuits, the roadblock to wide acceptance of asynchronous design methodology is poor CAD support, especially physical design tool. There are few academic design tools for asynchronous circuit design and synthesis, but there is neither a published tool nor a published document on physical design of these circuits.
This paper presents a robust and low-power Viterbi Decoder designed based on asynchronous archite... more This paper presents a robust and low-power Viterbi Decoder designed based on asynchronous architecture. The design is based upon Quasi Delay Insensitive (QDI) timing model which leads to a robust functionality for the decoder. To lower the power consumption of the decoder further, an optimization technique to reduce the power dissipation is applied to add-compare-select (ACS) unit of the decoder. The simulation results shows a 20% reduction in the power consumption for the asynchronous design compared to the synchronous design in 0.35μm CMOS technology with a power supply of 2.5V. The throughput for the circuit is 50 MS/s.
This paper focuses on a clock generation scheme for implementation of GALS circuits on commercial... more This paper focuses on a clock generation scheme for implementation of GALS circuits on commercial FPGAs which are mostly synchronous. Previously overlooked timing problems of existing pausible clock generators are explored and a novel clock generator is introduced. To validate the proposed solution we implemented the clock generator and a simple port controller on FPGA. In addition, general design considerations to successfully implement a GALS circuit on FPGAs are discussed. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach can improve both power and performance of the system using different partitioning strategies.
Page 1. Persia: an Asynchronous Synthesis Tool Based on Alain Martin's Method Arash Saifhash... more Page 1. Persia: an Asynchronous Synthesis Tool Based on Alain Martin's Method Arash Saifhashemi, Mohsen Naderi K. Saleh, M. Salehi H. Pedram Email: {saif, naderi, saleh, salehi, pedram}@ce.aut.ac.ir Amirkabir University of Technology (Tehran Polytechnic) ...
Globally Asynchronous Locally Synchronous (GALS) circuits on synchronous commercial FPGAs. A libr... more Globally Asynchronous Locally Synchronous (GALS) circuits on synchronous commercial FPGAs. A library of required elements for implementing GALS circuits is proposed and general design considerations to successfully implement a GALS circuit on FPGA are discussed. The library includes clock generators and arbiters, and different port controllers. Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19.6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.
template-based synthesis of asynchronous circuits. The tool, named Template Synthesizer (TSYN) an... more template-based synthesis of asynchronous circuits. The tool, named Template Synthesizer (TSYN) and it is a part of a complete asynchronous design flow named Persia. This tool transforms a behavioral description of a circuit to a sized transistor net-list. The input behavioral description must fit into some previously known templates. These templates are general enough to allow implementation of almost all circuit blocks. Finally, an asynchronous Reed-Solomon Decoder is synthesized using this tool.
Difficulties of synchronous circuits such as clock skew, power consumption, worst-case delay, and... more Difficulties of synchronous circuits such as clock skew, power consumption, worst-case delay, and physical sensitivity pave way for asynchronous designs. There are different asynchronous delay models; among them Delay Insensitive (DI) is one of the most popular models. A ...
Power reduction is one of the main reasons for designing asynchronous circuits. Asynchronous circ... more Power reduction is one of the main reasons for designing asynchronous circuits. Asynchronous circuits are inherently capable of having low power consumption. In this paper, we introduce five methods to achieve even lower power consumption. We have applied these methods to some sample circuits and the experimental results show that power reduction of 20% to 41% is obtainable.
In this paper we will show that it is both possible and easy to use a standard HDL language like ... more In this paper we will show that it is both possible and easy to use a standard HDL language like Verilog HDL, along with PLI to model asynchronous circuits at all levels of abstraction, including the behavioral level (CSP level). Our method allows CSP (Communicating Sequential ...
This paper presents a new method to implement a multiplier using the Quasi Delay Insensitive (QDI... more This paper presents a new method to implement a multiplier using the Quasi Delay Insensitive (QDI) approach. QDI circuits allow unbounded delays on wires and gates, and require the difference among the delays in forks to be less than the delays of their terminating gates. To implement the Booth multiplier following the QDI approach, we considered Martin's method. In this method, an asynchronous circuit is considered as a set of cells that communicate through a handshaking protocol, and is synthesized from a high level definition through different levels of translation. The main problem related to the resulting circuits their considerable overhead due to the implementation of handshaking protocols. In our proposed method, the overhead is reduced 50% by separating the control and data path units. This solution increases the forks, and causes complexity in physical implementation. By applying some of the rules derived from Martin's method, the forks became locally limited to ease up the physical implementation.
This paper presents a fully asynchronous 1k×4bit memory array with no assumptions regarding to ga... more This paper presents a fully asynchronous 1k×4bit memory array with no assumptions regarding to gate delays; thus suitable for systems with considerable changes in the voltage level and the environmental variables. The nominal voltage level was 3.3v, and it is shown that the design can still operate properly at the voltage levels as low as 1.2v. The design is based on QDI timing model and implemented using the Martin method.
This paper introduces a methodology for prototyping globally asynchronous locally synchronous (GA... more This paper introduces a methodology for prototyping globally asynchronous locally synchronous (GALS) circuits on synchronous commercial FPGAs. A library of required elements for implementing GALS circuits is proposed and general design considerations to successfully implement a GALS circuit on FPGA are discussed. The library includes clock generators and arbiters, and different port controllers. Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19,6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.
Different implementations of these circuits and their advantages and disadvantages are explored. ... more Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19.6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.
This paper investigates ideological aspects of literary translations by specifically focusing on ... more This paper investigates ideological aspects of literary translations by specifically focusing on the analysis of the textual features of a literary source text (namely, Sadeq Hedayat's Buf-e Kur [The Blind Owl], a novel originally published in Persian in 1937) and its corresponding translated target text (namely, The Blind Owl, rendered into English by D. P. Costello in 1957). Inspired mainly by theoretical framework for the analysis and assessment of translated works, the researchers have analysed the target text by accounting for the sets of constraints relating to genre, discourse, and text as semiotic systems within which the expression of ideology takes place. In order to indicate the translator's ideological orientations, the study has mainly concentrated on three textual features, namely, "transitivity shifts", "nominalizations", and "modality shifts", together with a handful of lexical choices suggesting "domestication". Venuti's proposal (1995) -that the present status of English as a "dominant" language serves as a motive for translators to adopt a "domesticating" strategy -has also been drawn on.
In this paper we present a design tool for automatic synthesis of Verilog behavioral description ... more In this paper we present a design tool for automatic synthesis of Verilog behavioral description of an asynchronous circuit into delay insensitive presynthesized library modules, using syntax directed techniques. Our design tool can also generate appropriate output to support implementing the circuit on ASICs and LUT-based FPGAs consequently rapid prototyping of the asynchronous circuit becomes readily available using the proposed tool.
Although asynchronous circuits are accepted as low-power, low-EMI and high-performance circuits, ... more Although asynchronous circuits are accepted as low-power, low-EMI and high-performance circuits, the roadblock to wide acceptance of asynchronous design methodology is poor CAD support, especially physical design tool. There are few academic design tools for asynchronous circuit design and synthesis, but there is neither a published tool nor a published document on physical design of these circuits.
This paper presents a robust and low-power Viterbi Decoder designed based on asynchronous archite... more This paper presents a robust and low-power Viterbi Decoder designed based on asynchronous architecture. The design is based upon Quasi Delay Insensitive (QDI) timing model which leads to a robust functionality for the decoder. To lower the power consumption of the decoder further, an optimization technique to reduce the power dissipation is applied to add-compare-select (ACS) unit of the decoder. The simulation results shows a 20% reduction in the power consumption for the asynchronous design compared to the synchronous design in 0.35μm CMOS technology with a power supply of 2.5V. The throughput for the circuit is 50 MS/s.
This paper focuses on a clock generation scheme for implementation of GALS circuits on commercial... more This paper focuses on a clock generation scheme for implementation of GALS circuits on commercial FPGAs which are mostly synchronous. Previously overlooked timing problems of existing pausible clock generators are explored and a novel clock generator is introduced. To validate the proposed solution we implemented the clock generator and a simple port controller on FPGA. In addition, general design considerations to successfully implement a GALS circuit on FPGAs are discussed. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach can improve both power and performance of the system using different partitioning strategies.
Page 1. Persia: an Asynchronous Synthesis Tool Based on Alain Martin's Method Arash Saifhash... more Page 1. Persia: an Asynchronous Synthesis Tool Based on Alain Martin's Method Arash Saifhashemi, Mohsen Naderi K. Saleh, M. Salehi H. Pedram Email: {saif, naderi, saleh, salehi, pedram}@ce.aut.ac.ir Amirkabir University of Technology (Tehran Polytechnic) ...
Globally Asynchronous Locally Synchronous (GALS) circuits on synchronous commercial FPGAs. A libr... more Globally Asynchronous Locally Synchronous (GALS) circuits on synchronous commercial FPGAs. A library of required elements for implementing GALS circuits is proposed and general design considerations to successfully implement a GALS circuit on FPGA are discussed. The library includes clock generators and arbiters, and different port controllers. Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19.6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.
template-based synthesis of asynchronous circuits. The tool, named Template Synthesizer (TSYN) an... more template-based synthesis of asynchronous circuits. The tool, named Template Synthesizer (TSYN) and it is a part of a complete asynchronous design flow named Persia. This tool transforms a behavioral description of a circuit to a sized transistor net-list. The input behavioral description must fit into some previously known templates. These templates are general enough to allow implementation of almost all circuit blocks. Finally, an asynchronous Reed-Solomon Decoder is synthesized using this tool.
Difficulties of synchronous circuits such as clock skew, power consumption, worst-case delay, and... more Difficulties of synchronous circuits such as clock skew, power consumption, worst-case delay, and physical sensitivity pave way for asynchronous designs. There are different asynchronous delay models; among them Delay Insensitive (DI) is one of the most popular models. A ...
Power reduction is one of the main reasons for designing asynchronous circuits. Asynchronous circ... more Power reduction is one of the main reasons for designing asynchronous circuits. Asynchronous circuits are inherently capable of having low power consumption. In this paper, we introduce five methods to achieve even lower power consumption. We have applied these methods to some sample circuits and the experimental results show that power reduction of 20% to 41% is obtainable.
In this paper we will show that it is both possible and easy to use a standard HDL language like ... more In this paper we will show that it is both possible and easy to use a standard HDL language like Verilog HDL, along with PLI to model asynchronous circuits at all levels of abstraction, including the behavioral level (CSP level). Our method allows CSP (Communicating Sequential ...
This paper presents a new method to implement a multiplier using the Quasi Delay Insensitive (QDI... more This paper presents a new method to implement a multiplier using the Quasi Delay Insensitive (QDI) approach. QDI circuits allow unbounded delays on wires and gates, and require the difference among the delays in forks to be less than the delays of their terminating gates. To implement the Booth multiplier following the QDI approach, we considered Martin's method. In this method, an asynchronous circuit is considered as a set of cells that communicate through a handshaking protocol, and is synthesized from a high level definition through different levels of translation. The main problem related to the resulting circuits their considerable overhead due to the implementation of handshaking protocols. In our proposed method, the overhead is reduced 50% by separating the control and data path units. This solution increases the forks, and causes complexity in physical implementation. By applying some of the rules derived from Martin's method, the forks became locally limited to ease up the physical implementation.
This paper presents a fully asynchronous 1k×4bit memory array with no assumptions regarding to ga... more This paper presents a fully asynchronous 1k×4bit memory array with no assumptions regarding to gate delays; thus suitable for systems with considerable changes in the voltage level and the environmental variables. The nominal voltage level was 3.3v, and it is shown that the design can still operate properly at the voltage levels as low as 1.2v. The design is based on QDI timing model and implemented using the Martin method.
This paper introduces a methodology for prototyping globally asynchronous locally synchronous (GA... more This paper introduces a methodology for prototyping globally asynchronous locally synchronous (GALS) circuits on synchronous commercial FPGAs. A library of required elements for implementing GALS circuits is proposed and general design considerations to successfully implement a GALS circuit on FPGA are discussed. The library includes clock generators and arbiters, and different port controllers. Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19,6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.
Different implementations of these circuits and their advantages and disadvantages are explored. ... more Different implementations of these circuits and their advantages and disadvantages are explored. At the end we present a GALS Reed-Solomon decoder as a practical example. The results show that the GALS approach improves the performance of the circuit by 11% and reduces the power consumption by 18.7% to 19.6% considering different error rates. On the other hand, the area of the circuit is increased by 51% which is acceptable considering that a pure synchronous circuit including a central controller is decomposed to generate GALS system and 29% of this overhead belongs to distributing controller in different modules. Deploying better decomposition methods can reduce this overhead substantially.