Aseem Sayal | Google - Academia.edu (original) (raw)

Papers by Aseem Sayal

Research paper thumbnail of EDA design for Microscale Modular Assembled ASIC (M2A2) circuits

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Heterogeneous integration of components onto compact devices using moire based metrology and vacuum based pick-and-place

Bookmarks Related papers MentionsView impact

Research paper thumbnail of COMPAC: Compressed Time-Domain, Pooling-Aware Convolution CNN Engine With Reduced Data Movement for Energy-Efficient AI Computing

In this work, we demonstrate a compressed time-domain, pooling-aware convolution (COMPAC) convolu... more In this work, we demonstrate a compressed time-domain, pooling-aware convolution (COMPAC) convolutional neural network (CNN) engine for energy-efficient edge AI computing by performing multi-bit input and multi-bit weight multiply-and-accumulate (MAC) operations in the time domain. The multi-bit inputs are compactly represented as a single pulsewidth encoded input. This translates into reduced switching capacitance ( CtextDYNC_{\text {DYN}}CtextDYN ), compared with the baseline digital implementation, and can enable low-power neural network computing in an edge device. COMPAC CNN engine employs a novel and an improved version of the memory delay line (MDL) supporting the time residue scaling to perform the signed accumulation of multi-bit input and multi-bit weight products in the time domain. The compressed time-domain (CTD) approach is proposed to improve the throughput in time encoding of the input activations. The simulation results of the proposed CTD approach on the AlexNet CNN over 1000 Im...

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Galvanically Isolated, Power and Electromagnetic Side-Channel Attack Resilient Secure AES Core with Integrated Charge Pump based Power Management

2021 IEEE Custom Integrated Circuits Conference (CICC), 2021

A galvanic isolation (GI) technique for cryptographic cores is proposed to mitigate power and ele... more A galvanic isolation (GI) technique for cryptographic cores is proposed to mitigate power and electromagnetic (EM) sidechannel analysis (SCA) attacks. The design uses deep N-well technology and an integrated charge pump-based power delivery and management to completely isolate VCC, VSS, and substrate nodes from the external supply and ground pins, improving the SCA resilience due to supply as well as ground bounce. Measured results from a 128-bit Advanced Encryption Standard (AES) core implemented in a 40nm CMOS show gt600mathrmx\gt600\mathrm{x}gt600mathrmx and gt220mathrmx\gt220\mathrm{x}gt220mathrmx improvement against a correlation power analysis (CPA) and coarse-grained EM SCA attack, respectively, while operating at 20% lower frequency, consuming 2.3x more power, and occupying 0.0136 mm2 larger area.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of 14.4 All-Digital Time-Domain CNN Engine Using Bidirectional Memory Delay Lines for Energy-Efficient Edge Computing

2019 IEEE International Solid- State Circuits Conference - (ISSCC), 2019

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of versatile modulator using CDTAs

2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA), 2013

ABSTRACT In this paper, a versatile modulator circuit that can be used for amplitude modulation, ... more ABSTRACT In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency modulation, delta modulation and sigma delta modulation has been proposed. The circuit has been implemented with the help of CDTA blocks, grounded capacitor and resistors. The circuit basically consists of an integrator, a Schmitt trigger and an inherent clock which is used as a carrier signal thereby obviating the need for an external clock. Due to its compact size, it is suitable for IC realisation. The performance comparison of different modulation modes is made in terms of power dissipation, signal to noise ratio (SNR) and output noise voltage.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of DVCCTA Based Versatile Modulator

Active and Passive Electronic Components, 2014

A Differential Voltage Current Conveyor Transconductance Amplifier (DVCCTA) based versatile modul... more A Differential Voltage Current Conveyor Transconductance Amplifier (DVCCTA) based versatile modulator is proposed which can work as an amplitude modulator, frequency modulator, delta modulator, and sigma delta modulator. The modulator operational scheme uses pulse generator as a core and its output is used as carrier signal. A DVCCTA based pulse generator is proposed first and subsequently configured as different modulators. Compact realization is the key feature of the proposed circuit as it uses two DVCCTA; a grounded resistor and a grounded capacitor hence are appropriate for IC realization. The functionality of the proposed circuit is verified through SPICE simulations using TSMC 0.25 μm CMOS process model parameters. The performance parameters such as power dissipation and noise for various modulator schemes are also obtained.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of CDTA based frequency agile filter

2013 IEEE International Conference on Signal Processing, Computing and Control (ISPCC), 2013

ABSTRACT This paper presents frequency agile filter based on current difference transconductance ... more ABSTRACT This paper presents frequency agile filter based on current difference transconductance amplifier (CDTA). The agile filters used in this work provide high agilty, tunability and quality factor while they are fully integrated configurations and not discrete systems. The use of grounded capacitors and resistor makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25μm TSMC CMOS technology model parameters. The performance evaluation is made in terms of power dissipation, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of CDTA and VDTA Based Frequency Agile Filters

Advances in Electronics, 2014

This paper presents frequency agile filters based on current difference transconductance amplifie... more This paper presents frequency agile filters based on current difference transconductance amplifier (CDTA) and voltage difference transconductance amplifier (VDTA). The proposed agile filter configurations employ grounded passive components and hence are suitable for integration. Extensive SPICE simulations using 0.25 μm TSMC CMOS technology model parameters are carried out for functional verification. The proposed configurations are compared in terms of performance parameters such as power dissipation, signal to noise ratio (SNR), and maximum output noise voltage.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of CDTA based semi Gaussian shapers for detector readout front ends

2013 International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

ABSTRACT This paper presents voltage and current mode semi Gaussian (S-G) shaper circuits based o... more ABSTRACT This paper presents voltage and current mode semi Gaussian (S-G) shaper circuits based on current difference transconductance amplifier (CDTA). The use of grounded capacitors makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25 µm TSMC CMOS technology model parameters. The performance comparison is made in terms of total harmonic distortion, power dissipation, dynamic range, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of DDCCTA based semi Gaussian shapers for detector readout front ends

2013 International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

Bookmarks Related papers MentionsView impact

Research paper thumbnail of 16.2 eDRAM-CIM: Compute-In-Memory Design with Reconfigurable Embedded-Dynamic-Memory Array Realizing Adaptive Data Converters and Charge-Domain Computing

2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021

Bookmarks Related papers MentionsView impact

Research paper thumbnail of M2A2: Microscale Modular Assembled ASICs for High-Mix, Low-Volume, Heterogeneously Integrated Designs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

With CMOS process technology scaling, the mask cost for fabricating nano-scale transistors, conta... more With CMOS process technology scaling, the mask cost for fabricating nano-scale transistors, contacts and interconnects has become prohibitively expensive especially for low volume designs. Moreover, higher transistor density has resulted in higher design complexity and large-sized die, which has led to an increase in the design cycle time and degradation in the process yield. These challenges are forcing low-volume ASICs (Application Specific Integrated Circuits) towards highly sub-optimal FPGAs (Field Programmable Gate Arrays). In this paper, we propose a new approach for designing and fabricating high-mix, low-volume heterogeneously integrated ASICs, referred to as Microscale Modular Assembled ASIC (M2A2), consisting of (1) pick-and-place assembly of prefabricated blocks (PFBs) which utilizes the nano-precision placement capabilities developed in Jet-and-Flash Imprint Lithography (J-FIL) and, (2) EDA design methodology utilizing unsupervised learning and graph-matching techniques. The EDA methodology leverages existing CAD tools infrastructure for easy adoption into the current EDA ecosystem. The proposed fabrication technology makes use of pick-and-place assembly technique to allow nano-precise assembly of PFBs. The PFBs can be fabricated in advanced process nodes and then knitted together on a wafer substrate. Custom-designed low-cost back-end metal layers can then be created/placed on top of the PFB knitted layer to realize a variety of high-mix, low-volume ASIC designs. M2A2 would allow more flexibility in front-end design by optimal PFB selection and knitting compared to the earlier proposed approaches such as structured ASICs (sASICs). In this paper, the performance of M2A2 based designs is compared with different design technologies such as baseline ASICs, FPGAs and sASICs at 16nm, 40nm and 130nm CMOS process nodes. The post-PNR simulation results achieved over 15 IWLS benchmarks show that the proposed M2A2 designs achieve 27.11x-34.89x reduced Power-Delay-Product (PDP) compared to FPGAs, and incur 1.69x-2.36x larger area compared to the baseline ASICs. The M2A2 designs achieve 15%-68.5% smaller area and 8.5%-52% higher performance compared to the sASIC methodologies. Moreover, the key fabrication steps in the proposed M2A2 technology are presented. The experimental fab results along with the proposed EDA flow simulations show promising results for the proposed M2A2 technology. Design trade-offs and process challenges for large scale deployment of the M2A2 technology are discussed along with their mitigation strategies.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A 12.08-TOPS/W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines for Energy Efficient Edge Computing

IEEE Journal of Solid-State Circuits

In this article, we demonstrate an energy-efficient convolutional neural network (CNN) engine by ... more In this article, we demonstrate an energy-efficient convolutional neural network (CNN) engine by performing multiply-and-accumulate (MAC) operations in the time domain. The multi-bit inputs are compactly represented as a single pulse width encoded input. This translates into reduced switching capacitance (CDYN), compared to baseline digital implementation, and can enable low power neural network computing in an edge device. The time-domain CNN engine employs a novel bi-directional memory delay line (MDL) unit to perform signed accumulation of input and weight products. The proposed MDL design leverages standard digital circuits and does not require any capacitors and complex analog-to-digital converters (ADCs) to realize the convolution operation, thereby enabling easy scaling across the process technology nodes. Four speed-up modes and a configurable MDL length are supported to address throughput versus accuracy trade-off of the time-domain computing approach. Delay calibration units have been accommodated to mitigate the process variation induced delay mismatch among concurrently operating MDL units. The proposed time-domain MDL design implements a LeNet-5 CNN engine in a commercial 40-nm CMOS process achieving energy efficiency of 12.08 TOPS/W, a throughput of 0.365 GOPS at 537 mV in the 16× speed-up mode. 40-nm CMOS test-chip measurements over 100 MNIST images show 97% classification accuracy. Simulation results over the entire 10000 MNIST validation dataset images taking into account the circuit non-ideal effects of the MDL-based time-domain approach show a classification accuracy of 98.42%. The test-chip is operational down to the near-threshold voltage (up to 375 mV) while maintaining the classification accuracy over 90% in the 1× speed-up mode. Furthermore, two methods of scaling MDLs to multi-bit weights are proposed. Simulation results for 1000-class AlexNet over 50000 ImageNet validation dataset images show classification accuracy loss within 1% when compared with software implementation. The proposed MDL based time-domain approach performing 1-bit/8-bit weight and 8-bit input MAC operations when compared with the correspond- ing baseline digital implementations shows 2.09×-2.32× higher energy efficiency and 2.22×-3.45× smaller area.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of S-G Shapers for Detector Readout Front Ends

Signal Processing

This paper presents the realization of semi Gaussian (S-G) shaper circuits for detector readout f... more This paper presents the realization of semi Gaussian (S-G) shaper circuits for detector readout front ends. The circuits are based on the current difference transconductance amplifier (CDTA) and differential-difference current conveyor transconductance amplifier (DDCCTA) and can operate in voltage and current mode. The proposed circuits use grounded capacitors and thus make these suitable from the integration viewpoint. The theoretical propositions are verified through extensive SPICE simulations using 0.25μm TSMC CMOS technology model parameters. The performance of proposed topologies is compared in terms of total harmonic distortion, power dissipation, dynamic range, signal to noise ratio (SNR), and output noise. The performance comparison of CDTA, DDCCTA, CCII, and OTA based shapers proves that proposed CDTA and DDCCTA based shapers have advantageous performance features over existing CCII and OTA based shapers.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Review on Maximum Power Point Tracking Techniques for Photovoltaic Power Generating Systems

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of CDTA Based Frequency Agile Filter

IEEE International Conference on Signal Processing, Computing and Control (ISPCC), 2013

This paper presents frequency agile filter based on current difference transconductance amplifier... more This paper presents frequency agile filter based on current difference transconductance amplifier (CDTA). The agile filters used in this work provide high agilty, tunability and quality factor while they are fully integrated configurations and not discrete systems. The use of grounded capacitors and resistor makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25µm TSMC CMOS technology model parameters. The performance evaluation is made in terms of power dissipation, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of Versatile Modulator using CDTAs

IEEE International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA), 2013

In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency... more In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency modulation, delta modulation and sigma delta modulation has been proposed. The circuit has been implemented with the help of CDTA blocks, grounded capacitor and resistors. The circuit basically consists of an integrator, a Schmitt trigger and an inherent clock which is used as a carrier signal thereby obviating the need for an external clock. Due to its compact size, it is suitable for IC realisation. The performance comparison of different modulation modes is made in terms of power dissipation, signal to noise ratio (SNR) and output noise voltage. Index Terms-Current difference transconductance amplifier (CDTA), Amplitude Modulation, Frequency Modulation, Delta Modulation, Sigma Delta Modulation.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of MPPT Techniques for Photovoltaic System under Uniform Insolation and Partial Shading Conditions

IEEE Students Conference on Engineering and Systems, 2012

The photovoltaic energy is one of the renewable energies that has attracted the attention of rese... more The photovoltaic energy is one of the renewable energies that has attracted the attention of researchers in the recent decades. The photovoltaic generators exhibit nonlinear I-V and P-V characteristics. The maximum power produced varies with solar insolation and temperature. It requires maximum power point tracking (MPPT) control techniques to extract the maximum available power from PV arrays. Due to partial shading condition, the characteristics of a PV system considerably change and often exhibit several local maxima with one global maxima. Conventional Maximum Power Point Tracking techniques can easily be trapped at local maxima under partial shading. This significantly reduced the energy yield of the PV systems. In this paper, various MPPT algorithms for uniform insolation and partial shading conditions are reviewed with their merits and demerits. Also, some new algorithms are proposed which are shown to be more efficient than the existing ones.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of DDCCTA Based Semi Gaussian Shapers for Detector Readout Front Ends

IEEE International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

This paper presents semi Gaussian (S-G) shaper circuit based on differential difference current c... more This paper presents semi Gaussian (S-G) shaper circuit based on differential difference current conveyor transconductance amplifier (DDCCTA). The grounded capacitors make the realization suitable for integration. The proposed circuits are verified for functionality through extensive SPICE simulations using 0.25 11m TSMC CMOS technology model parameters. The performance comparison is made in terms of total harmonie distortion, power dissipation, dynamic range, signal to noise ratio (SNR) and output noise. Index Terrns-Current mode, Differential difference current conveyor transconductance amplifier (DDCCTA), Semi Gaussian shaper, Voltage mode.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of EDA design for Microscale Modular Assembled ASIC (M2A2) circuits

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Heterogeneous integration of components onto compact devices using moire based metrology and vacuum based pick-and-place

Bookmarks Related papers MentionsView impact

Research paper thumbnail of COMPAC: Compressed Time-Domain, Pooling-Aware Convolution CNN Engine With Reduced Data Movement for Energy-Efficient AI Computing

In this work, we demonstrate a compressed time-domain, pooling-aware convolution (COMPAC) convolu... more In this work, we demonstrate a compressed time-domain, pooling-aware convolution (COMPAC) convolutional neural network (CNN) engine for energy-efficient edge AI computing by performing multi-bit input and multi-bit weight multiply-and-accumulate (MAC) operations in the time domain. The multi-bit inputs are compactly represented as a single pulsewidth encoded input. This translates into reduced switching capacitance ( CtextDYNC_{\text {DYN}}CtextDYN ), compared with the baseline digital implementation, and can enable low-power neural network computing in an edge device. COMPAC CNN engine employs a novel and an improved version of the memory delay line (MDL) supporting the time residue scaling to perform the signed accumulation of multi-bit input and multi-bit weight products in the time domain. The compressed time-domain (CTD) approach is proposed to improve the throughput in time encoding of the input activations. The simulation results of the proposed CTD approach on the AlexNet CNN over 1000 Im...

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Galvanically Isolated, Power and Electromagnetic Side-Channel Attack Resilient Secure AES Core with Integrated Charge Pump based Power Management

2021 IEEE Custom Integrated Circuits Conference (CICC), 2021

A galvanic isolation (GI) technique for cryptographic cores is proposed to mitigate power and ele... more A galvanic isolation (GI) technique for cryptographic cores is proposed to mitigate power and electromagnetic (EM) sidechannel analysis (SCA) attacks. The design uses deep N-well technology and an integrated charge pump-based power delivery and management to completely isolate VCC, VSS, and substrate nodes from the external supply and ground pins, improving the SCA resilience due to supply as well as ground bounce. Measured results from a 128-bit Advanced Encryption Standard (AES) core implemented in a 40nm CMOS show gt600mathrmx\gt600\mathrm{x}gt600mathrmx and gt220mathrmx\gt220\mathrm{x}gt220mathrmx improvement against a correlation power analysis (CPA) and coarse-grained EM SCA attack, respectively, while operating at 20% lower frequency, consuming 2.3x more power, and occupying 0.0136 mm2 larger area.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of 14.4 All-Digital Time-Domain CNN Engine Using Bidirectional Memory Delay Lines for Energy-Efficient Edge Computing

2019 IEEE International Solid- State Circuits Conference - (ISSCC), 2019

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of versatile modulator using CDTAs

2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA), 2013

ABSTRACT In this paper, a versatile modulator circuit that can be used for amplitude modulation, ... more ABSTRACT In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency modulation, delta modulation and sigma delta modulation has been proposed. The circuit has been implemented with the help of CDTA blocks, grounded capacitor and resistors. The circuit basically consists of an integrator, a Schmitt trigger and an inherent clock which is used as a carrier signal thereby obviating the need for an external clock. Due to its compact size, it is suitable for IC realisation. The performance comparison of different modulation modes is made in terms of power dissipation, signal to noise ratio (SNR) and output noise voltage.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of DVCCTA Based Versatile Modulator

Active and Passive Electronic Components, 2014

A Differential Voltage Current Conveyor Transconductance Amplifier (DVCCTA) based versatile modul... more A Differential Voltage Current Conveyor Transconductance Amplifier (DVCCTA) based versatile modulator is proposed which can work as an amplitude modulator, frequency modulator, delta modulator, and sigma delta modulator. The modulator operational scheme uses pulse generator as a core and its output is used as carrier signal. A DVCCTA based pulse generator is proposed first and subsequently configured as different modulators. Compact realization is the key feature of the proposed circuit as it uses two DVCCTA; a grounded resistor and a grounded capacitor hence are appropriate for IC realization. The functionality of the proposed circuit is verified through SPICE simulations using TSMC 0.25 μm CMOS process model parameters. The performance parameters such as power dissipation and noise for various modulator schemes are also obtained.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of CDTA based frequency agile filter

2013 IEEE International Conference on Signal Processing, Computing and Control (ISPCC), 2013

ABSTRACT This paper presents frequency agile filter based on current difference transconductance ... more ABSTRACT This paper presents frequency agile filter based on current difference transconductance amplifier (CDTA). The agile filters used in this work provide high agilty, tunability and quality factor while they are fully integrated configurations and not discrete systems. The use of grounded capacitors and resistor makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25μm TSMC CMOS technology model parameters. The performance evaluation is made in terms of power dissipation, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of CDTA and VDTA Based Frequency Agile Filters

Advances in Electronics, 2014

This paper presents frequency agile filters based on current difference transconductance amplifie... more This paper presents frequency agile filters based on current difference transconductance amplifier (CDTA) and voltage difference transconductance amplifier (VDTA). The proposed agile filter configurations employ grounded passive components and hence are suitable for integration. Extensive SPICE simulations using 0.25 μm TSMC CMOS technology model parameters are carried out for functional verification. The proposed configurations are compared in terms of performance parameters such as power dissipation, signal to noise ratio (SNR), and maximum output noise voltage.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of CDTA based semi Gaussian shapers for detector readout front ends

2013 International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

ABSTRACT This paper presents voltage and current mode semi Gaussian (S-G) shaper circuits based o... more ABSTRACT This paper presents voltage and current mode semi Gaussian (S-G) shaper circuits based on current difference transconductance amplifier (CDTA). The use of grounded capacitors makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25 µm TSMC CMOS technology model parameters. The performance comparison is made in terms of total harmonic distortion, power dissipation, dynamic range, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of DDCCTA based semi Gaussian shapers for detector readout front ends

2013 International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

Bookmarks Related papers MentionsView impact

Research paper thumbnail of 16.2 eDRAM-CIM: Compute-In-Memory Design with Reconfigurable Embedded-Dynamic-Memory Array Realizing Adaptive Data Converters and Charge-Domain Computing

2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021

Bookmarks Related papers MentionsView impact

Research paper thumbnail of M2A2: Microscale Modular Assembled ASICs for High-Mix, Low-Volume, Heterogeneously Integrated Designs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

With CMOS process technology scaling, the mask cost for fabricating nano-scale transistors, conta... more With CMOS process technology scaling, the mask cost for fabricating nano-scale transistors, contacts and interconnects has become prohibitively expensive especially for low volume designs. Moreover, higher transistor density has resulted in higher design complexity and large-sized die, which has led to an increase in the design cycle time and degradation in the process yield. These challenges are forcing low-volume ASICs (Application Specific Integrated Circuits) towards highly sub-optimal FPGAs (Field Programmable Gate Arrays). In this paper, we propose a new approach for designing and fabricating high-mix, low-volume heterogeneously integrated ASICs, referred to as Microscale Modular Assembled ASIC (M2A2), consisting of (1) pick-and-place assembly of prefabricated blocks (PFBs) which utilizes the nano-precision placement capabilities developed in Jet-and-Flash Imprint Lithography (J-FIL) and, (2) EDA design methodology utilizing unsupervised learning and graph-matching techniques. The EDA methodology leverages existing CAD tools infrastructure for easy adoption into the current EDA ecosystem. The proposed fabrication technology makes use of pick-and-place assembly technique to allow nano-precise assembly of PFBs. The PFBs can be fabricated in advanced process nodes and then knitted together on a wafer substrate. Custom-designed low-cost back-end metal layers can then be created/placed on top of the PFB knitted layer to realize a variety of high-mix, low-volume ASIC designs. M2A2 would allow more flexibility in front-end design by optimal PFB selection and knitting compared to the earlier proposed approaches such as structured ASICs (sASICs). In this paper, the performance of M2A2 based designs is compared with different design technologies such as baseline ASICs, FPGAs and sASICs at 16nm, 40nm and 130nm CMOS process nodes. The post-PNR simulation results achieved over 15 IWLS benchmarks show that the proposed M2A2 designs achieve 27.11x-34.89x reduced Power-Delay-Product (PDP) compared to FPGAs, and incur 1.69x-2.36x larger area compared to the baseline ASICs. The M2A2 designs achieve 15%-68.5% smaller area and 8.5%-52% higher performance compared to the sASIC methodologies. Moreover, the key fabrication steps in the proposed M2A2 technology are presented. The experimental fab results along with the proposed EDA flow simulations show promising results for the proposed M2A2 technology. Design trade-offs and process challenges for large scale deployment of the M2A2 technology are discussed along with their mitigation strategies.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A 12.08-TOPS/W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines for Energy Efficient Edge Computing

IEEE Journal of Solid-State Circuits

In this article, we demonstrate an energy-efficient convolutional neural network (CNN) engine by ... more In this article, we demonstrate an energy-efficient convolutional neural network (CNN) engine by performing multiply-and-accumulate (MAC) operations in the time domain. The multi-bit inputs are compactly represented as a single pulse width encoded input. This translates into reduced switching capacitance (CDYN), compared to baseline digital implementation, and can enable low power neural network computing in an edge device. The time-domain CNN engine employs a novel bi-directional memory delay line (MDL) unit to perform signed accumulation of input and weight products. The proposed MDL design leverages standard digital circuits and does not require any capacitors and complex analog-to-digital converters (ADCs) to realize the convolution operation, thereby enabling easy scaling across the process technology nodes. Four speed-up modes and a configurable MDL length are supported to address throughput versus accuracy trade-off of the time-domain computing approach. Delay calibration units have been accommodated to mitigate the process variation induced delay mismatch among concurrently operating MDL units. The proposed time-domain MDL design implements a LeNet-5 CNN engine in a commercial 40-nm CMOS process achieving energy efficiency of 12.08 TOPS/W, a throughput of 0.365 GOPS at 537 mV in the 16× speed-up mode. 40-nm CMOS test-chip measurements over 100 MNIST images show 97% classification accuracy. Simulation results over the entire 10000 MNIST validation dataset images taking into account the circuit non-ideal effects of the MDL-based time-domain approach show a classification accuracy of 98.42%. The test-chip is operational down to the near-threshold voltage (up to 375 mV) while maintaining the classification accuracy over 90% in the 1× speed-up mode. Furthermore, two methods of scaling MDLs to multi-bit weights are proposed. Simulation results for 1000-class AlexNet over 50000 ImageNet validation dataset images show classification accuracy loss within 1% when compared with software implementation. The proposed MDL based time-domain approach performing 1-bit/8-bit weight and 8-bit input MAC operations when compared with the correspond- ing baseline digital implementations shows 2.09×-2.32× higher energy efficiency and 2.22×-3.45× smaller area.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of S-G Shapers for Detector Readout Front Ends

Signal Processing

This paper presents the realization of semi Gaussian (S-G) shaper circuits for detector readout f... more This paper presents the realization of semi Gaussian (S-G) shaper circuits for detector readout front ends. The circuits are based on the current difference transconductance amplifier (CDTA) and differential-difference current conveyor transconductance amplifier (DDCCTA) and can operate in voltage and current mode. The proposed circuits use grounded capacitors and thus make these suitable from the integration viewpoint. The theoretical propositions are verified through extensive SPICE simulations using 0.25μm TSMC CMOS technology model parameters. The performance of proposed topologies is compared in terms of total harmonic distortion, power dissipation, dynamic range, signal to noise ratio (SNR), and output noise. The performance comparison of CDTA, DDCCTA, CCII, and OTA based shapers proves that proposed CDTA and DDCCTA based shapers have advantageous performance features over existing CCII and OTA based shapers.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Review on Maximum Power Point Tracking Techniques for Photovoltaic Power Generating Systems

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Realization of CDTA Based Frequency Agile Filter

IEEE International Conference on Signal Processing, Computing and Control (ISPCC), 2013

This paper presents frequency agile filter based on current difference transconductance amplifier... more This paper presents frequency agile filter based on current difference transconductance amplifier (CDTA). The agile filters used in this work provide high agilty, tunability and quality factor while they are fully integrated configurations and not discrete systems. The use of grounded capacitors and resistor makes these structures suitable for integration. The functional verification is exhibited through extensive SPICE simulations using 0.25µm TSMC CMOS technology model parameters. The performance evaluation is made in terms of power dissipation, signal to noise ratio (SNR) and output noise.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Design of Versatile Modulator using CDTAs

IEEE International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA), 2013

In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency... more In this paper, a versatile modulator circuit that can be used for amplitude modulation, frequency modulation, delta modulation and sigma delta modulation has been proposed. The circuit has been implemented with the help of CDTA blocks, grounded capacitor and resistors. The circuit basically consists of an integrator, a Schmitt trigger and an inherent clock which is used as a carrier signal thereby obviating the need for an external clock. Due to its compact size, it is suitable for IC realisation. The performance comparison of different modulation modes is made in terms of power dissipation, signal to noise ratio (SNR) and output noise voltage. Index Terms-Current difference transconductance amplifier (CDTA), Amplitude Modulation, Frequency Modulation, Delta Modulation, Sigma Delta Modulation.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of MPPT Techniques for Photovoltaic System under Uniform Insolation and Partial Shading Conditions

IEEE Students Conference on Engineering and Systems, 2012

The photovoltaic energy is one of the renewable energies that has attracted the attention of rese... more The photovoltaic energy is one of the renewable energies that has attracted the attention of researchers in the recent decades. The photovoltaic generators exhibit nonlinear I-V and P-V characteristics. The maximum power produced varies with solar insolation and temperature. It requires maximum power point tracking (MPPT) control techniques to extract the maximum available power from PV arrays. Due to partial shading condition, the characteristics of a PV system considerably change and often exhibit several local maxima with one global maxima. Conventional Maximum Power Point Tracking techniques can easily be trapped at local maxima under partial shading. This significantly reduced the energy yield of the PV systems. In this paper, various MPPT algorithms for uniform insolation and partial shading conditions are reviewed with their merits and demerits. Also, some new algorithms are proposed which are shown to be more efficient than the existing ones.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of DDCCTA Based Semi Gaussian Shapers for Detector Readout Front Ends

IEEE International Conference on Circuits, Power and Computing Technologies (ICCPCT), 2013

This paper presents semi Gaussian (S-G) shaper circuit based on differential difference current c... more This paper presents semi Gaussian (S-G) shaper circuit based on differential difference current conveyor transconductance amplifier (DDCCTA). The grounded capacitors make the realization suitable for integration. The proposed circuits are verified for functionality through extensive SPICE simulations using 0.25 11m TSMC CMOS technology model parameters. The performance comparison is made in terms of total harmonie distortion, power dissipation, dynamic range, signal to noise ratio (SNR) and output noise. Index Terrns-Current mode, Differential difference current conveyor transconductance amplifier (DDCCTA), Semi Gaussian shaper, Voltage mode.

Bookmarks Related papers MentionsView impact