Mudit Bhargava - Academia.edu (original) (raw)

Papers by Mudit Bhargava

Research paper thumbnail of A Compact Model for Scalable MTJ Simulation

This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.

Research paper thumbnail of Mai: A High Reliability PUF Using Hot Carrier Injection Based Response Reinforcement

Abstract. Achieving high reliability across environmental variations and over aging in physical u... more Abstract. Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a chal-lenge for PUF designers. The conventional method to improve PUF relia-bility is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of in-creasing the reliability of the PUF core is to use normally detrimental IC aging e↵ects to reinforce the desired (or “golden”) response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which ...

Research paper thumbnail of Low-overhead, digital offset compensated, SRAM sense amplifiers

2009 IEEE Custom Integrated Circuits Conference, 2009

Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. ... more Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. McCartney1, Alexander Hoefler2, and Ken Mai1 1Electrical and Computer Engineering Department, Carnegie Mellon University ...

Research paper thumbnail of A Technology-Agnostic Simulation Environment (TASE) for iterative custom IC design across processes

2009 IEEE International Conference on Computer Design, 2009

A designer's intent and knowledge about the critical issues and trade-offs underlying a custom ci... more A designer's intent and knowledge about the critical issues and trade-offs underlying a custom circuit design are implicit in the simulations she sets up for design creation and verification. However, this knowledge is tightly conjoined with technology-specific features and decoupled from the final schematic in traditional design flows. As a result, this knowledge is easily lost when the technology specifics change. This paper presents a Technology Agnostic Simulation Environment (TASE), which is a tool that uses simulation templates to capture the designer's knowledge and separate it from the technologyspecific components of a simulation. TASE also allows the designer to form groups of related simulations and port them as a unit to a new technology. This allows an actual design schematic to remain tied to the analyses that illuminate the underlying trade-offs and design issues, unlike the case where schematics are ported alone. Giving the designer immediate access to the trade-offs, which are likely to change in new technologies, accelerates the redesign that often must accompany porting of complicated custom circuits. We demonstrate the usefulness of TASE by investigating Read and Write noise margins for a 6T SRAM in predictive technologies down to 16 nm.

Research paper thumbnail of Virtual prototyper (ViPro)

Proceedings of the 47th Design Automation Conference on - DAC '10, 2010

SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and a... more SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and architecture level. Decisions made at various levels of the design hierarchy affect the global figures of merit (FoMs) of an SRAM, such as, performance, power, area, and yield. However, the lack of a quick mechanism to understand the impact of changes at various levels of the

Research paper thumbnail of A high reliability PUF using hot carrier injection based response reinforcement

Achieving high reliability across environmental variations and over aging in physical unclonable ... more Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a challenge for PUF designers. The conventional method to improve PUF reliability is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of increasing the reliability of the PUF core is to use normally detrimental IC aging effects to reinforce the desired (or "golden") response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which can reinforce the PUF golden response in short stress times (i.e., tens of seconds), without impacting the surrounding circuits, and that has high permanence (i.e., does not degrade significantly over aging). We present a self-contained HCI-reinforcement-enabled PUF circuit based on sense amplifiers (SA) which autonomously self-reinforces with minimal external intervention. We have fabricated a custom ASIC testchip in 65nm bulk CMOS with the proposed PUF design. Measured results show high reliability across environmental variations and accelerated aging, as well as good uniqueness and randomness. For example, 1600 SA elements, after being HCI stressed for 125s, show 100% reliability (zero errors) across ±20% voltage variations a temperature range of-20 • C to 85 • C.

Research paper thumbnail of IEEE 2008 Custom Intergrated Circuits Conference (CICC) Variation-Tolerant SRAM Sense-Amplifier Timing Using Configurable Replica Bitlines

Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable ... more Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable (SAE) timing for small-swing bitline SRAMs is described. Post-silicon selection of a subset of replica bitline driver cells from a statistically designed pool of cells facilitates precise SAE timing. An exponential reduction in timing variation is enabled by statistical selection of driver cells, which can provide 14x reduction in SAE timing uncertainty with 200x less area and power than a conventional RBL with equivalent variation control. We describe the post-silicon test and configuration methodology necessary for cRBLs. To demonstrate the efficacy of the proposed cRBL technique, we present measured results from a 90nm bulk CMOS 64kb SRAM testchip. I.

Research paper thumbnail of A 4GHz 16nm SRAM Architecture with Low-Power Features for Heterogeneous Computing Platforms

2019 Symposium on VLSI Circuits, 2019

We present a high-performance 6T SRAM architecture equipped with low-power features of late cance... more We present a high-performance 6T SRAM architecture equipped with low-power features of late cancel, left-right enable, input-gating, and power-gating. Measurements show that these SRAMs can support CPUs running at 4GHz while offering dynamic power savings of 17% and 6% for the caches and the system respectively and up to 21X static power system savings for the low-power implementation.

Research paper thumbnail of A Fokker-Planck Solver to Model MTJ Stochasticity

ArXiv, 2021

Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is rampin... more Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is ramping to production at major foundries as an eFlash replacement. MTJ switching exhibits a stochastic behaviour due to thermal fluctuations, which is modelled by s-LLGS and Fokker-Planck (FP) equations. This work implements and benchmarks Finite Volume Method (FVM) and analytical solvers for the FP equation. To deploy an MTJ model for circuit design, it must be calibrated against silicon data. To address this challenge, this work presents a regression scheme to fit MTJ parameters to a given set of measured current, switching time and error rate data points, yielding a silicon-calibrated model suitable for MRAM macro transient simulation.

Research paper thumbnail of A Compact Model for Scalable MTJ Simulation

This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-GilbertSlonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the nonnegligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm MagnetoresistiveRAM (MRAM) memory product.

Research paper thumbnail of An SRAM Prototyping Tool for Rapid Sub-32 nm Design Exploration and Optimization

SRAM design in scaled technologies increasingly requires circuit innovations such as read/write a... more SRAM design in scaled technologies increasingly requires circuit innovations such as read/write assist techniques or alternative bitcells to ensure even basic functionality. However, the lack of a quick mechanism for understanding the impact of these circuit level changes on system level metrics makes accurate assessments of new circuit techniques difficult. Thus, we introduce Virtual Prototyper (ViPro), a tool that helps circuit designers explore this large design space by rapidly generating optimized virtual prototypes of complete SRAM macros. ViPro does this by allowing SRAM component specification with varying levels of detail-from 'black-box' descriptions to complete netlists-and by incorporating those components into a hierarchical model that captures circuit and architectural features of the SRAM to optimize a complete prototype. SRAM designers can use ViPro to generate base-case prototypes, which provide starting points for design space exploration, or to assess the impact of a low level circuit innovation on the overall SRAM design.

Research paper thumbnail of Stack up your chips: Betting on 3D integration to augment Moore's Law scaling

3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processi... more 3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processing is gaining rapid industry adoption with the slowdown of Moore's law scaling. 3D stacking promises potential gains in performance, power and cost but the actual magnitude of gains varies depending on end-application, technology choices and design. In this talk, we will discuss some key challenges associated with 3D design and how design-for-3D will require us to break traditional silos of micro-architecture, circuit/physical design and manufacturing technology to work across abstractions to enable the gains promised by 3D technologies.

Research paper thumbnail of Enhanced 3D Implementation of an Arm® Cortex®-A Microprocessor

2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Research paper thumbnail of On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators

Proceedings of the 56th Annual Design Automation Conference 2019

Research paper thumbnail of Secure hardware-entangled field programmable gate arrays

Journal of Parallel and Distributed Computing

Research paper thumbnail of Sense amplifier providing low capacitance with reduced resolution time

Research paper thumbnail of Deeply hardware-entangled reconfigurable logic and interconnect

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015

Research paper thumbnail of Robust true random number generator using hot-carrier injection balanced metastable sense amplifiers

2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), 2015

Research paper thumbnail of FPGA-based nand flash memory error characterization and solid-state drive prototyping platform (abstract only)

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays - FPGA '11, 2011

NAND Flash memory has been widely used for data storage due to its high density, high throughput,... more NAND Flash memory has been widely used for data storage due to its high density, high throughput, low cost, and low power. However, as the storage cells become smaller and with more bits programmed per cell, they are expected to suffer from reduced reliability and ...

Research paper thumbnail of SRAM and DRAM Memory Retention and Remanence Characterization

Research paper thumbnail of A Compact Model for Scalable MTJ Simulation

This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.

Research paper thumbnail of Mai: A High Reliability PUF Using Hot Carrier Injection Based Response Reinforcement

Abstract. Achieving high reliability across environmental variations and over aging in physical u... more Abstract. Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a chal-lenge for PUF designers. The conventional method to improve PUF relia-bility is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of in-creasing the reliability of the PUF core is to use normally detrimental IC aging e↵ects to reinforce the desired (or “golden”) response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which ...

Research paper thumbnail of Low-overhead, digital offset compensated, SRAM sense amplifiers

2009 IEEE Custom Integrated Circuits Conference, 2009

Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. ... more Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. McCartney1, Alexander Hoefler2, and Ken Mai1 1Electrical and Computer Engineering Department, Carnegie Mellon University ...

Research paper thumbnail of A Technology-Agnostic Simulation Environment (TASE) for iterative custom IC design across processes

2009 IEEE International Conference on Computer Design, 2009

A designer's intent and knowledge about the critical issues and trade-offs underlying a custom ci... more A designer's intent and knowledge about the critical issues and trade-offs underlying a custom circuit design are implicit in the simulations she sets up for design creation and verification. However, this knowledge is tightly conjoined with technology-specific features and decoupled from the final schematic in traditional design flows. As a result, this knowledge is easily lost when the technology specifics change. This paper presents a Technology Agnostic Simulation Environment (TASE), which is a tool that uses simulation templates to capture the designer's knowledge and separate it from the technologyspecific components of a simulation. TASE also allows the designer to form groups of related simulations and port them as a unit to a new technology. This allows an actual design schematic to remain tied to the analyses that illuminate the underlying trade-offs and design issues, unlike the case where schematics are ported alone. Giving the designer immediate access to the trade-offs, which are likely to change in new technologies, accelerates the redesign that often must accompany porting of complicated custom circuits. We demonstrate the usefulness of TASE by investigating Read and Write noise margins for a 6T SRAM in predictive technologies down to 16 nm.

Research paper thumbnail of Virtual prototyper (ViPro)

Proceedings of the 47th Design Automation Conference on - DAC '10, 2010

SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and a... more SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and architecture level. Decisions made at various levels of the design hierarchy affect the global figures of merit (FoMs) of an SRAM, such as, performance, power, area, and yield. However, the lack of a quick mechanism to understand the impact of changes at various levels of the

Research paper thumbnail of A high reliability PUF using hot carrier injection based response reinforcement

Achieving high reliability across environmental variations and over aging in physical unclonable ... more Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a challenge for PUF designers. The conventional method to improve PUF reliability is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of increasing the reliability of the PUF core is to use normally detrimental IC aging effects to reinforce the desired (or "golden") response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which can reinforce the PUF golden response in short stress times (i.e., tens of seconds), without impacting the surrounding circuits, and that has high permanence (i.e., does not degrade significantly over aging). We present a self-contained HCI-reinforcement-enabled PUF circuit based on sense amplifiers (SA) which autonomously self-reinforces with minimal external intervention. We have fabricated a custom ASIC testchip in 65nm bulk CMOS with the proposed PUF design. Measured results show high reliability across environmental variations and accelerated aging, as well as good uniqueness and randomness. For example, 1600 SA elements, after being HCI stressed for 125s, show 100% reliability (zero errors) across ±20% voltage variations a temperature range of-20 • C to 85 • C.

Research paper thumbnail of IEEE 2008 Custom Intergrated Circuits Conference (CICC) Variation-Tolerant SRAM Sense-Amplifier Timing Using Configurable Replica Bitlines

Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable ... more Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable (SAE) timing for small-swing bitline SRAMs is described. Post-silicon selection of a subset of replica bitline driver cells from a statistically designed pool of cells facilitates precise SAE timing. An exponential reduction in timing variation is enabled by statistical selection of driver cells, which can provide 14x reduction in SAE timing uncertainty with 200x less area and power than a conventional RBL with equivalent variation control. We describe the post-silicon test and configuration methodology necessary for cRBLs. To demonstrate the efficacy of the proposed cRBL technique, we present measured results from a 90nm bulk CMOS 64kb SRAM testchip. I.

Research paper thumbnail of A 4GHz 16nm SRAM Architecture with Low-Power Features for Heterogeneous Computing Platforms

2019 Symposium on VLSI Circuits, 2019

We present a high-performance 6T SRAM architecture equipped with low-power features of late cance... more We present a high-performance 6T SRAM architecture equipped with low-power features of late cancel, left-right enable, input-gating, and power-gating. Measurements show that these SRAMs can support CPUs running at 4GHz while offering dynamic power savings of 17% and 6% for the caches and the system respectively and up to 21X static power system savings for the low-power implementation.

Research paper thumbnail of A Fokker-Planck Solver to Model MTJ Stochasticity

ArXiv, 2021

Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is rampin... more Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is ramping to production at major foundries as an eFlash replacement. MTJ switching exhibits a stochastic behaviour due to thermal fluctuations, which is modelled by s-LLGS and Fokker-Planck (FP) equations. This work implements and benchmarks Finite Volume Method (FVM) and analytical solvers for the FP equation. To deploy an MTJ model for circuit design, it must be calibrated against silicon data. To address this challenge, this work presents a regression scheme to fit MTJ parameters to a given set of measured current, switching time and error rate data points, yielding a silicon-calibrated model suitable for MRAM macro transient simulation.

Research paper thumbnail of A Compact Model for Scalable MTJ Simulation

This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-GilbertSlonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the nonnegligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm MagnetoresistiveRAM (MRAM) memory product.

Research paper thumbnail of An SRAM Prototyping Tool for Rapid Sub-32 nm Design Exploration and Optimization

SRAM design in scaled technologies increasingly requires circuit innovations such as read/write a... more SRAM design in scaled technologies increasingly requires circuit innovations such as read/write assist techniques or alternative bitcells to ensure even basic functionality. However, the lack of a quick mechanism for understanding the impact of these circuit level changes on system level metrics makes accurate assessments of new circuit techniques difficult. Thus, we introduce Virtual Prototyper (ViPro), a tool that helps circuit designers explore this large design space by rapidly generating optimized virtual prototypes of complete SRAM macros. ViPro does this by allowing SRAM component specification with varying levels of detail-from 'black-box' descriptions to complete netlists-and by incorporating those components into a hierarchical model that captures circuit and architectural features of the SRAM to optimize a complete prototype. SRAM designers can use ViPro to generate base-case prototypes, which provide starting points for design space exploration, or to assess the impact of a low level circuit innovation on the overall SRAM design.

Research paper thumbnail of Stack up your chips: Betting on 3D integration to augment Moore's Law scaling

3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processi... more 3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processing is gaining rapid industry adoption with the slowdown of Moore's law scaling. 3D stacking promises potential gains in performance, power and cost but the actual magnitude of gains varies depending on end-application, technology choices and design. In this talk, we will discuss some key challenges associated with 3D design and how design-for-3D will require us to break traditional silos of micro-architecture, circuit/physical design and manufacturing technology to work across abstractions to enable the gains promised by 3D technologies.

Research paper thumbnail of Enhanced 3D Implementation of an Arm® Cortex®-A Microprocessor

2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Research paper thumbnail of On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators

Proceedings of the 56th Annual Design Automation Conference 2019

Research paper thumbnail of Secure hardware-entangled field programmable gate arrays

Journal of Parallel and Distributed Computing

Research paper thumbnail of Sense amplifier providing low capacitance with reduced resolution time

Research paper thumbnail of Deeply hardware-entangled reconfigurable logic and interconnect

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015

Research paper thumbnail of Robust true random number generator using hot-carrier injection balanced metastable sense amplifiers

2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), 2015

Research paper thumbnail of FPGA-based nand flash memory error characterization and solid-state drive prototyping platform (abstract only)

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays - FPGA '11, 2011

NAND Flash memory has been widely used for data storage due to its high density, high throughput,... more NAND Flash memory has been widely used for data storage due to its high density, high throughput, low cost, and low power. However, as the storage cells become smaller and with more bits programmed per cell, they are expected to suffer from reduced reliability and ...

Research paper thumbnail of SRAM and DRAM Memory Retention and Remanence Characterization