Mudit Bhargava - Academia.edu (original) (raw)
Papers by Mudit Bhargava
This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.
Abstract. Achieving high reliability across environmental variations and over aging in physical u... more Abstract. Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a chal-lenge for PUF designers. The conventional method to improve PUF relia-bility is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of in-creasing the reliability of the PUF core is to use normally detrimental IC aging e↵ects to reinforce the desired (or “golden”) response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which ...
2009 IEEE Custom Integrated Circuits Conference, 2009
Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. ... more Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. McCartney1, Alexander Hoefler2, and Ken Mai1 1Electrical and Computer Engineering Department, Carnegie Mellon University ...
2009 IEEE International Conference on Computer Design, 2009
A designer's intent and knowledge about the critical issues and trade-offs underlying a custom ci... more A designer's intent and knowledge about the critical issues and trade-offs underlying a custom circuit design are implicit in the simulations she sets up for design creation and verification. However, this knowledge is tightly conjoined with technology-specific features and decoupled from the final schematic in traditional design flows. As a result, this knowledge is easily lost when the technology specifics change. This paper presents a Technology Agnostic Simulation Environment (TASE), which is a tool that uses simulation templates to capture the designer's knowledge and separate it from the technologyspecific components of a simulation. TASE also allows the designer to form groups of related simulations and port them as a unit to a new technology. This allows an actual design schematic to remain tied to the analyses that illuminate the underlying trade-offs and design issues, unlike the case where schematics are ported alone. Giving the designer immediate access to the trade-offs, which are likely to change in new technologies, accelerates the redesign that often must accompany porting of complicated custom circuits. We demonstrate the usefulness of TASE by investigating Read and Write noise margins for a 6T SRAM in predictive technologies down to 16 nm.
Proceedings of the 47th Design Automation Conference on - DAC '10, 2010
SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and a... more SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and architecture level. Decisions made at various levels of the design hierarchy affect the global figures of merit (FoMs) of an SRAM, such as, performance, power, area, and yield. However, the lack of a quick mechanism to understand the impact of changes at various levels of the
Achieving high reliability across environmental variations and over aging in physical unclonable ... more Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a challenge for PUF designers. The conventional method to improve PUF reliability is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of increasing the reliability of the PUF core is to use normally detrimental IC aging effects to reinforce the desired (or "golden") response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which can reinforce the PUF golden response in short stress times (i.e., tens of seconds), without impacting the surrounding circuits, and that has high permanence (i.e., does not degrade significantly over aging). We present a self-contained HCI-reinforcement-enabled PUF circuit based on sense amplifiers (SA) which autonomously self-reinforces with minimal external intervention. We have fabricated a custom ASIC testchip in 65nm bulk CMOS with the proposed PUF design. Measured results show high reliability across environmental variations and accelerated aging, as well as good uniqueness and randomness. For example, 1600 SA elements, after being HCI stressed for 125s, show 100% reliability (zero errors) across ±20% voltage variations a temperature range of-20 • C to 85 • C.
Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable ... more Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable (SAE) timing for small-swing bitline SRAMs is described. Post-silicon selection of a subset of replica bitline driver cells from a statistically designed pool of cells facilitates precise SAE timing. An exponential reduction in timing variation is enabled by statistical selection of driver cells, which can provide 14x reduction in SAE timing uncertainty with 200x less area and power than a conventional RBL with equivalent variation control. We describe the post-silicon test and configuration methodology necessary for cRBLs. To demonstrate the efficacy of the proposed cRBL technique, we present measured results from a 90nm bulk CMOS 64kb SRAM testchip. I.
2019 Symposium on VLSI Circuits, 2019
We present a high-performance 6T SRAM architecture equipped with low-power features of late cance... more We present a high-performance 6T SRAM architecture equipped with low-power features of late cancel, left-right enable, input-gating, and power-gating. Measurements show that these SRAMs can support CPUs running at 4GHz while offering dynamic power savings of 17% and 6% for the caches and the system respectively and up to 21X static power system savings for the low-power implementation.
ArXiv, 2021
Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is rampin... more Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is ramping to production at major foundries as an eFlash replacement. MTJ switching exhibits a stochastic behaviour due to thermal fluctuations, which is modelled by s-LLGS and Fokker-Planck (FP) equations. This work implements and benchmarks Finite Volume Method (FVM) and analytical solvers for the FP equation. To deploy an MTJ model for circuit design, it must be calibrated against silicon data. To address this challenge, this work presents a regression scheme to fit MTJ parameters to a given set of measured current, switching time and error rate data points, yielding a silicon-calibrated model suitable for MRAM macro transient simulation.
This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-GilbertSlonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the nonnegligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm MagnetoresistiveRAM (MRAM) memory product.
SRAM design in scaled technologies increasingly requires circuit innovations such as read/write a... more SRAM design in scaled technologies increasingly requires circuit innovations such as read/write assist techniques or alternative bitcells to ensure even basic functionality. However, the lack of a quick mechanism for understanding the impact of these circuit level changes on system level metrics makes accurate assessments of new circuit techniques difficult. Thus, we introduce Virtual Prototyper (ViPro), a tool that helps circuit designers explore this large design space by rapidly generating optimized virtual prototypes of complete SRAM macros. ViPro does this by allowing SRAM component specification with varying levels of detail-from 'black-box' descriptions to complete netlists-and by incorporating those components into a hierarchical model that captures circuit and architectural features of the SRAM to optimize a complete prototype. SRAM designers can use ViPro to generate base-case prototypes, which provide starting points for design space exploration, or to assess the impact of a low level circuit innovation on the overall SRAM design.
3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processi... more 3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processing is gaining rapid industry adoption with the slowdown of Moore's law scaling. 3D stacking promises potential gains in performance, power and cost but the actual magnitude of gains varies depending on end-application, technology choices and design. In this talk, we will discuss some key challenges associated with 3D design and how design-for-3D will require us to break traditional silos of micro-architecture, circuit/physical design and manufacturing technology to work across abstractions to enable the gains promised by 3D technologies.
2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)
Proceedings of the 56th Annual Design Automation Conference 2019
Journal of Parallel and Distributed Computing
2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015
2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), 2015
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays - FPGA '11, 2011
NAND Flash memory has been widely used for data storage due to its high density, high throughput,... more NAND Flash memory has been widely used for data storage due to its high density, high throughput, low cost, and low power. However, as the storage cells become smaller and with more bits programmed per cell, they are expected to suffer from reduced reliability and ...
This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.
Abstract. Achieving high reliability across environmental variations and over aging in physical u... more Abstract. Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a chal-lenge for PUF designers. The conventional method to improve PUF relia-bility is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of in-creasing the reliability of the PUF core is to use normally detrimental IC aging e↵ects to reinforce the desired (or “golden”) response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which ...
2009 IEEE Custom Integrated Circuits Conference, 2009
Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. ... more Page 1. Low-Overhead, Digital Offset Compensated, SRAM Sense Amplifiers Mudit Bhargava1, Mark P. McCartney1, Alexander Hoefler2, and Ken Mai1 1Electrical and Computer Engineering Department, Carnegie Mellon University ...
2009 IEEE International Conference on Computer Design, 2009
A designer's intent and knowledge about the critical issues and trade-offs underlying a custom ci... more A designer's intent and knowledge about the critical issues and trade-offs underlying a custom circuit design are implicit in the simulations she sets up for design creation and verification. However, this knowledge is tightly conjoined with technology-specific features and decoupled from the final schematic in traditional design flows. As a result, this knowledge is easily lost when the technology specifics change. This paper presents a Technology Agnostic Simulation Environment (TASE), which is a tool that uses simulation templates to capture the designer's knowledge and separate it from the technologyspecific components of a simulation. TASE also allows the designer to form groups of related simulations and port them as a unit to a new technology. This allows an actual design schematic to remain tied to the analyses that illuminate the underlying trade-offs and design issues, unlike the case where schematics are ported alone. Giving the designer immediate access to the trade-offs, which are likely to change in new technologies, accelerates the redesign that often must accompany porting of complicated custom circuits. We demonstrate the usefulness of TASE by investigating Read and Write noise margins for a 6T SRAM in predictive technologies down to 16 nm.
Proceedings of the 47th Design Automation Conference on - DAC '10, 2010
SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and a... more SRAM design in scaled technologies requires knowledge of phenomena at the process, circuit, and architecture level. Decisions made at various levels of the design hierarchy affect the global figures of merit (FoMs) of an SRAM, such as, performance, power, area, and yield. However, the lack of a quick mechanism to understand the impact of changes at various levels of the
Achieving high reliability across environmental variations and over aging in physical unclonable ... more Achieving high reliability across environmental variations and over aging in physical unclonable functions (PUFs) remains a challenge for PUF designers. The conventional method to improve PUF reliability is to use powerful error correction codes (ECC) to correct the errors in the raw response from the PUF core. Unfortunately, these ECC blocks generally have high VLSI overheads, which scale up quickly with the error correction capability. Alternately, researchers have proposed techniques to increase the reliability of the PUF core, and thus significantly reduce the required strength (and complexity) of the ECC. One method of increasing the reliability of the PUF core is to use normally detrimental IC aging effects to reinforce the desired (or "golden") response of the PUF by altering the PUF circuit characteristics permanently and hence making the PUF more reliable. In this work, we present a PUF response reinforcement technique based on hot carrier injection (HCI) which can reinforce the PUF golden response in short stress times (i.e., tens of seconds), without impacting the surrounding circuits, and that has high permanence (i.e., does not degrade significantly over aging). We present a self-contained HCI-reinforcement-enabled PUF circuit based on sense amplifiers (SA) which autonomously self-reinforces with minimal external intervention. We have fabricated a custom ASIC testchip in 65nm bulk CMOS with the proposed PUF design. Measured results show high reliability across environmental variations and accelerated aging, as well as good uniqueness and randomness. For example, 1600 SA elements, after being HCI stressed for 125s, show 100% reliability (zero errors) across ±20% voltage variations a temperature range of-20 • C to 85 • C.
Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable ... more Abstract- A configurable replica bitline (cRBL) technique for controlling sense-amplifier enable (SAE) timing for small-swing bitline SRAMs is described. Post-silicon selection of a subset of replica bitline driver cells from a statistically designed pool of cells facilitates precise SAE timing. An exponential reduction in timing variation is enabled by statistical selection of driver cells, which can provide 14x reduction in SAE timing uncertainty with 200x less area and power than a conventional RBL with equivalent variation control. We describe the post-silicon test and configuration methodology necessary for cRBLs. To demonstrate the efficacy of the proposed cRBL technique, we present measured results from a 90nm bulk CMOS 64kb SRAM testchip. I.
2019 Symposium on VLSI Circuits, 2019
We present a high-performance 6T SRAM architecture equipped with low-power features of late cance... more We present a high-performance 6T SRAM architecture equipped with low-power features of late cancel, left-right enable, input-gating, and power-gating. Measurements show that these SRAMs can support CPUs running at 4GHz while offering dynamic power savings of 17% and 6% for the caches and the system respectively and up to 21X static power system savings for the low-power implementation.
ArXiv, 2021
Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is rampin... more Magnetic Tunnel Junctions (MTJs) constitute the novel memory element in STT-MRAM, which is ramping to production at major foundries as an eFlash replacement. MTJ switching exhibits a stochastic behaviour due to thermal fluctuations, which is modelled by s-LLGS and Fokker-Planck (FP) equations. This work implements and benchmarks Finite Volume Method (FVM) and analytical solvers for the FP equation. To deploy an MTJ model for circuit design, it must be calibrated against silicon data. To address this challenge, this work presents a regression scheme to fit MTJ parameters to a given set of measured current, switching time and error rate data points, yielding a silicon-calibrated model suitable for MRAM macro transient simulation.
This paper presents a physics-based modeling framework for the analysis and transient simulation ... more This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-GilbertSlonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the nonnegligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm MagnetoresistiveRAM (MRAM) memory product.
SRAM design in scaled technologies increasingly requires circuit innovations such as read/write a... more SRAM design in scaled technologies increasingly requires circuit innovations such as read/write assist techniques or alternative bitcells to ensure even basic functionality. However, the lack of a quick mechanism for understanding the impact of these circuit level changes on system level metrics makes accurate assessments of new circuit techniques difficult. Thus, we introduce Virtual Prototyper (ViPro), a tool that helps circuit designers explore this large design space by rapidly generating optimized virtual prototypes of complete SRAM macros. ViPro does this by allowing SRAM component specification with varying levels of detail-from 'black-box' descriptions to complete netlists-and by incorporating those components into a hierarchical model that captures circuit and architectural features of the SRAM to optimize a complete prototype. SRAM designers can use ViPro to generate base-case prototypes, which provide starting points for design space exploration, or to assess the impact of a low level circuit innovation on the overall SRAM design.
3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processi... more 3D integration, i.e., stacking of integrated circuit layers using parallel or sequential processing is gaining rapid industry adoption with the slowdown of Moore's law scaling. 3D stacking promises potential gains in performance, power and cost but the actual magnitude of gains varies depending on end-application, technology choices and design. In this talk, we will discuss some key challenges associated with 3D design and how design-for-3D will require us to break traditional silos of micro-architecture, circuit/physical design and manufacturing technology to work across abstractions to enable the gains promised by 3D technologies.
2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)
Proceedings of the 56th Annual Design Automation Conference 2019
Journal of Parallel and Distributed Computing
2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), 2015
2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), 2015
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays - FPGA '11, 2011
NAND Flash memory has been widely used for data storage due to its high density, high throughput,... more NAND Flash memory has been widely used for data storage due to its high density, high throughput, low cost, and low power. However, as the storage cells become smaller and with more bits programmed per cell, they are expected to suffer from reduced reliability and ...