Kanwaldeep Sobti | Intel Corporation (original) (raw)

Uploads

Papers by Kanwaldeep Sobti

Research paper thumbnail of The scan-DFT features of AMD's next-generation microprocessor core

2010 IEEE International Test Conference, 2010

Abstract There is an ever-increasing demand for higher performance microprocessors within a given... more Abstract There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...

Research paper thumbnail of Accurate Models for Estimating Area and Power of Fpga Implementations †

This paper presents accurate area and power estimation models for implementations using FPGAs fro... more This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for accurately estimating the number of slices, block RAMs and 18x18-bit multipliers for fixed point and floating-point IP cores have been developed. These models are also utilized to develop accurate power models that consider the effect of logic power, signal power, clock power and I/O power. In all cases, the model coefficients have been derived by using curve fitting or regression analysis. The modeling error for the IP cores is very small (average 0.95%). The error for fairly large examples such as floating point implementation of 8-point FFTs is also quite small; it is 1.87 % for estimation of number of slices and 3.48 % for estimation of power consumption.

Research paper thumbnail of An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization,” in

Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators f... more Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. More...

Research paper thumbnail of TANOR: A TOOL FOR ACCELERATING N-BODY SIMULATIONS ON RECONF IGURABLE PLATFORM

Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a conseque... more Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a consequence, designers are currently abl e to ex- plore only a limited set of points in the whole design space. There- fore, a tool that can allow fast exploration of algorithmic a nd archi- tectural tradeoffs in an automated manner is highly desired. In this paper, we

Research paper thumbnail of Geometric Tiling for Reducing Power Consumption in Structured Matrix Operations

2006 IEEE International SOC Conference, 2006

This work focuses on reducing power consumption while maintaining the efficiency and accuracy of ... more This work focuses on reducing power consumption while maintaining the efficiency and accuracy of matrix computations using both algorithmic and architectural means. We transform the algorithms, in adaptation to application specifics, to translate the matrix structures into power saving potential via geometric tiling. Instead of using blind tiling, we index and partition matrix elements according to the underlying geometry to claim a better estimate and control of numerical range within and across geometric tiles, which can then be exploited for power saving.

Research paper thumbnail of TANOR: A Tool for Accelerating N-Body Simulations on Reconfigurable Platform

2007 International Conference on Field Programmable Logic and Applications, 2007

Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequenc... more Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequence, designers are currently able to explore only a limited set of points in the whole design space. Therefore, a tool that can allow fast exploration of algorithmic and architectural tradeoffs in an automated manner is highly desired. In this paper, we describe TANOR an automated tool targeted for designing hardware accelerators for the class of N-body interaction problems. The design flow, starting from a high level (MATLAB) description, configures the entire system automatically. We describe the design of TANOR and demonstrate the effectiveness and adaptability of our tool using three different target applications, namely, the gravitational kernel used in astrophysics, the gaussian kernel common in image processing applications, and a force calculation kernel applied in molecular dynamics. Our results demonstrate that TANOR generates hardware accelerator that are competitive with existing custom accelerator.

Research paper thumbnail of Efficient Function Evaluations with Lookup Tables for Structured Matrix Operations

2007 IEEE Workshop on Signal Processing Systems, 2007

A hardware efficient approach is introduced for elementary function evaluations in certain struct... more A hardware efficient approach is introduced for elementary function evaluations in certain structured matrix computations. It is a comprehensive approach that utilizes lookup tables for compactness, employs interpolations with adders and multipliers for their adaptivity to non-tabulated values and, more distinctively, exploits the function properties and the matrix structures to claim better control over numerical dynamic ranges. We demonstrate the

Research paper thumbnail of Structural tests of slave clock gating in low-power flip-flop

29th VLSI Test Symposium, 2011

Abstract A novel slave clock-gating technique in [5} is designed to save power when the master an... more Abstract A novel slave clock-gating technique in [5} is designed to save power when the master and slave latches of a low power flip-flop reach certain correlated states (eg, both latches are at logic 0 or 1). Testing this clock-gating circuit is essential for power-sensitive ...

Research paper thumbnail of Accurate models for estimating area and power of FPGA implementations

2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008

ABSTRACT This paper presents accurate area and power estimation models for implementations using ... more ABSTRACT This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space explo-ration in an automated algorithm-architecture codesign framework. ...

Research paper thumbnail of Structural test of power-only defects: ATPG or ad-hoc?

2009 International Test Conference, 2009

A power-only defect (POD) never causes function failures but only leads to power consumption incr... more A power-only defect (POD) never causes function failures but only leads to power consumption increase. In other words, a device with PODs can work as expected in terms of functionality and increasing power consumption in mission modes is the only external indication. The ...

Research paper thumbnail of Accurate Area, Time and Power Models for FPGA-Based Implementations

Journal of Signal Processing Systems, 2011

This paper presents accurate area, time, power estimation models for implementations using FPGAs ... more This paper presents accurate area, time, power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family (Deng et al. 2008). These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for estimating the number of slices, block RAMs and 18×18-bit multipliers for fixed point and floating point IP cores have

Research paper thumbnail of An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization

IEEE Transactions on Computers, 2000

This paper describes TANOR, an automated framework for designing hardware accelerators for numeri... more This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. Moreover, TANOR enables joint exploration of algorithmic and architectural variations in realizing efficient hardware accelerators. TANOR's capabilities have been demonstrated for three different N-body interaction applications: the calculation of gravitational potential in astrophysics, the diffusion or convolution with Gaussian kernel common in image processing applications, and the force calculation with vector-valued kernel function in molecular dynamics simulation. Experimental results show that TANOR-generated hardware accelerators achieve lower resource utilization without compromising numerical accuracy, in comparison to other existing custom accelerators.

Research paper thumbnail of The scan-DFT features of AMD's next-generation microprocessor core

… (ITC), 2010 IEEE …, 2010

There is an ever-increasing demand for higher performance microprocessors within a given power bu... more There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...

Research paper thumbnail of The scan-DFT features of AMD's next-generation microprocessor core

2010 IEEE International Test Conference, 2010

Abstract There is an ever-increasing demand for higher performance microprocessors within a given... more Abstract There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...

Research paper thumbnail of Accurate Models for Estimating Area and Power of Fpga Implementations †

This paper presents accurate area and power estimation models for implementations using FPGAs fro... more This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for accurately estimating the number of slices, block RAMs and 18x18-bit multipliers for fixed point and floating-point IP cores have been developed. These models are also utilized to develop accurate power models that consider the effect of logic power, signal power, clock power and I/O power. In all cases, the model coefficients have been derived by using curve fitting or regression analysis. The modeling error for the IP cores is very small (average 0.95%). The error for fairly large examples such as floating point implementation of 8-point FFTs is also quite small; it is 1.87 % for estimation of number of slices and 3.48 % for estimation of power consumption.

Research paper thumbnail of An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization,” in

Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators f... more Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. More...

Research paper thumbnail of TANOR: A TOOL FOR ACCELERATING N-BODY SIMULATIONS ON RECONF IGURABLE PLATFORM

Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a conseque... more Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a consequence, designers are currently abl e to ex- plore only a limited set of points in the whole design space. There- fore, a tool that can allow fast exploration of algorithmic a nd archi- tectural tradeoffs in an automated manner is highly desired. In this paper, we

Research paper thumbnail of Geometric Tiling for Reducing Power Consumption in Structured Matrix Operations

2006 IEEE International SOC Conference, 2006

This work focuses on reducing power consumption while maintaining the efficiency and accuracy of ... more This work focuses on reducing power consumption while maintaining the efficiency and accuracy of matrix computations using both algorithmic and architectural means. We transform the algorithms, in adaptation to application specifics, to translate the matrix structures into power saving potential via geometric tiling. Instead of using blind tiling, we index and partition matrix elements according to the underlying geometry to claim a better estimate and control of numerical range within and across geometric tiles, which can then be exploited for power saving.

Research paper thumbnail of TANOR: A Tool for Accelerating N-Body Simulations on Reconfigurable Platform

2007 International Conference on Field Programmable Logic and Applications, 2007

Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequenc... more Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequence, designers are currently able to explore only a limited set of points in the whole design space. Therefore, a tool that can allow fast exploration of algorithmic and architectural tradeoffs in an automated manner is highly desired. In this paper, we describe TANOR an automated tool targeted for designing hardware accelerators for the class of N-body interaction problems. The design flow, starting from a high level (MATLAB) description, configures the entire system automatically. We describe the design of TANOR and demonstrate the effectiveness and adaptability of our tool using three different target applications, namely, the gravitational kernel used in astrophysics, the gaussian kernel common in image processing applications, and a force calculation kernel applied in molecular dynamics. Our results demonstrate that TANOR generates hardware accelerator that are competitive with existing custom accelerator.

Research paper thumbnail of Efficient Function Evaluations with Lookup Tables for Structured Matrix Operations

2007 IEEE Workshop on Signal Processing Systems, 2007

A hardware efficient approach is introduced for elementary function evaluations in certain struct... more A hardware efficient approach is introduced for elementary function evaluations in certain structured matrix computations. It is a comprehensive approach that utilizes lookup tables for compactness, employs interpolations with adders and multipliers for their adaptivity to non-tabulated values and, more distinctively, exploits the function properties and the matrix structures to claim better control over numerical dynamic ranges. We demonstrate the

Research paper thumbnail of Structural tests of slave clock gating in low-power flip-flop

29th VLSI Test Symposium, 2011

Abstract A novel slave clock-gating technique in [5} is designed to save power when the master an... more Abstract A novel slave clock-gating technique in [5} is designed to save power when the master and slave latches of a low power flip-flop reach certain correlated states (eg, both latches are at logic 0 or 1). Testing this clock-gating circuit is essential for power-sensitive ...

Research paper thumbnail of Accurate models for estimating area and power of FPGA implementations

2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008

ABSTRACT This paper presents accurate area and power estimation models for implementations using ... more ABSTRACT This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space explo-ration in an automated algorithm-architecture codesign framework. ...

Research paper thumbnail of Structural test of power-only defects: ATPG or ad-hoc?

2009 International Test Conference, 2009

A power-only defect (POD) never causes function failures but only leads to power consumption incr... more A power-only defect (POD) never causes function failures but only leads to power consumption increase. In other words, a device with PODs can work as expected in terms of functionality and increasing power consumption in mission modes is the only external indication. The ...

Research paper thumbnail of Accurate Area, Time and Power Models for FPGA-Based Implementations

Journal of Signal Processing Systems, 2011

This paper presents accurate area, time, power estimation models for implementations using FPGAs ... more This paper presents accurate area, time, power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family (Deng et al. 2008). These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for estimating the number of slices, block RAMs and 18×18-bit multipliers for fixed point and floating point IP cores have

Research paper thumbnail of An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization

IEEE Transactions on Computers, 2000

This paper describes TANOR, an automated framework for designing hardware accelerators for numeri... more This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. Moreover, TANOR enables joint exploration of algorithmic and architectural variations in realizing efficient hardware accelerators. TANOR's capabilities have been demonstrated for three different N-body interaction applications: the calculation of gravitational potential in astrophysics, the diffusion or convolution with Gaussian kernel common in image processing applications, and the force calculation with vector-valued kernel function in molecular dynamics simulation. Experimental results show that TANOR-generated hardware accelerators achieve lower resource utilization without compromising numerical accuracy, in comparison to other existing custom accelerators.

Research paper thumbnail of The scan-DFT features of AMD's next-generation microprocessor core

… (ITC), 2010 IEEE …, 2010

There is an ever-increasing demand for higher performance microprocessors within a given power bu... more There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...