Kanwaldeep Sobti | Intel Corporation (original) (raw)
Uploads
Papers by Kanwaldeep Sobti
2010 IEEE International Test Conference, 2010
Abstract There is an ever-increasing demand for higher performance microprocessors within a given... more Abstract There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...
This paper presents accurate area and power estimation models for implementations using FPGAs fro... more This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for accurately estimating the number of slices, block RAMs and 18x18-bit multipliers for fixed point and floating-point IP cores have been developed. These models are also utilized to develop accurate power models that consider the effect of logic power, signal power, clock power and I/O power. In all cases, the model coefficients have been derived by using curve fitting or regression analysis. The modeling error for the IP cores is very small (average 0.95%). The error for fairly large examples such as floating point implementation of 8-point FFTs is also quite small; it is 1.87 % for estimation of number of slices and 3.48 % for estimation of power consumption.
Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators f... more Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. More...
Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a conseque... more Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a consequence, designers are currently abl e to ex- plore only a limited set of points in the whole design space. There- fore, a tool that can allow fast exploration of algorithmic a nd archi- tectural tradeoffs in an automated manner is highly desired. In this paper, we
2006 IEEE International SOC Conference, 2006
This work focuses on reducing power consumption while maintaining the efficiency and accuracy of ... more This work focuses on reducing power consumption while maintaining the efficiency and accuracy of matrix computations using both algorithmic and architectural means. We transform the algorithms, in adaptation to application specifics, to translate the matrix structures into power saving potential via geometric tiling. Instead of using blind tiling, we index and partition matrix elements according to the underlying geometry to claim a better estimate and control of numerical range within and across geometric tiles, which can then be exploited for power saving.
2007 International Conference on Field Programmable Logic and Applications, 2007
Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequenc... more Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequence, designers are currently able to explore only a limited set of points in the whole design space. Therefore, a tool that can allow fast exploration of algorithmic and architectural tradeoffs in an automated manner is highly desired. In this paper, we describe TANOR an automated tool targeted for designing hardware accelerators for the class of N-body interaction problems. The design flow, starting from a high level (MATLAB) description, configures the entire system automatically. We describe the design of TANOR and demonstrate the effectiveness and adaptability of our tool using three different target applications, namely, the gravitational kernel used in astrophysics, the gaussian kernel common in image processing applications, and a force calculation kernel applied in molecular dynamics. Our results demonstrate that TANOR generates hardware accelerator that are competitive with existing custom accelerator.
2007 IEEE Workshop on Signal Processing Systems, 2007
A hardware efficient approach is introduced for elementary function evaluations in certain struct... more A hardware efficient approach is introduced for elementary function evaluations in certain structured matrix computations. It is a comprehensive approach that utilizes lookup tables for compactness, employs interpolations with adders and multipliers for their adaptivity to non-tabulated values and, more distinctively, exploits the function properties and the matrix structures to claim better control over numerical dynamic ranges. We demonstrate the
29th VLSI Test Symposium, 2011
Abstract A novel slave clock-gating technique in [5} is designed to save power when the master an... more Abstract A novel slave clock-gating technique in [5} is designed to save power when the master and slave latches of a low power flip-flop reach certain correlated states (eg, both latches are at logic 0 or 1). Testing this clock-gating circuit is essential for power-sensitive ...
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
ABSTRACT This paper presents accurate area and power estimation models for implementations using ... more ABSTRACT This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space explo-ration in an automated algorithm-architecture codesign framework. ...
2009 International Test Conference, 2009
A power-only defect (POD) never causes function failures but only leads to power consumption incr... more A power-only defect (POD) never causes function failures but only leads to power consumption increase. In other words, a device with PODs can work as expected in terms of functionality and increasing power consumption in mission modes is the only external indication. The ...
Journal of Signal Processing Systems, 2011
This paper presents accurate area, time, power estimation models for implementations using FPGAs ... more This paper presents accurate area, time, power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family (Deng et al. 2008). These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for estimating the number of slices, block RAMs and 18×18-bit multipliers for fixed point and floating point IP cores have
IEEE Transactions on Computers, 2000
This paper describes TANOR, an automated framework for designing hardware accelerators for numeri... more This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. Moreover, TANOR enables joint exploration of algorithmic and architectural variations in realizing efficient hardware accelerators. TANOR's capabilities have been demonstrated for three different N-body interaction applications: the calculation of gravitational potential in astrophysics, the diffusion or convolution with Gaussian kernel common in image processing applications, and the force calculation with vector-valued kernel function in molecular dynamics simulation. Experimental results show that TANOR-generated hardware accelerators achieve lower resource utilization without compromising numerical accuracy, in comparison to other existing custom accelerators.
… (ITC), 2010 IEEE …, 2010
There is an ever-increasing demand for higher performance microprocessors within a given power bu... more There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...
2010 IEEE International Test Conference, 2010
Abstract There is an ever-increasing demand for higher performance microprocessors within a given... more Abstract There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...
This paper presents accurate area and power estimation models for implementations using FPGAs fro... more This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for accurately estimating the number of slices, block RAMs and 18x18-bit multipliers for fixed point and floating-point IP cores have been developed. These models are also utilized to develop accurate power models that consider the effect of logic power, signal power, clock power and I/O power. In all cases, the model coefficients have been derived by using curve fitting or regression analysis. The modeling error for the IP cores is very small (average 0.95%). The error for fairly large examples such as floating point implementation of 8-point FFTs is also quite small; it is 1.87 % for estimation of number of slices and 3.48 % for estimation of power consumption.
Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators f... more Abstract—This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. More...
Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a conseque... more Algorithm-architecture co-exploration is hindered by the lack of ef- ficient tools. As a consequence, designers are currently abl e to ex- plore only a limited set of points in the whole design space. There- fore, a tool that can allow fast exploration of algorithmic a nd archi- tectural tradeoffs in an automated manner is highly desired. In this paper, we
2006 IEEE International SOC Conference, 2006
This work focuses on reducing power consumption while maintaining the efficiency and accuracy of ... more This work focuses on reducing power consumption while maintaining the efficiency and accuracy of matrix computations using both algorithmic and architectural means. We transform the algorithms, in adaptation to application specifics, to translate the matrix structures into power saving potential via geometric tiling. Instead of using blind tiling, we index and partition matrix elements according to the underlying geometry to claim a better estimate and control of numerical range within and across geometric tiles, which can then be exploited for power saving.
2007 International Conference on Field Programmable Logic and Applications, 2007
Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequenc... more Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequence, designers are currently able to explore only a limited set of points in the whole design space. Therefore, a tool that can allow fast exploration of algorithmic and architectural tradeoffs in an automated manner is highly desired. In this paper, we describe TANOR an automated tool targeted for designing hardware accelerators for the class of N-body interaction problems. The design flow, starting from a high level (MATLAB) description, configures the entire system automatically. We describe the design of TANOR and demonstrate the effectiveness and adaptability of our tool using three different target applications, namely, the gravitational kernel used in astrophysics, the gaussian kernel common in image processing applications, and a force calculation kernel applied in molecular dynamics. Our results demonstrate that TANOR generates hardware accelerator that are competitive with existing custom accelerator.
2007 IEEE Workshop on Signal Processing Systems, 2007
A hardware efficient approach is introduced for elementary function evaluations in certain struct... more A hardware efficient approach is introduced for elementary function evaluations in certain structured matrix computations. It is a comprehensive approach that utilizes lookup tables for compactness, employs interpolations with adders and multipliers for their adaptivity to non-tabulated values and, more distinctively, exploits the function properties and the matrix structures to claim better control over numerical dynamic ranges. We demonstrate the
29th VLSI Test Symposium, 2011
Abstract A novel slave clock-gating technique in [5} is designed to save power when the master an... more Abstract A novel slave clock-gating technique in [5} is designed to save power when the master and slave latches of a low power flip-flop reach certain correlated states (eg, both latches are at logic 0 or 1). Testing this clock-gating circuit is essential for power-sensitive ...
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
ABSTRACT This paper presents accurate area and power estimation models for implementations using ... more ABSTRACT This paper presents accurate area and power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family. These models are designed to facilitate efficient design space explo-ration in an automated algorithm-architecture codesign framework. ...
2009 International Test Conference, 2009
A power-only defect (POD) never causes function failures but only leads to power consumption incr... more A power-only defect (POD) never causes function failures but only leads to power consumption increase. In other words, a device with PODs can work as expected in terms of functionality and increasing power consumption in mission modes is the only external indication. The ...
Journal of Signal Processing Systems, 2011
This paper presents accurate area, time, power estimation models for implementations using FPGAs ... more This paper presents accurate area, time, power estimation models for implementations using FPGAs from the Xilinx Virtex-2Pro family (Deng et al. 2008). These models are designed to facilitate efficient design space exploration in an automated algorithm-architecture codesign framework. Detailed models for estimating the number of slices, block RAMs and 18×18-bit multipliers for fixed point and floating point IP cores have
IEEE Transactions on Computers, 2000
This paper describes TANOR, an automated framework for designing hardware accelerators for numeri... more This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. The focus is on N-body interaction problems which have a wide range of applications spanning from astrophysics to molecular dynamics. The TANOR design flow starts with a MATLAB description of a particular interaction function, its parameters, and certain architectural constraints specified through a graphical user interface. Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. Architectural exploration is facilitated through support for fully custom fixed-point and floating point representations in addition to standard number representations such as single precision floating point. Moreover, TANOR enables joint exploration of algorithmic and architectural variations in realizing efficient hardware accelerators. TANOR's capabilities have been demonstrated for three different N-body interaction applications: the calculation of gravitational potential in astrophysics, the diffusion or convolution with Gaussian kernel common in image processing applications, and the force calculation with vector-valued kernel function in molecular dynamics simulation. Experimental results show that TANOR-generated hardware accelerators achieve lower resource utilization without compromising numerical accuracy, in comparison to other existing custom accelerators.
… (ITC), 2010 IEEE …, 2010
There is an ever-increasing demand for higher performance microprocessors within a given power bu... more There is an ever-increasing demand for higher performance microprocessors within a given power budget. This demand forces design choices-that were once seen only in high-speed custom blocks-to spread throughout the microprocessor core. These unique design ...