Karthikeyan Sankaralingam - Academia.edu (original) (raw)
Uploads
Papers by Karthikeyan Sankaralingam
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15, 2015
2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC., 2000
... IBM, and Intel. References [1] MS Hrishikesh, NP Jouppi, KI Farkas, D. Burger, SW Keckler, an... more ... IBM, and Intel. References [1] MS Hrishikesh, NP Jouppi, KI Farkas, D. Burger, SW Keckler, and P. Shivakumar, The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, ISCA-29, pp. 14-24, May, 2002. [2] R ...
2008 41st IEEE/ACM International Symposium on Microarchitecture, 2008
Significant improvement to visual quality for real-time 3D graphics requires modeling of complex ... more Significant improvement to visual quality for real-time 3D graphics requires modeling of complex illumination effects like soft-shadows, reflections, and diffuse lighting interac-tions. The conventional Z-buffer algorithm driven GPU model does not provide sufficient support for this ...
CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustai... more CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustain both high instruction-level parallelism and aggressive clock frequencies. In this paper, we present a new architecture that maps compiler-scheduled blocks onto a two-dimensional grid of ALUs. For the mapped window of execution, instructions execute in a dataflow-like manner, with each ALU forwarding its result along short wires to the consumers of the result. We describe our studies of program behavior and a preliminary evaluation that show that this architecture has the potential for both high clock speeds and high ILP, and may offer the best of both the VLIW and dynamic superscalar architectures.
Technology constraints and application characteristics are radically changing as we scale to the ... more Technology constraints and application characteristics are radically changing as we scale to the end of silicon technology. Devices are becoming increasingly brittle, highly varying in their properties, and error-prone, leading to a fundamentally unpredictable hardware substrate. Applications are also changing, and emerging new classes of applications are increasingly relying on probabilistic methods. They have an inherent tolerance for uncertainty and can tolerate hardware errors.
Abstract: Modern processors rely heavily on broadcast networks to bypass instruction results tode... more Abstract: Modern processors rely heavily on broadcast networks to bypass instruction results todependent instructions in the pipeline. However, as architectures get wider and pipelinesget deeper, broadcasting becomes more complex, slower, and more difficult to implement.
Deep packet inspection is becoming prevalent for mod- ern network processing systems. They inspec... more Deep packet inspection is becoming prevalent for mod- ern network processing systems. They inspect packet pay- loads for a variety of reasons, including intrusion detecti on, traffic policing, and load balancing. The focus of this paper is deep packet inspection in intrusion detection/preventi on systems (IPSes). The performance critical operation in the se systems is signature matching: matching payloads against signatures of vulnerabilities. Increasing network speedsof today's networks and the transition from simple string-bas ed signatures to complex regular expressions has rapidly in- creased the performance requirement of signature matching . To meet these requirements, solutions range from hardware- centric ASIC/FPGA implementations to software implemen- tations using high-performance microprocessors. In this paper, we propose a programmable SIMD archi- tecture design for IPSes and develop a prototype implemen- tation on an Nvidia G80 GPU. We first present a detailed archi...
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15, 2015
Synthesis Lectures on Computer Architecture, 2013
ACM Transactions on Architecture and Code Optimization, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15, 2015
2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC., 2000
... IBM, and Intel. References [1] MS Hrishikesh, NP Jouppi, KI Farkas, D. Burger, SW Keckler, an... more ... IBM, and Intel. References [1] MS Hrishikesh, NP Jouppi, KI Farkas, D. Burger, SW Keckler, and P. Shivakumar, The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, ISCA-29, pp. 14-24, May, 2002. [2] R ...
2008 41st IEEE/ACM International Symposium on Microarchitecture, 2008
Significant improvement to visual quality for real-time 3D graphics requires modeling of complex ... more Significant improvement to visual quality for real-time 3D graphics requires modeling of complex illumination effects like soft-shadows, reflections, and diffuse lighting interac-tions. The conventional Z-buffer algorithm driven GPU model does not provide sufficient support for this ...
CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustai... more CMOS technology scaling poses challenges in designing dynamically scheduled cores that can sustain both high instruction-level parallelism and aggressive clock frequencies. In this paper, we present a new architecture that maps compiler-scheduled blocks onto a two-dimensional grid of ALUs. For the mapped window of execution, instructions execute in a dataflow-like manner, with each ALU forwarding its result along short wires to the consumers of the result. We describe our studies of program behavior and a preliminary evaluation that show that this architecture has the potential for both high clock speeds and high ILP, and may offer the best of both the VLIW and dynamic superscalar architectures.
Technology constraints and application characteristics are radically changing as we scale to the ... more Technology constraints and application characteristics are radically changing as we scale to the end of silicon technology. Devices are becoming increasingly brittle, highly varying in their properties, and error-prone, leading to a fundamentally unpredictable hardware substrate. Applications are also changing, and emerging new classes of applications are increasingly relying on probabilistic methods. They have an inherent tolerance for uncertainty and can tolerate hardware errors.
Abstract: Modern processors rely heavily on broadcast networks to bypass instruction results tode... more Abstract: Modern processors rely heavily on broadcast networks to bypass instruction results todependent instructions in the pipeline. However, as architectures get wider and pipelinesget deeper, broadcasting becomes more complex, slower, and more difficult to implement.
Deep packet inspection is becoming prevalent for mod- ern network processing systems. They inspec... more Deep packet inspection is becoming prevalent for mod- ern network processing systems. They inspect packet pay- loads for a variety of reasons, including intrusion detecti on, traffic policing, and load balancing. The focus of this paper is deep packet inspection in intrusion detection/preventi on systems (IPSes). The performance critical operation in the se systems is signature matching: matching payloads against signatures of vulnerabilities. Increasing network speedsof today's networks and the transition from simple string-bas ed signatures to complex regular expressions has rapidly in- creased the performance requirement of signature matching . To meet these requirements, solutions range from hardware- centric ASIC/FPGA implementations to software implemen- tations using high-performance microprocessors. In this paper, we propose a programmable SIMD archi- tecture design for IPSes and develop a prototype implemen- tation on an Nvidia G80 GPU. We first present a detailed archi...
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15, 2015
Synthesis Lectures on Computer Architecture, 2013
ACM Transactions on Architecture and Code Optimization, 2015