Guido Araujo - Academia.edu (original) (raw)

Guido Araujo

Uploads

Papers by Guido Araujo

Research paper thumbnail of Modeling and Simulating Memory Hierarchies in a Platform-Based Design Methodology

Research paper thumbnail of Instruction set design and optimizations for address computation in dsp processors

Research paper thumbnail of Efficient datapath merging for partially reconfigurable architectures

IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 2005

Research paper thumbnail of Challenges in code generation for embedded processors

Research paper thumbnail of Code Generation and Optimization Techniques for Embedded Digital Signal Processors

Research paper thumbnail of Improving Offset Assignment through Simultaneous Variable Coalescing

Efficient address code optimization is a central problem in code generation for processors with r... more Efficient address code optimization is a central problem in code generation for processors with restricted addressing modes, like Digital Signal Processors (DSPs). This paper proposes a new heuristic to solve the Simple Offset Assignment (SOA) problem, the problem of allocating scalar variables to memory so as to minimize addressing code. This new approach, called Coalescing SOA (CSOA), performs variable memory slot coalescing simultaneously to offset assignment computation. Experimental results, based on compiling MediaBench benchmark programs with LANCE compiler, reveal a very significant improvement over the previous solutions to SOA. In fact, CSOA produces, on average, 37.3% fewer update instructions when comparing with the prior solution that perform memory slot coalescing before applying SOA, and 66.2% fewer update instructions when comparing with the best traditional SOA solution.

Research paper thumbnail of Instruction set design and optimization for address computation in dsp architectures

Research paper thumbnail of Optimal Live Range Merge for Address Register Allocation in Embedded Programs

The increasing demand for wireless devices running mobile applications has renewed the interest o... more The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats and shallow pipelines. Given that it enables such architectural features, indirect addressing is the most used addressing mode in embedded programs. This paper analyzes the problem of allocating address registers to array references in loops using auto-increment addressing mode. It leverages on previous work, which is based on a heuristic that merges address register live ranges. We prove, for the first time, that the merge operation is NP-hard in general, and show the existence of an optimal linear-time algorithm, based on dynamic programming, for a special case of the problem.

Research paper thumbnail of Code generation for fixed-point DSPs

ACM Transactions on Design Automation of Electronic Systems, 1998

Research paper thumbnail of Using register-transfer paths in code generation for heterogeneous memory-register architectures

Research paper thumbnail of Using register-transfer paths in code generation for heterogeneous memory-register architectures

In this paper we address the problem of code generation for basic blocks in heterogeneous memory-... more In this paper we address the problem of code generation for basic blocks in heterogeneous memory-register DSP processors. We propose a new a technique, based on register-transfer paths, that can be used for efficiently dismantling basic block DAGs (Directed Acyclic Graphs) into expression trees. This approach builds on recent results which report optimal code generation algorithm for expression trees for these architectures. This technique has been implemented and experimentally validated for the TMS320C25, a popular fixed point DSP processor. The results show that good code quality can be obtained using the proposed technique. An analysis of the type of DAGs found in the DSPstone benchmark programs reveals that the majority of basic blocks in this benchmark set are expression trees and leaf DAGs. This leads to our claim that tree based algorithms, like the one described in this paper, should be the technique of choice for basic blocks code generation with heterogeneous memory register architectures

Research paper thumbnail of Optimal code generation for embedded memory non-homogeneous register architectures

Research paper thumbnail of Datapath merging and interconnection sharing for reconfigurable architectures

Research paper thumbnail of The design of dynamically reconfigurable datapath coprocessors

ACM Transactions in Embedded Computing Systems, 2004

Research paper thumbnail of Expression-tree-based algorithms for code compression on embedded RISC architectures

IEEE Transactions on Very Large Scale Integration Systems, 2000

Research paper thumbnail of An efficient framework for high-level power exploration

Research paper thumbnail of Compressed Code Execution on DSP Architectures

Research paper thumbnail of Code compression based on operand factorization

Research paper thumbnail of Teaching computer architecture using an architecture description language

Research paper thumbnail of Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures

Research paper thumbnail of Modeling and Simulating Memory Hierarchies in a Platform-Based Design Methodology

Research paper thumbnail of Instruction set design and optimizations for address computation in dsp processors

Research paper thumbnail of Efficient datapath merging for partially reconfigurable architectures

IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 2005

Research paper thumbnail of Challenges in code generation for embedded processors

Research paper thumbnail of Code Generation and Optimization Techniques for Embedded Digital Signal Processors

Research paper thumbnail of Improving Offset Assignment through Simultaneous Variable Coalescing

Efficient address code optimization is a central problem in code generation for processors with r... more Efficient address code optimization is a central problem in code generation for processors with restricted addressing modes, like Digital Signal Processors (DSPs). This paper proposes a new heuristic to solve the Simple Offset Assignment (SOA) problem, the problem of allocating scalar variables to memory so as to minimize addressing code. This new approach, called Coalescing SOA (CSOA), performs variable memory slot coalescing simultaneously to offset assignment computation. Experimental results, based on compiling MediaBench benchmark programs with LANCE compiler, reveal a very significant improvement over the previous solutions to SOA. In fact, CSOA produces, on average, 37.3% fewer update instructions when comparing with the prior solution that perform memory slot coalescing before applying SOA, and 66.2% fewer update instructions when comparing with the best traditional SOA solution.

Research paper thumbnail of Instruction set design and optimization for address computation in dsp architectures

Research paper thumbnail of Optimal Live Range Merge for Address Register Allocation in Embedded Programs

The increasing demand for wireless devices running mobile applications has renewed the interest o... more The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats and shallow pipelines. Given that it enables such architectural features, indirect addressing is the most used addressing mode in embedded programs. This paper analyzes the problem of allocating address registers to array references in loops using auto-increment addressing mode. It leverages on previous work, which is based on a heuristic that merges address register live ranges. We prove, for the first time, that the merge operation is NP-hard in general, and show the existence of an optimal linear-time algorithm, based on dynamic programming, for a special case of the problem.

Research paper thumbnail of Code generation for fixed-point DSPs

ACM Transactions on Design Automation of Electronic Systems, 1998

Research paper thumbnail of Using register-transfer paths in code generation for heterogeneous memory-register architectures

Research paper thumbnail of Using register-transfer paths in code generation for heterogeneous memory-register architectures

In this paper we address the problem of code generation for basic blocks in heterogeneous memory-... more In this paper we address the problem of code generation for basic blocks in heterogeneous memory-register DSP processors. We propose a new a technique, based on register-transfer paths, that can be used for efficiently dismantling basic block DAGs (Directed Acyclic Graphs) into expression trees. This approach builds on recent results which report optimal code generation algorithm for expression trees for these architectures. This technique has been implemented and experimentally validated for the TMS320C25, a popular fixed point DSP processor. The results show that good code quality can be obtained using the proposed technique. An analysis of the type of DAGs found in the DSPstone benchmark programs reveals that the majority of basic blocks in this benchmark set are expression trees and leaf DAGs. This leads to our claim that tree based algorithms, like the one described in this paper, should be the technique of choice for basic blocks code generation with heterogeneous memory register architectures

Research paper thumbnail of Optimal code generation for embedded memory non-homogeneous register architectures

Research paper thumbnail of Datapath merging and interconnection sharing for reconfigurable architectures

Research paper thumbnail of The design of dynamically reconfigurable datapath coprocessors

ACM Transactions in Embedded Computing Systems, 2004

Research paper thumbnail of Expression-tree-based algorithms for code compression on embedded RISC architectures

IEEE Transactions on Very Large Scale Integration Systems, 2000

Research paper thumbnail of An efficient framework for high-level power exploration

Research paper thumbnail of Compressed Code Execution on DSP Architectures

Research paper thumbnail of Code compression based on operand factorization

Research paper thumbnail of Teaching computer architecture using an architecture description language

Research paper thumbnail of Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures

Log In