Abbas Kiasari - Academia.edu (original) (raw)

Papers by Abbas Kiasari

Research paper thumbnail of Power-efficient deterministic and adaptive routing in torus networks-on-chip

Microprocessors and Microsystems, 2012

Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, h... more Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, highperformance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual channels is important since it has a direct effect on the power consumption of NoCs. In this paper, we propose a novel systematic approach for designing deadlock-free routing algorithms for torus NoCs. Using this method a new deterministic routing algorithm (called TRANC) is proposed that uses only one virtual channel per physical channel in torus NoCs. We also propose an algorithmic mapping that enables extracting TRANC-based routing algorithms from existing routing algorithms, which can be both deterministic and adaptive. The simulation results show power consumption and performance improvements when using the proposed algorithms.

Research paper thumbnail of An Optimal Single-Path Routing Algorithm in the Datacenter Network DPillar

IEEE Transactions on Parallel and Distributed Systems, 2016

Research paper thumbnail of A comparative performance analysis of n-cubes and star graphs

Proceedings of the 20th International Conference on Parallel and Distributed Processing, 2006

Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.

Research paper thumbnail of Power-Efficient Routing Algorithm for Torus NoCs

for high-speed inter-node communication. NoC with torus interconnection topology is now popular d... more for high-speed inter-node communication. NoC with torus interconnection topology is now popular due to its low dimension and simple structure. Torus NoC is very similar to the mesh NoC from a structural point of view, but has rather smaller diameter that makes it a suitable choice for NoCs. For a routing algorithm to be deadlock-free in a torus NoC at least two virtual channels should be used to avoid channel dependency, while mesh NoC can handle deadlock freedom using only one virtual channel. In this paper, we propose a novel approach on designing routing algorithms for mesh and torus NoCs. Also a deadlock free routing algorithm is proposed for Torus NoC that uses only one virtual channel per physical channel resulting in lower power consumption because of reduced hardware complexity and with no significant performance degradation. The algorithm works within a dimension and is applied to all dimensions individually for XY routing and various turn based deterministic routing algorithms like west first, north last and negative first. We have proved efficiency of the algorithm using simulation results obtained from synthesis of our implemented VHDL Register Transfer Level (RTL) model of NoC.

Research paper thumbnail of Routing Algorithms for Recursively-Defined Data Centre Networks

2015 IEEE Trustcom/BigDataSE/ISPA, 2015

Research paper thumbnail of An Efficient Shortest-Path Routing Algorithm in the Data Centre Network DPillar

Lecture Notes in Computer Science, 2015

Research paper thumbnail of Star-Replaced Networks: A Generalised Class of Dual-Port Server-Centric Data Centre Networks

Research paper thumbnail of On Routing Algorithms for the DPillar Data Centre Networks

Lecture Notes in Computer Science, 2015

Research paper thumbnail of Caspian: A Tunable Performance Model for Multi-core Systems

Euro-Par 2008–Parallel …, Jan 1, 2008

Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedbac... more Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedback on design choices in the implementation of multi-core systems such as parallel systems, multicomputers, and Systems-on-Chip (SoCs). The significant advantage of analytical ...

Research paper thumbnail of A Performance and Power Analysis of WK-Recursive and Mesh Networks for Network-on-Chips

2006 International Conference on Computer Design, 2006

Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wir... more Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wires to achieve high performance and modularity. Power efficiency is one of the most important concerns in NoC architecture design. The choice of network topology is important in designing a low-power and high-performance NoC. In this paper, we propose the use of the WK-recursive networks to be used as the underlying topology in NoC. We have implemented VHDL hardware model of mesh and WK-recursive topologies and measured the latency results using simulation with these implementation. We also propose a novel approach in high level power modeling based on latency for these topologies and show that the power consumption of WK-recursive topology is less than that of the equivalent mesh on a chip.

Research paper thumbnail of Analytical performance modelling of adaptive wormhole routing in the star interconnection network

Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006

The star graph was introduced as an attractive alternative to the well-known hypercube and its pr... more The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormholeswitched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions.

Research paper thumbnail of A framework for designing congestion-aware deterministic routing

Proceedings of the Third International Workshop on Network on Chip Architectures - NoCArc '10, 2010

ABSTRACT

Research paper thumbnail of A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs

Routing Algorithms in Networks-on-Chip, 2013

Research paper thumbnail of A Markovian Performance Model for Networks-on-Chip

16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), 2008

Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of fut... more Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to have access to last methods for evaluating the performance of on-chip networks. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks. We compute

Research paper thumbnail of Analytical approaches for performance evaluation of networks-on-chip

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems - CASES '12, 2012

ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedul... more ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedulability analysis, network calculus, and queueing theory -- and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.

Research paper thumbnail of PERMAP: A performance-aware mapping for application-specific SoCs

2008 International Conference on Application-Specific Systems, Architectures and Processors, 2008

Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that ... more Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that can provide efficient and scalable data transport among the intellectual properties (IPs). Designing and optimizing SoCs is an increasingly difficult task due to the size and complexity of the SoC design space, high cost of detailed simulation, and several constraints that the design must satisfy. For efficient design of

Research paper thumbnail of A comparative performance analysis of n-cubes and star graphs

Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006

Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.

Research paper thumbnail of Performance comparison of adaptive routing algorithms in the star interconnection network

Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), 2005

7he star graph was introduced as an attractive alternative to the well-know hypercube and its pro... more 7he star graph was introduced as an attractive alternative to the well-know hypercube and its prop-erties have been well studied in the past. Most of these studies have focused on topo/ogical properties and algorithmic aspects of this network Im this paper, the performance of nine ...

Research paper thumbnail of Analytic performance comparison of hypercubes and star graphs with implementation constraints

Journal of Computer and System Sciences, 2008

Research paper thumbnail of An Analytical Latency Model for Networks-on-Chip

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2000

ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole... more ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.

Research paper thumbnail of Power-efficient deterministic and adaptive routing in torus networks-on-chip

Microprocessors and Microsystems, 2012

Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, h... more Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, highperformance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual channels is important since it has a direct effect on the power consumption of NoCs. In this paper, we propose a novel systematic approach for designing deadlock-free routing algorithms for torus NoCs. Using this method a new deterministic routing algorithm (called TRANC) is proposed that uses only one virtual channel per physical channel in torus NoCs. We also propose an algorithmic mapping that enables extracting TRANC-based routing algorithms from existing routing algorithms, which can be both deterministic and adaptive. The simulation results show power consumption and performance improvements when using the proposed algorithms.

Research paper thumbnail of An Optimal Single-Path Routing Algorithm in the Datacenter Network DPillar

IEEE Transactions on Parallel and Distributed Systems, 2016

Research paper thumbnail of A comparative performance analysis of n-cubes and star graphs

Proceedings of the 20th International Conference on Parallel and Distributed Processing, 2006

Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.

Research paper thumbnail of Power-Efficient Routing Algorithm for Torus NoCs

for high-speed inter-node communication. NoC with torus interconnection topology is now popular d... more for high-speed inter-node communication. NoC with torus interconnection topology is now popular due to its low dimension and simple structure. Torus NoC is very similar to the mesh NoC from a structural point of view, but has rather smaller diameter that makes it a suitable choice for NoCs. For a routing algorithm to be deadlock-free in a torus NoC at least two virtual channels should be used to avoid channel dependency, while mesh NoC can handle deadlock freedom using only one virtual channel. In this paper, we propose a novel approach on designing routing algorithms for mesh and torus NoCs. Also a deadlock free routing algorithm is proposed for Torus NoC that uses only one virtual channel per physical channel resulting in lower power consumption because of reduced hardware complexity and with no significant performance degradation. The algorithm works within a dimension and is applied to all dimensions individually for XY routing and various turn based deterministic routing algorithms like west first, north last and negative first. We have proved efficiency of the algorithm using simulation results obtained from synthesis of our implemented VHDL Register Transfer Level (RTL) model of NoC.

Research paper thumbnail of Routing Algorithms for Recursively-Defined Data Centre Networks

2015 IEEE Trustcom/BigDataSE/ISPA, 2015

Research paper thumbnail of An Efficient Shortest-Path Routing Algorithm in the Data Centre Network DPillar

Lecture Notes in Computer Science, 2015

Research paper thumbnail of Star-Replaced Networks: A Generalised Class of Dual-Port Server-Centric Data Centre Networks

Research paper thumbnail of On Routing Algorithms for the DPillar Data Centre Networks

Lecture Notes in Computer Science, 2015

Research paper thumbnail of Caspian: A Tunable Performance Model for Multi-core Systems

Euro-Par 2008–Parallel …, Jan 1, 2008

Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedbac... more Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedback on design choices in the implementation of multi-core systems such as parallel systems, multicomputers, and Systems-on-Chip (SoCs). The significant advantage of analytical ...

Research paper thumbnail of A Performance and Power Analysis of WK-Recursive and Mesh Networks for Network-on-Chips

2006 International Conference on Computer Design, 2006

Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wir... more Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wires to achieve high performance and modularity. Power efficiency is one of the most important concerns in NoC architecture design. The choice of network topology is important in designing a low-power and high-performance NoC. In this paper, we propose the use of the WK-recursive networks to be used as the underlying topology in NoC. We have implemented VHDL hardware model of mesh and WK-recursive topologies and measured the latency results using simulation with these implementation. We also propose a novel approach in high level power modeling based on latency for these topologies and show that the power consumption of WK-recursive topology is less than that of the equivalent mesh on a chip.

Research paper thumbnail of Analytical performance modelling of adaptive wormhole routing in the star interconnection network

Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006

The star graph was introduced as an attractive alternative to the well-known hypercube and its pr... more The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormholeswitched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions.

Research paper thumbnail of A framework for designing congestion-aware deterministic routing

Proceedings of the Third International Workshop on Network on Chip Architectures - NoCArc '10, 2010

ABSTRACT

Research paper thumbnail of A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs

Routing Algorithms in Networks-on-Chip, 2013

Research paper thumbnail of A Markovian Performance Model for Networks-on-Chip

16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), 2008

Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of fut... more Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to have access to last methods for evaluating the performance of on-chip networks. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks. We compute

Research paper thumbnail of Analytical approaches for performance evaluation of networks-on-chip

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems - CASES '12, 2012

ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedul... more ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedulability analysis, network calculus, and queueing theory -- and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.

Research paper thumbnail of PERMAP: A performance-aware mapping for application-specific SoCs

2008 International Conference on Application-Specific Systems, Architectures and Processors, 2008

Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that ... more Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that can provide efficient and scalable data transport among the intellectual properties (IPs). Designing and optimizing SoCs is an increasingly difficult task due to the size and complexity of the SoC design space, high cost of detailed simulation, and several constraints that the design must satisfy. For efficient design of

Research paper thumbnail of A comparative performance analysis of n-cubes and star graphs

Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006

Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.

Research paper thumbnail of Performance comparison of adaptive routing algorithms in the star interconnection network

Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), 2005

7he star graph was introduced as an attractive alternative to the well-know hypercube and its pro... more 7he star graph was introduced as an attractive alternative to the well-know hypercube and its prop-erties have been well studied in the past. Most of these studies have focused on topo/ogical properties and algorithmic aspects of this network Im this paper, the performance of nine ...

Research paper thumbnail of Analytic performance comparison of hypercubes and star graphs with implementation constraints

Journal of Computer and System Sciences, 2008

Research paper thumbnail of An Analytical Latency Model for Networks-on-Chip

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2000

ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole... more ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.