Abbas Kiasari - Academia.edu (original) (raw)
Papers by Abbas Kiasari
Microprocessors and Microsystems, 2012
Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, h... more Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, highperformance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual channels is important since it has a direct effect on the power consumption of NoCs. In this paper, we propose a novel systematic approach for designing deadlock-free routing algorithms for torus NoCs. Using this method a new deterministic routing algorithm (called TRANC) is proposed that uses only one virtual channel per physical channel in torus NoCs. We also propose an algorithmic mapping that enables extracting TRANC-based routing algorithms from existing routing algorithms, which can be both deterministic and adaptive. The simulation results show power consumption and performance improvements when using the proposed algorithms.
IEEE Transactions on Parallel and Distributed Systems, 2016
Proceedings of the 20th International Conference on Parallel and Distributed Processing, 2006
Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.
for high-speed inter-node communication. NoC with torus interconnection topology is now popular d... more for high-speed inter-node communication. NoC with torus interconnection topology is now popular due to its low dimension and simple structure. Torus NoC is very similar to the mesh NoC from a structural point of view, but has rather smaller diameter that makes it a suitable choice for NoCs. For a routing algorithm to be deadlock-free in a torus NoC at least two virtual channels should be used to avoid channel dependency, while mesh NoC can handle deadlock freedom using only one virtual channel. In this paper, we propose a novel approach on designing routing algorithms for mesh and torus NoCs. Also a deadlock free routing algorithm is proposed for Torus NoC that uses only one virtual channel per physical channel resulting in lower power consumption because of reduced hardware complexity and with no significant performance degradation. The algorithm works within a dimension and is applied to all dimensions individually for XY routing and various turn based deterministic routing algorithms like west first, north last and negative first. We have proved efficiency of the algorithm using simulation results obtained from synthesis of our implemented VHDL Register Transfer Level (RTL) model of NoC.
2015 IEEE Trustcom/BigDataSE/ISPA, 2015
Lecture Notes in Computer Science, 2015
Lecture Notes in Computer Science, 2015
Euro-Par 2008Parallel …, Jan 1, 2008
Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedbac... more Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedback on design choices in the implementation of multi-core systems such as parallel systems, multicomputers, and Systems-on-Chip (SoCs). The significant advantage of analytical ...
2006 International Conference on Computer Design, 2006
Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wir... more Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wires to achieve high performance and modularity. Power efficiency is one of the most important concerns in NoC architecture design. The choice of network topology is important in designing a low-power and high-performance NoC. In this paper, we propose the use of the WK-recursive networks to be used as the underlying topology in NoC. We have implemented VHDL hardware model of mesh and WK-recursive topologies and measured the latency results using simulation with these implementation. We also propose a novel approach in high level power modeling based on latency for these topologies and show that the power consumption of WK-recursive topology is less than that of the equivalent mesh on a chip.
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
The star graph was introduced as an attractive alternative to the well-known hypercube and its pr... more The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormholeswitched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions.
Proceedings of the Third International Workshop on Network on Chip Architectures - NoCArc '10, 2010
ABSTRACT
Routing Algorithms in Networks-on-Chip, 2013
16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), 2008
Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of fut... more Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to have access to last methods for evaluating the performance of on-chip networks. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks. We compute
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems - CASES '12, 2012
ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedul... more ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedulability analysis, network calculus, and queueing theory -- and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.
2008 International Conference on Application-Specific Systems, Architectures and Processors, 2008
Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that ... more Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that can provide efficient and scalable data transport among the intellectual properties (IPs). Designing and optimizing SoCs is an increasingly difficult task due to the size and complexity of the SoC design space, high cost of detailed simulation, and several constraints that the design must satisfy. For efficient design of
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.
Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), 2005
7he star graph was introduced as an attractive alternative to the well-know hypercube and its pro... more 7he star graph was introduced as an attractive alternative to the well-know hypercube and its prop-erties have been well studied in the past. Most of these studies have focused on topo/ogical properties and algorithmic aspects of this network Im this paper, the performance of nine ...
Journal of Computer and System Sciences, 2008
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2000
ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole... more ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.
Microprocessors and Microsystems, 2012
Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, h... more Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, highperformance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual channels is important since it has a direct effect on the power consumption of NoCs. In this paper, we propose a novel systematic approach for designing deadlock-free routing algorithms for torus NoCs. Using this method a new deterministic routing algorithm (called TRANC) is proposed that uses only one virtual channel per physical channel in torus NoCs. We also propose an algorithmic mapping that enables extracting TRANC-based routing algorithms from existing routing algorithms, which can be both deterministic and adaptive. The simulation results show power consumption and performance improvements when using the proposed algorithms.
IEEE Transactions on Parallel and Distributed Systems, 2016
Proceedings of the 20th International Conference on Parallel and Distributed Processing, 2006
Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.
for high-speed inter-node communication. NoC with torus interconnection topology is now popular d... more for high-speed inter-node communication. NoC with torus interconnection topology is now popular due to its low dimension and simple structure. Torus NoC is very similar to the mesh NoC from a structural point of view, but has rather smaller diameter that makes it a suitable choice for NoCs. For a routing algorithm to be deadlock-free in a torus NoC at least two virtual channels should be used to avoid channel dependency, while mesh NoC can handle deadlock freedom using only one virtual channel. In this paper, we propose a novel approach on designing routing algorithms for mesh and torus NoCs. Also a deadlock free routing algorithm is proposed for Torus NoC that uses only one virtual channel per physical channel resulting in lower power consumption because of reduced hardware complexity and with no significant performance degradation. The algorithm works within a dimension and is applied to all dimensions individually for XY routing and various turn based deterministic routing algorithms like west first, north last and negative first. We have proved efficiency of the algorithm using simulation results obtained from synthesis of our implemented VHDL Register Transfer Level (RTL) model of NoC.
2015 IEEE Trustcom/BigDataSE/ISPA, 2015
Lecture Notes in Computer Science, 2015
Lecture Notes in Computer Science, 2015
Euro-Par 2008Parallel …, Jan 1, 2008
Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedbac... more Abstract. Performance evaluation is an important engineering tool that pro-vides valuable feedback on design choices in the implementation of multi-core systems such as parallel systems, multicomputers, and Systems-on-Chip (SoCs). The significant advantage of analytical ...
2006 International Conference on Computer Design, 2006
Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wir... more Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wires to achieve high performance and modularity. Power efficiency is one of the most important concerns in NoC architecture design. The choice of network topology is important in designing a low-power and high-performance NoC. In this paper, we propose the use of the WK-recursive networks to be used as the underlying topology in NoC. We have implemented VHDL hardware model of mesh and WK-recursive topologies and measured the latency results using simulation with these implementation. We also propose a novel approach in high level power modeling based on latency for these topologies and show that the power consumption of WK-recursive topology is less than that of the equivalent mesh on a chip.
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
The star graph was introduced as an attractive alternative to the well-known hypercube and its pr... more The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormholeswitched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions.
Proceedings of the Third International Workshop on Network on Chip Architectures - NoCArc '10, 2010
ABSTRACT
Routing Algorithms in Networks-on-Chip, 2013
16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), 2008
Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of fut... more Network-on-chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to have access to last methods for evaluating the performance of on-chip networks. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks. We compute
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems - CASES '12, 2012
ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedul... more ABSTRACT This tutorial reviews four popular mathematical formalisms -- dataflow analysis, schedulability analysis, network calculus, and queueing theory -- and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.
2008 International Conference on Application-Specific Systems, Architectures and Processors, 2008
Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that ... more Future system-on-chip (SoC) designs will need efficient on-chip communication architectures that can provide efficient and scalable data transport among the intellectual properties (IPs). Designing and optimizing SoCs is an increasingly difficult task due to the size and complexity of the SoC design space, high cost of detailed simulation, and several constraints that the design must satisfy. For efficient design of
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using... more Many theoretical-based comparison studies, relying on the graph theoretical viewpoints with using structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message length and virtual channels and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models already proposed for the star graph and hypercube and implement the parameter changes imposed by technological implementation constraints. The comparison results reveal that the star graph has a better performance compared to the equivalent hypercube under light traffic loads while the opposite conclusion is reached for heavy traffic loads. The hypercube with more channels compared to its equivalent star graph saturates later showing that it can bear heavier traffic loads.
Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), 2005
7he star graph was introduced as an attractive alternative to the well-know hypercube and its pro... more 7he star graph was introduced as an attractive alternative to the well-know hypercube and its prop-erties have been well studied in the past. Most of these studies have focused on topo/ogical properties and algorithmic aspects of this network Im this paper, the performance of nine ...
Journal of Computer and System Sciences, 2008
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2000
ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole... more ABSTRACT We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.