Performance Comparison of CQ Selection Policies (original) (raw)
Related papers
Impact of scheduling algorithms on performance of crosspoint-queued switch
annals of telecommunications - annales des télécommunications, 2011
The performance analysis of the 32x32 crosspoint queued switch is presented in this paper. Switches with small buffers in crosspoints have been evaluated in the late Eighties, but mostly for uniform traffic. However, due to technological limitations of that time, it was impractical to implement large buffers together with switching fabric. The crosspoint queued switch architecture has been recently brought back into focus since modern technology enables an easy implementation of large buffers in crosspoints. An advantage of this solution is the absence of control communication between linecards and schedulers. In this paper, the performances of four algorithms (longest queue first, round robin, exhaustive round robin and frame based round robin matching), are analyzed and compared. The results obtained for the crosspoint queued switch are compared with the output queued switch. Throughput, average cell latency and instantaneous packet delay variance are evaluated under uniform and nonuniform traffic patterns.
Performance evaluation of crosspoint queued switch under the heavy traffic
In this paper, the performance analysis of a 2x2 crosspoint queued switch is presented. The analysis is performed under the non-admissible traffic pattern, for three scheduling algorithms: round-robin, frame-based-roundrobin-matching and longest-queue-first. Throughput, average cell latency and memory requirements for buffer implementation are observed. In addition to these parameters which are usually used for switch performance evaluation, inter-flow fairness is also analyzed. The results show that very long buffers are required in order to achieve good performance under the traffic overload, even for small switches. The longest-queue-first algorithm showed higher throughput and lower memory requirements, but worse latency and fairness than other two observed algorithms.
Performance analysis of variable packet size crosspoint-queued switch
Eurocon 2013, 2013
The performance analysis of the variable packet size crosspoint queued switch is presented in this paper. Packet switch throughput and average latency are evaluated under Interrupted Bernoulli Process incoming traffic pattern. It is shown that among the observed algorithms, the longest queue first algorithm has the highest throughput with short crosspoint buffers, but the highest average latency. Also, we establish that the choice of the scheduling algorithm does not play a significant role in the switch performance if the buffers are long enough. Therefore, the round robin algorithm becomes the best choice for implementation due to its simplicity.
International Journal of Parallel, Emergent and Distributed Systems, 2006
Combined input-crosspoint-queued (CICQ) switch structure decouples the inputs and outputs matching and enables totally distributed arbitration. CICQ switch cannot achieve 100% throughput under nonuniform traffic if Round-Robin (RR-RR) algorithm is used. The other existing schemes require quite a bit of hardware and time complexity. In this paper, we theoretically prove that the RR-RR can achieve 100% throughput under uniform traffic, but it would be instable under non-uniform traffic. Our quantitative analysis also points out that the throughput will be ,91.7% under the weak diagonal traffic model with f ¼ 0.5 and k ¼ 2, where f is the packet arrival rate along the main diagonal of the traffic matrix and k is the number of buffers used in each crosspoint. Moreover, our formula shows that the k must be at least 4 and 24, respectively, if we wish to improve the throughput to 95 and 99%. Based on our theoretical study, we propose the Differential Round-Robin (DRR) algorithm. Simulations have demonstrated that DRR can achieve 100% throughput under arbitrary traffic using only one buffer cell in each crosspoint. DRR algorithm keeps the high simplicity and efficiency of RR-RR with O(1) complexity while overcoming the instability problem of RR-RR.
Prioritized Queue with Round Robin Scheduler for Buffered Crossbar Switches
ICTACT Journal on Communication Technology, 2014
Research in high speed switching systems is in greater demand as the internet traffic gets rapid increase. Designing an efficient scheduling algorithm with high throughput and low delay is an open challenge. Most of the algorithms achieve 100% throughput in uniform traffics but failed to attain the same performance under non-uniform traffics. Moreover these algorithms are also suffers from starvation leads to extended waiting time of VOQ. In this paper, Prioritized Queue with Round Robin Scheduler (PQRS) is proposed for Buffered Crossbar Switches. We proved that our proposed scheduler can achieve 85% throughput under any non-uniform traffic without starvation.
Analysis of WRR scheduling algorithm frame size impact on CQ switch performance
MELECON 2014 - 2014 17th IEEE Mediterranean Electrotechnical Conference, 2014
In this paper we study the influence of Weighted Round Robin (WRR) scheduling algorithm frame size to the performance of Crosspoint Queued (CQ) crossbar switch. In order to show that throughput and cell delay can be adjusted with appropriate WRR frame size, we analyzed switch for different values of frame size and under the unbalanced bursty traffic. We show that WRR scheduling algorithm achieve throughput similar to Output Queued switch when frame size is low. Also, with an increase of the frame size, the delay decreases.
CQ switch analysis under traffic overload
Telfor Journal, 2011
An analysis of 2x2 crossbar packet switch with buffers at crosspoints and round robin scheduling algorithm is presented in this paper. The analysis is performed for a non-admissible traffic pattern, where output ports are overloaded. The case of full offered load is observed and output ports are loaded with packets that have different arrival probabilities. In addition to the parameters that are commonly observed in such an analysis (throughput and average packet delay), memory requirements for the implementation of the buffer, as well as fair representation when servicing the buffer -the so-called fairness are also analyzed. The results show that even for a switch with a small number of ports very large buffers should be implemented, if we want to achieve satisfactory performance under traffic overload.
Threshold-based Exhaustive Round-Robin for the CICQ Switch with Virtual Crosspoint Queues
2007 IEEE International Conference on Communications, 2007
A multi-cabinet implementation of a combined input and crosspoint queued (CICQ) switch introduces a large RTT latency between the line cards and switch fabric, making the crosspoint (CP) buffer requirement impractical. A virtual crosspoint queues (VCQs), proposed in literature are shared among a set of VOQs and CP buffers for the same input port, reducing minimal memory required inside the switch fabric. In this paper, a threshold-based exhaustive round-robin (T-ERR) is employed to improve the throughput of the CICQ switch with VCQs. T-ERR at VCQ and CP arbiters serve packets residing in a longer queue more aggressively than packet residing in a shorter queue. T-ERR is simple yet drastically increases throughput for the CICQ with small VCQ size. Simulation experiment with unbalanced traffic show that its throughput improves from 80% to 94% for CP size of 4 cells and 73% to 83% for CP size of 2 cells for RTT = 64 cell time. Furthermore, its throughput is independent of switch size and RTT. Thus, the proposed scheme makes the scalable implementation of a distributed CICQ switch practical.
Buffer length impact to crosspoint queued crossbar switch performance
Performance analysis of two crosspoint queued switches, with four and sixteen ports, is presented in this paper. Crosspoint queued switch architecture has been recently actualized, since it is now not a problem to implement large buffers in crosspoints using modern technology any more. An advantage of this type of solution is the absence of control communication between linecards and schedulers. Four algorithms are used to implement a scheduler: longest queue first, round-robin, frame based round robin matching and random. Throughput, average cell latency and loss probability, are evaluated under uniform and various nonuniform traffic patterns. Results will show that the longest queue first algorithm has the best performance in most simulated cases. It will also be shown that an implemented algorithm does not have any influence on switch performance if the buffers are long enough.
Performance evaluation of new scheduling methods for the RR/RR CICQ switch
Computer Communications, 2005
Increasing link speeds and port counts in packet switches demand that methods for minimizing internal speed-up and implementing fast scheduling be developed. Combined input and cross point queued (CICQ) switches with round-robin (RR) polling of virtual output queues (VOQ) and of cross point buffers can natively forward variable-length packets without a required internal segmentation into cells. However, native switching of variable-length packets results in unfairness between ports. To eliminate this unfairness, we propose a block transfer mechanism that transfers up to a predefined number of bytes of packet data from a selected VOQ. This mechanism does not require internal speed-up. We also propose an overlapped RR (ORR) arbiter design that fully overlaps RR polling and scheduling. Using simulation and both synthetic and traced packet traffic as input, we show that the RR/RR CICQ switch with the block transfer mechanism has a lower delay than an input queued (IQ) switch that internally uses cells. We also show that the ORR arbiter is scalable, work conserving, and fair. q