Scalability of delays in input queued switches
Transcript
Scalability of delays in input queued switches
Scalability of delays in input queued switches Paolo Giaccone Notes for the class on “Router and Switch Architectures” Politecnico di Torino December 2013 Scalability of delays N × N switch Key question How does the average delay W scale with N, when N → ∞? Assumptions I I Bernoulli i.i.d. arrivals with rate λij ∈ [0, 1] cell/slot at input i for output j Admissible traffic: ρ ∈ (0, 1) and X λkj ≤ ρ ∀j k X λik ≤ ρ ∀i k P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 2 / 23 Output Queued (OQ) switch Assume uniform traffic: λij = ρ/N Delay of an OQ switch W OQ 1 =1+ 1− N ρ ρ ≈1+ = O(1) 2(1 − ρ) 2(1 − ρ) (1) Proof: each output queue is a slotted M/D/1 queue with binomial (N, ρ/N) arrivals per slot and service time equal to one slot P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 3 / 23 Scheduling in Input Queued (IQ) switches Queue-independent policies I I Arrival rates are known Frame scheduling F F Random Frame scheduler Periodic Frame scheduler Queue-aware policies I I Arrival rates are unknown Slot-by-slot schedulers F I e.g.: MWM, iSLIP Queue-aware frame scheduler P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 4 / 23 Random Frame scheduling Assume non uniform traffic such that ρµij for some ρ < 1 where P λij = P [µij ] is double-stochastic matrix: k µik = k µkj = 1 P P Thanks to BvN Theorem, µij = k pk Mijk with pk ≥ 0, k pk = 1 and M k = [Mijk ] be one matching matrix (1 ≤ k ≤ N!). At each timeslot, the scheduler selects M k at random with probability pk Delay of Random Frame (RF) scheduler W RF = P. Giaccone (Politecnico di Torino) N −1 = O(N) 1−ρ Delay and frame scheduling Dec. 2013 5 / 23 Proof - I For a slotted Geom/Geom/1 queue with arrival probability λ and service probability µ, the average delay is W Geom/Geom/1 = η λ(1 − η) with η= λ(1 − µ) µ(1 − λ) (2) In the random frame scheduler, VOQij is a Geom/Geom/1 queue with service probability 1 − Pr(VOQij is not served) = 1 − Y 1 − pk Mijk ≈ k 1− 1− X k X pk Mijk = µij pk Mijk = k and arrival probability λij = ρµij . P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 6 / 23 Proof - II Now η=ρ 1 − µij 1 − ρµij 1−η = 1−ρ 1 − ρµij Recalling (2) W ij = (1 − µij ) (1 − ρµij ) 1 − µij 1 ρ = ρµij (1 − ρµij ) (1 − ρ) µij (1 − ρ) The total arrival traffic is λtot = X λij = ρN i,j and the overall average delay is W RF = 1 X 1 X λij (1 − µij ) λij W ij = = λtot Nρ µij (1 − ρ) i,j i,j 1 X ρ(1 − µij ) 1 X ρ − λij N 2 ρ − Nρ N −1 = = = Nρ (1 − ρ) Nρ 1−ρ Nρ(1 − ρ) 1−ρ i,j P. Giaccone (Politecnico di Torino) i,j Delay and frame scheduling Dec. 2013 7 / 23 Periodic Frame scheduling Assume uniform traffic: λij = ρ/N The scheduler serves each VOQ exactly every N timeslots I I I fixed frame of N timeslots during timeslot t, input i is connected to e.g. for N = 3: frame is (M1 , M2 , M3 ) 1 0 0 0 1 M1 = 0 1 0 M2 = 0 0 0 0 1 1 0 output (i + t) mod N 0 1 0 0 M3 = 1 0 0 0 1 1 0 0 Delay of Periodic Frame (PF) scheduling W PF = 1 + P. Giaccone (Politecnico di Torino) N = O(N) 2(1 − ρ) Delay and frame scheduling Dec. 2013 8 / 23 Proof - I Each VOQ is a slotted, single server queue with arrival probability ρ/N and actual/tentative services every N slots. Now sample the state of the queue every N slots, in correspondence of each service opportunity. The sampling period is N slots. The VOQ appears as a slotted M/D/1 queue with binomial (N, ρ/N) arrivals and service equal to one sampling period. For such queue, we know (see (1)): ρ W M/D/1 ≈ 1 + 2(1 − ρ) P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 9 / 23 Proof - II The average delay in the original VOQ will be N times the M/D/1 delay. In addition, the delay must include the average waiting time before being served and must be reduced since the service time is just 1 slot and not N slots as in the considered sub-sampled system. W PF = N + NW M/D/1 − (N − 1) = 2 N Nρ N Nρ +N + −N +1= +1+ = 2 2(1 − ρ) 2 2(1 − ρ) N ρ N 1 1+ 1+ =1+ 2 1−ρ 2 1−ρ P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 10 / 23 Queue-independent Frame Scheduling For the previous queue-independent frame schedulers W RF = O(N) W PF = O(N) General property: Delay of a Queue-Independent frame scheduler For any scheduling algorithm that operates independently of the queue size W queue−independent = O(N) (3) proved in [1] M.J. Neely, E. Modiano, Y.S. Cheng, “Logarithmic Delay for N × N Packet Switches Under the Crossbar Constraint”, IEEE Transaction on Networking, Vol. 15, N. 3, June 2007 by comparing with (3) with (1), queue-independent frame scheduling appears inefficient in terms of delays P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 11 / 23 Generic Frame Scheduling Occupancy matrix L = [Lij ] where Lij is the size of VOQij BvN theorem T ? = max ( N X i=1,...,N k=1 Lik , N X ) Lki k=1 is the minimum clearance time for L Minimum clearance time and maximal size matchings Any arbitrary sequence of maximal size matchings will be able to serve all packets of L in ≤ 2T ? − 1 timeslots. Proof: A given packet can be delayed by at most T ? − 1 packets on the same input and by at most T ? − 1 packets on the same output. In total, each packet can be delayed by at most 2T ? − 2 other packets. P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 12 / 23 Queue-aware frame scheduling Fix the frame duration T Let Ik = [Tk, . . . , T (k + 1) − 1] the slots corresponding to the kth frame At the beginning of kth frame, i.e. at slot Tk, the scheduler computes all the matchings for the future slots in Ik based on just the arrivals in Ik−1 Overflow packets are packets that arrived in Ik−1 and were not served in Ik We assume (for now) that overflow packets are dropped Key Idea 1 Choose T large enough to (almost) avoid overflow packets 2 Delays are O(T ) P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 13 / 23 Queue-aware frame scheduling Let A(T ) = [aij (T )] be the cumulative number of arrived packets during the kth frame Ik By BvN theorem, it is possible P P to serve all the packets and avoid overflow packets iff k aik ≤ T and k akj ≤ T We will show that if T = θ(log(N)) Pr(frame overflow) can become negligible delays become O(log(N)) P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 14 / 23 Chernoff Bound Theorem Let X1 , X2 , . . . , Xn be independent binary random variables, with Pn pi = Pr(Xi = 1). Let X = i=1 Xi and µ = E [X ]. For any δ > 0: Pr(X > (1 + δ)µ) < eδ (1 + δ)(1+δ) µ (4) 1 P(X>(1+δ)µ) 0.01 µ=1 µ=10 µ=100 µ=1000 µ=10000 0.0001 1e-06 1e-08 1e-10 0 0.2 0.4 0.6 0.8 1 δ P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 15 / 23 Proof from Wikipedia From http://en.wikipedia.org/wiki/Chernoff_bound Q E [ ni=1 exp(tXi )] Pr[X > (1 + δ)µ)] ≤ inf t>0 exp(t(1 + δ)µ) Qn E[exp(tXi )] = inf i=1 t>0 exp(t(1 + δ)µ) Qn [pi exp(t) + (1 − pi )] = inf i=1 t>0 exp(t(1 + δ)µ) The third line above follows because e tXi takes the value e t with probability pi and the value 1 with probability 1 − pi . Rewriting pi e t + (1 − pi ) as pi (e t − 1) + 1 and recalling that 1 + x ≤ e x (with strict inequality if x > 0), we set x = pi (e t − 1). P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 16 / 23 Proof from Wikipedia Thus, Qn t − 1)) exp(t(1 + δ)µ) Pn t exp ((e − 1) i=1 pi ) exp((e t − 1)µ) = = . exp(t(1 + δ)µ) exp(t(1 + δ)µ) Pr[X > (1 + δ)µ] < i=1 exp(pi (e If we simply set t = log(1 + δ) so that t > 0 for δ > 0, we can substitute and find µ exp((e t − 1)µ) exp((1 + δ − 1)µ) exp(δ) = = exp(t(1 + δ)µ) (1 + δ)(1+δ)µ (1 + δ)(1+δ) P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 17 / 23 Minimum frame size to avoid overflow Frame size and overflow Let γ = ρe 1−ρ . If T ≥ log(N/) log(1/γ) then Pr(frame overflow) ≤ . 1 0.8 γ 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 ρ P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 18 / 23 Minimum frame size to avoid overflow Frame size and overflow Let γ = ρe 1−ρ . If T ≥ log(N/) log(1/γ) then Pr(frame overflow) ≤ . Minimum frame size for N=16 100000 Minimum frame size for N=1024 100000 ε=0.1 ε=0.01 ε=0.001 ε=0.0001 10000 ε=0.1 ε=0.01 ε=0.001 ε=0.0001 10000 T 1000 T 1000 100 100 10 10 1 1 0 0.2 0.4 0.6 0.8 1 ρ P. Giaccone (Politecnico di Torino) 0 0.2 0.4 0.6 0.8 1 ρ Delay and frame scheduling Dec. 2013 19 / 23 Proof - I Consider a generic output j. Let C (T ) be the number of packets arrived during the frame and destined to j: C (T ) = N X aij (T ) i=1 Pr (overflow for output j) = Pr (C (T ) > T ) PN PT C (T ) = i=1 t=1 Xit where Xit = 1 with probability λij and all Xit are independent random variables. We can use Chernoff Bound: µ = E [C (T )] = N X T X E [Xit ] = T i=1 t=1 having defined ρ0 = PN P. Giaccone (Politecnico di Torino) i=1 λij N X λij = T ρ0 i=1 ≤ ρ. Delay and frame scheduling Dec. 2013 20 / 23 Proof - II Using (4) µ µ eδ Pr(C (T ) > T ) = Pr C (T ) > 0 < ρ (1 + δ)(1+δ) being 1 + δ = 1/ρ0 and δ = 1/ρ0 − 1 !ρ0 T 0 e 1/ρ −1 = Pr (C (T ) > T ) < (1/ρ0 )1/ρ0 0 e 1−ρ 1/ρ0 !T 0 = (ρ0 e 1−ρ )T ≤ γ T since function γ is increasing with respect to ρ. By the union bound: X Pr(overflow for any output) ≤ Pr(overflow for output j) ≤ Nγ T j Now we can set Nγ T ≤ and obtain T log(γ) ≤ log(/N) P. Giaccone (Politecnico di Torino) ⇒ Delay and frame scheduling T ≥ log(N/) log(1/γ) Dec. 2013 21 / 23 Queue-Aware Frame scheduling Assume enough small to experience negligible frame overflows. Then all packets are served with a delay ≤ 2T . Delay for Queue-Aware Frame scheduling WQAF ≤ 2 log(N/) = O(log N) log(1/γ) Note that the [1] proves formally the property for the average delay W QAF by considering also the delays for the overflow packets, which are not dropped as assumed here. P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 22 / 23 Conclusions Take home messages Output-Queued switch W OQ = O(1) Queue-Independent Frame scheduler W QIF = O(N) I Random Frame scheduler W RF = O(N) I Periodic Frame scheduler W PF = O(N) Queue-Aware Frame scheduler W QAF = O(log N) P. Giaccone (Politecnico di Torino) Delay and frame scheduling Dec. 2013 23 / 23