## Improved Meta-Flattened Butterfly Multistage Interconnection Network

#### Kawalpreet Kaur

Department of Computer Science & Engineering, Guru Nanak dev University Amritsar, India

#### Sandeep Sharma

Department of Computer Science & Engineering, Guru Nanak dev University Amritsar, India

#### **ABSTRACT**

Parallel processing is efficient form of information processing which means to implement high performance computing systems. A communication network that links processors and memory modules determines the efficiency of these parallel systems. To provide the required connectivity and performance at reasonable cost, an interconnection network is used. Hence, multistage interconnection networks are a good way for providing communication in these systems. This paper includes two major contributions. Firstly, it describes the flaws of existing Metaflattened Network (MFN). Secondly, a new Fault-tolerant multistage interconnection network named as Improved Meta-Flattened Butterfly Network (IMFN) is proposed that can improve the routing problems of Meta-flattened Network. Performance of the proposed network is analyzed in terms of cost and permutation possibility. The performance comparison of the IMFN with MFN shows that the proposed network IMFN gives much better performance in terms of permutation.

#### **Keywords**

Multistage interconnection network, meta-flattened network, Flattened Butterfly, permutation passabilty

#### 1. INTRODUCTION

Interconnection Networks (INs) are considered as a good communication medium for parallel systems. They limit the paths between different communicating nodes in order to minimize the switch complexity, while giving a certain level of parallelism which is superior to that of a bus. The design of a suitable interconnection network for inter-processor communication is one of the key issues of the system performance. Multistage interconnection networks (MINs) provide cost-effective, high-bandwidth communication between processors and/or memory modules in comparison to bus and crossbar interconnection networks [13]. Multi-stage Interconnection Network plays a vital role on the performance of these multiprocessor systems. Considering availability of paths to establish new connections, MINs are classified into three categories: blocking, non-blocking, and re-arrangeable networks [2, 3, 4].

The main shortcoming of Meta-Flattened Network which was based on Flattened Delta Network [5] is that it is less fault-tolerant. In case of failure of a switching element or a link, the request can't be passed on to all linked destinations. For example, if a 1<sup>st</sup> switching element (SE) of middle stage (2<sup>nd</sup>) fails then the request coming to that SE will not reach 1st four destinations. Hence, in this paper, a novel structure which referred to as Improved Meta-Flattened Network (IMFN) is introduced in order to increase the number of paths between every pair of sources and destinations. By using this network, the fault-tolerance can be improved greatly.

The paper is structured as five major sections. In the first section, MINs are briefly introduced. Then, the existing Meta-flattened network (MFN) is described. The construction of Improved Meta-flattened network (IMFN) is presented in Section 3. In Section 4, the performance of IMF and MF are compared in terms of two

parameters: i.e. permutation possibility, bandwidth and cost. Finally results in Section 5.

#### 2. META FLATTENED DELTA NETWORK

In MFN networks, the structure of the first and the last stages remains constant. Also, similar to a flattened network, the intermediate stages are merged to form a single stage. Two methods can be adopted to flatten intermediate stages. First, the stages can be flattened in groups of two, which is mainly applicable to networks with the even number of stages. Second, all intermediate stages can be implemented as a flattened network. Figure 1 represents the organizations of Meta-Flattened Network with 16 inputs. As shown in this figure, the number of stages is three and the number of inputs and outputs in the intermediate routers increases. [1]

# 3. CONSTRUCTION PROCEDURE OF IMPROVED META FLATTENED NETWORK

Improved meta-flattened Network (IMFN) is a multistage interconnection network, designed from Meta-flattened Network. An N  $\times$  N ( $2^n\times 2^n$ ) network consists of m stages (where m =  $\log_2$  N/2). All the stages contain equal number of switching elements. The switches in the last stage are of size  $2\times 2$  whereas stages from 1 to m-1 are having switches of size  $3\times 3$ . IMFN of size  $16\times 16$  is shown in Figure 2.



Fig 1: MF-Butterfly

Following structural changes have been made in IMFN:

- 1) Switches in the  $1^{st}$  stage are of size  $3\times3$  instead of  $2\times2$ .
- 2) Change in connections.

#### 4. PERFORMANCE MEASURES

The performance of both the Networks has been analysed in terms of cost, permutation passibility and bandwidth.

#### **4.1 Cost**

An assumption is made to estimate the cost of a network that is the cost of a switch is proportional to the number of gates involved, which is roughly proportional to the cross-points within a switch [7],[8], [9]. For example a 2x2 switch has 4 units of hardware cost whereas a 3x3switch has 9units. The estimated cost for IMFN MIN is given below in Table 1.

Cost of 8 2x2 switches i.e. 8x 4= 32 Cost of 16 3x3 switches i.e. 16x 9=144 Total cost of Improved Meta-flattened network (IMFN) = 32+144=

**Table 1: Cost of IMFN** 

| NETWORK                         | COST |
|---------------------------------|------|
| Improved meta-flattened network | 176  |
| (IMFN)                          |      |

There is a little increase in cost but this is negligible taking in account performance improvement using other parameters.



Fig 2: IMF-Butterfly

#### 4.2 Permutation Passibility

A permutation is a full one-to-one mapping between the network inputs and outputs [11], [12]. If number of requests occur simultaneously at the source then the permutation passibility behaviour of a network shows that how many input requests are able to pass through the given network, and how many of them will successfully mature, i.e., reach their destination [6].

Table 2: Identical Permutation Passibility of MFN

| Source | Destination | Route        |  |
|--------|-------------|--------------|--|
| 0      | 0           | 0-A-A'-A''-0 |  |
| 1      | 1           | Blocked      |  |
| 2      | 2           | 2-B-B'-B''-2 |  |
| 3      | 3           | Blocked      |  |
| 4      | 4           | 4-C-C'-C''-4 |  |
| 5      | 5           | Blocked      |  |

| 6  | 6  | 6-D-D'-D''-6  |  |
|----|----|---------------|--|
| 7  | 7  | Blocked       |  |
| 8  | 8  | 8-E-E'-E"-8   |  |
| 9  | 9  | Blocked       |  |
| 10 | 10 | 10-F-F'-F"-10 |  |
| 11 | 11 | Blocked       |  |
| 12 | 12 | 12-G-G'-G"-12 |  |
| 13 | 13 | Blocked       |  |
| 14 | 14 | 14-H-H'-H"-14 |  |
| 15 | 15 | Blocked       |  |

Total no. of requests = 16No. of requests matured successfully = 8Average path length = (3+3+3+3+3+3+3+3+3)/8=3

Table 3: Incremental Permutation Passibility of MFN

| Source | Destination | Route            |  |
|--------|-------------|------------------|--|
| 0      | 2           | 0-A-A'-B"-2      |  |
| 1      | 3           | Blocked          |  |
| 2      | 4           | 2-B-B'-D'-C"-4   |  |
| 3      | 5           | Blocked          |  |
| 4      | 6           | 4-C-C'-D"-6      |  |
| 5      | 7           | Blocked          |  |
| 6      | 8           | 6-D-H'-F'-E"-8   |  |
| 7      | 9           | Blocked          |  |
| 8      | 10          | 8-E-E'-F"-10     |  |
| 9      | 11          | Blocked          |  |
| 10     | 12          | 10-F-F'-H'-G"-12 |  |
| 11     | 13          | Blocked          |  |
| 12     | 14          | 12-G-G'-H"-14    |  |
| 13     | 15          | Blocked          |  |
| 14     | 0           | 14-H-D'-B'-A"-0  |  |
| 15     | 1           | Blocked          |  |

Total no. of requests=16No. of requests matured successfully = 8Average path length = (3+4+3+4+3+4+3+4)/8=3.5

**Table 4: Incremental Permutation Passibility of MFN** 

| Source | Destination | Route            |  |
|--------|-------------|------------------|--|
| 0      | 7           | 0-A-A'-C'-D''-7  |  |
| 1      | 6           | Blocked          |  |
| 2      | 5           | 2-B-B'-D'-C"-5   |  |
| 3      | 4           | Blocked          |  |
| 4      | 3           | 4-C-C'-A'-B"-3   |  |
| 5      | 2           | Blocked          |  |
| 6      | 1           | 6-D-D'-B'-A''-1  |  |
| 7      | 0           | Blocked          |  |
| 8      | 15          | 8-E-E'-G'-H"-15  |  |
| 9      | 14          | Blocked          |  |
| 10     | 13          | 10-F-F'-H'-G"-13 |  |
| 11     | 12          | Blocked          |  |
| 12     | 11          | 12-G-G'-E'-F"-11 |  |
| 13     | 10          | Blocked          |  |
| 14     | 9           | 14-H-H'-F'-E"-9  |  |
| 15     | 8           | Blocked          |  |

Total no. of requests=16No. of requests matured successfully = 8Average path length = (4+4+4+4+4+4+4+4)/8=4

### Permutation of improved meta-flattened network

Table 5: Identical Permutation Passibility of IMFN

| Source | Destination | Route         |  |
|--------|-------------|---------------|--|
| 0      | 0           | 0-A-A'-A''-0  |  |
| 1      | 1           | Blocked       |  |
| 2      | 2           | 2-B-B'-B''-2  |  |
| 3      | 3           | Blocked       |  |
| 4      | 4           | 4-C-C'-C''-4  |  |
| 5      | 5           | Blocked       |  |
| 6      | 6           | 6-D-D'-D''-6  |  |
| 7      | 7           | Blocked       |  |
| 8      | 8           | 8-E-E'-E"-8   |  |
| 9      | 9           | Blocked       |  |
| 10     | 10          | 10-F-F'-F"-10 |  |
| 11     | 11          | Blocked       |  |
| 12     | 12          | 12-G-G'-G"-12 |  |
| 13     | 13          | Blocked       |  |
| 14     | 14          | 14-H-H'-H"-14 |  |
| 15     | 15          | Blocked       |  |

no. of requests=16No. of requests matured successfully = 8Average path length = (3+3+3+3+3+3+3+3+3+3)/8=3. Total

**Table 6: Incremental Permutation Passibility of IMFN** 

| Source | Destination | Route             |  |
|--------|-------------|-------------------|--|
| 0      | 2           | 0-A-A'-B''-2      |  |
| 1      | 3           | 1-A-B-B'-B"-3     |  |
| 2      | 4           | Blocked           |  |
| 3      | 5           | Blocked           |  |
| 4      | 6           | 4-C-C'-D"-6       |  |
| 5      | 7           | 5-C-D-D'-D"-7     |  |
| 6      | 8           | 6-D-H'-F'-E"-8    |  |
| 7      | 9           | 7-D-C-G'-E'-E"-9  |  |
| 8      | 10          | 8-E-E'-F"-10      |  |
| 9      | 11          | 9-E-F-F'-F"-11    |  |
| 10     | 12          | Blocked           |  |
| 11     | 13          | Blocked           |  |
| 12     | 14          | 12-G-G'-H"-14     |  |
| 13     | 15          | 13-G-H-H'-H"-15   |  |
| 14     | 0           | 14-H-D'-B'-A''-0  |  |
| 15     | 1           | 15-H-G-C'-A'-A"-1 |  |

Total no. of requests=16No. of requests matured successfully = 12Average path length = (3+4+3+4+4+5+3+4+3+4+4+5)/12=3.83

#### 4.3 Bandwidth Analysis

#### 4.3.1 Assumptions

Here are the assumptions on the basis of which the probabilistic relations have been carried out [8].

- The IN operates in a synchronous mode, i.e., the requests issued by the processors begin and end simultaneously.
- The requests are random and the request generated by a processor is independent of the request generated by another processor rejected.
- The requests generated in a cycle are independent of the requests generated in the previous cycle
- "p0" is the probability with which a processor generates a request. Thus, p0 is the rate of request of a processor per cycle

Table 7: Incremental Permutation Passibility of IMFN

|        |             | •                |
|--------|-------------|------------------|
| Source | Destination | Route            |
| 0      | 7           | 0-A-A'-C'-D"-7   |
| 1      | 6           | Blocked          |
| 2      | 5           | 2-B-B'-D'-C"-5   |
| 3      | 4           | Blocked          |
| 4      | 3           | 4-C-C'-A'-B"-3   |
| 5      | 2           | Blocked          |
| 6      | 1           | 6-D-D'-B'-A"-1   |
| 7      | 0           | Blocked          |
| 8      | 15          | 8-E-E'-G'-H"-15  |
| 9      | 14          | Blocked          |
| 10     | 13          | 10-F-F'-H'-G"-13 |
| 11     | 12          | Blocked          |
| 12     | 11          | 12-G-G'-E'-F"-11 |
| 13     | 10          | Blocked          |
| 14     | 9           | 14-H-H'-F'-E"-9  |
| 15     | 8           | Blocked          |

Total no. of requests=16No. of requests matured successfully = 8Average path length = (4+4+4+4+4+4+4+4)/8=4

- The probability with which processor Pi addresses memory Mi is zero i.e. there is no favourite memory
- Networks are of same size N x N.

For calculating probabilistic equations, let us suppose a MIN of size  $a^n \times b^n$  with  $a^n$  sources and  $b^n$  destinations. The analysis is based on a x b crossbar switches. It is assumed that all the destinations are independently and identically distributed [10]. Let the Request Generation Probability is p, at each of the 'a' inputs of a x b crossbar switches. The expected number of requests that passes per unit of time is given by b-b  $(1-p/b)^a$ . Dividing it by number of output lines we get rate of requests at any of the output lines. So output rate, which is function of input line is given by  $1-(1-pb^{-1})^a$ . Since output rate at one stage is input rate to next stage, output rate can be recursively evaluated, starting from initial stage. Output rate of final stage will determine the Bandwidth of a MIN [10]. **So Bandwidth BW is** 

BW=  $b^n p_n$  and  $p_0=p$ 

#### **Probability Equations for IMFBN**

p[1]=1-(1-p[0]/3)<sup>3</sup> p[2]=1-(1-p[1]/3)<sup>3</sup> p[3]=1-(1-p[1]/2)<sup>2</sup>

To have an idea that how these equations have been derived, Consider Fig. of IMFBN. There are 3 stages. All the stages have  $N2^{-1}$  switches. As discussed above probability to reach ith stage is given by  $1-(1-p_{i-1}b^{-1})^a$ . In the first and second stage there are 3x3 switches, therefore a = b = 3. In last stage, switches are of 2x2 and the input lines are coming form first and second stages.

The results of the Probability equations for Bandwidth are shown in Table 8. and Fig. 3.

**Table 8: Bandwidth Comparison** 

| p   | MFN      | IMFN     |
|-----|----------|----------|
| 0.1 | 1.474233 | 1.532848 |
| 0.2 | 2.71169  | 2.685833 |
| 0.3 | 3.746    | 3.712176 |
| 0.4 | 4.6067   | 4.60772  |
| 0.5 | 5.461016 | 5.397944 |
| 0.6 | 6.01     | 5.924708 |

| 0.7 | 6.678946 | 6.4581879 |  |  |
|-----|----------|-----------|--|--|
| 0.8 | 7.041766 | 7.010505  |  |  |
| 0.9 | 7.5140   | 7.30278   |  |  |
| 1   | 7.91308  | 7.607873  |  |  |



**Request Generation Probability** 

Fig. 3: Bandwidth comparison of MFN and IMFN

Table 9: Parametric Analysis of MFN and IMFN Networks

| Par                        | ameters                                                     | MFN  | IMFN  |
|----------------------------|-------------------------------------------------------------|------|-------|
| Permutation<br>Passibility | Total No. of<br>requests<br>appearing at the<br>source side | 48   | 48    |
|                            | Total request<br>matured                                    | 24   | 28    |
| Cost Function              |                                                             | 8N+8 | 9N+32 |
| Total Path Length          |                                                             | 84   | 102   |
| Average Path Length        |                                                             | 3.5  | 3.6   |

#### 5. CONCLUSION

The bandwidth analysis shows that newly proposed network IMFN has comparable bandwidth with that of MFN. From the data given in table no. 9, it can be concluded that although the average path length of proposed network IMFN is greater than MFN. But there is a significant improvement in number of requests that has been successfully matured at the destination side in case of IMFN. It provides multiple paths of varying lengths between a source-destination pair. Thus, IMFN is more fault-tolerant than the existing network. Hence, the proposed network IMFN gives better performance in this respect.

#### 6. REFERENCES

- [1] MahsaMoazez, FarshadSafaei and Majid Rezazadeh, "Design and Implementation of Multistage Interconnection Networks for SoC Networks," International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 2012.
- [2] J. Duato, S. Yalamanchili, L.M. Ni, Interconnection networks: An engineering approach, Morgan Kaufmann Publishers, 2003.
- [3] J. Kim, W. J. Dally, D. Abts, "Adaptive routing in high radix Clos network," Proceedings of the 2006 ACM/IEEE conference on Supercomputing (SC'06), 2006.
- [4] C. Clos, "A study of non-blocking switching networks," Bell System Tech. Journal, Vol.32, No. 2, 1953, pp. 406-424.
- [5] J. Kim and W.J. Dally, "Flattened butterfly: A cost-efficient topology for high-radix networks," ISCA '07 Proceedings of the 34th annual international symposium on Computer architecture, Vol. 35, Issue 2, May 2007.
- [6] Bhuyan Laxmi N., Yang Qing and Aggrawal P. Dharma, "Performance of MultiProcessor Interconnection Networks", Proceeings of IEEE, 1989,pp. 25-37.
- [7] Aggarwal and P.K. Bansal, "Routing and Path Length Algorithm for Cost-effective Modified Four Tree Network", IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, 2002, pp. 293-297.
- [8] J.H. Patel," Performance of processor-memory interconnection for Multiprocessor", IEEE, Transaction on computers, Vol. 30,No. 10,pp,771-780,1981.
- [9] Sandeep Sharma, K.S Kahlon, P.K Basnal," Reliability and path length analysis of irregular fault tolerant multistage interconnection network", PAGE 16-23, ACM SIGARCH Computer Architecture News, 2009.
- [10] Rinkle Aggarwal and Lakhwinder Kaur,"Design and Bandwidth Analysis of Fault-Tolerant Multistage Interconnection Networks", Journal of Computer Science 4 (11): 963-966, 2008
- [11] Sandeep Sharma, and P.K.Bansal, "A new fault tolerant Multistage Interconnection Network", IEEE TENCON'02, vol 1,2002, pp 347-350.
- [12] H.J. Siegel, "Interconnection Network for large-Scale Parallel Processing": theory and Case studies, McGraw Hill Second Addition, 1992, pp. 1-4,271-274.
- [13] Sergio D'Angelo, Alderighi Monica, Fabio Casini, Salvi Davide and Giacomo R. Sechi, "A Fault-Tolerant FPGA-based Multi-Stage Interconnection Network for Space Applications", Proceedings of the First IEEE International Workshop ON Electronic Design, Test and Applications (DELTA.02) IEEE, 2002