next up previous
Next: Network latency for Clos Up: Results Previous: Results

Comparison of network topologies

Figure 3 shows saturation network throughput for different sizes of Clos and 2-dimensional grid networks under random and systematic traffic for 64 byte packets. Systematic traffic involves fixed pairs of nodes sending to each other. For random traffic, nodes choose a destination from a uniform distribution.

 
Figure 3: : Throughput versus network size for Clos and grid networks under random and systematic traffic with 64 byte packets 

The throughput of the Clos networks for random traffic is higher than for the 2-dimensional grids. This is because of the larger cross-sectional bandwidth. The maximum cross-sectional bandwidth is defined as the bidirectional data rate that can pass between two parts of the network if it is divided into two equal halves. The 256 node Clos has a maximum theoretical cross-sectional bandwidth of 2.44 Gbytes/s, whereas for the grid of the same size it is only 305 Mbytes/s. For the grid networks, the per-node throughput decreases rapidly as the network size increases, e.g. for a 64 node grid, which consists of an array of switches, the per node throughput under random traffic is only 40% (4Mbytes/s) of the maximum link bandwidth. For a 1024 node grid ( switches), the per node throughput under random traffic is only 10% (1 Mbyte/s) of the maximum link bandwidth.

The results show that the network throughput under random traffic is always significantly lower than the maximum theoretical cross-sectional bandwidth. This is because the throughput of the network under random traffic is limited by head-of-line blocking.

The fall off in performance from systematic to random traffic is more pronounced for the grid than the Clos. The degradation of performance as the network size increases agrees with analytical models presented in [8]. This study predicts the throughput of Clos networks under sustained random load to degrade by approximately 25% from linear when the network size is increased from 64 to 512 nodes. The measurement results shown in figure 3 show a reduction of about 20% under the same conditions.

The performance of the grid is strongly dependent upon the choice of pairs for systematic traffic. The results shown for the grid in figure 3 use a 'best case' scenario, this traffic pattern involves communication between nodes attached to nearest neighbour switches. A 'worst case' scenario would be the pairing of nodes with their mirror image node in the network. The throughput of a 256 node grid under this 'worst case' pattern is only 200 Mbytes/s, as opposed to 1.8 Gbytes/s under the 'best case' pattern. This shows that on the grid good performance requires locality. The throughput of the Clos under systematic traffic is independent of the choice of pairs due to its high cross-sectional bandwidth.

The throughput of the torus is about 20% higher than the grid due to the extra wrap around links which are available.


next up previous
Next: Network latency for Clos Up: Results Previous: Results

Roger Heeley
Fri Sep 26 17:00:08 MET DST 1997