DSNIC: Brief Firmware Description.

Marcel Boosten

CERN, EP/ATD, Geneva, Switzerland

December 12 1997

Table of Contents


Introduction

The DSNIC firmware has been designed to full fill two requirements: The DSNIC card contains two C101 DS link interfaces. The first C101 is used to provide register I/O based network access meant for network control. The second C101 is used to support the high speed message communication protocol.

We have designed a high speed message passing protocol with the following characteristics:

Firmware design

The design of the firmware is the result of the following arguments:

Architecture

Figure 1 shows a block diagram of the firmware. The functions are described in AHDL and compiled to run on the -3 version of the Altera 10K50.


Figure 1: AHDL functional block diagram.

Detailed design

COMIN takes care of packet reception, it splits packet headers from packet data. The headers are send to the HEADLENFIFO, and the data is send to the INFIFO. Apart from splitting, COMIN also counts the packet length, which is known as soon as the end-of-packet character has been received. Also this length is send to the HEADLENFIFO. Data in the HEADLENFIFO is delivered to an AMCC mailbox, which can generate an interrupt. On a header reception interrupt, the host CPU can establish the receiving DMA.

COMOUT takes care of sending data and acknowledge packets. If data is available in both the ACKFIFO and the OUTFIFO, data in the ACKFIFO will be selected. This way, acknowledges can pass by enqueued data packets.

In order to hide the latency of the controlling software, there are FIFOs on both the receiving and the transmitting side. The buffer size of 1 Kbyte was chosen as a good fit into the internal resources of the Altera FPGA and it can hide a latency up to 100 usec which should be sufficient to hide software latency.

To avoid a store-and-forward system, the receiving DMA must be initialised as soon as the header of the packet is received. However, the length of the DMA transfer should be known beforehand. This is not the case, since the packet length is determined by COMIN via counting. Therefore FLUSH pads out each packet to the full packet length. DMAs can now be set up immediately on header reception, using the full packet length.

AMCCMUX multiplexes the AMCC to all the FIFOs. PASSTHRU provides access to the registers of the C101 and the reset links.

Resource usage


Figure 2: Resource usage of the firmware.
Figure 2 shows the FPGA resource usage. The Altera 10K50 contains a total of 2880 Logic Cells (LCs). Due to limitations in logic placement, only about 80% of these resources can actually be used. Upto 20% of the LCs in the firmware can be made available for additional purposes by reducing the size of the ACKFIFO and the HEADLENFIFO. Currently, these FIFOs are maximised to just fill the Altera 10K50.

Figure 2 contains more functional blocks than Figure 1. The extra blocks FASTSLOW and SLOWFAST take care of data transfer from one clock domain to the other. Furthermore, there are IO and BUF blocks which ensure the timing constraints with respect to the C101 interface.

Performance

Figure 3 shows the throughput versus message length for bidirectional message exchange (Comm2) and for some different packet sizes (8, 1024, and 4096). The results show the end-to-end data bandwidth between two Linux processes. Each process is running on a 200 MHz Pentium PC.

Figure 3: Bidirectional throughput versus message length for different packet sizes.
This graph proves that the DSNIC can reach a bidirectional process-to-process throughput of 16.6 Mbytes/s, which is 90% of the theoretical bandwidth.