The DSNIC should offer the described API to processes, and implement communication via the described protocol. This functionality must be distributed over the FPGA firmware and the host software.
Measurements have proven that the maximum throughput cannot be reached if the host CPU is used to transfer every single byte of a message via register I/O. The firmware therefore minimally needs to offer a packet transfer interface. Being unable to estimate the development time and resource utilisation, i.e., the required number of Logic Cells (LCs), for complex firmware functionality, made us decided to initially only implement the packet transfer functionality in firmware.
A well-known problem in interfacing is memory-to-memory copying. Memory-to-memory copying is an expensive and often nonessential operation which causes CPU loading and latency. Consequently, we choose to avoid it: message data must be DMA-transfered directly to the right address in a process' memory.