Implementation of S-Link Using Physical Layer at 2.5 Gbps

Hardware Specification

 Aurelio Ruiz
(Universidad Carlos III de Madrid)

Erik van der Bij
( CERN-Geneva)

CERN, Geneva
28. June. 2002


Overview of the HOLA implementation

    1. Forward Channel

        1.1 Normal Frames
            1.1.1 Data Frame
            1.1.2 Control Frame
            1.1.3 HOLA Link Command
            1.1.4 Transmission example

        1.2 Test Frames
            1.2.1 Test Control Frame
            1.2.2 Test Data Frame

    2. Return Channel

    3. Test Function

    4. Reset Protocol

    5. Architecture of the Link Cards

          5.1 Transceiver
          5.2 Link Source Card
          5.3 Link Destination Card

    6. Power consumption estimation.



Main Features

S-Link nomenclature: LSC/LDC-D-40-B-3.3-160


This document is intended as a designers note on the HOLA S-Link Interface - one of the possible implementations of the S-Link Interface Standard - or for anyone who wants to know more details than the HOLA data sheet features.

The HOLA S-LINK interface is a standard S-LINK implementation which uses the TI TLK2501 2.5 Gbps transceiver both for the forward and for the return channel (one per card). For the optical transmission Small
Form Factor Pluggable (SFP) Multimode transceivers with LC Connector are used. As opposed to the ODIN S-LINK, the HOLA can transfer with one duplex fibre at the full S-LINK bandwidth of 250MB/s. To achieve this rate, the input UCLK clock can be increased up to a frequency of 62.5 MHz. Output clock in the LDC (independently of UCLK frequency) can be also chosen to be 62.5 MHz (40 MHz default value).

The design is not intented to be used in an environment with radiation.


1. Forward Channel

When the physical link is already in  Active State, 32-bit words (both data and control information) can be sent from the LinkSource Card (LSC) to the Link Destination Card (LDC) via the forward channel. IDLEs must be continuously sent on the forward channel if neither data nor control information is transmitted. Besides, in the interval between the transmission of data or control frames, 3 IDLEs will be also sent. 8 IDLEs will be sent after the internal commands RLDN (Remote Link Down) and RRES (Remote Reset) to help the receiver synchronize to the data, and after the TON (Test On) internal command to give some time to the LDC to reset its FIFO.

IDLEs consist either of the sequence <K28.5,D5.6> or <K28.5,D16.2>.

1.1 Normal Frames

Data is encoded using the 8-bit/10-bit algorithm.

Control words are distinguished from data words by a preceding special HOLA Command (Next word Control Word (NCW) - for further details see 1.1.2).

1.1.1 Data Frame

Data is 32-bit long. Before being transmitted through channel D, data will be split in two 16-bit words. They have no special overhead, in order to reduce loss of bandwidth.

The most significant bits, D[31..16], are sent first, followed by the least significant bits, D[15..0], on the next clock cycle.

4kbyte (1024 data words) is the maximum Payload size of a data frame (for bigger payloads the error detection capability of the CRC would degrade). The transmission of a data frame will start when a data word is transferred from the FEMB to the LSC input buffer. The frame will be closed with CRC generation in the following cases:

The CRC is not generated when the LSC FIFO is empty, in order to avoid a decrease in performance for clocks near the maximum 62.5 MHz.

1.1.2 Control Frame

An S-LINK control word (28-bit word) is sent preceded by the command  NCW (Next word Control Word - see Table 2). The transmission of data is stopped (and closed with the corresponding CRC Checksum) to transmit the Control Word received from the FEMB (see example in Table 3).

Even parity is used as error detection for control words. Even parity is chosen so that a control word containing all zeros will have all parity bits also zero. The 4 least significant bits of the word are used for this purpose, as shown in the table below:

Control Word Even parity bit for D[31..25] Even parity bit for D[24..18] Even parity bit for D[17..11] Even parity bit for D[10..4]
 Table 1. Control word with parity bits

1.1.3 HOLA S-LINK Command

HOLA internal commands are used for one of the following purposes: to send a command with a meaning of its own, or to show that the next word will be either a Control Word or a CRC Checksum.

All of them begin with 2 consecutive K23.7 characters, followed by the corresponding code, as described in the table below:

K23.7 K23.7
Remote reset off
K23.7  K23.7  0000000000000001
Remote reset
K23.7  K23.7  0000000000000010
Next word a CRCC
K23.7  K23.7  0000000000000011 
Next word a control word
K23.7  K23.7  0000000000000100
Test mode off
K23.7  K23.7  0000000000000101
Test mode on
K23.7  K23.7  0000000000000110
Link Down
 Table 2. Internal HOLA commands

No error detection is performed. If the command is unrecognized, the LDC will discard the word.

1.1.4 Transmission example

Here is a simple transmission example, consisting of a first control word (for example, a START word), followed by a certain amount of data (up to 4KB) and ended with another control word (for example, a STOP word).  Data is closed with the CRC Checksum, and only then the control word is sent. Note that CRC Checksum is always preceded by the CRC command.

At least 3 IDLEs will be inserted between frames.

 n# IDLEs
Control Word
   3 IDLEs
< 4KB data
 CRCC CRC Checksum 3 IDLEs
Control Word
  n# IDLEs
Table 3. Transmission example

2. Return Channel

Two different kinds of frames are sent alternately in the return channel : 7 IDLEs frames for synchronization (Table 4) and "normal" data frames, containing the information relative to flow control and reset, and the return lines (Table 5).
IDLE (<K28.5,D5.6> 
or < D28.5,D16.2>) 
Table 4. IDLE frame in return channel

Bits Description
[1..0] RL[0] 
[3..2] RL[1] 
[5..4] RL[2] 
[7..6] RL[3] 
[9..8] XOFF# - Flow Control
[11..10] LDC down
[13..12] Remote reset
[15..14] Reserved - Set as 0, ignore on reception
Table 5. Commands in return channel

A third combination, consisting of the sequence <K23.7,K23.7>, is used to indicate that the next frame will be used for test (see section 3).

For error detection every bit is sent twice. If an error is detected in the word, it is discarded.

3. Test function

The HOLA uses the standard S-LINK test pattern, the walking bit pattern, both for the forward and for the return channel. The test mode is started by sending a 'test-mode-on' command -TON- and ended with a 'test-mode-off' command -TOFF- through the forward channel.

In the return channel test frames are inserted (4 in the implementation, although any other number would be possible as well) between each normal sequence, each preceded by a K23.7 character. Data is still sent during test mode in order to keep both the return lines and all the functionalities (flow control or Link Down) on. If an error is detected the data error LED on the LDC is illuminated. That implies that errors detected in the LSC coming  from the return channel must be reported back to the LDC. For that purpose, the LSC is forced to send an invalid pattern in the next test frame (all bits 1). An example is shown in the table below:

Data Frame
1st. <K23.7,K23.7>
Test word
4th. <23.7,23.7>
        4th.           Test word
Table 6. Transmission example for the test on the return channel

The test cycle in the return channel ends when the LSC gets out of test mode. That means that error detection in the last test words in return channel may be lost. This fact should not be of significance, since test is intended to be performed for long time periods (in comparison to one test cycle - 32 clock cycles long).

When the LDC receives the TON command, FIFOs on both sides are reset, even if the FIFO was requesting flow control (TON command is sent ignoring flow control). Data sent in previous clock cycles that could be still stored there will be lost.

4. Reset protocol

The reset protocol is changed from the S-LINK specified card reset to a link reset. When either side detects a problem with the link, i.e. the receiver(s) do not acquire lock on serial data, a link down command is sent to the other card, making it go down. If URESET# is set low on either side, a Remote reset command (RRES) is sent to the other side, making both cards go up when the error is cleared.

Figure 1. LSC Power-up and Reset State Machine

LSC powers up in POWER state. From there it waits for a hw_ok, 21-bit filter signal on , and rlup, indicating that a command other than link down is received on return channel. Counter in RESET-UP transition is reset after a LSC reset and assures link down to be 4 clock cycles at a LSC reset.

When link is up and LDC is reset, LSC goes into RESET state to answer reset command. This is necessary since CRC and LDC is reset only on this LSC reset command. When answering reset command the reset counter is not started, and LSC will stay up during LDC initiated reset cycle.

In POWER and DOWN state, the internal command RLDN is sent. This is to make LDC  receivers to acquire faster and more secure lock, as it prevents improper word alignment caused by static valid code field embedded within the data field that the command words have.

Figure 2. LDC Power-up and Reset state machine

LDC powers up in POWER/DOWN state. Here the LDC waits for hw_ok and rlup, and is continuously sending link down command to the LSC. By doing this, it is assured that LDC will come up before the LSC in power up sequence, and no data written to LSC will be lost. When the LDC is in UP state, a LSC reset will do nothing to the state machine, but CRC error latches and Test mode state is cleared.

Power-up sequence, LSC powered up first

Figure 3. Power-up sequence, LSC powered up first

Power-up sequence, LDC powered up first

Figure 4. Power-up sequence, LDC powered up first

In case the LDC is powered down and powered up again during normal operation, a reset in any of the cards MUST be made. Otherwise, they will both stay in state down. If the LSC is powered down and powered up again, the link will recover by itself.

For simulation with software user can select via the generic SIMULATE (to choose simulation parameters set to 1) a shorter filter length (5 bits), in order to increase simulation speed. Normal initialization time on each card is 25 ms., for simulation reduced to 6µs.

5. Architecture of the Link Cards

The same hardware module is used for LSC and LDC, the only difference is the design of the protocol chip. The ALTERA EP20K30ETC144-2 programmable logic device is used as protocol chip. The active serial configuration scheme is used with an EPROM as data source.  The TLK2501 Gbps transceiver is used for 8b/10b encoding/decoding, and as serial interface. A 125 MHz quartz oscillator  provides the encoder clock for the  ENDEC and the media interface.

The user can select several parameters of the card through the following generics:
Generic name Description Valid values Included in LSC Included in LDC
Simulation mode     (shorter initialization time)
1 = ON (25 ms init.time)    . 0 = OFF(  6 µs init. time)
LCLK frequency selection
1 = 62.5 MHz LCLK clock  0 = 40    MHz LCLK clock NO YES
ACTIVITY_LENGTH Activity_led stays illuminated after write operation for 2^(ACTIVITY_LENGTH) clock cycles Integer YES YES
FIFODEPTH Depth of FIFO Integer (max. 512 for EP20K30ETC144 ) YES YES
FULLMARGIN Number of words that must be free in the FIFO before setting flow control. <FIFODEPTH YES YES

Table 7: HOLA generics.

for the current implementation, values are as follows:
Generic name LSC LDC
FIFODEPTH  64  512

Table 8: Generic values for current implementation

With these values, the LSC can accept up to 32 words (FIFODEPTH-FULLMARGIN) after LFF# was asserted.

These are the different PCB mounting options for the source and destination cards:
  R16 R17 R18 CP2
LSC  -  200 200 OPEN
LDC 0 - - FILL

Table 9: Mounting table

5.1 Transceiver

The TLK2501 transceiver is a member of a multi-gigabit family of transceivers used in ultra high-speed, bidirectional, point-to-point data transmission systems. This SerDes supports aneffective serial interface speed of 1.6 Gb/s to 2.5 Gb/s, providing up to 2 Gb/s of data bandwidth. Thetransmission media can be a printed-circuit board, copper cables, or fiber-optic cable.

Transmitter Interface

The transmitter section validates incoming 16-bit wide data (TXD [0:15]) on the rising edge of the GTX_CLK. That incoming data is then 8-bit/10-bit encoded, serialized, and transmitted sequentially over the differential high-speed I/O channel. The clock multiplier multiplies the reference clock (GTX_CLK) by a factor of 10, creating a bit clock. This internal bit clock is fed to the parallel-to-serial shift register. This register transmits data on both edges of the bit clock, providing serial data at a rate of 20 times the reference clock.

An 8-bit/10-bit encoding algorithm is implemented in the transmitter section of the device; it?s the same algorithm used by fiber channel and the gigabit Ethernet. The decoding is transparent to the user; data is internally encoded such that the user only has to write 16-bit data. The 8-bit/10-bit encoder converts 8-bit wide incoming data to a 10-bit wide encoded data character. This is done to improve its transmission characteristics. The encoding scheme maintains the signal DC balance by keeping the number of ones and zeros the same. This provides good transition density for clock recovery and improves error checking.
Externally, the device has a 16-bit wide interface; internally, the data is split into two 8-bit wide bytes for encoding. Each byte is fed into a separate encoder. The encoding is dependent upon two additional input signals, the TX_EN and TX_ER. Tabl e 1 provides the transmit data control decoding.

0 0 IDLE(<K28.5,D5.6> or <K28.5,D16.2>)
0 1 Carrier Extend (K23.7,K23.7)
1 0 Normal data character
1 1 Transmit error propagation (K30.7,K30.7)
Table 10: Transmit data controls

Receiver Interface

The receiver section accepts 8-bit/10-bit encoded differential serial data. The interpolator and clock recovery circuit lock to the data stream and extract the bit rate clock. This recovered clock is used to re-time the input data stream. The serial data is then clocked into the serial-to-parallel shift registers. The 10-bit wide parallel data is then multiplexed and fed into two separate 8-bit/10-bit decoders, where the data is then synchronized to the incoming data steam word boundary by detection of the K28.5 synchronization pattern.

0 0 IDLE(<K28.5,D5.6> or <K28.5,D16.2>)
0 1 Carrier Extend (K23.7,K23.7)
1 0 Normal data character
1 1 Transmit error propagation (K30.7,K30.7)
Table 11: Receive status Signals

5.2 Link Source Card (LSC)

On the LSC side, 32-bits data received from the FEMB is written into a FIFO (of size FIFODEPTH x 34) at the positive edges of the clock UCLK (the clock provided by the user). 32 bit-words from the FIFO are read, and the necessary logical operations are made. MSB and LSB are then alternatively multiplexed and delivered to the TLK2501 (both running at 125 MHz), which encodes it using the 8b/10b algorithm, and converts the parallel input into the serial data of the forward channel.

The by the S-LINK specified two words after LFF# being asserted become, for the current implementation, 32 words. This value can be set by the user means the generics FIFODEPTH and FULLMARGIN previously described.

Architecture of the HOLA LSC Card

Data coming from the return channel is read at the positive edges of RXCLK, the recovered return channel clock. No FIFO is required for them.

HOLA LSC provides two additional LEDs. ACTIVITY_LED, illuminated when a write operation was performed on the FIFO in the previous 2^ACTIVITY_LENGTH cycles (with ACTIVITY_LENGTH being the generic described in Tables 7 and 8), and DERRLED_N,  to show that the input FIFO is COMPLETELY full, and no more data will be accepted (if this LED is illuminated, there is an error as the user violated the S-LINK specification by sending more than 32 words after LFF# got asserted).

Clock Functions Speed
UCLCK Clock from FEMB -> Write in FIFO Typ. 40 MHz
ICLK  Internal clock ->Read from FIFO, any internal logic 62.5 MHz
GTX_CLK  Clock for the serialization for Forward Channel, multiplexing 125 MHz
RX_CLK  Recovered clock from Return Channel 125 MHz
Table 12. Clocks on LSC

The timing parameters for the LSC are listed in Table 13:

Symbol Description Min Max Units
tDS Data Set-up time 10 ns
tDH Data Hold time 1 ns
tENS Enable Set-up time 10 ns
tENH Enable Hold time 1 ns
tWFF Write Clock to Full Flag 12 ns
tCLK Clock Cycle time 25 ns
tCH Clock High time 11 ns
tcl Clock Low time 11 ns
 Table 13. LSC timing parameters (S-Link minimum)

5.3 Link Destination Card

The length of the FIFO in the LDC can be selected by the user by setting the suitable generics in the core.  In an Altera EP20K30E device up to 512 words can be selected. This value allow user to use cable lengths of 300 m. The FIFO width is 34 bits.

The 16-bit serial data received on the LDC is first decoded by the TLK2501, and converted into a 16-bit parallel output (working at 125 MHz).32-bit data must be then written to the FIFO (this operation is made at 62.5 MHz). This conversion is made via a 'demultiplexor', which simply latches the MSB until the next clock cycle, where the LSB is received. The 32-bit word is the written to the FIFO at 62.5 MHz.

At the positive edges of the LCLK clock (typically 40 MHz), a 32-bit word is read from the FIFO and transmitted to the ROMB. Note that in the FIFO 'write' is a 16-bit operation, but 'read' a 32-bit one.

Architecture of the HOLA LDC Card

Words in the return channel are written at the positive edges of XCLK, the internal 125 MHz clock.
Clock Functions Speed
RX_CLK Clock for the serializer/deserializer and demultiplexer 125 MHz
LCLK Clock for LD -> Read from FIFO Typ. 40 MHz
ICLK_2 Clock for the internal logic and write to FIFO 62.5 MHz
GTX_CLK Clock for the return channel 125 MHz

Table 14. Clocks on LDC

Block front writes the data received from the deserialiser to the FIFO. To guarantee the maximum set up and hold times data will registered always in the high-to-low transition of the internal clock ICLK_2. The internal 16-bit register datamsb is used to take the MSB data received first. In case it is necessary to wait for one clock cycle to give the data to the FIFO, LSB data will be kept in the 16-bit register datalsb. A timing diagram is shown in the following figure.

 Table 15 gives the guaranteed timing parameters for LDC. The suffix -40 is used for the parameters of the 40 MHz LCLK version, suffix -625 is used for the 62.5 MHz LCLK version.

Symbol Description Min Max Units
tDS Data Set-up time 10 ns
tDH Data Hold time 1 ns
tENS Enable Set-up time 10 ns
tENH Enable Hold time 1 ns
tWFF Write Clock to Full Flag 12 ns
tCLK-40 Clock Cycle time 25 ns
tCH-40 Clock High time 11 ns
tCL- 40 Clock Low time 11 ns
tCLK-625 Clock Cycle timetCL 16 ns
tCH-625 Clock High time  7 ns
tCL- 625 Clock Low time  7 ns
Table 15.LDC timing parameters

6-Power consumption estimation

The power consumption of the link has been calculated by measuring the currant drawn means a SLIBOX. The results are shown in the table below:
 Min  Typ Max Units
LSC - 40     MHz UCLK clock
LDC - 40    MHz LCLK clock
 2805  3300 4290  mW
LSC - 40     MHz UCLK clock
LDC - 62.5  MHz LCLK clock
 - - -  mW
Table 14: HOLA power consumption

Power consumption of the Altera device can be estimated by inserting the .pwf files generated after simulation in the Quartus software, and performing there again a timing simulation. In the simulator settings it must be added the .pwf as "Source of Vector Stimuli", and in Simulator Settings the option "Estimate power consumption" must be also selected.

The results (and a comparison with the results obtained using the APEX power consumption calculator are shown below).
  Quartus estimation APEX power calculator estimation
  Internal power (mW) IO power(mW) Total(mW) Internal power(mW) IO power(mW) Total(mW)
LDC - 62.5 MHz LCLK clock 301.2 0.024 301.24 301.83 34.99 336.82
LDC - 40    MHz LCLK clock  265.66 0.016 265.67 228.09 28.14 256.23
LSC -  62.5 MHz input clock  -  -  - 307.1 20.82 327.92
LSC -  40    MHz input clock  301.7 0.036 301.74 307.1 19.07 326.16

Table 15: Simulated Altera power consumption

The estimation  using the APEX power calculator will be less accurate, since the calculation was done for all flip-flops (773 for the LDC, 735 for the LSC) and LEs (111 ESBTON and 763 ESBTOFF for the LDC, 86 and 789 for the LSC) assigned to a 62.5 MHz clock (or a 40 MHz clock in the case of LDC with 40 MHz LCLK clock. LSC is always calculated at 62.5 MHz because most of the registers will be working at that speed independently of the input clock). In the reality there will be an amount of registers running at others speeds.

Other sources of power consumption are the SerDes and the optical transceiver. According to the values from their datasheets, the TLK2501 consumes 360 mW for a data rate of 2.5 Gbps. For the optical transceiver power consumption is 412 mW. The 2.5V for the SerDes will be generated out of 3.3V with a drop-down converter. Its actually power consumptio will be then 475 mW.

For the moment the S-Link VHDL model only allows input clocks at 40 MHz.