High-speed serial transceivers in PolarFire FPGAs

Introduction

As part of our ongoing series of blog posts on high-speed serial transceivers we are going to look at how to create a transmitter using a PolarFire FPGA. In a previous post we discussed how high-speed serial links are a good way of sending high-bandwidth data (such as video) between two different FPGAs. We also showed how design a simple high-speed serial receiver. In this blog we are going to show how you how to create a high-speed serial transmitter using a PolarFire FPGA.

Key Design Parameters

This PolarFire transmitter we are going to create is based on the following parameters:

Target device: Microchip PolarFire MPF300
Target board: PolarFire Video Kit
Channel type: TX (transmitter only)
Encoding: 8b10b
Comma character: K28.5
Data rate: 2 Gbps
Reference clock speed: 50 MHz
Fabric interface: 32-bits @ 50 MHz
Tool version: Libero SoC 2022.2

High-level Design Structure

The high-level structure of the transmitter design we are going to create is shown below:

Clocking

The PolarFire transceiver is clocked from its own dedicated PLL IP block. The PLL IP block is created from the PolarFire ‘Transmit PLL’ IP wizard. The resulting PLL is connected as follows:

The input REF_CLK is connected to a clock sourced from the FPGA fabric. The source clock is a 50 MHz clock received from an FPGA input pin that is buffered via a global clock buffer.
The output BIT_CLK is the high-speed serial clock. This is connected to the BIT_CLK input of the transceiver.
The output REF_CLK_TO_LANE is the fabric clock. This is connected to the REF_CLK input of the transceiver.
The output LOCK (not PLL_LOCK) is connected to the PLL_LOCK input of the transceiver.

The output TX_CLK_R is used to clock the FPGA fabric. This clock is sourced from a regional buffer (hence the 'R' suffix).

The reference clock for the PLL can be sourced from a dedicated transceiver clock pin or from the FPGA fabric. If using a dedicated transceiver clock pin a block based on the ‘Transceiver Reference Clock’ IP will need to be created. However, for this example the clock will be sourced from general clocking tree as this offers more flexibility.

Reset

The whole design is reset by a single asynchronous, active low, sourced from an FPGA pin (it is triggered via a push-button). The TX_CLK_STABLE output from the transceiver is also used to hold the counter in reset until the fabric clock is stable.

Opening the Wizards

The IP wizards are found using the ‘Catalog’ tab in Libero. All of the IP that is required can be found under the ‘PolarFire Features’ section:

Transceiver Interface Wizard

The transceiver interface wizard is setup as follows:

We are only creating a TX channel so the RX options are disabled. The data rate we require is 2,000 Mbps, however the PolarFire requires a minimum data rate of 3,200 Mbps. The 3,200 Mbps can then be divided down to give data rates less than 3,200 Mbps. The transceiver is therefore setup for 4,000 Mbps with a divide by 2 applied as this will give us the required 2 Mbps data rate. The only other setting that needs adjusting is the data width to the fabric (here 32-bits).

Transmit PLL Wizard

The PLL is configured as follows:

A 50 MHz clock will be sourced from the fabric. Note that due to the divide by 2 option in the transceiver wizard the selected data rate needs to be x2 when compared to the 2 Gbps we are targetting.

K-Character Insertion

To ensure that receiver can sync to the transmitter we need to insert a commas (K-characters) into the data stream. These commas are used by the receiver to align to the incoming data stream.

The Wikipedia page on 8b10b encoding provides a good overview of what K-characters are. However, in outline a K-character is a unique 8b10b code that is not allowed to occur in the normal data stream. The receiver can detect these characters and then use them to align itself to the transmitter. For this example we are going to use the comma code K28.5. This comma code corresponds to a raw data value of 0xBC (hex) which is encoded into the correct 8b10b value for K28.5.

In our example the transmit data traffic is being generated by a free running counter which counts from 0 to 255. The output of this counter is fed into the transmitter. A small piece of logic is inserted which replaces the count value with a comma value for the first 4 counts of each 256 count loop. The comma character K28.5 (0xBC hex) is used so that is matches the receivers setup. Due to the 4-byte alignment setting on the receiver the comma character is inserted into the least significant byte of the interface. The logic to do this looks like the following:

s_data <= X"000000BC" when (s_count <= std_ulogic_vector(to_unsigned(3, 32))) else s_count;
s_k_char <= "0001" when (s_count <= std_ulogic_vector(to_unsigned(3, 32))) else "0000";

The signal 's_count' is the count from the counter. The comma character 0xBC is inserted into the least significant byte when the counter has a value of 3 or less. The K-character flag for the least-significant byte is also set whenever K-characters are being inserted. This will cause the transmitter to encode the comma value as a K-character instead of as a normal data value.

Top-level Design

The top-level VHDL of the design looks like the following:

entity xcvr_tx_top_pf is
port (
    i_clk : in std_ulogic;        --Reference clock 
    i_aresetn : in std_ulogic;    --Async, active-low reset
    o_xcvr_tx_n : out std_ulogic; --TX output n
    o_xcvr_tx_p : out std_ulogic  --TX output p
);
end xcvr_tx_top_pf ;

The VHDL for the transceiver PLL is connected up as follows:

u_pll : transceiver_pll
port map(
    --Inputs
    FAB_REF_CLK => s_clk,  --Reference clock from global buffer
    --Outputs
    BIT_CLK => s_bit_clk,
    LOCK => s_pll_lock,
    PLL_LOCK => open,
    REF_CLK_TO_LANE => s_ref_clk
);

The transceiver VHDL is connected up as follows:

u_xcvr : transceiver
port map(
    --Inputs
    LANE0_8B10B_TX_K => s_k_char,  --From the K-char insertion logic
    LANE0_PCS_ARST_N => i_aresetn,
    LANE0_PMA_ARST_N => i_aresetn,
    LANE0_TX_DATA => s_data ,      --From the K-char insertion logic
    LANE0_TX_DISPFNC => c_dispfnc, --Constant set to zero
    TX_BIT_CLK_0 => s_bit_clk,
    TX_PLL_LOCK_0 => s_pll_lock,
    TX_PLL_REF_CLK_0 => s_ref_clk,
    --Outputs
    LANE0_TXD_N => o_xcvr_tx_n,
    LANE0_TXD_P => o_xcvr_tx_p,
    LANE0_TX_CLK_R => s_fabric_clk, --To user logic
    LANE0_TX_CLK_STABLE => s_fabric_clk_stable
);

The input 's_k_char' comes from the comma insertion logic and is the K-character flag. The signal 's_data' is the data value from the comma insertion logic and either contains comma values or count values from the counter.

The output 'LANE0_TX_CLK_R' from the transceiver is used to clock any user logic. In this example it is used to clock our counter. This clock is operating at the correct frequency to drive the input of the transmitter.

Implementation Details

A single constraints file is generated for the design. This contains the pin mapping and some basic timing constraints. The transmitter pins do not need constraining as this is done automatically by the implementation tool.

Simulating the Design

To prove that the design works it is necessary to simulate the transmitter. A simple test-bench can be created that includes a receiver block. The creation of a receiver is covered in other blogs.

To compile the PolarFire transmitter design two files are needed for each IP. This is a top-level stub file for the IP and lower level file that instantiates the IP blocks. The PolarFire pre-compile libraries are also required. The following ModelSim DO file snippet shows the files that are required to simulate the PolarFire transceiver design.


vlib polarfire
vmap polarfire $libero_install_path/Designer/lib/modelsimpro/precompiled/vlog/PolarFire

vlog -work work $relative_path/component/work/video_transceiver/I_XCVR/video_transceiver_I_XCVR_PF_XCVR_sim.v
vlog -work work $relative_path/component/work/video_transceiver/video_transceiver.v

vlog -work work $relative_path/component/work/transceiver_pll/transceiver_pll_0/transceiver_pll_transceiver_pll_0_PF_TX_PLL.v
vlog -work work $relative_path/component/work/transceiver_pll/transceiver_pll.v

Once the transmitter has been compiled you can then simulate it. In our simple test-bench the transmitter is connected to a compatible receiver block. The transmitter sends the count to the receiver which then outputs the received count from an output port. Below you can see a screen shot of the transmitter simulation:

Hardware Test

The design can also be tested on hardware. This can be done by connecting the transmitter to a compatible receiver implementation. The output from the receiver can be used to display the received count on a logic analyser:

Summary

High-speed serial transceivers are a very useful tool for sending high-bandwidth data between FPGAs. In this blog we have shown how to create a simple transmitter design. Hopefully this should give you enough information to start implementing your own high-speed serial transmitter designs.