civil-and-structural-engineering
Implementing High-speed Serializer/deserializer (serdes) in Vhdl
Table of Contents
Introduction to High-Speed SerDes in VHDL
Serialiser/deserialiser (SerDes) modules are fundamental building blocks in modern high-speed data communication systems. They convert parallel data streams into serial format for transmission over a single differential pair or optical fibre, and then reconstruct the parallel data at the receiver. Implementing a robust, high-speed SerDes in VHDL demands a deep understanding of digital design principles, clocking architectures, and signal integrity. This article provides a comprehensive guide to designing SerDes in VHDL, covering architecture, timing, coding techniques, simulation, and validation. It is intended for engineers who need to integrate SerDes into FPGA or ASIC designs for applications such as Gigabit Ethernet, PCI Express, SATA, and high-speed backplanes.
SerDes Architecture Fundamentals
A complete SerDes can be divided into four key subsystems: the serializer, the deserializer, clock data recovery (CDR), and alignment/framing circuitry. While the transmitter side serializes parallel data, the receiver side must recover the bit clock from the incoming serial stream and align the data boundaries. Each subsystem imposes distinct constraints on the VHDL design.
Serializer (Parallel-to-Serial Converter)
The serializer takes a N-bit wide parallel word and outputs the bits one at a time at the line rate. The parallel clock (PCLK) is typically N times slower than the serial clock (SCLK). In VHDL, the serializer can be implemented as a shift register with a counter that controls bit selection. For proper high-speed operation, the shift register must be synthesised using dedicated flip-flops with minimal clock-to-output delay, and the timing between parallel and serial domains must be carefully managed to avoid metastability.
Deserializer (Serial-to-Parallel Converter)
The deserializer performs the reverse operation: it receives a serial bit stream and reconstructs the parallel word. The incoming bits are shifted into a register on each serial clock edge. When the full word is assembled, it is presented on the parallel output bus with an associated valid flag. The deserializer must also handle bit alignment; if the data stream does not have a known framing pattern, alignment is achieved via comma characters or packet headers.
Clock Data Recovery (CDR)
CDR extracts the serial clock from the incoming data stream. This is critical because the receiver and transmitter are not frequency locked in many systems. In high-speed SerDes, CDR is typically implemented as a phase-locked loop (PLL) or delay-locked loop (DLL) that adjusts a local oscillator to match the incoming data transitions. While CDR is rarely coded in pure VHDL for gate-level synthesis (it uses analog or mixed-signal circuits in ASICs, or hardened blocks in FPGAs), the digital portion such as phase frequency detectors and digital loop filters can be described in VHDL. Understanding the CDR behaviour is essential when designing the deserializer control logic.
Word Alignment and Framing
After CDR, the incoming bits must be aligned to word boundaries. Many protocols embed a distinct pattern (e.g., K28.5 in 8b/10b) that the deserializer uses to locate byte boundaries. The VHDL alignment logic includes a state machine that searches for the pattern and then shifts the bit stream until the pattern is matched. This ensures subsequent parallel words are correctly framed.
Challenges in High-Speed SerDes Design
High-speed SerDes operate at multi-gigabit rates, often exceeding 10 Gbps per lane. At these speeds, physical and digital design challenges become pronounced.
Timing Closure and Metastability
The serializer and deserializer must transfer data between clock domains (parallel and serial). Metastability can occur if setup/hold times are violated. VHDL designers must insert synchronisers and carefully constrain the synthesis tool to meet timing. Using double- or triple-flop synchronisers for control signals is standard practice.
Jitter and Skew
Jitter in the recovered clock can cause bit errors. In the serializer, clock skew between the parallel and serial domains leads to incorrect bit ordering. To combat this, high-speed SerDes designs often use differential signalling, pre-emphasis, and equalisation. From a VHDL standpoint, designers must minimise combinatorial logic in the serial data path; retiming (moving flip-flops to balance delays) is a common technique.
Power Consumption
Switching at gigabit frequencies draws significant dynamic power. Power reduction strategies include clock gating, lowering the supply voltage (if the target technology permits), and using serialisers with lower parallel widths (e.g., shifting 8 bits vs. 32 bits) to reduce serial clock frequency. In VHDL, careful coding prevents unnecessary toggling of wide buses.
VHDL Implementation Techniques for High-Speed SerDes
To achieve reliable operation at multi-gigabit rates, the VHDL code must be synthesised into dedicated hardware resources. The following sections present practical coding techniques and examples.
Using Dedicated SerDes Hard Macros
Most modern FPGAs include hardened SerDes transceivers (e.g., Xilinx GTY, Intel E-tile). These blocks contain PLLs, CDR, deserializers, and equalisation circuits. VHDL code for these devices primarily instantiates vendor primitives and connects them to your core logic. For instance, in Xilinx devices, the GTH or GTY transceiver is configured via attributes and then accessed through a wrap-around module. This approach is far more reliable than a full soft SerDes at high speeds. The VHDL code for such wrappers must follow the vendor’s user guide to set correct parameters for line rate, reference clock, and encoding.
Soft SerDes Implementation (Moderate Speeds)
For rates up to a few hundred megahertz, a fully soft SerDes in VHDL can be viable. The key is to use shift registers and counters that are implemented with dedicated flip-flops and avoid long combinatorial chains. Below is an expanded example of a serializer that includes a load enable and handshake signal.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity Serializer_8to1 is
Port (
clk_ser : in STD_LOGIC; -- serial clock
clk_par : in STD_LOGIC; -- parallel clock (1/8 rate)
rst : in STD_LOGIC;
data_in : in STD_LOGIC_VECTOR(7 downto 0);
load : in STD_LOGIC;
serial_out : out STD_LOGIC;
busy : out STD_LOGIC
);
end Serializer_8to1;
architecture Behavioral of Serializer_8to1 is
signal shift_reg : STD_LOGIC_VECTOR(7 downto 0);
signal bit_cnt : integer range 0 to 7;
begin
process(clk_ser, rst)
begin
if rst = '1' then
shift_reg <= (others => '0');
bit_cnt <= 0;
busy <= '0';
elsif rising_edge(clk_ser) then
if load = '1' then
shift_reg <= data_in;
bit_cnt <= 7;
busy <= '1';
elsif bit_cnt > 0 then
shift_reg <= shift_reg(6 downto 0) & '0';
bit_cnt <= bit_cnt - 1;
else
busy <= '0';
end if;
end if;
end process;
serial_out <= shift_reg(7);
end Behavioral;
This serializer uses the serial clock for all operations. The load signal must be synchronous to clk_ser; in practice, a clock domain crossing circuit from the parallel domain is needed.
Deserializer with Word Alignment
The deserializer must extract the parallel word and identify boundaries. The following example includes a simple alignment state machine that searches for a known pattern (e.g., B"1111_0000").
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity Deserializer_8b10b is
Port (
clk_ser : in STD_LOGIC;
rst : in STD_LOGIC;
serial_in : in STD_LOGIC;
data_out : out STD_LOGIC_VECTOR(9 downto 0);
aligned : out STD_LOGIC
);
end Deserializer_8b10b;
architecture Behavioral of Deserializer_8b10b is
signal shift_reg : STD_LOGIC_VECTOR(9 downto 0);
signal bit_cnt : integer range 0 to 10;
signal lock : STD_LOGIC := '0';
begin
process(clk_ser, rst)
begin
if rst = '1' then
shift_reg <= (others => '0');
bit_cnt <= 0;
lock <= '0';
aligned <= '0';
elsif rising_edge(clk_ser) then
shift_reg <= shift_reg(8 downto 0) & serial_in;
if bit_cnt < 10 then
bit_cnt <= bit_cnt + 1;
end if;
-- check for alignment pattern at any bit position
if lock = '0' then
if bit_cnt = 10 then
if shift_reg = "1111000010" then -- comma pattern
lock <= '1';
aligned <= '1';
end if;
end if;
else
-- After locked, output parallel word every 10 clocks
if bit_cnt = 10 then
data_out <= shift_reg;
bit_cnt <= 0;
end if;
end if;
end if;
end process;
end Behavioral;
In a real system, alignment logic must also handle bit slipping and multiple comma characters per packet. The CDR clock is assumed recovered; if not, a PLL block is required.
Clock Domain Crossing Strategies
High-speed SerDes designs inevitably cross from the slower parallel clock to the fast serial clock. The safest approach is to use a FIFO (first-in, first-out) buffer to decouple the domains. In VHDL, the FIFO can be implemented as a dual-port RAM with read and write pointers synchronised across clocks. For minimal latency, handshake-based synchronisers (such as a request-acknowledge scheme) can be used for control signals.
Simulation and Verification
Simulating a SerDes at high speed requires careful testbench design. The testbench should generate data patterns, inject jitter (to assess CDR and timing margins), and verify that transmitted data matches received data after framing. Key aspects include:
- Data generation: Use PRBS (pseudo-random bit sequences) to emulate realistic traffic. PRBS patterns expose error sources like inter-symbol interference and pattern-dependent jitter.
- Clock modelling: Model the serial clock with realistic phase noise. Most digital simulators can approximate jitter by using a variable period in VHDL testbenches (e.g.,
wait for 125 ps;with small random variations). - Assertion-based verification: Insert VHDL assertions to check for protocol violations, such as invalid character codes in 8b/10b encoding or misalignment after a defined number of clock cycles.
- Coverage-driven testing: Run long simulations to ensure that the alignment state machine can lock onto the comma pattern under various initial phases. Functional coverage can be recorded using SystemVerilog if available, or via custom VHDL counters.
Vendors like Xilinx and Intel provide simulation libraries for their SerDes primitives. Using these libraries accelerates verification because they model the analog behaviour of the PLL and equalisation. Always include these models in your simulation testbench.
Advanced Topics: Pre-emphasis, Equalisation, and PCB Considerations
While VHDL code cannot alter the physical layer behaviour, the SerDes controller can adjust equalisation settings if the transceiver supports them. Modern FPGA transceivers have programmable transmitter pre-emphasis and receiver equalisation. In VHDL, these are configured by writing to control registers via an internal bus (e.g., AXI-Lite). The designer must understand how pre-emphasis boosts high-frequency components to compensate for channel loss, and how continuous-time linear equaliser (CTLE) and decision-feedback equaliser (DFE) correct inter-symbol interference. This knowledge influences the trade-offs in the digital receive state machine, such as the decision threshold for data sampling.
PCB layout is equally critical. The serial traces must be impedance-controlled (typically 100 Ω differential), and the clock distribution should minimise skew between the reference clock and the transceiver’s dedicated clock input. These physical constraints are outside VHDL, but the digital designer must provide clear guidelines to the PCB engineer, such as the required jitter tolerance and return loss specifications.
Case Study: Implementing a Gigabit Ethernet SerDes in VHDL
Gigabit Ethernet (1000BASE-X) uses an 8b/10b encoding scheme and a line rate of 1.25 Gbps. The SerDes must serialize an 8-bit parallel word plus a control bit (for K/D character distinction) into a 10-bit serial stream. The receiver performs clock recovery from the scrambled data and aligns word boundaries using the comma /K28.5/ (binary 11100 11110 after encoding). A VHDL implementation would include:
- An 8b/10b encoder (can be implemented as a lookup table) that converts 8-bit data and control signals into 10-bit code groups.
- A serializer that takes the 10-bit code group and outputs it serially at 1.25 Gbps.
- A deserializer that shifts in bits and searches for the comma pattern.
- A CDR block (usually instantiated from a vendor primitive) that generates a 125 MHz recovered clock (for the parallel side) from the incoming 1.25 Gbps stream.
- A FIFO to transfer received data from the recovered clock domain to the system clock domain.
This architecture can be described entirely in VHDL, with the exception of the analog CDR and line driver/receiver. Many open-source projects provide such implementations for educational purposes, but for production, hardened transceivers are strongly recommended.
Conclusion
Implementing a high-speed SerDes in VHDL is a challenging but rewarding task that bridges digital logic and high-frequency analog design. The VHDL code must be written with synthesis constraints, timing closure, and clock domain crossing in mind. For rates above 1 Gbps, leveraging dedicated transceiver macros is essential; for lower rates, a fully soft SerDes can be built using shift registers and counters. Regardless of the approach, thorough simulation with realistic jitter and data patterns ensures robustness. Understanding the underlying SerDes architecture—serializer, deserializer, CDR, and alignment—enables the designer to make informed trade-offs between area, power, and performance. By following the guidelines in this article, engineers can develop reliable SerDes implementations that power modern communication systems.
External References
- Xilinx 7 Series Transceivers User Guide (UG476) – detailed information on hardened SerDes primitives.
- Wikipedia: Serializer/Deserializer – overview of SerDes concepts and applications.
- Intel Stratix 10 Transceiver User Guide – another vendor’s approach to high-speed serial interfaces.
- FPGARelated SerDes Tutorial – practical introduction to SerDes design with VHDL examples.