Computing is about communicating. Some would also say about networking. Digital independence tags along on the wave of "Recommendations and Roadmap for European Sovereignty in open source HW, SW and RISC-V Technologies (2021)", calling for the development of critical open source IP blocks, such as PCIE Root Complex (RC)
. This is the first step in that direction.
Our project aims to open Artix7 PCIe Gen2 RC IP blocks for use outside of proprietary tool flows. While still reliant on Xilinx Series7 Hard Macros (HMs), it will surround them with open-source soft logic for PIO accesses — The RTL
and, even more importantly, the layered sofware Driver with Demo App
.
All that with full HW/SW opensource co-sim
the kind of is yet to be seen in the proprietary settings. Augmented with a rock-solid openBackplane
in the basement of our hardware solution, the geek community will thus get all it takes for building their own, end-to-end openCompute systems.
The project‘s immediate goal is to empower the makers with ability to drive PCIE-based peripherals from their own soft RISC-V SOCs.
Given that the PCIE End-Point (EP) with DMA is already available in opensource, the opensource PCIE peripherals do exist for Artix7. Except that they are always, without exception, controlled by the proprietary RC on the motherboard side, typically in the form of RaspberryPi ASIC, or x86 PC. This project intends to change that status quo.
Our long-term goal is to set the stage for the development of full opensource PCIE stack, gradually phasing out Xilinx HMs from the solution. That’s a long, ambitious track, esp. when it comes to mixed-signal SerDes and high-quality PLLs. We therefore anticipate a series of follow on projects that would build on the foundations we hereby set.
This first phase is about implementing an open source PCIE Root Complex (RC) for Artix7 FPGA, utilizing Xilinx Series7 PCIE HM and GTP IP blocks, along with their low-jitter PLL.
- PCIE Primer by Simon Southwell
Almost all consumer PCIE installations have the RC chip soldered down on the motherboard, typically embodied in the CPU or "North Bridge" ASIC, where PCIE connectors are used solely for the EP cards. Similarly, all FPGA boards on the market are designed for EP applications. As such, they expect clock, reset and a few other signals from the infrastructure. It is only the professional and military-grade electronics that may have both RC and EP functions on add-on cards, with a backplane or mid-plane connecting them (see VPX chassis, or VITA 46.4).
This dev activity is about creating the minimal PCIE infrastructure necessary for using a plethora of ready-made FPGA EP cards as a Root Complex. This infrastructure takes the physical form of a mini backplane that provides the necessary PCIE context similarly to what a typical motherboard would give, but without a soldered-down RC chip that would be conflicting with our own FPGA RC node.
Such approach is less work and less risk than to design our own PCIE motherboard, with a large FPGA on it. But, it is also a task that we did not appreciate from the get-go. In a bit of a surprise, half-way through planning, we've realized that a suitable, ready-made backplane was not available on the market. This initial disappointment then turned into excitement knowing that this new outcome would make the project even more attractive / more valuable for the community... esp. when Envox.eu has agreed to step in and help. They will take on the PCIE backplane PCB development activity.
- Create requirements document.
- Select components. Schematic and PCB layout design.
- Review and iterate design to ensure robust operation at 5GHz, possibly using openEMS for simulation of high-speed traces.
- Manufacture prototype. Debug and bringup, using AMD-proprietary on-chip IBERT IP core to assess Signal Integrity.
- Produce second batch that includes all improvements. Distribute it, and release design files with full documentation.
- Procure FPGA development boards and PCIE accessories.
- Put together a prototype system. Bring it up using proprietary RTL IP, proprietary SW Driver, TestApp and Vivado toolchain.
- HW development of opensource RTL that mimics the functionality of PCIE RC proprietary solution.
- SW development of opensource driver for the PCIE RC HW function. This may, or may not be done within Linux framework.
- Design SOC based on RISC-V CPU with PCIE RC as its main peripheral.
This dev activity is significantly beefed up compared to our original plan, which was to use a much simpler PCIE EP BFM, and non-SOC sim framework. While that would have reduced the time and effort spent on the sim, prompted by NLnet astute questions, we're happy to announce that wyvernSemi is now also onboard!
Their VProc can be used not only to faithfully model the RISC-V CPU and SW interactions with HW, but it also comes with an implementation of the PCIE RC model. The plan is to first convert it to the comprehensive PCIE EP model, then pair it up in sim with our RC RTL design. Moreover, the existence of both RC and EP models paves the way for future plug-and-play, pick-and-choose opensource sims of the entire PCIE subsystem.
With the full end-to-end simulation thus in place, we hope that the need for hardware debugging, using ChipScope, expensive test equipment and PCIE protocol analyzers would be alleviated.
- Conversion of existing PCIE RC model to EP model.
- Testbench development and build up. Execution and debug of sim testcases.
- Documentation of EP model, TB and sim environment, with objectives to make it all simple enough to pickup, adapt and deploy in other projects.
- One-by-one replace proprietary design elements from PART2.b with our opensource versions (except for Vivado and TestApp). Test it along the way, fixing problems as they occur.
- Develop our opensource PIO TestApp software and representative Demo.
- Build design with openXC7, reporting issues and working with developers to fix them, possibly also trying ScalePNR flow.
Given that PCIE is an advanced, high-speed design, and our accute awareness of nextpnr-xilinx and openXC7 shortcomings, we expect to run into showstoppers on the timing closure front. We therefore hope that the upcoming ScalePNR flow will be ready for heavy-duty testing within this project.
- WIP
- Basic PCIE EP for LiteFury
- Regymm PCIE
- LiteX PCIE EP
- PCIE EP DMA - Wupper
- Xilinx UG477 - 7Series Integrated Block PCIe
- XIlinx DS821 - 7series_PCIE Datasheet
- Xapp1052 - BusMaster DMA for EP
The openpcie2-rc top level test bench is based around the pcievhost PCIe 2.0 verification co-simulation IP in order to drive the DUT's PCIe link. This is a C model for generating PCIe traffic data connected to the logic simulation using the VProc virtual processor co-simulation element. VProc allows a user program to be compiled natively on the host machine and 'run' on an instantiated HDL component in a logic simulation and has a generic memory mapped master bus for generating read an write transactions. A Bus Functional Model (BFM) wrapper encapsulates a VProc component and effectively memory maps the PCIe ports into the address space, allowing software to drive and read these ports and interface with the PCIe C model. Although originally designed as a root complex model, the pcievhost components has some endpoint features, enabled with a parameter.
The diagram below is a draft block diagram of the proposed top level test bench.
The DUT PCIe link is connected to the pcievhost, configured as an endpoint, running some user code to do link training and any transaction generation required, though it will automatically respond to transactions requiring a completion. A pair of PcieDispLink HDL components (supplied as part of pcievhost) are optionally connected to the up and down links that can display the traffic on the PCIe link. It also does some on-the-fly compliance testing. To drive the DUT's memory mapped slave bus, a VProc component is used with a BFM wrapper for the specific bus protocol used for the device. A user program can then be run on the virtal processor to access the device's registers etc.
The software to run on the virtual processor is proposed to be a means to connect to an external QEMU process via a TCP/IP socket with a (TBD) protocol to instigate read and write transactions and return data (where applicable). It is envisaged that the client software is driven via the DUT's device driver to communicate with server software on VProc. This is currently TBD.
More details of the test bench, the pcievhost component and its usage can be found in the 5.sim/README.md file.
- WIP
- PCIE Utils
- Debug PCIE issues using 'lspci' and 'setpci'
- Using bysybox (devmem) for register access
- PCIE Sniffing
- Stark 75T Card
- ngpscope
- PCI Leech
- PCI Leech/ZDMA
- LiteX PCIE Screamer
- LiteX PCIE Analyzer
- Wireshark PCIe Dissector
- PCIe Tool Hunt
- PCIe network simulator
- An interesting PCIE tidbit: Peer-to-Peer communicaton. Also see this
- NetTLP - An invasive method for intercepting PCIE TLPs
We are grateful to NLnet Foundation for their sponsorship of this development activity.
The wyvernSemi's wisdom and contribution made a great deal of difference -- Thank you, we are honored to have you on the project.
The Envox, our next-door buddy, is responsible for the birth of our backplane, which we like to call BB (not to be mistaked for their gorgeous blue beauty BB3 🙂)