The Shift Toward Hardware-Accurate Game Preservation

Field-programmable gate arrays (FPGAs) have quietly altered the landscape of retro gaming and hardware design. Unlike a CPU that fetches instructions from memory or a GPU that runs shader programs across hundreds of cores, an FPGA lets engineers build custom digital logic circuits directly on the chip. This means a single FPGA can become a perfect replica of a classic console’s central processor, video display controller, and sound chip—all running in hardware with cycle-exact precision. The result is zero latency, pixel-accurate graphics, and original audio timings that emulators can only approximate.

The gaming world first took notice when projects like the Analogue Nt Mini demonstrated that an FPGA could play original NES cartridges with flawless authenticity on modern displays. Since then, the technology has expanded into handhelds, open-source platforms, and even experimental modern consoles. Understanding the underlying principles and the engineering effort behind these systems reveals why FPGA acceleration is not just a gimmick but a fundamental shift in how we build and preserve gaming hardware. Tool-assisted speedruns (TAS) have long proven that software emulation introduces timing variances; FPGA cores eliminate these discrepancies entirely, offering a verifiable hardware standard.

How FPGAs Work at the Gate Level

An FPGA consists of a grid of configurable logic blocks (CLBs) surrounded by programmable interconnect wiring. Each CLB contains a look-up table (LUT) that can implement any small Boolean function, a flip-flop for storing state, and a multiplexer for routing signals. By loading a bitstream—a binary configuration file—developers define how these blocks are wired together to form complex digital circuits. The same FPGA can later be reprogrammed to become an entirely different system simply by loading a new bitstream.

Hardware description languages (HDLs) like Verilog or VHDL are used to describe the desired circuit behavior. For example, to create a 6502 processor, you write RTL code that specifies the instruction decode, ALU operations, and register file. The synthesis tool maps this code onto the FPGA’s LUTs and flip-flops. Timing analysis ensures every signal meets the target clock frequency. Unlike software emulation, which adds layers of abstraction, the FPGA physically becomes the circuit. This eliminates overhead and guarantees deterministic behavior down to the nanosecond.

Modern FPGAs from AMD Xilinx and Intel also embed hardened blocks such as DSP slices for fast multiply-accumulate, block RAM (BRAM) for on-chip storage, phase-locked loops (PLLs) for clock generation, and high-speed transceivers for HDMI or USB. These hardened blocks accelerate common tasks without consuming general-purpose logic, making it feasible to implement entire system-on-chip designs on a single FPGA die. For instance, the DSP48E2 slices in Xilinx devices can handle video scaling and color space conversion without consuming a single logic element.

The Technical Advantages for Gaming Hardware

Building a console around an FPGA brings concrete benefits that go beyond nostalgia. These advantages explain why both commercial products and open-source communities have embraced the platform so enthusiastically.

Cycle-Accurate Authenticity

Software emulators must translate each instruction of a vintage CPU into host CPU instructions, often using dynamic recompilation. This process introduces timing inaccuracies that can break games relying on specific scanline timings or co-processor synchronization. An FPGA core runs the original logic exactly as the silicon would: the same number of clock cycles per instruction, the same DMA bus contention, the same audio sample boundaries. For preservation purposes, this is the only known method to guarantee that every game behaves identically to the original hardware. This accuracy is critical for systems like the Super Nintendo, where the Super FX chip and SA-1 co-processors push the boundaries of what can be emulated in software without detectable glitches.

Deterministic Low Latency

Input lag accumulates through USB polling, buffer queuing, and operating system scheduling. FPGA-based consoles bypass these entirely. Controller input can be sampled directly by the hardware core, and video pixels are generated with a fixed pipeline that outputs to the display at the exact pixel clock. Many designs even output native analog video to CRTs, achieving response times under one millisecond. This is critical for competitive retro gaming and for modern implementations where real-time interaction matters, such as rhythm games or quick-time events. The RetroTINK 4K scale, for example, leverages an FPGA to achieve near-zero latency 4K upscaling that software solutions cannot match.

Inherent Parallel Execution

An FPGA inherently executes its entire logic in parallel. Every module—CPU, GPU, audio processor, DMA controller—runs simultaneously on dedicated silicon resources. This is analogous to having separate physical chips for each subsystem, communicating through dedicated wires. In a software emulator, the host CPU must time-slice and prioritize these tasks, leading to overhead. An FPGA eliminates this bottleneck entirely. The parallel nature also simplifies the integration of complex co-processors. Adding a Yamaha YM2612 sound chip to an FPGA-based Sega Genesis core is simply a matter of instantiating the hardware module and wiring it into the bus architecture.

Energy-Efficient Performance

Recreating a 16-bit console on a desktop PC might draw 200 watts. An FPGA implementing the same system can run on 10 watts or less. The reason is simple: the FPGA only toggles the gates that are actively needed. There is no operating system overhead, no redundant shader pipelines, and no memory hierarchy thrashing. For handheld consoles, this efficiency translates into hours of battery life instead of minutes. Even for modern acceleration tasks like video upscaling or AI inference, FPGA can deliver higher operations per watt than a general-purpose GPU for targeted workloads.

Field-Upgradeable Hardware

Because the FPGA configuration is stored in external flash, updating the console is as easy as loading a new bitstream. This allows developers to fix hardware bugs, add new features, or even support entirely new console systems without replacing a single component. The Analogue Pocket uses this capability to support an openFPGA ecosystem where third-party developers can create and distribute their own cores. The device that started as a Game Boy clone can now emulate dozens of systems, all through firmware updates. This upgradability extends the lifespan of the hardware indefinitely, a stark contrast to fixed-function ASICs.

The Evolution from Niche to Mainstream

FPGA gaming began with hobbyists. Engineer Kevin “Kevtris” Horton designed a single FPGA that could replace the entire NES motherboard, outputting 1080p video with no lag. His work caught the attention of Analogue, who commercialized the concept with the Nt Mini. Analogue later released the Super Nt and Mega Sg, proving there was a market for premium FPGA consoles. These devices sold out quickly, showing that enthusiasts were willing to pay a premium for accuracy and build quality.

Around the same time, the open-source MiSTer project emerged as a community-driven alternative. Using a Terasic DE10-Nano board (Cyclone V FPGA), volunteers developed cores for classic computers (Amiga, C64, MSX), arcade boards, and consoles up to the PlayStation era. The MiSTer platform now supports over 150 different systems, with new cores appearing regularly. Its modular design allows users to add RAM modules, analog I/O boards, and USB hubs. The project’s success has inspired similar efforts using cheaper FPGA boards like the Lattice ECP5, and has even led to commercial spin-offs like the MARS FPGA and Pocket MiSTer. The community has navigated the legal challenges of distributing BIOS files by requiring users to dump their own hardware, while distributing the hardware logic freely. Core developers like Jotego have built successful Patreon-backed businesses by releasing arcade cores (CPS1, CPS2) that demand extensive reverse-engineering effort.

Designing a Modern FPGA Console: From Concept to Silicon

Creating a complete FPGA-based console requires expertise in digital design, PCB layout, and system integration. The process can be broken into distinct stages, each with its own challenges.

Choosing the Right FPGA and Board

The selection of the FPGA device dictates what systems can be implemented. For 8-bit and 16-bit consoles, a mid-range FPGA with 50,000–100,000 logic elements is sufficient. For early 3D systems like the PlayStation or Sega Saturn, you need 200,000+ LEs, plus sufficient block RAM and DSP slices. The Intel Cyclone V (used in MiSTer) and Xilinx Artix-7 are popular choices. For cutting-edge projects, higher-end devices like the Xilinx Kintex or Versal provide the resources for real-time ray tracing or neural network accelerators. The board must also include high-speed memory (DDR3/DDR4), video output interfaces (HDMI, VGA, analog), audio codecs, and controller ports. Power sequencing is critical: modern FPGAs require multiple voltage rails (core, transceiver, IO banks) that must ramp up in a specific order to avoid damage and latch-up conditions.

Writing the Core in Verilog or VHDL

The heart of the console is the hardware description that defines each component. Developers often start by reverse-engineering original chip schematics or using logic analyzers to capture signals from known-working hardware. For widely supported systems, existing open-source cores (like those in the MiSTer repository) provide a starting point. The code is structured into modules: CPU core, video processor, audio generator, memory controller, and bus arbiter. Each module is simulated with testbenches to verify functional and timing correctness. Tools like ModelSim or Verilator are used for simulation, and Xilinx Vivado or Intel Quartus for synthesis and place-and-route. Timing closure—ensuring all paths meet the target clock frequency—often requires multiple iterations and careful constraint management. Developers use static timing analysis (STA) to validate that setup and hold times are met across process, voltage, and temperature (PVT) corners.

Verification and Debugging

Debugging an FPGA core is significantly harder than debugging software. You cannot easily set breakpoints or inspect variable values mid-cycle. Developers rely heavily on integrated logic analyzers (ILAs) from Vivado or Quartus, which use spare block RAM to buffer internal signals and stream them over JTAG. Formal verification tools are also gaining traction; they mathematically prove that the FPGA implementation matches the golden reference model. For console cores, running the actual game ROMs through the simulation vector is the ultimate test. This "batch simulation" approach can execute millions of clock cycles overnight to catch elusive bugs. The MiSTer project uses a standard test suite of known tricky ROMs to validate new core releases.

PCB Design and Signal Integrity

The physical board must handle high-speed signals for DDR memory, HDMI (up to 1.65 Gbps or more), and USB 2.0/3.0. Controlled impedance traces (typically 50 ohms single-ended, 90 ohms differential for USB, 100 ohms for HDMI) are essential, as are decoupling capacitors near every power pin. For retro authenticity, many designers include original controller ports with level shifters to handle 5V logic. Analog video outputs (RGB, S-Video, composite) require external DACs and anti-aliasing filters. Audio may be output via I2S to HDMI or through a separate DAC. Ground plane isolation between digital and analog sections prevents noise from degrading video quality. Thermal management is also important; a large FPGA can dissipate several watts, requiring heatsinks and sometimes active cooling. The power delivery network (PDN) must be simulated to ensure transient currents do not cause voltage drops that trigger brittle error correction or logic faults.

Software Integration and User Experience

Once the hardware core is proven, a simple operating system or bootloader is needed. For retro systems, this is often a minimal Linux kernel that loads the FPGA bitstream and presents a menu for selecting ROMs. The user interface must handle file systems (FAT32, ext4), display scaling, and controller mapping. Some consoles also support save states, cheats, and screen filters. For original game development, developers can use a soft-core processor like RISC-V running bare-metal code, offloading graphics and audio to custom FPGA accelerators. High-level synthesis (HLS) tools allow writing parts of the design in C++ and converting them to hardware, accelerating development for programmers unfamiliar with HDLs. The MiSTer ecosystem uses a "Main" binary that handles USB input, OSD menus, and file system access, while the core handles only the hardware emulation.

Real-World Applications and Future Potential

Retro gaming preservation remains the killer application. Devices like the Analogue Pocket and MiSTer allow players to enjoy classic libraries without relying on emulators that may have timing bugs or input lag. The openFPGA initiative makes it possible for the community to port cores across platforms, ensuring that even obscure systems can be experienced authentically. The recent explosion of arcade core development means entire game boards are being preserved in hardware description language form, protecting them from physical decay.

Beyond retro, FPGAs are starting to appear in modern game development. They can accelerate compute-intensive tasks such as physics simulations, real-time ray tracing, and AI upscaling (like FSR or DLSS). Console manufacturers have experimented with FPGAs for I/O processing (as in the PS4’s southbridge) or as a flexible prototyping platform for custom silicon. In virtual reality, FPGAs can handle sensor fusion and asynchronous timewarp with deterministic latency, reducing motion sickness. The ability to update hardware floating-point units or custom codecs after a console has shipped gives OEMs unique flexibility.

Looking ahead, chiplet architectures may allow a main CPU/GPU die to be paired with a small FPGA tile on the same interposer. This would give developers the ability to reconfigure a portion of the console’s hardware after release, enabling features like custom codecs, security accelerators, or even retro compatibility patches. The open-source toolchain Yosys and nextpnr are making FPGA development more accessible, reducing reliance on expensive proprietary tools. For example, the Lattice ECP5 family now has excellent open-source support through Project Trellis, enabling inexpensive boards like the OrangeCrab to run custom cores. As costs continue to drop and software tools improve, FPGA-accelerated gaming could become a standard part of the console landscape, bridging the gap between fixed-function hardware and the ever-evolving demands of game developers.

Overcoming the Barriers to Adoption

Despite its advantages, FPGA development remains challenging. HDL coding requires a different mindset from software programming; race conditions and timing issues are difficult to debug. Proprietary tools like Vivado and Quartus are large and often require expensive licenses for high-end devices. Open-source alternatives are still maturing and may not support all FPGA families. The cost of a capable FPGA board can be significant (DE10-Nano costs around $200), and adding RAM, video output, and controllers pushes the price higher. A fully equipped MiSTer setup can exceed $500, putting it out of reach for casual players. Mass production would favor an ASIC, but the niche market for premium retro hardware justifies the FPGA approach. The latency verification required for FPGA designs demands expensive oscilloscopes and logic analyzers, creating a high barrier to entry for individual developers.

However, the community has made great strides. Educational resources like NANDLAND provide accessible tutorials for beginner HDL designers. Websites like the MiSTer Wiki offer extensive documentation and pre-built cores. Open-source toolchains like Yosys and NextPNR have matured enough to handle complex designs on Lattice devices, freeing developers from vendor lock-in. As more developers become familiar with digital design, the pool of talent grows. The next few years may see FPGA consoles that not only preserve the past but also enable new forms of interactive entertainment that were impossible with fixed hardware, such as custom geometry pipelines or real-time procedural generation engines implemented entirely in logic.