How to migrate a VeriFire Emulator design from F1 to F2 Instances

This post was contributed by Irfan Waheed (SilverLining EDA), Babar Sohail (SilverLining EDA), and Mark Azadpour (AWS).

Today, Artificial Intelligence (AI) is set to change the way we live and the engine behind this transformation is large and complex Application Specific Integrated Circuit (ASIC) that is produced by semiconductor vendors. These ASICs are very complex and design and verification cycles takes years to complete. The verification process involves running simulations which are very compute intensive and require HPC infrastructure. To accelerate these simulations, various accelerators such as Field Programmable Gate Array (FPGA) are deployed. This hardware assisted simulations are called emulations. AWS has offered FPGA-accelerated Amazon Elastic Compute Cloud (Amazon EC2) Instances since 2017 with the introduction of F1 instances. Late 2024, AWS introduced F2 instances which can deliver up to 60% better price performance than first-generation F1 instances.

SilverLining EDA’s VeriFire emulator takes advantage of the FPGAs in F-instances to offer cloud-based emulation capability. The VeriFire Verification-as-a-Service (VaaS) combines the best of hardware simulation capabilities—such as debug, break-point, and single stepping—at the speed of hardware (typically 10x simulation speed up) at scale. Therefore, you can verify a design with a suite of tests managed by orchestrators such as AWS Batch to execute 1000’s of jobs in an AWS environment.

VeriFire is the semiconductor industry’s first Verification-as-a-Service solution. VeriFire is deployed entirely on the AWS cloud and accelerates verification by reducing testbench development time, increasing simulation speed and standardizing debugging. VeriFire is available via the AWS Partner Network (APN).

With the introduction of F2, the FPGA build times have drastically improved when compared to F1. A build, which would take 2.5 hours on F1, completes in 30 minutes on F2. To take advantage of these improvements, VerFire was ported from F1 to F2 instances. In this post, we’ll cover this journey. Specifically, we will cover how to launch F2 instances, transition path from F1 to F2, and running VeriFire on the F2 ported design.

F2 instance Provisioning

F2 instances are easy to program and come with everything you need to develop, simulate, debug, and compile your hardware acceleration code, including an FPGA Developer AMI. They provide development environments for low-level hardware development and software development in C/C++ and openCL environments (available on our GitHub).

The first step in F2 instance provisioning is the selection of one of the 3 types of F2 instances available. The VU47P FPGAs on the F2 instances are significant upgrades to the VU9P FPGAs available on the F1s. The number of LUTS is 10% higher and the number of DSPs is 32% higher. This allows more complex designs to fit on a single FPGA. The instance type you choose will depend on the size and complexity of your application. For F1 users, the mapping is straightforward and simply choose the F2 instance which has the same number of FPGAs as your F1 instance.

The actual steps to provision a F2 instance is no different than provisioning any other EC2 instance. There’s 2 pieces of information you need to provision an instance:

The instance name (1 of the 3 types).
The AMI number. As an example, we used ami-0e6383ac30e23cf97.

The following two screen shots show the menu’s for selecting the instance types:

Figure 1 – The AWS Management Console EC2 instance launch dialog showcasing the FPGA Developer AMI and F2 instance type.

Once you have provisioned an instance, you can connect to it in a variety of ways including

EC2 instance connect
Session Manager
SSH
EC2 Serial Console

Now that the provisioning of F2 instance is complete, the next step is to transition an F1 design to F2 as detailed next.

Transitioning from F1 to F2

SilverLining EDA launched Verifier in August 2024 and originally developed it for F1 instances. SilverLining EDA worked closely with the F2 development team to port VeriFire to F2. We have successfully ported our infrastructure to F2 and VeriFire is now available for use on F2 instances.

The Hardware Development Kit (HDK) and Software Development Kit (SDK) for F2 are fairly similar to F1. The HDK design flow enables developers to create RTL-based accelerator designs for F2 instances using Xilinx Vivado. On the other hand, the SDK contains the Amazon FPGA Image (AFI) Management Tools, which includes both the source code to the AFI Management Tools as well as detailed description of the commands to use on an FPGA instance. The SDK is not used to build or register AFI, rather it is only used for managing and deploying pre-built AFIs.

VeriFire uses the utility functions provided by the SDK to communicate with the Custom Logic (CL). For a VeriFire based CL, this includes both the Device Under Test (DUT) and VeriFire Engine.

With Amazon EC2 FPGA instances, each FPGA is divided into two partitions:

Shell (SH) – AWS platform logic implementing the FPGA external peripherals, PCIe, DMA, and Interrupts.
Custom Logic (CL) – Custom acceleration logic created by FPGA Developers.

The shell creates a clean, well-defined top-level portlist for the CL. Currently, support is available for a “small” shell. The Small Shell offers 88% usable FPGA resources for the CL. It does not include a built-in Direct Memory Access (DMA) engine.

Documentation indicates that support for another shell known as XDMA shell is in the works. This will allow writes/reads data to/from the CL via the sh_cl_pcis_dma bus.

At the end of the development process, combining the Shell and CL creates an Amazon FPGA Image (AFI) that can be loaded onto the Amazon EC2 FPGA Instances.

There are a few key differences between F1 and F2 as detailed in the next paragraphs.

The main difference is that the F2 Shell offers only two clocks – clk_main_a0 and clk_hbm_ref to the Custom Logic. This is different from F1 Shell, which offers a total of 8 clocks from Shell to the CL as described in F1 Shell Interface Spec. Offering fewer clocks from the Shell to CL is beneficial because it does not lock up the routing resources for customers who do not require all the clocks from the Shell.

If you were only using clk_main_a0 and clock recipe A1 on your F1 design, then your transition to F2 will be fairly straightforward. All you need to do is create a new CL using CL_TEMPLATE and then instantiate your design modules from F1 in this CL_TEMPLATE and wire them to the new SHELL.

In the context of F1, a clock recipe determines the frequency of each of the clocks in clock groups A,B and C. This is done by providing the following options to the AWS build script

-clock_recipe_a [A0,A1,A2] -clock_recipe_b [B0,B1,B2, B3, B4,B5] -clock_recipe_c [C0,C1,C2]

The clock speeds available to the CL will depend on the clock recipe picked for each clock group. If you are using any other clock recipe in F1 besides clock recipe A1, you will need to incorporate AWS_CLK_GEN IP into your CL in order to port your design to F2.

The steps needed to integrate AWS_CLK_GEN into your Custom Logic is described in full detail in the AWS F2 documentation.

Once the AWS_CLK_GEN IP is integrated into your design, the build options are the same as specified above for F1.

Let’s illustrate this with a specific example of a DUT which used 2 clocks with frequencies of 62.5 MHz and 125Mhz on F1. None of these clock frequencies are available on F2 by default. While transitioning to F2, we went through the process of integrating the AWS_CLK_GEN IP into our VeriFire environment as described above which created the clocks we needed to build our design successfully.

The other issue you might face is a bit more subtle. The signals coming into the Custom Logic from the Shell are synchronized with clk_main_a0. When using F1, the frequency of clk_main_a0 was variable depending on the recipe being used and the data was always synchronized to this clock.

When using F2, clk_main_a0 is always 250 Mhz. If you require your data to be synchronized to clk_main_a0 but your design does not support the high speed of 250 Mhz, you must use the cl_axi_clock_converter_light IP so that your clock and data remain synchronized.

Integrating the cl_axi_clock_converter_light IP is not complex as AWS has provided a pre-built version of this IP. All you need to do is instantiate cl_axi_clock_converter_light in your top-level design file, connect the ports, and source the IP in the synthesis script as follows:

## AXI Conversion IP's
read_ip [ list \
  ${HDK_IP_SRC_DIR}/cl_axi_clock_converter/cl_axi_clock_converter.xci \
  ${HDK_IP_SRC_DIR}/cl_axi_clock_converter_light/cl_axi_clock_converter_light.xci
]

Integrating the AWS_CLK_GEN IP is also similar. Instantiate aws_clk_gen in your top-level design file, connect the ports and source the IP in the synthesis scripts as follows:

## Clocking IP's
read_ip [ list \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/clk_mmcm_a/clk_mmcm_a.xci \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/clk_mmcm_b/clk_mmcm_b.xci \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/clk_mmcm_c/clk_mmcm_c.xci \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/clk_mmcm_hbm/clk_mmcm_hbm.xci \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/cl_clk_axil_xbar/cl_clk_axil_xbar.xci \
  $HDK_SHELL_DESIGN_DIR/../../ip/cl_ip/cl_ip.srcs/sources_1/ip/cl_sda_axil_xbar/cl_sda_axil_xbar.xci
]

Now that the VeriFire has been ported to F2, let’s look at some basic commands that are executed to load and run emulation in an interactive session.

Running VeriFire on F2

The process of integrating your DUT with VeriFire is the same as it was on F1. More detail information can be found in a previous blog.

Once your DUT has been integrated with VeriFire, use firebolt to communicate with your DUT via a Command Line Interface (CLI). The following steps show the commands that needs to be executed:

First, load AGFI by entering the following command:

firebolt --load-image agfi-0b42c67166d5971c5

Once the AGFI is loaded, open a port to view log messages from your DUT via the following command:

firebolt --launch-openocd

Next, build a test specific to the DUT and spawn a debug session by the following two commands:

firebolt build.dut-fpu.gcc.fp32_add_ecu.1 --dry-run
firebolt debug.dut-fpu.gcc.fp32_add_ecu.1 --dry-runIn order to be able to attach to the GNU Debugger (gdb) session, where you have visibility into the design, use the following command:

In order to be able to attach to the GNU Debugger (gdb) session, where you have visibility into the design, use the following command:

tmux attach-session -t gdb-session-C0

Use Ctrl+b d to detach from the session. To see prints, attach to the openocd session:

sudo tmux attach-session -t openocd-session-C0

Use Ctrl+b d to detach from the session. When you’re done, clear sessions by the following command:

sudo tmux kill-server && tmux kill-server

The log messages from your DUT can also be dumped to a file which is present in the directory where you launched firebolt from.

Conclusion

By using SilverLining EDA’s VeriFire on the AWS cloud, ASIC designers can achieve multi-fold speed up in their design cycle and therefore, introduce their products faster in this rapidly evolving generative AI environment. The state-of-the-art AMD FPGA, available through AWS’s F2 instances, allows large parallel processing of jobs with AWS orchestrator on large HPC clusters to train, design, and verify at scale.

As you face challenges in the area of verifying and model validation in the high-tech industry, you can count on AWS team to help you explore solutions tailored to your needs. Reach out to your AWS account team to start the conversation about accelerating your design and verification workflows.

AWS HPC Blog

How to migrate a VeriFire Emulator design from F1 to F2 Instances

F2 instance Provisioning

Transitioning from F1 to F2

Running VeriFire on F2

Conclusion

Resources

Follow