# RISC-V Core FPGA/ASIC Performance Comparison: A 45nm Case Study

Mohammed El-desouky<sup>#</sup>

<sup>#</sup>Nanotechnology and Nano electronics Engineering Department, UST at Zewail City 12578 Ahmed Zewail Street, October Gardens 6th of October City, Giza, Egypt

<sup>1</sup>s-mohammed.eldesouky@zewailcity.edu.eg

*Abstract*— RISC-V is an open Instruction Set Architecture (ISA) that is expected to dominate the market in the next few years. It is forecasted that market will consume 62.4 billion RISC-V CPU cores by 2025. In this project a RISC-V core is physically implemented as an ASIC using Nangate Open Cell Library 45nm PDK and its performance is compared to a 45nm based Spartan 6 FPGA implementation.

Keywords-FPGA, ASIC, RISC-V

#### I. INTRODUCTION

RISC-V is an open ISA that was introduced in 2010. In the past years it was commercialized and it is expected to dominate the market because it is efficient and open with numerous variants that serves almost every application from Internet of Things (IoT) to High Performance Computing (HPC). In this project a RISC-V core is implemented as an ASIC using Nangate Open Cell Library 45nm PDK (ff1p25v0c library) with highest possible frequency as a design goal and hence its performance is compared to a 45nm based Spartan 6 FPGA implementation.

# II. SYNTHESIS

The synthesis was carried out using Synopsys Design Compiler (DC) with constrains on frequency, input and output capacitance and clock jitter/skew as illustrated in the following section.

#### A. Constrains

First an estimate of the frequency was looked up in order to suggest a proper clock period for the target technology and RTL. First a period of 4 ns was used with normal compilation strategy it yielded negative slack of -1.19 ns hence the critical paths were identified and grouped and assigned 10 times higher priority. Table 1 below summarizes constrains used in the synthesis.

| TABLE I                      |
|------------------------------|
| CONSTRAINS USED IN SYNTHESIS |

| CLOCK PERIOD      | 1.8 NS   |
|-------------------|----------|
| INPUT DELAY       | 0.018 NS |
| OUTPUT DELAY      | 0.018 NS |
| CLOCK UNCERTAINTY | 0.36     |

# B. Synthesis Strategy

The design was first flattened to enable across boundary optimization. This was established via invoking the command ungroup [1] with options –all –flatten. Then a group path was

created for multiplier nets (slowest nets in the design) and these nets priority was set to 10. Hence compiler\_ultra command was invoked with option -timing high effort script. [1]

## C. Results

Table 2 below summarizes the synthesis results

TABLE III Synthesis Results Summery

| FREQUENCY / CLOCK PERIOD   | 555 MHz/ 1.8 NS |
|----------------------------|-----------------|
| NUMBER OF SETUP VIOLATIONS | 0               |
| NUMBER OF HOLD VIOLATIONS  | 0               |
| TOTAL CELLS AREA           | 74540.649100    |

#### **III.PLACEMENT AND ROUTING**

Placement and routing was carried out using Synopsys IC Complier (ICC) the detailed steps are illustrated below

# A. Floor planning and Power Network Synthesis

Floor planning was carried out with core utilization as a control factor by invoking command create floorplan with option –core utilization sat to 0.25[2]. The top four metal layers were assigned to power. The virtual power pads were added manually from the graphical user interface and hence their locations were saved to the script for later use. The power network was committed to and was analysed with target drop 2% of the supply voltage which is 22 mV. Fig 1 shows the result of IR drop analysis. Maximum=6.8mV



Fig 1:voltage drop map

## B. Placement and Routing

The placement was carried out via invoking the command place\_opt. The Clock Tree Synthesis (CTS) was carried out with target skew 0.05. After CTS routing was carried out using route\_opt command with options –effort high –xtalk reduction and –stage track [2]. Routing took around 6 hours with the previous setup. The full routed chip is shown in Figs 2,3



Fig 2:full routed core



Fig 3:close up view for full routed core

# C. Sign off and Results

The power connections were checked and the filler cells were added. Then DRCs were checked and 2 of them were eliminated by hand then route\_eco command was invoked. Hence route\_opt again with options -effort high –only design rule and –incremental. The DRCs were reduced and concentrated in only 5 nets. It was hence recarried and it violated the setup time in 72 instances so that run was not saved. Table 3 summarizes the results after signing off the chip.

#### TABLE IIIII Results Summery

| FREQUENCY / CLOCK PERIOD   | 555 MHz/ 1.8 NS |
|----------------------------|-----------------|
| NUMBER OF SETUP VIOLATIONS | 0               |
| NUMBER OF HOLD VIOLATIONS  | 0               |
| TOTAL CELLS AREA           | 74540.649100    |
| Power                      | 0.056.53 W      |
| CHIP AREA                  | 297280          |

### **III. STATIC TIMING ANALYSIS**

To finalize static timing analysis were carried out via Synopsys Prime Time [3], there were no setup or hold violations However the slack differed from DC and ICC which is illustrated in table 4

TABLE IVV TIMING ANALYSIS COMPARISON

| DC  | 0 SLACK IN MAX PATH<br>0 SLACK IN MIN PATH |
|-----|--------------------------------------------|
|     | U SLACK IN MIIN FATH                       |
| ICC | 0.19 SLACK IN MAX PATH                     |
|     | 0 SLACK IN MIN PATH                        |
| РТ  | 0.27 SLACK IN MAX PATH                     |
|     | 0.01 SLACK IN MIN PATH                     |

## IV. FPGA IMPLEMENTATION

The core was implemented on Spartan 6 XC6SLX45T board Using Xilinx ISE. The RTL schematic is shown in Fig 4



Fig 4: RTL schematic of the core

The design was synthesised with the frequency enhancement built in strategy. Table 5 summarizes the design attributes.

TABLE V FPGA DESIGN ATTRIBUTES

| MAXIMUM FREQUENCY | 34.145MHz                                           |
|-------------------|-----------------------------------------------------|
| POWER             | 0.086.94 W                                          |
| UTILIZATION       | 7380 SLICE REGISTER (13%)<br>14031 SLICE LUTS (51%) |

Fig 5 illustrates the floorplan of the design on the target FPGA. Fig 6 shows the routed design of the target FPGA





Fig 6: Routed core

V. PERFORMANCE COMPARISON

The selected target FPGA Spartan 6 XC6SLX45T is 45nm based. It was selected so that the general design performance comparison with the available 45nm PDK based ASIC implementation is technology independent. Table 6 summarizes the performance of the RISC V core implemented both as an ASIC and on FPGA.

The results are consistent with general design performance gap between ASIC and FPGA (ASIC is around 18x Faster and consumes much less power per unit frequency) However, FPGA provides much more flexibility with a very short time of development.

| TABLE VI                             |
|--------------------------------------|
| FPGA AND ASIC PERFORMANCE COMPARISON |

|                            | FPGA                                                      | ASIC                                                     |
|----------------------------|-----------------------------------------------------------|----------------------------------------------------------|
| MAXIMUM<br>FREQUENCY       | 34.145MHz                                                 | 555 MHz/ 1.8 ns                                          |
| Power at that<br>Frequency | 0.086.94 W                                                | 0.056.53 W                                               |
| RESOURCES                  | 7380 SLICE REGISTER<br>(13%)<br>14031 SLICE LUTS<br>(51%) | 41139 CELLS<br>WITH AREA 74540<br><i>um</i> <sup>2</sup> |

# VI. CONCLUSIONS

The aim of this project was to compare the performance of FPGA and ASIC implementations of RISC-V core at 45nm technology with achieving the highest frequency as a main design goal. The results are best summarized in table 6 and it is consistent with the knowledge about ASIC and FPGA performance gap.

#### ACKNOWLEDGMENT

The Author of this document would like to thank Dr Hassan Mustafa and Engineers Bassant Hassan, Mohammed Adel for their dedication and effort throughout the course.

#### References

- [1] Design Compiler User Guide, Synopsys Inc, USA, 2005
- [2] IC Compiler User Guide, Synopsys Inc, USA, 2010
- [3] Time User Guide, Synopsys Inc, USA, 2011

All the project files/scripts can be found here. Access will be granted based on personal communication with the author of this document.

