How can i repeat the coremark score

(jiangfengxu) #1

hi all
I ran coremark on my hifive unleashed board ,I was getting an average frequency of 1001MHz and coremark score is 2.39 coremark/MHz,which is lower than the reported 2.75 coremark/MHz.I was wondering if there is any specific reason to why the benchmark results aren’t quite measuring up to the reported statistics, such as specific compiler optimizations that I may not be using. Any insight would be helpful, thank you!

the toolchian version is gcc 7.2.0
compile options:
-O3 -march=rv64imafdc -mabi=lp64d -funroll-all-loops -fgcse-sm -finline-limit=500 -fno-schedule-insns

(Bruce Hoult) #2

I haven’t run CoreMark on the unleashed myself, but on the HiFive one we’re using:

-O2 -fno-common -funroll-loops -finline-functions --param max-inline-insns-auto=20 -falign-functions=4 -falign-jumps=4 -falign-loops=4


I’ve also found that using -msave-restore improves performance slightly if you’re using a RISC-V gcc version that supports it (this replaces long sequences of register save and restore instructions in function entry/exit with small runtime functions. This makes it faster because CoreMark only just barely fits into the 16 KB of L1 instruction cache on the HiFive1. It might not help with the bigger 32 KB icache on the Unleashed.

-O3 and too much loop unrolling might make it slower for the same reason – or might be ok.

The HiFive1 gets very close the same CoreMark/MHz as the HiFive Unleashed, and certainly more than you are reporting.

(jiangfengxu) #3

hi bruce ,i have tried the compilation options you provided .and i have got a new score 2.37 coremark/MHz,which is still lower than the reported.if there are other factors besides compilation options?
by the way I have got the dhrystone score 1.13 DMIPS ,which is lower than the reported 1.7DMIPS
can you tell me which compilation options i can use to repeat the score ,thank you!

(Jim Wilson) #4

I think the main trick is defining ee_u32 to be signed int instead of unsigned int, as 32-bit values on a 64-bit RISC-V must always be signed extended, so this avoids some useless sign extension shifts while still giving the correct result. Also, use the compiler options Bruce mentioned, which you can find in the sifive/freedom-e-sdk project on github.

By the way, gcc-7.2 is known to be broken, and in general should not be used. You are probably using riscv-tools. Don’t. That is a mistake. riscv-tools isn’t being maintained. Use riscv-gnu-toolchain instead. Or use upstream FSF toolchain sources.

(jiangfengxu) #5

thank you very much,I repeat the coremark score by this infomation
by the way,do you know the compilation options of the reported dhrystone score which is 1.7DMIPS

(Jim Wilson) #6

The compilation options in freedom-e-sdk/software/dhrystone should work, but dhrystone is a terrible benchmark. A lot of the code gets optimized away, and the resulting code is so small that it ends up being very sensitive to the compiler version, and to secondary effects like instruction and data alignment in cache lines which the compiler has little control over. It may take some experimenting to get good dhrystone results, and all of this dhrystone benchmark hackery is pretty useless for real world code.

(jiangfengxu) #7

thank you I’ll try on my hifive unleashed board,by the way the reported coremark score which is 2.75 coremare/MHz was be tested by defined ee_u32 to be signed int instead of unsigned int? is this reasonable,after all,to modify the code ?

(Jim Wilson) #8

We did not modify the code of the benchmark. We modified the file that is supposed to be ported to each target, and is supposed to be modified to support the target the best way possible.

There is apparently nowhere in the rules that says that ee_u32 has to be an unsigned type.

Note that I’m not the one responsible for this, and I’m not claiming to support it. I’m just pointing out that this was done, is typical of how the industry handles benchmarking, and that you should never really trust any benchmark score published by anyone. You should compile and measure the performance of your own code and make decisions based on that.