Profiling RISC-V with gprof


(Ivan Kvesic) #1

Hello,

I have a problem with compiling C program for risc-v on for profiling purposes. I wanted to execute risc-v program on spike and get profiling results but during linking i’ve got an error :
" Undefined reference to _mcount" when I used -pg flag.
If anyone had similar issue I would like to now how he got fixed it. I’ve used riscv-tools from github-site to get all necessary files and folders for building risc-v on ubuntu.

If you need more info i can provide to you

Thanks in advance,

P.S,
Sorry, this is the first time I’m writing a topic on forum so I am not sure if I’ve provided you with enough info.


(Bruce Hoult) #2

_mcount() is the special function that gcc inserts into every function to gather the statistics. It is a part of glibc.

I’m guessing you used riscv64-unknown-elf-gcc – you really should show exactly what you did when you ask for help. This uses newlib, not glibc, and so doesn’t have the support function for gprof.

The following worked for me (producing a gmon.out):

riscv64-unknown-linux-gnu-gcc -static -pg hello.c -o hello
qemu-riscv64 hello

Note: this doesn’t work with spike and pk because of “bad syscall #103!”. pk’s support of system calls is much more limited than qemu’s.


(Jim Wilson) #3

It would be possible to add some limited profiling support to the RISC-V newlib port. Some other ports have support for this. It just hasn’t been written for RISC-V yet. Full support requires timer interrupts which is not easy to do in an embedded system. But we could provide a count of the number of times a function is called. You also need a way to write the results to a file which is also hard to do in an embedded system, but if spike has support to read/write files it might work. Note that gcov profiling does not need a timer, and does not need any special C library support so is a bit simpler to support. But it does still need a way to write results to a file at the end.


(Ivan Kvesic) #4

Hi guys,

Thank you for answering guys in short notice.
I finally was able to link and compile with -pg flag but I couldn’t get timing in gprof. Output was correct after qemu executed program and I could see in gprof number of calls but there was : “No time accumulated” from gprof output. Any ideas ?

BR,
Ivan


(Jim Wilson) #5

gprof requires OS support. Do you have an OS? Does this OS have gprof support? See “man profil” on any unix system for info on the system call that you need to make this work.


(Ivan Kvesic) #6

I was able to profile with “normal” gcc and to get timing of each function
But i can’t with riscv64-unknown-linux-gnu-gcc compiler
I hope it does answer to your question


(Jim Wilson) #7

You need to precisely describe what you are doing, giving me enough info that I can reproduce. Otherwise I may not be able to help.

gcc -pg does work if run on riscv-linux, running on hardware. There may be issues with using qemu, as it is a simulator, but not a cycle accurate or time accurate simulator, so it has no reasonable way to give you any useful timing info. qemu does support the instruction retired count register, instret, so you could try generating your own “timing” info by counting number of instructions. Of course, that will count add and divide both as one instruction, and won’t give you any info about cache misses, but you just can’t get that kind of info from qemu.

If you want accurate timing info, you can’t use qemu. Spike can give accurate timing info, but spike has only a fraction of the features of qemu, and what you are trying to do may not work in spike. The only easy way to do what you want is to get some hardware that can run linux, and then do the work on linux.


(Ivan Kvesic) #8

Yeah i should explain what i’m trying to achieve.
I am student and I have a project in which I want optimize Jpeg program running on risc-v. For now I just wanted to use qemu as some kind of risc-v simulator (no hardware yet) to show me timing difference between c code and code optimized in assembly. That’s why I wanted to use profiler. Now what i’ve done :
I’ve installed yesterday riscv for linux from riscv-gnu-toolchain. I’ve installed qemu with ./configure --prefix=$RISCV --target-list= riscv64-linux-user and after i’ve installed qemu i tried what @bruce suggested . Again i was able to achieve output but not timing after qemu exited :

  1. /build.sh from riscv-tools with --enable-linux in riscv-toolchain
  2. ./configure --prefix=$RISCV --target-list= riscv64-linux-user
  3. make
  4. make install
  5. riscv64-unknown-linux-gnu-gcc -static -pg Jpeg.c -o Jpeg
  6. qemu-riscv64 Jpeg
  7. riscv64-unknown-linux-gnu-gprof Jpeg gmon.out

(Bruce Hoult) #9

I think your plan is probably not a very good one.

RISC-V was designed so that the machine code/assembly language contains only exactly what is needed to compile and run C (and similar languages) efficiently. If you use a modern compiler such as gcc or llvm then there will be little or nothing to be gained from hand-writing in assembly language.

Any improvements you do make will probably be so small that running in a simulation such as qemu will be misleading.

Also, running qemu on a modern x86 machine will be misleading because the “Core” series (whether Core 2 Duo or the latest one) microarchitecture is so different from any existing RISC-V CPU.

To detect any difference between the performance of compiled-C and hand written assembly language you’ll need to run on a cycle-accurate simulation, which qemu isn’t.