RISC-V software usage

Hi there,

I am having a question about disassembling a simple C program, What I want to do is to have a file which shows the C source and compiled assembly. I have figured out how to do this by using:

riscv-gcc -g -c test.c
riscv-objdump -d -S test.o

And I also want to get a feeling about the program counter (PC), so it might be good to know which lines of assembly instruction get executed how frequently in a program. I know we can do that with Spike simulator, but how do we combine all the information (C vs. ASM + PC log) into a single file if possible?

Thanks,
Dong

I do not know of a standard way to combine the two files, but you can use the spike-dasm utility to disassemble your log from Spike or rocket-chip based simulations. It is installed with the rest of riscv-tools:

cat spike.log | $RISCV/bin/spike-dasm  

Hi @mwachs5

Thank you for your reply! But that is for using C++ simulator, right? The spike-dasm is used to translate the simulated result into readable form.

In this case, I think what I want is to have a C program, translate it to assembly, and know how frequent does each instruction get executed. Is there anyway to do that?

Thanks,
Dong

There is no existing tool that I know of that does what you want to do. In order to know how frequently each instruction is executed, you need to run the C program through a simulator. You should be able to write a small script that takes the simulation output and counts the frequency of each instruction.

By simulator, you mean Spike simulator? And also is there a way to get which piece of C code (assembly correspondingly) is executed how frequently? The best solution I can find at the moment is to use spike -g pk (binary), -g is for PC histogram, but I don’t understand what PC histgoram size means.

Could you please explain me a bit further about it? Or if you know anything that can does the jobs above?

Thanks,
Dong

I meant any simulator that can output an instruction trace, which includes both Spike and the simulators that the rocket-chip project can create.

I was unaware of spike -g, but now that I’ve tried it out myself, it looks like the “PC histogram size” just means the number of unique PCs seen.

Hi @rxia Richard,

Thanks for your quick reply. Yes, in that case we would have the frequency of each instruction (represented by addresses in spike -g output).

But we still don’t know which line of C code this corresponds to. Is there anyway to glue this information with a file which shows C and assembly at the same time?

Thanks,
Dong

You can use objdump to disassemble the .o file and annotate it with the C source file and line numbers:

$ riscv64-unknown-elf-objdump -d --line-numbers -S test.o

...
00000000000101a8 <dmul>:
dmul():
/home/rxia/tmp/simple.c:1
double dmul(double a, double b) {
   101a8:       1101                    addi    sp,sp,-32
   101aa:       ec22                    sd      s0,24(sp)
   101ac:       1000                    addi    s0,sp,32
   101ae:       fea43427                fsd     fa0,-24(s0)
   101b2:       feb43027                fsd     fa1,-32(s0)
/home/rxia/tmp/simple.c:2
  return a * b;
   101b6:       fe843707                fld     fa4,-24(s0)
   101ba:       fe043787                fld     fa5,-32(s0)
   101be:       12f777d3                fmul.d  fa5,fa4,fa5
/home/rxia/tmp/simple.c:3
}
   101c2:       22f78553                fmv.d   fa0,fa5
   101c6:       6462                    ld      s0,24(sp)
   101c8:       6105                    addi    sp,sp,32
   101ca:       8082                    ret
...

From there you should have enough information to link the PC counts to C lines.

Thanks for your reply @rxia Richard.

Yes, you are right, we could do that, is there a better way to link the PC information to the disassembled file? or we have to write, for example, a perl script to manually parse the objdump file and manually combine those two?

Thanks,
Dong

Sorry, I don’t know of any way to automatically link the lines of ASM to the lines in the disassembled file. It hopefully isn’t too difficult to write a short script to parse out the mappings, but keep in mind that GCC is free to rearrange the C code when it generates the assembly code, so the mappings may not be perfectly accurate.

Thanks @rxia Richard, not a problem at all. I just want to get a sense of roughly which particular instruction is executed most of time (hence slowing down the program) in a particular C program. If this is the best way to go, I will have a try.

Much appreciate your help!!

Thanks,
Dong

Hi @rxia Richard,

May I ask how did you compile the object file (which command did you use?) I used -g -c option in RISC-V gcc, but it turns out that the addresses in objdump files doesn’t match to those in pc histogram.

Thanks,
Dong

I compiled and linked it with

$ riscv64-unknown-elf-gcc simple.c -g -o simple.out

When you passed in -c, that told GCC not to link the object file, so your ELF file has the unrelocated addresses. If you omit the -c option, GCC will perform the final step of linking and assigning final addresses to the program, producing an executable ELF.

My colleague Palmer has a great writeup on the subject of linker relocations here if you’d like to learn more: https://www.sifive.com/blog/2017/08/21/all-aboard-part-2-relocations/

Thanks @rxia, but above you have test.o file, which command did you use?

Also, from PC histogram, it also shows some counters at the address 8000000, what are those? I presume they are program memory?

Thanks,
Dong

Sorry, I changed around some of the file names to match the names that you picked, so I never actually created test.o. I disassembled the test.out I created. Let me start over from the beginning.

Starting with a simple C file:

// simple.c
double dmul(double a, double b) {
  return a * b;
}

int main(void) {
  return dmul(1.0, 2.0);
}

I create the relocated, executable ELF file (simple.out) and use that to disassemble:

$ riscv64-unknown-elf-gcc simple.c -g -o simple.out

$ riscv64-unknown-elf-objdump -d --line-numbers -S simple.out
...
# Other functions omitted
...
00000000000101a8 <dmul>:
dmul():
/tmp/simple.c:1
double dmul(double a, double b) {
   101a8:       1101                    addi    sp,sp,-32
   101aa:       ec22                    sd      s0,24(sp)
   101ac:       1000                    addi    s0,sp,32
   101ae:       fea43427                fsd     fa0,-24(s0)
   101b2:       feb43027                fsd     fa1,-32(s0)
/tmp/simple.c:2
  return a * b;
   101b6:       fe843707                fld     fa4,-24(s0)
   101ba:       fe043787                fld     fa5,-32(s0)
   101be:       12f777d3                fmul.d  fa5,fa4,fa5
/tmp/simple.c:3
}
   101c2:       22f78553                fmv.d   fa0,fa5
   101c6:       6462                    ld      s0,24(sp)
   101c8:       6105                    addi    sp,sp,32
   101ca:       8082                    ret
...

$ spike -g pk simple.out 2>&1 | head
PC Histogram size:3032
1000 1
1004 1
1008 1
100c 1
1010 1
100b0 1
100b4 1
100b8 1
100bc 1

In my case, the instructions at starting at 8000_0000 are from the proxy kernel (pk).

This Perl (ugh.) snippet might help. Input is spike log file.

$pk = "/mnt/raid/scratch/workdir/riscv64-unknown-elf/bin/pk"; # path to exe
while(<STDIN>) {
    @x = split " ", $_;
    $y = `riscv64-unknown-elf-addr2line -f -i -p -e $pk $x[2]`;
    $y =~ s/\n//g;
    print "[$y] $_";
}

Argument -e to addr2line is path to pk. It could be any binary.

Output won’t give you line numbers, but at least a trace through function calls.

[fdt_scan_helper at fdt.c:?] core   0: 0x0000000080003f9e (0x30f030ef) jal     pc + 0x3b0e
[strcmp at ??:?] core   0: 0x0000000080007aac (0x00000505) addi    a0, a0, 1

@edc

Would you please explain this script snippest, is it a full work which can be used? I tried it but it didn’t give any response only it seems waiting some input from the keyboard.

what this line mean ?? "@x = split " “, $_;”

What is ?? shold this replaced with the log file name or writen as it is??
How should I give the log file input to this script?!!

Thanks,
Medo

This gives static information about frequency of instruction.Does’t this just tells us how assembly code is stored in memory?
How to find frequency of instruction executed in runtime.I tried spike -l (log of execution) but not sure if its correct way