Ok so I spent pretty much the whole day checking how “hello” works. I’m probably going to write a detailed blog about this so that others might find it useful, but overall:
- RISV-V GNU tools compiled to produce code for the RISC-V architecture (gcc, ld, libc,…et al).
- libwrap implements minimal memory handling (malloc/free) and syscalls required by many C functions (write,fstat,read…etc) that interface with file descriptors or underlying kernel (and there is none) so most of these syscalls are not implemented since there is no OS, but it helps build simple C programs.
- libwarp.mk Makefile uses --wrap for all the syscalls so that they get called instead of any others that might be defined in libc
- flash.lds maps the binary elf image sections to fit within the expected SPI EPROM memory map accepted by the FE310-G000 processor in the HiFive1 board. This sets all sections including stack, heap boundaries, rodata…etc
- start.S is where the execution begins. Here .data is loaded into loaded from 0x80000000 virtual addresses to physical addresses and .bss is cleared. Other boilerplate operations are executed as well before pushing the stack and calling main()
There is of course a lot more detail. At any rate, one thing I couldn’t fathom is the location of .rodata
From dasm output, the location is:
2040be20 <blanks.4376-0x1bc>:
2040be20: 6568 flw fa0,76(a0)
2040be22: 6c6c flw fa1,92(s0)
2040be24: 6f77206f j 2047ed1a <__fini_array_end+0x72166>
2040be28: 6c72 flw fs8,28(sp)
2040be2a: 0a64 addi s1,sp,284
The above is “hello world\n”. Also, if you checkout main dasm, you see the following:
2040006c <main>:
2040006c: 1141 addi sp,sp,-16
2040006e: 0000c517 auipc a0,0xc
20400072: db250513 addi a0,a0,-590 # 2040be20 <__clzsi2+0x44>
20400076: c606 sw ra,12(sp)
20400078: 00000097 auipc ra,0x0
2040007c: 54c080e7 jalr 1356(ra) # 204005c4 <printf>
20400080: 40b2 lw ra,12(sp)
20400082: 4501 li a0,0
20400084: 0141 addi sp,sp,16
20400086: 8082 ret
20400088: 0000 unimp
So it is loading a0 with 0x2040be20 which matches .rodata address above. However, when running the program on the board, I get different result:
main:
2040006c: addi sp,sp,-16
5 printf("hello world\n");
2040006e: auipc a0,0xc
20400072: addi a0,a0,1150 # 0x2040c4ec
4 {
20400076: sw ra,12(sp)
5 printf("hello world\n");
20400078: auipc ra,0x0
2040007c: jalr 2046(ra) # 0x20400876 <printf>
7 }
20400080: lw ra,12(sp)
20400082: li a0,0
20400084: addi sp,sp,16
20400086: ret
Where does 0x2040c4ec come from? Looks like the .ro data was shifted down a bit, but why?