Translating an existing C++ Project

Hi,
We have an existing C++ Project Project which we developed for ARM. Now we tried to transfer this project to RISC-V. To be able t make a comparison between the two architectures.

I’m able to translate the project using the riscv-gnu-toolchain, but the generated output is about 20 times bigger than the ARM output. I found out that a lot of library code is linked. When researching for a solution I read several times, that the libraries in the riscv-gnu-toolchain are not yet ready for bare metal applications. Is this true? Is this the reason for the huge code overhead?

Thanks for your help!

For embedded you’ll want to use make not make linux when you build the toolchain, to use newlib instead of glibc. This does link only the necessary functions into the image (though printf is still quite big).

Or, if you’re using something pre-built, make sure you’re using the riscv32-unknown-elf-gcc etc variants, not riscv32-unknown-linux-gnu-gcc.

If your ARM project used newlib nano, for the comparison to be meaningful, you should also use newlib nano for the RISC-V project. It is available from the GNU MCU Eclipse RISC-V Embedded GCC toolchain.

If you are using printf or FP, and you have a soft-float target, then the RISC-V libraries are bigger than the ARM libraries for two reasons.

  1. Embedded ARM has 64-bit long doubles, RISC-V has 128-bit long doubles. That means ARM programs have only 32/64 soft fp libraries linked in, whereas RISC-V programs have 32/64/128 soft fp libraries linked in. We are planning to fix this with a new ABI for embedded use.
  2. ARM has hand written assembly soft FP libraries that provide a subset of IEEE FP support. RISC-V has C code for soft FP libraries that attempt to provide full IEEE FP support. That means the RISC-V FP support is bigger, but also provides more features. it isn’t clear what to do about this yet.

The comment Liviu made about newlib nano is also a good one. It isn’t a fair comparison if you are comparing full newlib support against newlib nano support. The riscv-gnu-toolchain uses full newlib support by default, but you can get newlib nano support by using the gcc option “–specs=nano.specs”.

There may also be other issues. It is helpful if you can provide more specific details about the problems you are seeing.

Thanks for your help!

Yes, I used the make command instead of make linux to build the toolchain and I have also the newlib nano support enabled. Also with the embedded toolchain and newlib nano the output is about 10 times bigger than the ARM output.

One concrete problem for example is that if I use supc++ library (with newlib) it is linking a lot of code for malloc and free which is also producing system calls (ecall instructions) when trying to allocate heap. This problem occurs during initialization of the static objects. To resolve the problem I excluded the library and manually wrote the new and delete operator.

I assume there are more issues like this, which cause the growing of the output code.

I understand that the result will be bigger than for the ARM because of the points you mentioned. But I think a factor of ten or more seems hard to explain.

Without giving actual numbers, it’s a meaningless comparison.

If one library is bigger than the other then the difference will look extreme if your program is “hello world”. It will look much less extreme for real sized programs.

For our freedom-e-sdk example “hello” program using puts() I get 51118 bytes for the code. If I change it to use printf() then it’s 51066 bytes. If I change it to use write() then it’s 51070. If I take out printing “hello world” entirely and just return 0 then the size is 51030.

If I look at the object file before linking (using riscv32-unknown-elf-size, there is 42 bytes (28 for main(), 13 for the string). In the linked program main() is reduced to 22 bytes because of linker relaxations.

So clearly we’ve got some work to do on getting the library to strip better. At the moment it’s giving basically 50 KB of overhead for any program.

What should the priority be on when we put effort into that instead of into other things? What is the use-case where it actually matters?

Maybe the ARM toolchain produces a 5 KB binary for helloworld. It’s not because their code is ten times smaller – it’s because they pulled in ten times less of the library. As you look at a real program that uses more of the library, the absolute difference will get smaller. And as you get a bigger program with 50KB, 100KB, whatever of your own code, the relative difference will be much less too.

Meanwhile, the HiFive1 has 16 MB of space for the user’s program. 50 KB for library code is utterly in the noise.

How much space will the device you deploy on have?

If you can give us some concrete examples, we can take a look at them. Meanwhile, we are aware that there are some code size issues, and will be doing some work to address that, but this may take some time. We haven’t seen any cases where the RISC-V code is 10 times bigger than the ARM code, but we probably haven’t looked at much embedded C++ code as yet.

There will also be other toolchains eventually. IAR for instance is working on a RISC-V compiler, and they expect to release it next year. I would expect the IAR toolchain to be much better than the GNU toolchain for embedded code size.

Hi,buddy
could you kindly share your risc-v makefile as a reference for me ? I just also want to port an C++ project into the Hifive board , but don’t know how to update the makefile
Thanks