MCU crashes randomly

Hi everyone. I’m using Freedom Studio (on windows) to program FE310-G002 on LoFive R1 using a J-Link - JTAG debugger.

I’m running an algorithm on the MCU and it crashes randomly. It can complete one loop of the algorithm when all the variables are float, then crashed. When I change all variables to double, it crashed right away in the first loop, and stoped at early_trap_vector in trap.S

I used Segger Embedded Studio to compile the program and it takes only 6kB/16KB RAM in the MCU and a few KB of SPI Flash. However, I can’t find a way to see this information in Freedom Studio build results.

The algorithm has been tested with ARM-core STM32 and it works very well without any problem. So I think there is no problem with the code but there is something wrong with the RISC-V MCU.

Has anyone here ever encountered this problem? Please let me know your thinking on this.

Thank you!

Hi,

Any chance you can provide a little more information on this “algorithm”? Also, it sounds like you are compiling your code with one IDE (Segger) and then programming/debugging with a different one (FreedomStudio) - is this accurate?

It might be worthwhile to provide your build options and show the contents of the mcause, mepc, and mtval CSRs and this will pinpoint the source of your problem. Have you tried single-stepping your program to understand which instruction is causing the exception?

Thanks David. I’m running extended kalman filter. No, I compiled the code sepatately by both Segger IDE and FreedomStudio.

In FreedomStudio, I tried single-stepping as you said but the MCU crashed at different functions when I change all variables in my program from float to double.

My build tool chain is SiFive RISC-V GNU GCC Newlib, it think it’s RISCV64 GNU ELF tooth chain. Every functions worked fine when I test them separately. I set MCU clock at 256MHz and it can read data from IMU MUP9250 correctly. Calculations for float variables are precise and eveything seems fine.

These are the registor values before crashing:
2021-07-23 10_06_45-wsFreedomStudio - C-EKF-FE310_src_main.c - FreedomStudio

And these are after crashing:
2021-07-23 10_07_27-wsFreedomStudio - C-EKF-FE310_src_main.c - FreedomStudio

Update: it worked well on Segger Embedded Studio, I can see the output data from the algorithm continuously through global live watch, and the MCU doesn’t crash. But I can’t find any library to work with the MCU peripherals such as UART and I2C. Does anyone know a library similar to Freedom Metal but for Segger IDE?

It’s often difficult to debug a random crash but you may want to consider not touching the PLL as a first step, and using a simple example and building from there. There is a clue here that different instructions at different times cause the CPU to crash. Are you using printf() anywhere in your code?

1 Like

Thanks dconn, I did not encounter this problem when using Segger IDE with a correct package. When I dig into their library, in their core initialization, I found this function:

I think this is the reason why it worked on Segger IDE. So setting mstatus register can help solving this problem. But I can’t find any way to set it up in Freedom Studio.

If you disable ‘Start target execution’ in the Freedom Studio Debug configuration (Startup tab) you can single step all the entry code, and you should see in freedom-metal/gloss/crt0.S a similar init routine:

...
  /* Check RISC-V isa and enable FS bits if Floating Point architecture. */
  li   a4, 0x10028
  and  a5, a5, a4
  beqz a5, 1f
  csrr a5, mstatus
  lui  a4, 0x2
  or   a5, a5, a4
  csrw mstatus, a5
  csrwi fcsr, 0
1:
...

Keep in mind that freedom-metal/src/entry.S holds the entry point, and _start label within crt0.S is called from here.

Thank you. I tried to run as you suggested, but it didn’t stop there.
Besides, it was placed in “_skip_init”. In the beginning of the file, I saw the following code and I don’t really understand what that means:

  /* If we're not hart 0, skip the initialization work */
  la t0, __metal_boot_hart
  bne a0, t0, _skip_init

Please give me some insight on this if possible.

Make sure you build the firmware with soft floating point support: HiFive1 doesn’t have floating point support in hardware.

1 Like