RISCV-Probe / BBL / Running baremetal code

I’m trying to get some baremetal code running on the SiFive Unleashed board, enough to provide some two-way communication over the virtual comm serial port (sifive UART) with printf/getchar function calls.

So far I’ve found the RISCV-PROBE project: https://github.com/michaeljclark/riscv-probe

Which is BBL + libfemto (a very minimal libc library) that provides this. Using qemu I’m able to get my own baremetal code working and I’m able to use the provided printf/getchar functions to funnel data to and from my code via the UART.

As for running on actual hardware, the riscv-probe project only has support for SiFive E2 CoreIP Arty A7 FPGA. It seems they didn’t complete the sifive U series for hardware, only qemu simulations only.

I have been trying to compare the differences between the hardware E2 core and the qemu E2 core folders to try and see if I can adopt the qemu U5 core to work on the actual unleashed board hardware, but I just can’t see to get it working. Looking in the memory layout of the E2 hardware default.lds I see,

flash (rxai!w) : ORIGIN = 0x40400000, LENGTH = 128M
ram (wxa!ri) : ORIGIN = 0x80000000, LENGTH = 64K

I have tried changing it to:
ram (wxa!ri) : ORIGIN = 0x80000000, LENGTH = 128M
flash (rxai!w) : ORIGIN = 0x10000, LENGTH = 64M

but no luck. I’ve also experimented with openSBI, but haven’t been able to get anything other than printf statements working with my compiled payload. Does anyone have any idea on how I could get my code which is working on the qemu simulation of U5 to work on actual hardware? There already is E2 qemu to hardware parity, so I must be missing something very small.