MT - compiling code for a multithreading architecture

I have created a multi-threaded processor using the FGMT (fine-grained multi-threading technique) where a different thread is executed at each clock cycle. To test it and demonstrate it’s performance improvement, i would like to compile some C code suitable for it (the complied code would be aware of the multithreading feature.

Is that possible with Freedom Studio?

So my question is, can I compile the C code for execution in 2 thread architecture?

My understanding of FGMT is that each thread has its own PC and own set of registers. That means the compiled code is exactly the same as for a non-threaded processor.

You are absolutely correct, in FGMT, each thread has its own PC and set of registers.
However, if the compiler is aware of the number of threads, it will detect parts of the code that are independent and will “split” the code accordingly to the number of threads to execute in parallel.

int main(void)
int i = 0, j = 0, y = 0, z = 1;

for( i = 0; i < 20; i = i + 1 ){
  		z = z + i;

for( j = 0; j < 20; j = j + 1 ){
  		y = y + 2;
return 0;


Since the two ‘for’ loops are completely independent, the compiler is going to ensure that thread 1 is executing the code for the first loop while thread 2 will execute the code for the second loop. That means TWO jump instructions will ensure that thread 1 and 2 jump at the right instruction address(PC_thread1 and PC_thread2 respectively). In addition to that, a different sp (stack pointer) should be set for each pointer and gp (global pointer) should be aware of the multithreading feature so no erroneous changes occur.

So i am asking if the Freedom Studio has the ability to optimise the compiled code based on a given number of threads

It is even better!

With -Os or -O2 the gcc in Freedom Studio optimises your code to:

0000000000000000 :
0: 4501 li a0,0
2: 8082 ret

Perhaps you need a better example.

There is some support in gcc for auto-parallelization. I don’t know how well it works, or whether it is enabled in the RISC-V linux version. It isn’t for the embedded compiler I use ("-ftree-parallelize-loops=n")

Yes, I am familiar with the -Os (optimise size), -O1 (optimise), -O2 (optimise more) and -O3 (optimise most) of the gcc.

Unfortunately, these optimisations won’t help with the compiled code I want to create which I have explained above.

Thanks for your help. I hope if someone else knows about a thread specified feature in Freedom Studio to get in touch.