Gcc assembling problem

Hi all,

what i have read in the risc -v standard is that C.MV expands into add rd, x0, rs2…but when i am using gcc assembler …it expands it into addi.

is that an error or i am doing something wrong?

At runtime, the hardware expands c.mv into an add instruction. But that is irrelevant to the assembler, a c.mv is a c.mv. Why do you think that you have an addi?

rohan:2094$ cat tmp.s
	.option rvc
	c.mv a4,a5
rohan:2095$ riscv32-unknown-elf-as -o tmp.o tmp.s
rohan:2096$ riscv32-unknown-elf-objdump -dr tmp.o

tmp.o:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <.text>:
   0:	873e                	mv	a4,a5
	...
rohan:2097$

what i am doing , is to translate the sudo instruction once into a compressed instruction and checking the result then translating the same sudo instruction to 32i instruction and also check the result…why i am doing that ? because i am building a decompression unit using verilog that transfers the compressed instruction into the ordinary 32 bit instruction…

so i am actually doing that ( assembling the same instruction once into 16 and equivalent into32 ) as a kind of verification of my unit and my understanding to it.

Thanks for your effort

OK, I think I understand what you are doing. You have two assembly files, one has
mv a4,a5
and one has
.option rvc
mv a4,a5

The first assembles to an addi instruction. The second assembles to c.mv, which is implemented in hardware as an add instruction.

Your mistake is assuming that the 4-byte and 2-byte move instructions are the same thing. They aren’t. In 4-byte mode, mv is a macro that expands to an addi. In 2-byte mode, mv is an actual instruction. They are different things, so there should be no expectation that they get handled exactly the same inside the hardware.
The fact that the hardware implements the c.mv instruction by expanding it to add is irrelevant.

Of course, one could ask why the hardware and software solutions are different, and that probably has something to do with the fact that it was just easier (least code/fewest gates) to do it that way.

The ISA manual does say in Chapter 20 RISC-V Assembly Programmer’s Handbook that the 4-byte mv macro expands to addi.

Is it even observable whether c.mv a,b is actually implemented internally as add a,zero,b or addi a,b,0 or something else (or or xor or ori or xori, for example)?

Maybe on some wide superscalar machine with slightly under-provisioned read ports the immediate versions would perform better.

The internal behavior of c.mv is observable if you are reading verilog, otherwise no. The original poster did mention verilog.

Reading x0 doesn’t necessarily require a register port. This special case can be recognized and mapped directly to a zero immediate. One doesn’t even necessarily have to have an actual register for x0. It could just be handled by the decoder, substituting zeros for reads, and disabling the register file write.

sorry but i didn’t get why the hardware wouldn’t handle them like each other, in my case i have an rtl for the 32…what i am doing is just adding is the decompression unit …which is a simple stage before the processor itself…so actually the unit transform the 16 into the type of instruction that the hardware actually was built for it .

Thanks again for your time and effort in answering me .

One is a macro and one is an instruction. They aren’t the same thing. One is handled by the assembler when it does macro expansion. One is handled by the hardware when it uncompresses instructions.

There is a 2-byte move instruction. There is no 4-byte move instruction. The hardware never gets to handle a 4-byte move instruction, because there is no such thing. Asking why the hardware handles it differently is asking the wrong question.

One can ask why the software and hardware don’t agree on how to handle 2 and 4 byte move instructions/macros, but this kind of thing happens often in real world engineering. Sometimes there is an important but obscure reason. Sometimes it is an accident of circumstances.

I wasn’t around when the original decision was made, but I would guess that the 4-byte move macro came long before the compressed instruction support, and at that time they just made a random choice as to whether to use an addi or add x0 for the 4-byte move macro, and picked addi. Later, when the compressed instruction support was added, in the hardware design, they discovered that expanding c.mv to add x0 used fewer gates than expanding it to addi, so they used add x0. This made the 2-byte mv instruction different from the 4-byte move macro, but since the 4-byte move macro decision was made long before, they couldn’t change it without causing problems, so they left it alone.

1 Like