16-bit instructions from objdump

Hi all,

I’ve read this post

which shows assembly code, that, similarly to my experiences, contains 16-bit instructions when using objdump.

For example:

4040083a:   0017b793                seqz    a5,a5
4040083e:   8fd9                    or      a5,a5,a4
40400840:   3fc00717                auipc   a4,0x3fc00

I am wondering why this is the case since, according to the spec, all RISC-V instructions are 32-bit wide.

Thanks,
Ron

It depends if you enable compressed instructions, which are 16 bit wide.
It is controlled by the -arch option of gcc.
If it contains a „c“, for example „rv32imac“ 16 bit instructions will be emiitted. A special case is a „g“ which stands for „imfa“. Be aware that also the linker step needs the arch option. If you ommit it, the standard toolchain will default to „rv64gc“, special builds of the toolchain may default to something else.

Thank you! How should I enable this for the linker as well?

For compilation I can do:

riscv64-unknown-elf-gcc-7.1.1 -Dmarch=rv64i -o store.o store.c

(I am not sure what the ‘D’ is for but it does not work without it.)

However, when I try to do

riscv64-unknown-elf-objdump -Dmarch=rv64i -d -r test.o

I get a

test: file format elf64-littleriscv

riscv64-unknown-elf-objdump: can’t use supplied machine arch=rv64i

error. Do you know why this is happening and if I am trying to execute the right command?

Thanks again,
Ron

The GCC option -D is for defining macros. So -Dmarch=rv64i defined a macro “march” that has value “rv64i” which is not useful to you. The correct option is -march=rv64i. What problem did you have when you used -march=rv64i? You might need to use a -mabi= option also if the compiler is configured to emit 32-bit code by default.

The linker only needs to know whether you have 32-bit or 64-bit code, and it gets that from the object file, so you don’t need any special linker options.

The disassembler currently does not accept any architecture options.

I see, a macro is definitely not that useful. This is the output without the D, so it seems like as you said there is some issue with the ABI. Which -mabi option should I use?

riscv64-unknown-elf-gcc-7.1.1 -march=rv64i -o store store.c

cc1: error: requested ABI requires -march to subsume the ‘D’ extension

Is there a documentation where I could read more about these different flags and options?

UPDATE:

I found the lp64 option, this is the resulting error message:

riscv64-unknown-elf-gcc-7.1.1 -march=rv64i -mabi=lp64 -o store store.c

/home/rjokai/riscv_w_gcc4_8/riscv/lib/gcc/riscv64-unknown-elf/7.1.1/…/…/…/…/riscv64-unknown-elf/bin/ld: /tmp/ccVt8CzV.o: can’t link hard-float modules with soft-float modules
/home/rjokai/riscv_w_gcc4_8/riscv/lib/gcc/riscv64-unknown-elf/7.1.1/…/…/…/…/riscv64-unknown-elf/bin/ld: failed to merge target specific data of file /tmp/ccVt8CzV.o
collect2: error: ld returned 1 exit status

Yes, -mabi=lp64 would be the correct option for 64-bit soft-float code.

The linker error indicates that you are trying to link hard-float libraries with soft-float libraries. Why you got this depends on exactly how you built the toolchain. If you built a hard-float toolchain, and did not enable multilibs, then you would have built only one set of libraries which are hard-float, and which will not be link compatible with soft-float code.

You can fix that by configuring a toolchain to emit soft-float code directly, e.g. make -march=rv64i -mabi=lp64 the default. You can do this by specifying --with-arch=rv64i --with-abi=lp64 when configuring gcc.

Alternatively, you can build multiple copies of the libraries, for various combinations of arch/abi. You can do this by specifying --enable-multilib when configuring gcc. You can see the list of libraries supported by gcc by using “gcc --print-multi-lib”.

I am building my toolchain with the following commands:

export TOP=$(pwd)
git clone GitHub - riscv-software-src/riscv-tools: RISC-V Tools (ISA Simulator and Tests)
cd $TOP/riscv-tools
git submodule update --init --recursive

sudo apt-get install autoconf automake autotools-dev curl device-tree-compiler libmpc-dev libmpfr-dev libgmp-dev libusb-1.0-0-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev device-tree-compiler pkg-config

export RISCV=$TOP/riscv
export PATH=$PATH:$RISCV/bin
./build.sh

Should I do

./build.sh --with-arch=rv64i --with-abi=lp64

instead of the last command, or is there an additional step for configuring gcc?

Thank you!

riscv-tools is not well maintained. I would suggest using riscv-gnu-toolchain instead. Since riscv-tools includes riscv-gnu-toolchain, you could use what you already have, but just build riscv-gnu-toolchain directly instead of trying to build using the riscv-tools build script.

if you build riscv-gnu-tools directly, then something like
./configure --with-arch=rv64i --with-abi=lp64 --prefix=$RISCV
make
should work. You can also configure from a separate build directory instead of in the source tree, which makes managing build trees a little easier.

if you use the riscv-tools build scripts, then you will have to modify them to pass in additional configure arguments.

Just as a late comment …

In the base RISC-V instruction set, all instructions are 32 bits wide, with the two least significant bits always “11” and the five next least significant bits anything except “11111”. Instructions with the two least significant bits something other than “11” are used for instruction set extensions with 16 bit encodings. Instructions with the seven least significant bits “1111111” are used for instruction set extensions with encodings of 48 bits, 64 bits, or longer.

At present, it seems likely that the vast majority of RISC-V CPUs that will be shipped will support the “C” standard extension with 16 bit instructions.

The only exceptions are likely to be:

  1. Student projects

  2. deeply embedded processors (possibly thousands of them) in an ASIC or FPGA running programs only a few hundred bytes (or maybe couple of KB) in size.

  3. extremely high end processors doing very wide superscalar dispatch, and not wanting the overhead of decoding variable-length instructions.

I’m actually dubious about 3). Some people proposing to make such processors say they don’t want to support the “C” extension and are complaining about standard Linux distributions proposing to mandate C. x86 vendors have shown you can do wide superscalar decode of variable length instructions in commodity desktop and laptop processors, and RISC-V is far far simpler. The benefits of getting more code into your L1 cache (or being able to make the L1 cache smaller) are huge.