CLINT and mtvec configuration


(Don A. Bailey) #1

Hello,

I’ve read through all of the RISC-V and E30x documentation and I can’t seem to find a clear definition of how trap/interrupt/exception vectors are actually called. The {m,h,s,u}tvec register provides a base address offset to the vector table, but where are the software/timer interrupts and external interrupt addresses stored? Is it similar to the table in other RISC processors? Or, am I misunderstanding how this happens and is there only one address called for all of these events?

I could attempt to reverse this from the SDK and other dev software, but I’d rather be pointed to a clear definition that I can point to. I’m porting our proprietary operating system kernel to RISC-V in preparation for the arrival of the HiFive Founders Edition board.

Thank you,
D


(Don A. Bailey) #2

To clarify, I’m reading section 3.1.11 of Privileged-1.9.1 in this fashion:

  • All traps in the system cause PC to be loaded with the value of xtvec
  • PC is then executed as the trap handler

But, in paragraph 3:
“Additional trap vector entry points can be defined by implementations to allow more rapid identification and service of certain trap causes.”
…seems to imply that this address could possibly be interpreted as a vector table. It would be useful to get a clear statement on whether this will always be one callable vector that is expected to select a trap-specific function address based on if-else logic and pending interrupt status.

Thank you,
D


(Megan A. Wachs) #3

Don,

Thanks for the question, excited that you’re porting your OS!

You pointed out the statement “Additional trap vector entry points can be defined by implementations to allow more rapid identification and service of certain trap causes.”

The FE310 does not define any additional trap vector entry points, so the default behavior applies. All traps in the system cause the PC to be loaded with the value in mtvec, which is a writable register in this implementation. Generally the routine there would examine the MCAUSE register to determine which interrupt handler to run.

Please note that this is part of our Freedom E300 Platform, not part of the RISC-V specification, and other implementations may (as it says) have different trap vector setups as allowed by the RISC-V spec.


(Don A. Bailey) #4

Hi Megan,

Thank you for the clarification!

Yeah, one thing that I have appreciated about the RISC-V documentation is that it’s easy to read what aspects are implementation dependent and what isn’t. I figured I was asking the right team, and I’m glad that is the case. Although, I realized that I probably should have asked this in the E300 Forum, not the HiFive1, though I suppose both are relevant.

One last thing, according to the spec the Reset and NMI vectors are implementation dependent as well. It looks like the E300 uses the boot/reset model defined in sections 5.2 and 5.3 of the E310G-v1.0 document. Will SiFive continue using this boot strategy for the foreseeable future or is this model going to be specific to E3x0? Perhaps that is a question yet to have an answer.

Thanks again, especially for the fast response!

Best,
D


(Megan A. Wachs) #5

Don,

We hope to keep using this strategy, but it may modify slightly for other chips with other non-volatile storage options.
We generally expect to continue with the boot scheme described in E310G-v1.0 sections 5.2 and 5.3 for the Freedom E300 systems. I’m not sure it has been well-defined for the U500 systems.

In general, the Reset vector should remain 0x1000 for all SiFive systems.

Note that on the FE310 chip on the HiFive1 boards, only the last row of table 5.1 applies. The pins for configuring the reset vector are not available in the QF48-pin package on the HiFive1, so your chip will boot as follows (we’ll soon be releasing a doc for the HiFive1 with this information):
0x1000 – jump to OTP
0x20000 – execute a program (we will burn into OTP before shipping) which jumps to SPI Flash
0x20000000 – execute a default bootloader that jumps to SPI Flash 0x20400000
0x20400000 – user program

You will be able to overwrite the SPI ‘bootloader’ and user program. You can also in theory change the OTP program but we discourage this as it is a great way to brick the chip. You can’t change the reset vector or its contents.


(Don A. Bailey) #6

Hi,

Ah, excellent! Thanks for clarifying. I’ll keep an eye out for the chip documentation.

Out of curiosity, will the code burned to OTP be open sourced?

Best,
D


(Megan A. Wachs) #7

Yep, of course! We’ll give an exact description of what we burned into OTP and why.


(Don A. Bailey) #8

One last question, but I’m also curious about the thought that went into the trap call-model. In other architectures where the hardware logic performs the if-else on behalf of the soft/firmware, presumably the hardware is more expensive to manufacture, but at a greater cost than the size of code (or speed) of the equivalent software. Was this the reason the RISC-V implementation chose a single trap vector, the decrease in chip expense because of the use of an I-cache to optimize the trap code?

Thanks for your perspective,
D


(Tommy Thorn) #9

I’m not from SiFive, but I have raised the same concerns in the past (and
I’m still concerned). One argument I have read was the it was easier to
virtualize, but I’m unconvinced.

I can think of many application where interrupt handling performance
matters a great deal.

Unfortunately, I expect a lot of proprietary extensions in the area.

Tommy


(Wesley W. Terpstra) #10

So, I’m no expert in this area, but can’t you just do an indirect jump based on the triggering interrupt? That’s only a few instructions slower than if the hardware did it for you, or am I misunderstanding the concern?


(Don A. Bailey) #11

Sure, and using a jump table is precisely what they expect developers to do. That’s not a bad thing, and in most cases it won’t be a large trade off from a hardware-supported trap table.

The question for me is more related to whether the hardware implementation of a trap table is more expensive than it is worth. I’d rather decrease the cost of hardware fabrication where possible. Also, I don’t know if there are patents relevant here, which may also be an issue.

So, I’m more just curious of the overall cost of engineering a hardware-assisted trap table and whether that cost was related to the RISC-V decision to use a singular trap entry point.

D


(Wesley W. Terpstra) #12

Well, from a bird’s eye perspective, storing a trap table in DRAM is much cheaper than a trap table inside the chip as registers. The same goes for the cost of the dispatch logic. If it can be done with good performance using instructions stored in DRAM, that’s cheaper than doing it using custom logic in the chip. Even SRAM is cheaper than registers and custom logic.

Another consideration might be that riscv is intended to support a wide range of processor classes. Anything that interacts with program flow can have a disproportionate impact on the microarchitecture of a processor. Keeping program flow as simple as possible in a future-proof ISA is probably a good idea; see for example the branch-delay-slot vs. out-of-order machines and it’s interaction with traps.

This is all just from my perspective, which could be incomplete.


(Don A. Bailey) #13

Sure, any reduction in the chip logic will decrease cost. But, I don’t know if it’s a significant enough amount to make it preferable to use instructions. Because RISC-V has an opportunity to be far more cost effective than its competitors, it’s an interesting question where reductions actually equate to substantial amounts of cost decrease in fabrication.

Yeah, I was very happy to read that RISC-V was going to eschew delay slots. That was a smart move where so many ISAs have faltered. That said, I can’t imagine this is a method for future proofing an ISA. Especially as the ISA is already standardizing its 128-bit architecture.

To me this is more of an IP or cost issue.

Best,
D


(Krste Asanovic) #14

Tommy- check out my talk on interrupts from the fourth RISC-V workshop at MIT where I cover rationale. Basically, Unix-like systems want to treat interrupts as data to be scheduled when convenient, in which case, a single handler entry point is preferred (lower instruction footprint is actually probably more important than slightly lower hardware requirements). Embedded systems treat interrupts as control flow, where interrupt controller is task dispatcher, in which case, you might prefer hardware trap vector in hardware (though not much savings in cycles over software dispatch really). The privileged architecture allows both these use cases, plus additional extensions for even more exotic interrupt processing.


(Tommy Thorn) #15

Hi Krste,

my concern is only about the embedded case.

“interrupt controller is task dispatcher, in which case, you might prefer
hardware trap vector in hardware (though not much savings in cycles
over software dispatch really).”

That depends entirely on the structure and depth of the event tree that
the handler has to walk to figure this out. It can easily be [many] dozens of cycles.
Contrast this with a MSI/MSI-X-like vector where you simply register the
exact vector to take upon a given event, a constant cost regardless of the
number of interrupt sources. For a typical embedded-style core, this
will be on the order of 2-3 cycles. If the work done to handle the interrupt
is trivial, then the overall interrupt handling performance will differ by an
order of magnitude.

Tommy


(Krste Asanovic) #16

Agreed. For low-end embedded with slow processors and dumb peripherals, the usual design choice is to use a more complex interrupt dispatcher to save processor cycles. This is all possible within the RISC-V interrupt framework.


(John Fireman) #17

Sorry to bother you, for the CLINT vector mode I have a question really want to ask…
In RISC-V Manual Volume II, it said:

When MODE=Vectored, all synchronous exceptions into machine mode cause the pc to be set to the address in the BASE field, whereas interrupts cause the pc to be set to the address in the BASE field plus four times the interrupt cause number.

Does the synchronous exception handler is same with the ID=0 interrupt handler(User software interrupt) ?
And I try to solve my question on qemu-system-riscv64 virt, and my steps following, I try to test exception, user software interrupt and machine timer interrupt:

# boot:
la t0, __vector_table
xor t0, t0, 1 # vector mode
csrw mtvec, t0

# vector table
__vector_table:
IRQ_0:
        j trap_handler_entry 
IRQ_1:
        j trap_handler_entry
             ...
IRQ_7:
        j timer_interrupt_vector_handler

# handler at vector table index 0
trap_handler_entry:
    SAVE_REGISTER
    j trap_handler # trap_handler is the handler 
                   # I used to handle all trap in direct mode
    RESTORE_REGISTER
    mret
# timer handler
void __atrributr__ ((interrupt)) timer_interrupt_vector_handler { 
    add new value in mtimecmp
}

At this point I test ecall and timer interrupt, when timer interrupt and ecall happens it surely go to the function in vector table, ecall -> INDEX=0, timer -> INDEX=7
And then I try to trigger a user software interrupt, like:

# Test function
while(1) {
    if (odd round) { ecall }
    else {
        set_csr(mip, USIP);
        /* Test func in M mode, enable USIE in MIE, 
         * and other interrupts works well */
    }
}

But there no user software interrupt happens, the manual saids:

Each lower privilege level has a separate software interrupt-pending bit (SSIP, USIP), which can be both read and written by CSR accesses from code running on the local hart at the associated or any higher privilege level.

And I’m sure I can get Supervisor Software Interrupt by set MIP_SSIP. And I check the interrupt raise function in Spike, I found that there is no user softeware interrupt … The manual also said:

If user-level interrupts are not supported, USIP and USIE are hardwired to zero.

So I print the mie to ckeck it:

# config in mie and mstatus
write_csr(mie, read_csr(mie) | MIP_MTIP | MIP_MSIP | MIP_SSIP | MIP_USIP);
write_csr(mstatus, (read_csr(mstatus) | MSTATUS_MIE | MSTATUS_SIE | MSTATUS_UIE));
# Set mie and mip in test func
set_csr(mie, MIP_USIP);
set_csr(mip, MIP_USIP);
# Result before and after set
MIE >> 000000000000008a # before
MIP >> 0000000000000000
MIE >> 000000000000008a # after
MIP >> 0000000000000000

The USIP and USIE are hardwired to zero ! So, maybe there is no USI right now ?
But depends on the first try, I think synchronous exception actully using the ID=0 interrupt handler(User software interrupt). It seems may the answer of my question is yes … And the conflict is avoid by no user software interrupt ?
Is this right? Does there any conflict ?