Timings for drawing VGA graphics

Hi there,

I’ve been thinking about whether to buy a HiFive and play around with it. I’d like to learn RISC-V in a nice and simple constrained environment.

A goal of mine would be to bit-bang to VGA using GPIO and draw some pixels. I’ve spent a while reading the documentation, and unfortunately my knowledge isn’t enough to answer my questions, so I’ll present them:

What’s the bandwidth/throughput/latency for reading from the SPI chip, writing to GPIO ports and reading from GPIO ports? How many clock cycles does it take to send a request then retrieve data from them? How much data do I get for a request?

Could I consistently say, seek portions of graphics and write them out to VGA based on some logic? I’ve seen pong on an Arduino, so I imagine the HiFive has it in it to write pong.

Does the instruction cache the scratchpad? Documentation says “The instruction cache will not cache instructions from an instruction scratchpad SRAM or mask ROM placed in the instruction pipeline”. Also, how does the instruction cache work? There’s 16KiB of room, so I assume I could slowly load some code from flash, warm it up in to the cache by executing it and then do my soft-realtime code while having the scratchpad be used for some tiny bitmaps. But I imagine it could cache or evict some stuff I don’t want evicted.

Some information on the timings and ideas for this project would be interesting. In the end I think I could get away with a black and white terminal display at the very least, but there might be some big blocks that I don’t see. Maybe the HiFive isn’t suited for this kind of work.

Jookia.

Hi Jookia,

There are some other threads on this topic about the GPIO bitbanging frequency, see below.

Your intuition is spot on about how the instruction caching of the scratchpad would work. All instructions are actually cached on the HiFive1 as there is no instruction scratchpad RAM or maskROM for the instruction pipeline (those are all on the data path, shared with the instruction fetch). As long as your code fits in the ICache, it won’t get evicted.

I’m not personally familiar with the specific requirements for driving VGA, but there are lots of LCD / LED screens out there on which one could make a cool Pong game that the HiFive1 would absolutely be capable of driving.

Reading those links opens two questions for me:

Does the QSPI and GPIO share the same IO bus meaning I’ll have to divide up the speed?

Is the CPU by default clocked at 250mhz? Why is this, can it be changed, what are the risks?

Jookia.

QSPI and GPIO share the same bus, but more importantly, you can only have one outstanding MMIO request at a time in the first stepping of the E31. So your MMIO bottleneck is the processor. You can increase the frequency if you choose, but we only rate it at up to 320MHz. Depending on your luck, there have been chips known to reach 400MHz overclocked, but this is the exception.

A colleague of mine points out that maybe you meant executing from QSPI, in which case instruction fetch can proceed effectively in parallel with data bus access to the GPIO. They are on the same bus, but the TileLink bus allows both requests to be concurrently inflight.

This is a project I got a start on but real life (work, kids etc) forced me to put it aside for a little while. I got as far as setting up the PWM module as an interrupt source to generate the VGA pixel clock and producing some experimental waveforms. It was promising, but I was planning on severely limiting my colour range output and resolution to make it feasible.

[quote=“mwachs5, post:2, topic:460”]
I’m not personally familiar with the specific requirements for driving VGA, but there are lots of LCD / LED screens out there on which one could make a cool Pong game that the HiFive1 would absolutely be capable of driving.[/quote]

When I started I had a “how hard can it be?” attitude due to the relative old-age of VGA. The answer turned out to be “a bit of a pickle.” If one’s goal is to actually drive a display and do something with it then those screens, which are usually driven by common interfaces like SPI, are a great choice. The VGA option is pure academic exercise and deeply impractical for uC level stuff. Fun, though.

I’m not planning on displaying some Full HD video. I have no real estimate of how hard this will be, so it’s possible I’ll give up and do something else (maybe just display text graphics over UART!) Whatever I get working is enough for me, it’s more the experience of having an open hardware CPU and making it do things and understanding it as much as possible. All I hope is to have some fun and learn.

Another question is: How would I go to find out about the cache eviction algorithm? Is it in VHDL somewhere? I’m guessing it would either be in FIFO or LIFO, but it’s no much important to me as being able to know it at all.

All I hope is to have some fun and learn.

We hope you will too! That’s definitely one of our goals!

Another question is: How would I go to find out about the cache eviction algorithm? Is it in VHDL somewhere? I’m guessing it would either be in FIFO or LIFO, but it’s no much important to me as being able to know it at all.

The code is available online at GitHub - sifive/freedom: Source files for SiFive's Freedom platforms.

The code is written in Chisel, which is a Scala-based HDL. By looking in the ICache code (which is found in the submodule rocket-chip), we can see that the replacement policy is set by some ICacheParams, and by default it is a Random replacement policy:

You can see how the “Random” replacement works here:

Basically, the HW is maintaining an LFSR and using it to select the way to evict.