HiFive 1 Arduino performance

I ran a Ray Tracing Demo) on my new HiFive aboard, and updated the results matrix in the source code as shown:

Rendering times:

AVR@16 320x240x1 1S 293s nodraw 280s/270s
AVR@16 320x240x1 8S 2213s

STM32@72 320x240x1 1S 52s nodraw 49s/51s
STM32@72 320x240x1 8S 403s

ESP8266@80 320x240x1 1S 65s nodraw 61s (72s using slow lib)
ESP8266@160 320x240x1 1S 33s nodraw 30s (37s using slow lib)
ESP8266@80 320x240x1 8S ??? reboot
ESP8266@160 320x240x1 8S 246s

SiFive@16 320x240x1 1S n/a nodraw 1125s
SiFive@256 320x240x1 1S 73s nodraw 62s
SiFive@320 320x240x1 1S 68s nodraw 46s

Had to modify library: Adafruit_ILI9341.h, line 185
#elif defined (FREEDOM_E300_HIFIVE1)
volatile uint32_t *mosiport, *clkport, *dcport, *rsport, *csport;
int32_t _cs, _dc, _rst, _mosi, _miso, _sclk;
uint32_t mosipinmask, clkpinmask, cspinmask, dcpinmask;

Not sure why the HiFive is so slow compared to the other processors.

This is using floating point?

I’ve noticed that the FP libraries that get linked in on the HiFive1 are extremely slow for a 32 bit processor with fast hardware 32x32->64 multiply.

At 16 Mhz, float add&sub are around 7 us, mul around 40.5 us.

I don’t have an AVR Arduino here at the moment, but I’ve always counted them as near enough to 10 us for add/sub/mul at 16 MHz on an 8 bit AVR. Maybe a little less.

Ah … HiFive1 double add/sub are around 5 us, and mul 40 us. (at 16 MHz)

Double is faster than float! So everything is being done in double and converted back and forth from float.

I expect the HiFive1 math library is IEEE compliant. The AVR one is far from it! Speed optimised.

(in an earlier post-midnight version I forgot to measure and take off the loop control overhead)

1 Like

Thanks for the insight! The ray tracer is all floating point, so your guess was good.

Abusing cpp to replace the floats with doubles resulted in an execution time of 153 seconds.

(I’m actually surprised FP works at all on these processors :-))

And at the higher clock speeds?

I was actually very surprised how fast software FP is on the 8 bit AVRs when I first got one a few years back. Someone did a great job on that library.

Sorry to take so long to reply - missed the notification email, I guess.

The 153 seconds was observed at the 320 MHz clock rate.

I wonder how different the gcc FP libraries are for ESP8266 and HiFive - the delta in performance is pretty large (61s @ 80 Mhz, vs. 62 @ 180 MHz). Have to dig up a ESP8266 vs. HiFive DMIPS comparison.

That’s unbelievable! With the hardware resources available vs an 8 bit AVR, 153 seconds should be about right at 16 MHz. Given a reasonable performance-oriented FP library, of course.

Could you share your code?

It’s been a while since I wrote a raytracer… how is the scene itself generated? Is it generated on the fly (in memory) or stored in flash (program memory). If the latter and we have to keep hitting flash that will also slow it down.

Yes, that’s a good point.

Looking at the video https://www.youtube.com/watch?v=H9uUO-UQtzE it’s a classic simple ray-tracing scene with a chequered plain and a few balls of varying size and reflectivity. The description should only be a few hundred bytes.

It would be important to make sure the scene description is copied into RAM, not accessed directly from flash.

Simply making sure any global variables or arrays are (somewhat perversely) not “const” should be sufficient.

Again: it would be good to have the source code, so we can try it ourselves.

Code was obtained from: https://drive.google.com/drive/folders/0B_jncvz2HAYCblR3Zkt4LVh5REE?pageId=102377604871671942921&tid=0B_jncvz2HAYCfkVDd2taYlFrdnR1N0M2WUg5NC1sSS15ckJ6enJnXzk4YnNzZkw4ZHR0NTQ

Diffs to make work with HiFive board:
diff -Naur …/…/esp8266/projects/TFT22_raytrace/TFT22_raytrace.ino TFT22_raytrace/TFT22_raytrace.ino
— …/…/esp8266/projects/TFT22_raytrace/TFT22_raytrace.ino 2016-04-03 08:57:26.000000000 -0700
+++ TFT22_raytrace/TFT22_raytrace.ino 2017-02-19 11:14:13.460868690 -0800
@@ -7,6 +7,9 @@
Source code for my YouTube videos:
not_https://www.youtube.com/watch?v=RD5VO8o9bD4
https://www.youtube.com/watch?v=H9uUO-UQtzE
+
+Added SiFive HiFive Freedom 300 board 19-Feb-2017 RS
+
*/

/*
@@ -55,6 +58,21 @@
CS D1 (GPIO5)
GND GND
VCC +3.3V
+
+SiFive:
+Board: HiFive Freedom 300, 16, 256, or 320MHz
+
+TFT2.2 ILI9341 from top left:

  • MISO 12 (SPI1:SD1/MISO)
  • LED +3.3V
  • SCK 13 (SPI1:SCK)
  • MOSI 11 (SPI1:SD0/MOSI)
  • DC 6
  • RST 4
  • CS 5
  • GND GND
  • VCC +3.3V

*/

/*
@@ -70,6 +88,10 @@
ESP8266@160 320x240x1 1S 33s nodraw 30s (37s using slow lib)
ESP8266@80 320x240x1 8S ??? reboot
ESP8266@160 320x240x1 8S 246s
+
+SiFive@16 320x240x1 1S n/a nodraw 1125s
+SiFive@256 320x240x1 1S 73s nodraw 62s
+SiFive@320 320x240x1 1S 68s nodraw 46s
*/

#include “SPI.h”
@@ -98,20 +120,29 @@
//#endif

// ESP8266
-#ifdef ESP8266
-#include “Adafruit_GFX.h”
-#include “Adafruit_ILI9341.h”
-#define TFT_DC 2
-#define TFT_CS 5
-Adafruit_ILI9341 display = Adafruit_ILI9341(TFT_CS, TFT_DC);
-#endif
+//#ifdef ESP8266
+//#include “Adafruit_GFX.h”
+//#include “Adafruit_ILI9341.h”
+//#define TFT_DC 2
+//#define TFT_CS 5
+//Adafruit_ILI9341 display = Adafruit_ILI9341(TFT_CS, TFT_DC);
+//#endif
+
+// SiFive HiFive 1
+#include <Adafruit_GFX.h>
+#include <Adafruit_ILI9341.h>
+#define TFT_CS 5
+#define TFT_DC 6
+#define TFT_RST 4
+Adafruit_ILI9341 display = Adafruit_ILI9341(TFT_CS, TFT_DC, TFT_RST);
+

#define RGBTO565(_r, _g, _b) ((((_r) & B11111000)<<8) | (((_g) & B11111100)<<3) | ((_b) >>3))
#include “raytracer.h”

void setup() {
-// Serial.begin(115200);

  • Serial.begin(9600);
  • Serial.begin(115200);
    +// Serial.begin(9600);
    Serial.println(“ILI9341 raytracing”);

    display.begin();
    @@ -150,4 +181,4 @@

void loop(void) {
}

Will need to modify Adafruit’s library as outlined above.