Hi,
While working on a project, I deciding to make execution time measurements by reading the mcycle register. The result was mediocre and after investigation it seems that function calls are the problem.
I created a small project to test that out :
/* Copyright 2019 SiFive, Inc */
/* SPDX-License-Identifier: Apache-2.0 */
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> /* for uinptr_t */
#include <time.h>
#include <metal/cpu.h>
#include <metal/machine.h>
#include <metal/csr.h>
uintptr_t cycle_s, cycle_e;
int main() {
int x =0;
cycle_s=( unsigned int *)malloc(sizeof( unsigned int));
cycle_e=( unsigned int *)malloc(sizeof( unsigned int));
METAL_CPU_GET_CSR(mcycle, cycle_s);
int b =fun(x);
METAL_CPU_GET_CSR(mcycle, cycle_e);
printf(" Elapsed time = %lu clock cycles or %lu microseconds \r\n",(unsigned )(cycle_e-cycle_s),(unsigned )(cycle_e-cycle_s)/16);
free(cycle_s);
free(cycle_e);
return 0;
}
with fun a trivial function described in fun.c :
/*
* fun.c
*
* Created on: 21 Jul 2023
* Author: baf
*/
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <metal/cpu.h>
#include <metal/machine.h>
#include <metal/timer.h>
#include <metal/led.h>
#include <metal/csr.h>
extern uintptr_t cycle_s, cycle_e;
int fun(int a ){
// METAL_CPU_GET_CSR(mcycle, cycle_s);
a ++;
//METAL_CPU_GET_CSR(mcycle, cycle_e);
return a;
}
When measuring the elapsed time inside the fun(), it of course only take 4 clock cycles.
However measuring the elapsed time inside main gives 2414 clock cycles.
I did not see any extra instructions in disassembly nor a jump to the interrupt handler or anything when going step by step with the debugger. I have no idea what happens during these thousands of clock cycle
If anyone has already experimented this issue or has any ideas, I would be grateful.