GPIO speed depending on how I call printf before the loop?

I have this very small assembly program that toggles a GPIO pin. Before the tight loop that toggles the GPIO pin I have a printf statement. Depending on how I call this printf statement the frequency of the GPIO toggling changes.

When I call print with call printf I get a frequency of 615 kHz and when I call printf with jal ra, printf I get a frequency of 571 kHz.

Can someone please, explain this to me?

This is my code (called from main.c):

.equ	GPIO,		0x10012000
.equ	OUTPUT_EN,	0x08
.equ	OUTPUT_VAL,	0x0c
.equ	PIN,		22		# the red led


assembler:
	la		a0, hello
	jal		ra, printf

	# setup register to blink
	la		t0, GPIO
	li		t1, 1
	li		t3, PIN

	# enable the specified GPIO pin
	lw		t4, OUTPUT_EN(t0) 		# get the current value of the enable bits
	sll		t2, t1, t3
	or	 	t4, t4, t2
	sw		t4, OUTPUT_EN(t0)

	sll 	t2, t1, t3
100:
	# toggle the GPIO pin
	lw 		t4, OUTPUT_VAL(t0)
	xor 	t4, t4, t2
	sw		t4, OUTPUT_VAL(t0)
	j		100b

	ret

		.section	.rodata

	hello:
		.string	"This is the clockspeed demo\r\n"

Has nobody an answer to this question?

Regards,
Bengt

It could be an alignment issue (of the jump target).

Thank you for your kind answer.

Could you, please, explain how that would affect the loop as the call to printf is done before the loop? I don’t understand this.