There doesn’t seem to be anything wrong with the board’s RAM, running a (single-threaded) memory tester for a few days didn’t uncover a single problem.
# memtester 15G
memtester version 4.5.0 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 15360MB (16106127360 bytes)
got 15360MB (16106127360 bytes), trying mlock ...locked.
Loop 1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : setting 216
I can’t really think of anything else to try. I could try removing the NVME, but that will be kind of inconvenient (and it might stop triggering the problem for sake of I/O just being a lot slower, instead of helping narrow down the underlying issue). Or maybe downclocking the CPU.