Why does CSR reads need three cycles?

Hi,
It’s mentioned in U540 manual:

CSR reads have a three-cycle result latency.

So why does it take three cycles to read a CSR register? Thank you.

Because … CSR space is more similar in size and distance from the pipeline to L1 cache or DTIM than it is to CPU registers, so similar access time makes sense? Because … CSRs are read so seldom it’s not worth spending a lot of design effort and silicon on making that faster?

Thanks for your reply’s.

In Rocket’s pipeline, the CSR’s reading and writing are executed in writeback phase. So I don’t know whether to read CSR in WB stage is to consider three cycles or to unify the process.