Problem identified! All my fault, sorry for beeing so stupid.
Peak detection settles just fine.
I assumed that using the same attack time = 20 ms for both peak and RMS would roughly make them settle at about the same time, but of course peak detection is 25x slower (yes, every audio engineer would've known that, but I didn't).
From the left plot below, because RMS (blue) settles early I concluded that the same should've happened for peaks (red). However if you look at the right plot which shows the complete picture (waiting 30x longer), you see that peak detection settles just fine some ages later.
This can be fixed by using (much) higher shapes than 1, so everything is fine and file closed. So sorry again for the trouble!