On 08/06/2021 13:37, Bruno Piazera Larsen wrote: > > > On 08/06/2021 12:35, Richard Henderson wrote: >> On 6/8/21 7:39 AM, Bruno Piazera Larsen wrote: >>>> That's odd.  We already have more arguments than the number of >>>> argument registers...  A 5x slowdown is distinctly odd. >>> I did some more digging and the problem is not with >>> ppc_radix64_check_prot, the problem is ppc_radix64_xlate, which >>> currently has 7 arguments and we're increasing to 8. 7 feels like >>> the correct number, but I couldn't find docs supporting it, so I >>> could be wrong. >> >> According to tcg/ppc/tcg-target.c.inc, there are 8 argument registers >> for ppc hosts.  But now I see you didn't actually say on which host >> you observed the problem...  It's 6 argument registers for x86_64 host. > > Oh, yes, sorry. I'm experiencing it in a POWER9 machine (ppc64le > architecture). According to tcg this shouldn't be the issue, then, so > idk if that's the real reason or not. All I know is that as soon as > gcc can't optimize an argument away it happens (fprintf in > radix64_xlate, using one of the mmuidx_* functions, defining those as > macros). > > I'll test it in my x86_64 machine and see if such a slowdown happens. > It's not conclusive evidence, but the function is too complex for me > to follow the disassembly if I can avoid it... > Test has been done: Slow down also happens on the x86_64 machine (but without change its already 360s, so idk if the slowdown is that dramatic), so it's _probably_ not going over the argument register count. I have no clue what could be. Still working on the struct version to see if anything changes. -- Bruno Piazera Larsen Instituto de Pesquisas ELDORADO Departamento Computação Embarcada Analista de Software Trainee Aviso Legal - Disclaimer