On 07/06/2021 18:06, Richard Henderson wrote: > On 6/7/21 12:29 PM, Bruno Piazera Larsen wrote: >> I just tried sending mmu_idx all the way down, but I ran into a very >> weird bug of gcc. If we have to add one more parameter that GCC can't >> just optimize away we get at least a slow down of 5x for the first >> test of check-acceptance (could be more, but the test times out after >> 900 seconds, so I'm not sure). > > That's odd.  We already have more arguments than the number of > argument registers...  A 5x slowdown is distinctly odd. I did some more digging and the problem is not with ppc_radix64_check_prot, the problem is ppc_radix64_xlate, which currently has 7 arguments and we're increasing to 8. 7 feels like the correct number, but I couldn't find docs supporting it, so I could be wrong. > > >> One way that I managed to get around that is saving the current MSR, >> setting it to 5, and restoring after the xlate call. The code ended >> up something like: >> >> int new_idx = (5<> int clr = (7<> int old_idx = env->msr & clr; >> clr = ~clr; >> /* set new msr so we don't need to send the mmu_idx */ >> env->msr = (env->msr & clr) | new_idx; >> ret = ppc_radix64_partition_scoped_xlate(...); >> /* restore old mmu_idx */ >> env->msr = (env->msr & clr) | old_idx; > > No, this is silly. > > We need to do one of two things: >   - make sure everything is inlined, >   - reduce the number of arguments. > > We're currently passing in 9 arguments, which really is too many > already.  We should be using something akin to mmu_ctx_t, but probably > specific to radix64 without the random stuff collected for random > other mmu models. That means we'd have to define radix_ctx_t (or however we call it) in radix64.h, setup the struct on ppc_xlate, then pass it to ppc_radix64_xlate. From looking at the code, it seems the most useful bits to put in the struct are: eaddr, g_addr, h_addr, {h,g}_prot, {g,h}_page_size, mmu_idx and guest_visible. They all seem reasonable to me, but I might be missing something again. > > > r~ Another question: I know hash mmus don't have this problem, but since ppc_jumbo_xlate also uses mmu_idx, should we make those xlate user mmu_idxs as well? I tested and it doesn't make a time difference. -- Bruno Piazera Larsen Instituto de Pesquisas ELDORADO Departamento Computação Embarcada Analista de Software Trainee Aviso Legal - Disclaimer