All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* benh performance problem
@ 2004-02-03 17:58 Mikolaj Krzewicki
  2004-02-03 22:24 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Mikolaj Krzewicki @ 2004-02-03 17:58 UTC (permalink / raw
  To: linuxppc-dev


hi all,
starting 2.6.0-test9-benh3 i noticed an increased system overhead on my
ibook 500. Each process eats much more cpu time than in 2.4 or even
vanilla 2.6.
gkrellm takes 1% instead of 0.1, xmms 5% instead of ~1% and so on.
I benchmarked the lot with octave and it indeed runs ~10% slower on the
benh kernels.
I would use vanilla 2.6.1 but it won't sleep/wake correctly (at all).
All pre-2.6.0-test9-benh3 kernels worked fast but would't sleep/wake either.
I tried to isolate the patch responsible but failed.
Has anybody come across similar behaviour?
greetings, M.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
  2004-02-03 17:58 benh performance problem Mikolaj Krzewicki
@ 2004-02-03 22:24 ` Benjamin Herrenschmidt
  2004-02-05 21:50   ` Mikolaj Krzewicki
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2004-02-03 22:24 UTC (permalink / raw
  To: Mikolaj Krzewicki; +Cc: linuxppc-dev list


On Wed, 2004-02-04 at 04:58, Mikolaj Krzewicki wrote:
> hi all,
> starting 2.6.0-test9-benh3 i noticed an increased system overhead on my
> ibook 500. Each process eats much more cpu time than in 2.4 or even
> vanilla 2.6.
> gkrellm takes 1% instead of 0.1, xmms 5% instead of ~1% and so on.
> I benchmarked the lot with octave and it indeed runs ~10% slower on the
> benh kernels.
> I would use vanilla 2.6.1 but it won't sleep/wake correctly (at all).
> All pre-2.6.0-test9-benh3 kernels worked fast but would't sleep/wake either.
> I tried to isolate the patch responsible but failed.
> Has anybody come across similar behaviour?
> greetings, M

I switched to HZ=1000, the userland procps utilities (like ps and top)
tend to not properly deal with that, at least earlier versions, afaik.

I suspect it's just display crap.

Alos, you can find a more recent kernel than that :) My bk tree is
currently at 2.6.2-rc3-ben1

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
  2004-02-03 22:24 ` Benjamin Herrenschmidt
@ 2004-02-05 21:50   ` Mikolaj Krzewicki
       [not found]   ` <40221664.70208@lycos.nl>
  2004-02-07  9:17   ` Gabriel Paubert
  2 siblings, 0 replies; 7+ messages in thread
From: Mikolaj Krzewicki @ 2004-02-05 21:50 UTC (permalink / raw
  To: linuxppc-dev; +Cc: benh


> On Wed, 2004-02-04 at 04:58, Mikolaj Krzewicki wrote:
>
>>> hi all,
>>> starting 2.6.0-test9-benh3 i noticed an increased system overhead on my
>>> ibook 500. Each process eats much more cpu time than in 2.4 or even
>>> vanilla 2.6.
>>> gkrellm takes 1% instead of 0.1, xmms 5% instead of ~1% and so on.
>>> I benchmarked the lot with octave and it indeed runs ~10% slower on the
>>> benh kernels.
>>> I would use vanilla 2.6.1 but it won't sleep/wake correctly (at all).
>>> All pre-2.6.0-test9-benh3 kernels worked fast but would't sleep/wake either.
>>> I tried to isolate the patch responsible but failed.
>>> Has anybody come across similar behaviour?
>>> greetings, M
>
>
> I switched to HZ=1000, the userland procps utilities (like ps and top)
> tend to not properly deal with that, at least earlier versions, afaik.
>
> I suspect it's just display crap.
>
> Alos, you can find a more recent kernel than that  My bk tree is
> currently at 2.6.2-rc3-ben1
>
> Ben.

i tested 2.6.1-benh1 now with HZ=100
the benchmark i used shows a speedup back to the values i'm used to.
The benchmark itself is an octave script the execution time of which i
tested with different kernels under similar circumstances, so here are
the details:

2.6.1-benh1, HZ=1000: timing=36.5s
2.6.1-benh1, HZ=100 : timing=32s

this is with X running and a lot more processes(not running).
the weird thing is it executes slightly faster (on average) with
pbbuttonsd off.
The machine is g3 500 ibook, the octave script is:

   tic;
     a = abs(randn(1500, 1500)/10);
     b = a';
     c= a*b;
     a = reshape(b, 750, 3000);
     b = a';
   timing=toc;

so lots of system calls and cache flushing is in order.

Mikolaj.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
       [not found]   ` <40221664.70208@lycos.nl>
@ 2004-02-06  5:14     ` Benjamin Herrenschmidt
  2004-02-06 15:25       ` Hollis Blanchard
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2004-02-06  5:14 UTC (permalink / raw
  To: Mikolaj Krzewicki; +Cc: linuxppc-dev list


On Thu, 2004-02-05 at 21:09, Mikolaj Krzewicki wrote:

> the benchmark i used shows a speedup back to the values i'm used to.
> The benchmark itself is an octave script the execution time of which i
> tested with different kernels under similar circumstances, so here are
> the details:
>
> 2.6.1-benh1, HZ=1000: timing=36.5s
> 2.6.1-benh1, HZ=100 : timing=32s
>
> this is with X running and a lot more processes(not running).
> the weird thing is it executes slightly faster (on average) with
> pbbuttonsd off.
> The machine is g3 500 ibook, the octave script is:
>
>    tic;
>      a = abs(randn(1500, 1500)/10);
>      b = a';
>      c= a*b;
>      a = reshape(b, 750, 3000);
>      b = a';
>    timing=toc;
>
> so lots of system calls and cache flushing is in order.
>
> Mikolaj.

Well, I don't knwo what the above means, I don't talk that language
anyway :)

The fact that pbbuttons makes a difference makes me think the
interrupt handling is taking way too much time on your setup,
and pbbuttons is loading the machine with PMU interrupts...

Not sure if I can fix any of this at this point without doing a
major rewrite of the exception handling code, I suspect those
CPUs don't like running in real mode and our exception handling
happens mostly in that mode in ppc32...

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
  2004-02-06  5:14     ` Benjamin Herrenschmidt
@ 2004-02-06 15:25       ` Hollis Blanchard
  2004-02-07 21:16         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Hollis Blanchard @ 2004-02-06 15:25 UTC (permalink / raw
  To: Benjamin Herrenschmidt; +Cc: Mikolaj Krzewicki, linuxppc-dev list


On Feb 5, 2004, at 11:14 PM, Benjamin Herrenschmidt wrote:
>
> On Thu, 2004-02-05 at 21:09, Mikolaj Krzewicki wrote:
>>
>> 2.6.1-benh1, HZ=1000: timing=36.5s
>> 2.6.1-benh1, HZ=100 : timing=32s
> The fact that pbbuttons makes a difference makes me think the
> interrupt handling is taking way too much time on your setup,
> and pbbuttons is loading the machine with PMU interrupts...

pbbuttons is loading the machine? What about all those extra
decrementer ticks? I assume benchmarks showed no performance
degradation on some class of ppc32; what was the slowest system tested?
For that matter, couldn't this hurt 4xx, 8xx(x), as well?

Perhaps HZ should be user-configurable? Although that doesn't help
distributions who'd like one kernel to work everywhere... How does
Linux handle increased HZ on old i386 anyways?

> Not sure if I can fix any of this at this point without doing a
> major rewrite of the exception handling code, I suspect those
> CPUs don't like running in real mode and our exception handling
> happens mostly in that mode in ppc32...

I didn't realize CPUs go slower in real mode..?

--
Hollis Blanchard
IBM Linux Technology Center


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
  2004-02-03 22:24 ` Benjamin Herrenschmidt
  2004-02-05 21:50   ` Mikolaj Krzewicki
       [not found]   ` <40221664.70208@lycos.nl>
@ 2004-02-07  9:17   ` Gabriel Paubert
  2 siblings, 0 replies; 7+ messages in thread
From: Gabriel Paubert @ 2004-02-07  9:17 UTC (permalink / raw
  To: Benjamin Herrenschmidt; +Cc: Mikolaj Krzewicki, linuxppc-dev list


On Wed, Feb 04, 2004 at 09:24:44AM +1100, Benjamin Herrenschmidt wrote:
>
> On Wed, 2004-02-04 at 04:58, Mikolaj Krzewicki wrote:
> > hi all,
> > starting 2.6.0-test9-benh3 i noticed an increased system overhead on my
> > ibook 500. Each process eats much more cpu time than in 2.4 or even
> > vanilla 2.6.
> > gkrellm takes 1% instead of 0.1, xmms 5% instead of ~1% and so on.
> > I benchmarked the lot with octave and it indeed runs ~10% slower on the
> > benh kernels.
> > I would use vanilla 2.6.1 but it won't sleep/wake correctly (at all).
> > All pre-2.6.0-test9-benh3 kernels worked fast but would't sleep/wake either.
> > I tried to isolate the patch responsible but failed.
> > Has anybody come across similar behaviour?
> > greetings, M
>
> I switched to HZ=1000, the userland procps utilities (like ps and top)
> tend to not properly deal with that, at least earlier versions, afaik.
>
> I suspect it's just display crap.

The other problem is that measuring CPU time just by sampling which
process is running at timer interrupt is a crappy method bound to be
inaccurate, espcially on very bursty workloads, or one which depend
on other frequencies which beat with the timer.

Didn't somebody implemented CPU process accounting based on the
timebase? Or was it for another architecture (maybe s390)?


	Regards,
	Gabriel

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: benh performance problem
  2004-02-06 15:25       ` Hollis Blanchard
@ 2004-02-07 21:16         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2004-02-07 21:16 UTC (permalink / raw
  To: Hollis Blanchard; +Cc: Mikolaj Krzewicki, linuxppc-dev list


> pbbuttons is loading the machine? What about all those extra
> decrementer ticks? I assume benchmarks showed no performance
> degradation on some class of ppc32; what was the slowest system tested?
> For that matter, couldn't this hurt 4xx, 8xx(x), as well?

Heh, well, I did see any obvious degradation on my powerbook but
I didn't benchmark much actually. Maybe I should, or make kernel HZ
a config option.

> Perhaps HZ should be user-configurable? Although that doesn't help
> distributions who'd like one kernel to work everywhere... How does
> Linux handle increased HZ on old i386 anyways?

I don't know :)

> > Not sure if I can fix any of this at this point without doing a
> > major rewrite of the exception handling code, I suspect those
> > CPUs don't like running in real mode and our exception handling
> > happens mostly in that mode in ppc32...
>
> I didn't realize CPUs go slower in real mode..?

It is the case with 970's and I think with IBM G3s, I'm not sure
about Motorola G4s and other CPUs. In real mode, all memory is
treated like it has the G bit set, preventing speculation (among
others, I suppose prefetch gets killed too).

I's very visible on lmbench with a G5: The null syscall overhead
of the ppc64 kernel is lower than the one of the ppc32 kernel
despite actually running more code and saving more & bigger registers.

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-02-07 21:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-03 17:58 benh performance problem Mikolaj Krzewicki
2004-02-03 22:24 ` Benjamin Herrenschmidt
2004-02-05 21:50   ` Mikolaj Krzewicki
     [not found]   ` <40221664.70208@lycos.nl>
2004-02-06  5:14     ` Benjamin Herrenschmidt
2004-02-06 15:25       ` Hollis Blanchard
2004-02-07 21:16         ` Benjamin Herrenschmidt
2004-02-07  9:17   ` Gabriel Paubert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.