All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
@ 2013-10-18  7:38 Wolfgang Denk
  2013-10-18 16:38 ` Scott Wood
  2013-10-23  5:07 ` Kumar Gala
  0 siblings, 2 replies; 21+ messages in thread
From: Wolfgang Denk @ 2013-10-18  7:38 UTC (permalink / raw
  To: linuxppc-dev; +Cc: Scott Wood, Wolfgang Denk

Default Debian PowerPC doesn't work on e500 because the code contains
"lwsync" instructions, which are unsupported on this core.  As a
result, applications using this will crash with an "unhandled signal 4"
"Illegal instruction" error.

As a work around we add code to emulate this insn.  This is expensive
performance-wise, but allows to run standard user land code.

Signed-off-by: Wolfgang Denk <wd@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Scott Wood <scottwood@freescale.com>
---
I am aware that the clean solution to the problem is to build user
space with compiler options that match the target architecture.
However, sometimes this is just too much effort.

Also, of course the performance of such an emulation sucks. But the
the occurrence of such instructions is so rare that no significant
slowdown can be oserved.

I'm not sure if this should / could go into mainline.  I'm posting it
primarily so it can be found should anybody else need this.
- wd

 arch/powerpc/kernel/traps.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f783c93..f330374 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs *regs)
 		return 0;
 	}
 
+	/* Emulating the lwsync insn as a sync insn */
+	if (instword == PPC_INST_LWSYNC) {
+		PPC_WARN_EMULATED(lwsync, regs);
+		asm volatile("sync" : : : "memory");
+		return 0;
+	}
+
 	/* Emulate the mcrxr insn.  */
 	if ((instword & PPC_INST_MCRXR_MASK) == PPC_INST_MCRXR) {
 		int shift = (instword >> 21) & 0x1c;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-18  7:38 [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores Wolfgang Denk
@ 2013-10-18 16:38 ` Scott Wood
  2013-10-18 18:50   ` Wolfgang Denk
  2013-10-23  5:07 ` Kumar Gala
  1 sibling, 1 reply; 21+ messages in thread
From: Scott Wood @ 2013-10-18 16:38 UTC (permalink / raw
  To: Wolfgang Denk; +Cc: linuxppc-dev

On Fri, 2013-10-18 at 09:38 +0200, Wolfgang Denk wrote:
> Default Debian PowerPC doesn't work on e500 because the code contains
> "lwsync" instructions, which are unsupported on this core.  As a
> result, applications using this will crash with an "unhandled signal 4"
> "Illegal instruction" error.
> 
> As a work around we add code to emulate this insn.  This is expensive
> performance-wise, but allows to run standard user land code.
> 
> Signed-off-by: Wolfgang Denk <wd@denx.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Scott Wood <scottwood@freescale.com>
> ---
> I am aware that the clean solution to the problem is to build user
> space with compiler options that match the target architecture.
> However, sometimes this is just too much effort.
> 
> Also, of course the performance of such an emulation sucks. But the
> the occurrence of such instructions is so rare that no significant
> slowdown can be oserved.
> 
> I'm not sure if this should / could go into mainline.  I'm posting it
> primarily so it can be found should anybody else need this.
> - wd
> 
>  arch/powerpc/kernel/traps.c | 7 +++++++
>  1 file changed, 7 insertions(+)

There's already been a patch posted for this:
http://patchwork.ozlabs.org/patch/256747/

I plan to apply it for my next pull request.

-Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-18 16:38 ` Scott Wood
@ 2013-10-18 18:50   ` Wolfgang Denk
  0 siblings, 0 replies; 21+ messages in thread
From: Wolfgang Denk @ 2013-10-18 18:50 UTC (permalink / raw
  To: Scott Wood; +Cc: linuxppc-dev

Dear Scott,

In message <1382114321.7979.840.camel@snotra.buserror.net> you wrote:
>
> There's already been a patch posted for this:
> http://patchwork.ozlabs.org/patch/256747/
> 
> I plan to apply it for my next pull request.

Ah, cool.  Thanks!

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
"It is better to have tried and failed than to have  failed  to  try,
but the result's the same."                           - Mike Dennison

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-18  7:38 [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores Wolfgang Denk
  2013-10-18 16:38 ` Scott Wood
@ 2013-10-23  5:07 ` Kumar Gala
  2013-10-23 10:15   ` Scott Wood
  1 sibling, 1 reply; 21+ messages in thread
From: Kumar Gala @ 2013-10-23  5:07 UTC (permalink / raw
  To: Wolfgang Denk; +Cc: Scott Wood, linuxppc-dev


On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:

> Default Debian PowerPC doesn't work on e500 because the code contains
> "lwsync" instructions, which are unsupported on this core.  As a
> result, applications using this will crash with an "unhandled signal =
4"
> "Illegal instruction" error.
>=20
> As a work around we add code to emulate this insn.  This is expensive
> performance-wise, but allows to run standard user land code.
>=20
> Signed-off-by: Wolfgang Denk <wd@denx.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Scott Wood <scottwood@freescale.com>
> ---
> I am aware that the clean solution to the problem is to build user
> space with compiler options that match the target architecture.
> However, sometimes this is just too much effort.
>=20
> Also, of course the performance of such an emulation sucks. But the
> the occurrence of such instructions is so rare that no significant
> slowdown can be oserved.
>=20
> I'm not sure if this should / could go into mainline.  I'm posting it
> primarily so it can be found should anybody else need this.
> - wd
>=20
> arch/powerpc/kernel/traps.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>=20
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index f783c93..f330374 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs =
*regs)
> 		return 0;
> 	}
>=20
> +	/* Emulating the lwsync insn as a sync insn */
> +	if (instword =3D=3D PPC_INST_LWSYNC) {
> +		PPC_WARN_EMULATED(lwsync, regs);
> +		asm volatile("sync" : : : "memory");

Do we really need the inline asm?  Doesn't the fact of just taking an =
exception and returning from it equate to a sync.

> +		return 0;
> +	}
> +
> 	/* Emulate the mcrxr insn.  */
> 	if ((instword & PPC_INST_MCRXR_MASK) =3D=3D PPC_INST_MCRXR) {
> 		int shift =3D (instword >> 21) & 0x1c;
> --=20
> 1.8.3.1
>=20
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-23  5:07 ` Kumar Gala
@ 2013-10-23 10:15   ` Scott Wood
  2013-10-24  4:06     ` Kumar Gala
  0 siblings, 1 reply; 21+ messages in thread
From: Scott Wood @ 2013-10-23 10:15 UTC (permalink / raw
  To: Kumar Gala; +Cc: linuxppc-dev, Wolfgang Denk

On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index f783c93..f330374 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs *regs)
> > 		return 0;
> > 	}
> > 
> > +	/* Emulating the lwsync insn as a sync insn */
> > +	if (instword == PPC_INST_LWSYNC) {
> > +		PPC_WARN_EMULATED(lwsync, regs);
> > +		asm volatile("sync" : : : "memory");
> 
> Do we really need the inline asm?  Doesn't the fact of just taking an exception and returning from it equate to a sync.

No, it doesn't equate to a sync.  See the discussion here:
http://patchwork.ozlabs.org/patch/256747/

-Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-23 10:15   ` Scott Wood
@ 2013-10-24  4:06     ` Kumar Gala
  2013-10-24  9:45       ` Benjamin Herrenschmidt
  2013-10-24 10:18       ` David Laight
  0 siblings, 2 replies; 21+ messages in thread
From: Kumar Gala @ 2013-10-24  4:06 UTC (permalink / raw
  To: Scott Wood; +Cc: linuxppc-dev, Wolfgang Denk


On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:

> On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
>> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
>>> diff --git a/arch/powerpc/kernel/traps.c =
b/arch/powerpc/kernel/traps.c
>>> index f783c93..f330374 100644
>>> --- a/arch/powerpc/kernel/traps.c
>>> +++ b/arch/powerpc/kernel/traps.c
>>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs =
*regs)
>>> 		return 0;
>>> 	}
>>>=20
>>> +	/* Emulating the lwsync insn as a sync insn */
>>> +	if (instword =3D=3D PPC_INST_LWSYNC) {
>>> +		PPC_WARN_EMULATED(lwsync, regs);
>>> +		asm volatile("sync" : : : "memory");
>>=20
>> Do we really need the inline asm?  Doesn't the fact of just taking an =
exception and returning from it equate to a sync.
>=20
> No, it doesn't equate to a sync.  See the discussion here:
> http://patchwork.ozlabs.org/patch/256747/
>=20

Thanks.=20

I'm not sure I'm a fan of doing this as it silently hides a significant =
performance impact.

Could we possible re-write the userspace instruction to be a 'sync' when =
we hit this?

- k

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  4:06     ` Kumar Gala
@ 2013-10-24  9:45       ` Benjamin Herrenschmidt
  2013-10-24  9:55         ` Kumar Gala
  2013-10-24 10:18       ` David Laight
  1 sibling, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2013-10-24  9:45 UTC (permalink / raw
  To: Kumar Gala; +Cc: Scott Wood, linuxppc-dev, Wolfgang Denk

On Wed, 2013-10-23 at 23:06 -0500, Kumar Gala wrote:
> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
> 
> > On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
> >> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
> >>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> >>> index f783c93..f330374 100644
> >>> --- a/arch/powerpc/kernel/traps.c
> >>> +++ b/arch/powerpc/kernel/traps.c
> >>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs *regs)
> >>> 		return 0;
> >>> 	}
> >>> 
> >>> +	/* Emulating the lwsync insn as a sync insn */
> >>> +	if (instword == PPC_INST_LWSYNC) {
> >>> +		PPC_WARN_EMULATED(lwsync, regs);
> >>> +		asm volatile("sync" : : : "memory");
> >> 
> >> Do we really need the inline asm?  Doesn't the fact of just taking an exception and returning from it equate to a sync.
> > 
> > No, it doesn't equate to a sync.  See the discussion here:
> > http://patchwork.ozlabs.org/patch/256747/
> > 
> 
> Thanks. 
> 
> I'm not sure I'm a fan of doing this as it silently hides a significant performance impact.
> 
> Could we possible re-write the userspace instruction to be a 'sync' when we hit this?

Rewriting user space is a can of worms I wouldn't get into ... is any
other arch doing it ?

I'm not too worried as long as we warn and account them.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  9:45       ` Benjamin Herrenschmidt
@ 2013-10-24  9:55         ` Kumar Gala
  2013-10-24 21:05           ` James Yang
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Kumar Gala @ 2013-10-24  9:55 UTC (permalink / raw
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Wolfgang Denk


On Oct 24, 2013, at 4:45 AM, Benjamin Herrenschmidt wrote:

> On Wed, 2013-10-23 at 23:06 -0500, Kumar Gala wrote:
>> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
>>=20
>>> On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
>>>> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
>>>>> diff --git a/arch/powerpc/kernel/traps.c =
b/arch/powerpc/kernel/traps.c
>>>>> index f783c93..f330374 100644
>>>>> --- a/arch/powerpc/kernel/traps.c
>>>>> +++ b/arch/powerpc/kernel/traps.c
>>>>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs =
*regs)
>>>>> 		return 0;
>>>>> 	}
>>>>>=20
>>>>> +	/* Emulating the lwsync insn as a sync insn */
>>>>> +	if (instword =3D=3D PPC_INST_LWSYNC) {
>>>>> +		PPC_WARN_EMULATED(lwsync, regs);
>>>>> +		asm volatile("sync" : : : "memory");
>>>>=20
>>>> Do we really need the inline asm?  Doesn't the fact of just taking =
an exception and returning from it equate to a sync.
>>>=20
>>> No, it doesn't equate to a sync.  See the discussion here:
>>> http://patchwork.ozlabs.org/patch/256747/
>>>=20
>>=20
>> Thanks.=20
>>=20
>> I'm not sure I'm a fan of doing this as it silently hides a =
significant performance impact.
>>=20
>> Could we possible re-write the userspace instruction to be a 'sync' =
when we hit this?
>=20
> Rewriting user space is a can of worms I wouldn't get into ... is any
> other arch doing it ?

Fair enough
>=20
> I'm not too worried as long as we warn and account them.

Than, I'd ask this be under a Kconfig option that is disabled by =
default.  Users should have to explicitly enable this so they know what =
they are doing.

- k

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  4:06     ` Kumar Gala
  2013-10-24  9:45       ` Benjamin Herrenschmidt
@ 2013-10-24 10:18       ` David Laight
  1 sibling, 0 replies; 21+ messages in thread
From: David Laight @ 2013-10-24 10:18 UTC (permalink / raw
  To: Kumar Gala, Scott Wood; +Cc: linuxppc-dev, Wolfgang Denk

> Subject: Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land =
on e500 cores
>=20
>=20
> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
>=20
> > On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
> >> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
> >>> diff --git a/arch/powerpc/kernel/traps.c =
b/arch/powerpc/kernel/traps.c
> >>> index f783c93..f330374 100644
> >>> --- a/arch/powerpc/kernel/traps.c
> >>> +++ b/arch/powerpc/kernel/traps.c
> >>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs =
*regs)
> >>> 		return 0;
> >>> 	}
> >>>
> >>> +	/* Emulating the lwsync insn as a sync insn */
> >>> +	if (instword =3D=3D PPC_INST_LWSYNC) {
> >>> +		PPC_WARN_EMULATED(lwsync, regs);
> >>> +		asm volatile("sync" : : : "memory");
...
> I'm not sure I'm a fan of doing this as it silently hides a =
significant performance impact.
>=20
> Could we possible re-write the userspace instruction to be a 'sync' =
when we hit this?

You'd need to modify the memory without marking the page dirty.
(Which might be interesting for other cpu-dependant optimisations.)

	David

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  9:55         ` Kumar Gala
@ 2013-10-24 21:05           ` James Yang
  2013-10-25  4:12             ` Kumar Gala
  2013-10-25 10:36           ` Scott Wood
  2013-10-27 10:25           ` Wolfgang Denk
  2 siblings, 1 reply; 21+ messages in thread
From: James Yang @ 2013-10-24 21:05 UTC (permalink / raw
  To: Kumar Gala; +Cc: linuxppc-dev

On Thu, 24 Oct 2013, Kumar Gala wrote:

> On Oct 24, 2013, at 4:45 AM, Benjamin Herrenschmidt wrote:
> 
> > On Wed, 2013-10-23 at 23:06 -0500, Kumar Gala wrote:
> >> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
> >> 
> >>> On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
> >>>> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
> >>>>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> >>>>> index f783c93..f330374 100644
> >>>>> --- a/arch/powerpc/kernel/traps.c
> >>>>> +++ b/arch/powerpc/kernel/traps.c
> >>>>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs *regs)
> >>>>> 		return 0;
> >>>>> 	}
> >>>>> 
> >>>>> +	/* Emulating the lwsync insn as a sync insn */
> >>>>> +	if (instword == PPC_INST_LWSYNC) {
> >>>>> +		PPC_WARN_EMULATED(lwsync, regs);
> >>>>> +		asm volatile("sync" : : : "memory");
> >>>> 
> >>>> Do we really need the inline asm?  Doesn't the fact of just taking an exception and returning from it equate to a sync.
> >>> 
> >>> No, it doesn't equate to a sync.  See the discussion here:
> >>> http://patchwork.ozlabs.org/patch/256747/
> >>> 
> >> 
> >> Thanks. 
> >> 
> >> I'm not sure I'm a fan of doing this as it silently hides a 
> >> significant performance impact.
> >> 
> >> Could we possible re-write the userspace instruction to be a 
> >> 'sync' when we hit this?
> > 
> > Rewriting user space is a can of worms I wouldn't get into ... is 
> > any other arch doing it ?
> 
> Fair enough
> > 
> > I'm not too worried as long as we warn and account them.
> 
> Than, I'd ask this be under a Kconfig option that is disabled by 
> default.  Users should have to explicitly enable this so they know 
> what they are doing.


I think it should be enabled by default, rather than disabled, so that 
users would actually see a warning rather than get a sig 4.  Or, let 
it not be Kconfig-able so that this doesn't become a problem any more. 
(It's been 4 years since I sent to you an earlier version of this 
patch.)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24 21:05           ` James Yang
@ 2013-10-25  4:12             ` Kumar Gala
  2013-10-25  4:49               ` Yang James-RA8135
  0 siblings, 1 reply; 21+ messages in thread
From: Kumar Gala @ 2013-10-25  4:12 UTC (permalink / raw
  To: James Yang; +Cc: linuxppc-dev


On Oct 24, 2013, at 4:05 PM, James Yang wrote:

> On Thu, 24 Oct 2013, Kumar Gala wrote:
>=20
>> On Oct 24, 2013, at 4:45 AM, Benjamin Herrenschmidt wrote:
>>=20
>>> On Wed, 2013-10-23 at 23:06 -0500, Kumar Gala wrote:
>>>> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
>>>>=20
>>>>> On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
>>>>>> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
>>>>>>> diff --git a/arch/powerpc/kernel/traps.c =
b/arch/powerpc/kernel/traps.c
>>>>>>> index f783c93..f330374 100644
>>>>>>> --- a/arch/powerpc/kernel/traps.c
>>>>>>> +++ b/arch/powerpc/kernel/traps.c
>>>>>>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct =
pt_regs *regs)
>>>>>>> 		return 0;
>>>>>>> 	}
>>>>>>>=20
>>>>>>> +	/* Emulating the lwsync insn as a sync insn */
>>>>>>> +	if (instword =3D=3D PPC_INST_LWSYNC) {
>>>>>>> +		PPC_WARN_EMULATED(lwsync, regs);
>>>>>>> +		asm volatile("sync" : : : "memory");
>>>>>>=20
>>>>>> Do we really need the inline asm?  Doesn't the fact of just =
taking an exception and returning from it equate to a sync.
>>>>>=20
>>>>> No, it doesn't equate to a sync.  See the discussion here:
>>>>> http://patchwork.ozlabs.org/patch/256747/
>>>>>=20
>>>>=20
>>>> Thanks.=20
>>>>=20
>>>> I'm not sure I'm a fan of doing this as it silently hides a=20
>>>> significant performance impact.
>>>>=20
>>>> Could we possible re-write the userspace instruction to be a=20
>>>> 'sync' when we hit this?
>>>=20
>>> Rewriting user space is a can of worms I wouldn't get into ... is=20
>>> any other arch doing it ?
>>=20
>> Fair enough
>>>=20
>>> I'm not too worried as long as we warn and account them.
>>=20
>> Than, I'd ask this be under a Kconfig option that is disabled by=20
>> default.  Users should have to explicitly enable this so they know=20
>> what they are doing.
>=20
>=20
> I think it should be enabled by default, rather than disabled, so that=20=

> users would actually see a warning rather than get a sig 4.  Or, let=20=

> it not be Kconfig-able so that this doesn't become a problem any more.=20=

> (It's been 4 years since I sent to you an earlier version of this=20
> patch.)

And clearly most users don't seem terrible annoyed enough about this =
issue to have concerns.  I don't see why making it a Kconfig option =
impacts the small handful of people that happen to try and run a more =
generic distro on e500 cores.

- k=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25  4:12             ` Kumar Gala
@ 2013-10-25  4:49               ` Yang James-RA8135
  2013-10-25  9:58                 ` David Laight
  0 siblings, 1 reply; 21+ messages in thread
From: Yang James-RA8135 @ 2013-10-25  4:49 UTC (permalink / raw
  To: Kumar Gala; +Cc: linuxppc-dev@lists.ozlabs.org

Sorry for the formatting.  I'm not at my usual mailer.=0A=
=0A=
From: Kumar Gala:=0A=
>On Oct 24, 2013, at 4:05 PM, James Yang wrote:=0A=
>> On Thu, 24 Oct 2013, Kumar Gala wrote:=0A=
>>> Than, I'd ask this be under a Kconfig option that is disabled by=0A=
>>> default.  Users should have to explicitly enable this so they know=0A=
>>> what they are doing.=0A=
>>=0A=
>>=0A=
>> I think it should be enabled by default, rather than disabled, so that=
=0A=
>> users would actually see a warning rather than get a sig 4.  Or, let=0A=
>> it not be Kconfig-able so that this doesn't become a problem any more.=
=0A=
>> (It's been 4 years since I sent to you an earlier version of this=0A=
>> patch.)=0A=
>=0A=
>And clearly most users don't seem terrible annoyed enough about =0A=
>this issue to have concerns.  I don't see why making it a Kconfig =0A=
>option impacts the small handful of people that happen to try =0A=
> and run a more generic distro on e500 cores.=0A=
=0A=
=0A=
I would have to dispute that qualification of "most".  =0A=
=0A=
This is not a distro issue. It's a libstdc++ portability issue. libstdc++ =
=0A=
hardcodes lwsync unless __NO_LWSYNC__ is explicitly defined, =0A=
which you only get with -mcpu=3D8540/-mcpu=3D8548.  When compiled =0A=
for any powerpc target other than -mcpu=3D8540/-mcpu=3D8548, including=0A=
the default -mcpu=3Dcommon,  libstdc++ will end up containing lwsync.  =0A=
There is no way to explicitly request libstdc++ to be built without lwsync=
=0A=
with an -mcpu target other than 8540/8548.=0A=
=0A=
The issue is easily demonstrated by running a program that throws a =0A=
C++ exception: __cxa_throw() is called, which has an lwsync.  This=0A=
results in an illegal instruction exception when run on an e500v1/e500v2.=
=0A=
=0A=
Those users who insist on using a generic target for their compiler=0A=
will hit this problem, in particular, those who need to generate =0A=
binary code that targets a wide range of powerpc targets, such as=0A=
generic distro.  It's unreasonable to require a user to recompile =0A=
libstdc++ just to get functionality for a C++ program, since they would=0A=
have to provide two libstdc++.so along with some dynamic-linking hacks to=
=0A=
determine at runtime which one should be used.  Emulating lwsync =0A=
does not prevent an e500v1/e500v2-targeted libstdc++ from providing=0A=
the optimal performance.=0A=
=0A=
=0A=
citation:=0A=
=0A=
> > lwsync is embedded into the library unless __NO_LWSYNC__ is defined:=0A=
> > http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/config/cpu/powerp=
c/atomic_word.h?revision=3D195701&view=3Dmarkup#l30=0A=
> > -----------------------------------------------=0A=
> > #define _GLIBCXX_READ_MEM_BARRIER __asm __volatile=0A=
> > ("isync":::"memory")=0A=
> > #ifdef __NO_LWSYNC__=0A=
> > #define _GLIBCXX_WRITE_MEM_BARRIER __asm __volatile=0A=
> > ("sync":::"memory")=0A=
> > #else=0A=
> > #define _GLIBCXX_WRITE_MEM_BARRIER __asm __volatile=0A=
> > ("lwsync":::"memory")=0A=
> > #endif=0A=
> > -----------------------------------------------=0A=
> > =0A=
> > =0A=
=0A=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25  4:49               ` Yang James-RA8135
@ 2013-10-25  9:58                 ` David Laight
  2013-10-25 13:02                   ` Benjamin Herrenschmidt
  2013-10-25 15:13                   ` James Yang
  0 siblings, 2 replies; 21+ messages in thread
From: David Laight @ 2013-10-25  9:58 UTC (permalink / raw
  To: Yang James-RA8135, Kumar Gala; +Cc: linuxppc-dev

> This is not a distro issue. It's a libstdc++ portability issue. =
libstdc++
> hardcodes lwsync unless __NO_LWSYNC__ is explicitly defined,
> which you only get with -mcpu=3D8540/-mcpu=3D8548.  When compiled
> for any powerpc target other than -mcpu=3D8540/-mcpu=3D8548, including
> the default -mcpu=3Dcommon,  libstdc++ will end up containing lwsync.
> There is no way to explicitly request libstdc++ to be built without =
lwsync
> with an -mcpu target other than 8540/8548.
>=20
> The issue is easily demonstrated by running a program that throws a
> C++ exception: __cxa_throw() is called, which has an lwsync.  This
> results in an illegal instruction exception when run on an =
e500v1/e500v2.

Perhaps libstc++ should be working out at run time whether lwsync is =
valid?

	David

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  9:55         ` Kumar Gala
  2013-10-24 21:05           ` James Yang
@ 2013-10-25 10:36           ` Scott Wood
  2013-10-25 15:25             ` James Yang
  2013-10-27 10:29             ` Wolfgang Denk
  2013-10-27 10:25           ` Wolfgang Denk
  2 siblings, 2 replies; 21+ messages in thread
From: Scott Wood @ 2013-10-25 10:36 UTC (permalink / raw
  To: Kumar Gala; +Cc: linuxppc-dev, Wolfgang Denk

On Thu, 2013-10-24 at 04:55 -0500, Kumar Gala wrote:
> On Oct 24, 2013, at 4:45 AM, Benjamin Herrenschmidt wrote:
> 
> > On Wed, 2013-10-23 at 23:06 -0500, Kumar Gala wrote:
> >> On Oct 23, 2013, at 5:15 AM, Scott Wood wrote:
> >> 
> >>> On Wed, 2013-10-23 at 00:07 -0500, Kumar Gala wrote:
> >>>> On Oct 18, 2013, at 2:38 AM, Wolfgang Denk wrote:
> >>>>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> >>>>> index f783c93..f330374 100644
> >>>>> --- a/arch/powerpc/kernel/traps.c
> >>>>> +++ b/arch/powerpc/kernel/traps.c
> >>>>> @@ -986,6 +986,13 @@ static int emulate_instruction(struct pt_regs *regs)
> >>>>> 		return 0;
> >>>>> 	}
> >>>>> 
> >>>>> +	/* Emulating the lwsync insn as a sync insn */
> >>>>> +	if (instword == PPC_INST_LWSYNC) {
> >>>>> +		PPC_WARN_EMULATED(lwsync, regs);
> >>>>> +		asm volatile("sync" : : : "memory");
> >>>> 
> >>>> Do we really need the inline asm?  Doesn't the fact of just taking an exception and returning from it equate to a sync.
> >>> 
> >>> No, it doesn't equate to a sync.  See the discussion here:
> >>> http://patchwork.ozlabs.org/patch/256747/
> >>> 
> >> 
> >> Thanks. 
> >> 
> >> I'm not sure I'm a fan of doing this as it silently hides a significant performance impact.
> >> 
> >> Could we possible re-write the userspace instruction to be a 'sync' when we hit this?
> > 
> > Rewriting user space is a can of worms I wouldn't get into ... is any
> > other arch doing it ?
> 
> Fair enough
> > 
> > I'm not too worried as long as we warn and account them.
> 
> Than, I'd ask this be under a Kconfig option that is disabled by
> default.  Users should have to explicitly enable this so they know what
> they are doing.

Why should this be any different than the other emulated instructions,
which are generally either not kconfigized, or on by default in some
configs (like fp emu)?

Making sure users are aware of this is what PPC_WARN_EMULATED is for.

Has anyone measured how much this slows things down with a typical
userspace?

-Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25  9:58                 ` David Laight
@ 2013-10-25 13:02                   ` Benjamin Herrenschmidt
  2013-10-26  7:26                     ` Kumar Gala
  2013-10-25 15:13                   ` James Yang
  1 sibling, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2013-10-25 13:02 UTC (permalink / raw
  To: David Laight; +Cc: Yang James-RA8135, linuxppc-dev

On Fri, 2013-10-25 at 10:58 +0100, David Laight wrote:
> > This is not a distro issue. It's a libstdc++ portability issue. libstdc++
> > hardcodes lwsync unless __NO_LWSYNC__ is explicitly defined,
> > which you only get with -mcpu=8540/-mcpu=8548.  When compiled
> > for any powerpc target other than -mcpu=8540/-mcpu=8548, including
> > the default -mcpu=common,  libstdc++ will end up containing lwsync.
> > There is no way to explicitly request libstdc++ to be built without lwsync
> > with an -mcpu target other than 8540/8548.
> > 
> > The issue is easily demonstrated by running a program that throws a
> > C++ exception: __cxa_throw() is called, which has an lwsync.  This
> > results in an illegal instruction exception when run on an e500v1/e500v2.
> 
> Perhaps libstc++ should be working out at run time whether lwsync is valid?

Do we have enough coats of paint on this bike shed yet ? :-)

I'm personally tempted to take Scott's approach since that's what we do
for other things as well, it just works and is simple.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25  9:58                 ` David Laight
  2013-10-25 13:02                   ` Benjamin Herrenschmidt
@ 2013-10-25 15:13                   ` James Yang
  1 sibling, 0 replies; 21+ messages in thread
From: James Yang @ 2013-10-25 15:13 UTC (permalink / raw
  To: David Laight; +Cc: linuxppc-dev

On Fri, 25 Oct 2013, David Laight wrote:

> > This is not a distro issue. It's a libstdc++ portability issue. libstdc++
> > hardcodes lwsync unless __NO_LWSYNC__ is explicitly defined,
> > which you only get with -mcpu=8540/-mcpu=8548.  When compiled
> > for any powerpc target other than -mcpu=8540/-mcpu=8548, including
> > the default -mcpu=common,  libstdc++ will end up containing lwsync.
> > There is no way to explicitly request libstdc++ to be built without lwsync
> > with an -mcpu target other than 8540/8548.
> > 
> > The issue is easily demonstrated by running a program that throws a
> > C++ exception: __cxa_throw() is called, which has an lwsync.  This
> > results in an illegal instruction exception when run on an e500v1/e500v2.
> 
> Perhaps libstc++ should be working out at run time whether lwsync is 
> valid?

lwsync has been in libstdc++ for over 8 years so there's a substantial 
amount of legacy binary code that exists such that changing libstdc++ 
now won't solve the problem.

This isn't a performance issue, it's a functional issue -- libstdc++ 
for -mcpu=common targets doesn't work on e500v1/e500v2. (QorIQ P4080 
that have e500mc and newer cores support lwsync so this is no longer 
an issue.)

If one /really/ cared about losing performance while running common 
target binaries, emulating lwsync is an inconsequential performance 
loss compared to the kernel doing floating point emulation.

Recompiling libstdc++ (and all your other userland) with -mcpu=8548 is 
what you'd have to do to avoid classic FP and use Embedded FP or SPE, 
and if you are doing that you'll get sync instead of lwsync in 
libstdc++.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25 10:36           ` Scott Wood
@ 2013-10-25 15:25             ` James Yang
  2013-10-28 17:52               ` Scott Wood
  2013-10-27 10:29             ` Wolfgang Denk
  1 sibling, 1 reply; 21+ messages in thread
From: James Yang @ 2013-10-25 15:25 UTC (permalink / raw
  To: Scott Wood; +Cc: linuxppc-dev, Wolfgang Denk

On Fri, 25 Oct 2013, Scott Wood wrote:

> Has anyone measured how much this slows things down with a typical
> userspace?

Not a measurement of the patch in question but an older similar patch 
on 3.0.51 (8572 running Debian 5.0.3):

$ ./test-lwsync 
cycles per emulated lwsync = 371
cycles per sync = 36

----------------------------------------------------------
#include <stdio.h>

int main (void) {
        unsigned long atb_start, atb_stop;
        unsigned long i;

        asm volatile ("mfspr %0,526" : "=r" (atb_start));
        for (i = 0; i < 100; i++) {
                asm volatile ("lwsync");
        }
        asm volatile ("mfspr %0,526" : "=r" (atb_stop));
        printf("cycles per emulated lwsync = %lu\n", (atb_stop - 
atb_start) / 100);

        asm volatile ("mfspr %0,526" : "=r" (atb_start));
        for (i = 0; i < 100; i++) {
                asm volatile ("sync");
        }
        asm volatile ("mfspr %0,526" : "=r" (atb_stop));

        printf("cycles per sync = %lu\n", (atb_stop - atb_start) / 
100);

        return 0;
}
----------------------------------------------------------

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25 13:02                   ` Benjamin Herrenschmidt
@ 2013-10-26  7:26                     ` Kumar Gala
  0 siblings, 0 replies; 21+ messages in thread
From: Kumar Gala @ 2013-10-26  7:26 UTC (permalink / raw
  To: Benjamin Herrenschmidt; +Cc: Yang James-RA8135, David Laight, linuxppc-dev


On Oct 25, 2013, at 8:02 AM, Benjamin Herrenschmidt wrote:

> On Fri, 2013-10-25 at 10:58 +0100, David Laight wrote:
>>> This is not a distro issue. It's a libstdc++ portability issue. =
libstdc++
>>> hardcodes lwsync unless __NO_LWSYNC__ is explicitly defined,
>>> which you only get with -mcpu=3D8540/-mcpu=3D8548.  When compiled
>>> for any powerpc target other than -mcpu=3D8540/-mcpu=3D8548, =
including
>>> the default -mcpu=3Dcommon,  libstdc++ will end up containing =
lwsync.
>>> There is no way to explicitly request libstdc++ to be built without =
lwsync
>>> with an -mcpu target other than 8540/8548.
>>>=20
>>> The issue is easily demonstrated by running a program that throws a
>>> C++ exception: __cxa_throw() is called, which has an lwsync.  This
>>> results in an illegal instruction exception when run on an =
e500v1/e500v2.
>>=20
>> Perhaps libstc++ should be working out at run time whether lwsync is =
valid?
>=20
> Do we have enough coats of paint on this bike shed yet ? :-)
>=20
> I'm personally tempted to take Scott's approach since that's what we =
do
> for other things as well, it just works and is simple.
>=20
> Cheers,
> Ben.
>=20

I give in, however some should test with CONFIG_PPC_EMULATED_STATS and =
fix what I'm guessing is a build breakage with either patch.

- k

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-24  9:55         ` Kumar Gala
  2013-10-24 21:05           ` James Yang
  2013-10-25 10:36           ` Scott Wood
@ 2013-10-27 10:25           ` Wolfgang Denk
  2 siblings, 0 replies; 21+ messages in thread
From: Wolfgang Denk @ 2013-10-27 10:25 UTC (permalink / raw
  To: Kumar Gala; +Cc: Scott Wood, linuxppc-dev

Dear Kumar,

In message <6BFC8EB0-1A75-41C3-985A-E3ED14846710@kernel.crashing.org> you wrote:
> 
> Fair enough
> > 
> > I'm not too worried as long as we warn and account them.
>
> Than, I'd ask this be under a Kconfig option that is disabled by
> default.  Users should have to explicitly enable this so they know what
> they are doing.

Is this really worth the effort?  Under normal situations (users are
using a user space environment that has been properly buiult for the
processor variant they are using) nobody should ever run into this
situation.  It happens only if you are already doing something wrong -
like using user space that has not been built for an E500 core on such
a machine.

In this situation, it seems more useful to me if a "standard" kernel
just works with a "standard" user space environment, even if this
includes some performance penalty - which actually should be
neglibale.  In my tests (when running standard Debian for PPC on a
E500 only very few programs actually triggered this situation, and
none of them in a time-critical way.  I doubt I would even be able to
measure the performance impact.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
"...all the  good  computer  designs  are  bootlegged;  the  formally
planned  products,  if  they  are built at all, are dogs!" - David E.
Lundstrom, "A Few Good Men From Univac", MIT Press, 1987

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25 10:36           ` Scott Wood
  2013-10-25 15:25             ` James Yang
@ 2013-10-27 10:29             ` Wolfgang Denk
  1 sibling, 0 replies; 21+ messages in thread
From: Wolfgang Denk @ 2013-10-27 10:29 UTC (permalink / raw
  To: Scott Wood; +Cc: linuxppc-dev

Dear Scott,

In message <1382697373.3926.36.camel@aoeu.buserror.net> you wrote:
>
> Has anyone measured how much this slows things down with a typical
> userspace?

In the applications I found to trigger this issue the number of traps
is so small that I can't even reliably measure any difference.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
When it is incorrect, it is, at least *authoritatively* incorrect.
                                    - Hitchiker's Guide To The Galaxy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores
  2013-10-25 15:25             ` James Yang
@ 2013-10-28 17:52               ` Scott Wood
  0 siblings, 0 replies; 21+ messages in thread
From: Scott Wood @ 2013-10-28 17:52 UTC (permalink / raw
  To: James Yang; +Cc: linuxppc-dev, Wolfgang Denk

On Fri, 2013-10-25 at 10:25 -0500, James Yang wrote:
> On Fri, 25 Oct 2013, Scott Wood wrote:
> 
> > Has anyone measured how much this slows things down with a typical
> > userspace?
> 
> Not a measurement of the patch in question but an older similar patch 
> on 3.0.51 (8572 running Debian 5.0.3):
> 
> $ ./test-lwsync 
> cycles per emulated lwsync = 371
> cycles per sync = 36

My point was more to find out how often a typical userspace executes
these instructions.  From Wolfgang's e-mail it seems the answer is not
very often.

-Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-10-28 17:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-18  7:38 [PATCH] [RFC] Emulate "lwsync" to run standard user land on e500 cores Wolfgang Denk
2013-10-18 16:38 ` Scott Wood
2013-10-18 18:50   ` Wolfgang Denk
2013-10-23  5:07 ` Kumar Gala
2013-10-23 10:15   ` Scott Wood
2013-10-24  4:06     ` Kumar Gala
2013-10-24  9:45       ` Benjamin Herrenschmidt
2013-10-24  9:55         ` Kumar Gala
2013-10-24 21:05           ` James Yang
2013-10-25  4:12             ` Kumar Gala
2013-10-25  4:49               ` Yang James-RA8135
2013-10-25  9:58                 ` David Laight
2013-10-25 13:02                   ` Benjamin Herrenschmidt
2013-10-26  7:26                     ` Kumar Gala
2013-10-25 15:13                   ` James Yang
2013-10-25 10:36           ` Scott Wood
2013-10-25 15:25             ` James Yang
2013-10-28 17:52               ` Scott Wood
2013-10-27 10:29             ` Wolfgang Denk
2013-10-27 10:25           ` Wolfgang Denk
2013-10-24 10:18       ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.