LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
@ 2013-10-01 20:35 Helge Deller
  2013-10-01 20:43 ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Helge Deller @ 2013-10-01 20:35 UTC (permalink / raw
  To: Tejun Heo, Libin, linux-kernel, linux-parisc, James Bottomley

print_worker_info() includes no validity check on the pwq and wq
pointers before handing them over to the probe_kernel_read() functions.

It seems that most architectures don't care about that, but at least on
the parisc architecture this leads to a kernel crash since accesses to
page zero are protected by the kernel for security reasons.

Fix this problem by verifying the contents of pwq and wq before usage.
Even if probe_kernel_read() usually prevents such crashes by disabling
page faults, clean code should always include such checks. 

Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
crash the Linux kernel on the parisc architecture.

CC: Tejun Heo <tj@kernel.org>
CC: Libin <huawei.libin@huawei.com>
CC: linux-parisc@vger.kernel.org
CC: James.Bottomley@HansenPartnership.com
Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 987293d..c03b47f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4512,8 +4512,10 @@ void print_worker_info(const char *log_lvl, struct task_struct *task)
 	 */
 	probe_kernel_read(&fn, &worker->current_func, sizeof(fn));
 	probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
-	probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
-	probe_kernel_read(name, wq->name, sizeof(name) - 1);
+	if (pwq)
+		probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
+	if (wq)
+		probe_kernel_read(name, wq->name, sizeof(name) - 1);
 
 	/* copy worker description */
 	probe_kernel_read(&desc_valid, &worker->desc_valid, sizeof(desc_valid));

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 20:35 [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use Helge Deller
@ 2013-10-01 20:43 ` Tejun Heo
  2013-10-01 20:53   ` Helge Deller
  2013-10-01 21:40   ` James Bottomley
  0 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2013-10-01 20:43 UTC (permalink / raw
  To: Helge Deller; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

Hello,

On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> print_worker_info() includes no validity check on the pwq and wq
> pointers before handing them over to the probe_kernel_read() functions.
> 
> It seems that most architectures don't care about that, but at least on
> the parisc architecture this leads to a kernel crash since accesses to
> page zero are protected by the kernel for security reasons.
> 
> Fix this problem by verifying the contents of pwq and wq before usage.
> Even if probe_kernel_read() usually prevents such crashes by disabling
> page faults, clean code should always include such checks. 
> 
> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> crash the Linux kernel on the parisc architecture.

Hmm... um had similar problem but the root cause here is that the arch
isn't implementing probe_kernel_read() properly.  We really have no
idea what the pointer value may be at the dump point and that's why we
use probe_kernel_read().  If something like the above is necessary for
the time being, the correct place would be the arch
probe_kernel_read() implementation.  James, would it be difficult
implement proper probe_kernel_read() on parisc?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 20:43 ` Tejun Heo
@ 2013-10-01 20:53   ` Helge Deller
  2013-10-01 21:03     ` Tejun Heo
  2013-10-01 21:40   ` James Bottomley
  1 sibling, 1 reply; 14+ messages in thread
From: Helge Deller @ 2013-10-01 20:53 UTC (permalink / raw
  To: Tejun Heo; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

On 10/01/2013 10:43 PM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>> print_worker_info() includes no validity check on the pwq and wq
>> pointers before handing them over to the probe_kernel_read() functions.
>>
>> It seems that most architectures don't care about that, but at least on
>> the parisc architecture this leads to a kernel crash since accesses to
>> page zero are protected by the kernel for security reasons.
>>
>> Fix this problem by verifying the contents of pwq and wq before usage.
>> Even if probe_kernel_read() usually prevents such crashes by disabling
>> page faults, clean code should always include such checks. 
>>
>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>> crash the Linux kernel on the parisc architecture.
> 
> Hmm... um had similar problem but the root cause here is that the arch
> isn't implementing probe_kernel_read() properly.  We really have no
> idea what the pointer value may be at the dump point and that's why we
> use probe_kernel_read().  If something like the above is necessary for
> the time being, the correct place would be the arch
> probe_kernel_read() implementation.  James, would it be difficult
> implement proper probe_kernel_read() on parisc?

No, it's not really complicated.
That was my initial way to work around that problem.

But is this really necessary? Isn't a pointer which points to mem zero most
likely wrong on any architecture?

In addition I wrote another patch to work around that problem in the parisc
page fault handler (which is needed anyway) too:
https://patchwork.kernel.org/patch/2971701/

So, in summary my patch here is not really necessary, but for the sake of
clean code I think it doesn't hurt either and as such it would be nice if
you could apply it.

Helge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 20:53   ` Helge Deller
@ 2013-10-01 21:03     ` Tejun Heo
  2013-10-01 21:07       ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-10-01 21:03 UTC (permalink / raw
  To: Helge Deller; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
> So, in summary my patch here is not really necessary, but for the sake of
> clean code I think it doesn't hurt either and as such it would be nice if
> you could apply it.

What? function *must* take any value and try to access it and not
cause failure.  That's the *whole* purpose of that interface.  How is
having incomplete spurious checks around it "clean code" in any sense
of the word?  That doesn't make any sense.

 Nacked-by: Tejun Heo <tj@kernel.org>

and *please* don't add any checks like that anywhere else in the
kernel.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 21:03     ` Tejun Heo
@ 2013-10-01 21:07       ` Tejun Heo
  2013-10-01 22:34         ` Helge Deller
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-10-01 21:07 UTC (permalink / raw
  To: Helge Deller; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

On Tue, Oct 01, 2013 at 05:03:48PM -0400, Tejun Heo wrote:
> On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
> > So, in summary my patch here is not really necessary, but for the sake of
> > clean code I think it doesn't hurt either and as such it would be nice if
> > you could apply it.
> 
> What? function *must* take any value and try to access it and not
> cause failure.  That's the *whole* purpose of that interface.  How is
> having incomplete spurious checks around it "clean code" in any sense
> of the word?  That doesn't make any sense.

Just in case you didn't know already.  probe_kernel_read()'s role is
to take any ulong value and dereference it if it can.  If not, it can
return any value, but it shouldn't crash in any case.  If you're just
adding NULL test in probe_kernel_read(), you're just masking a common
failure pattern and the kernel still *will* panic while dumping the
states.  If a specific arch doesn't have proper probe_kernel_read()
implementation, adding if (!NULL) test there could be a temporary
workaround, but it should be clearly marked as such.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 20:43 ` Tejun Heo
  2013-10-01 20:53   ` Helge Deller
@ 2013-10-01 21:40   ` James Bottomley
  2013-10-01 22:07     ` Helge Deller
  1 sibling, 1 reply; 14+ messages in thread
From: James Bottomley @ 2013-10-01 21:40 UTC (permalink / raw
  To: Tejun Heo; +Cc: Helge Deller, Libin, linux-kernel, linux-parisc

On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> > print_worker_info() includes no validity check on the pwq and wq
> > pointers before handing them over to the probe_kernel_read() functions.
> > 
> > It seems that most architectures don't care about that, but at least on
> > the parisc architecture this leads to a kernel crash since accesses to
> > page zero are protected by the kernel for security reasons.
> > 
> > Fix this problem by verifying the contents of pwq and wq before usage.
> > Even if probe_kernel_read() usually prevents such crashes by disabling
> > page faults, clean code should always include such checks. 
> > 
> > Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> > crash the Linux kernel on the parisc architecture.
> 
> Hmm... um had similar problem but the root cause here is that the arch
> isn't implementing probe_kernel_read() properly.  We really have no
> idea what the pointer value may be at the dump point and that's why we
> use probe_kernel_read().  If something like the above is necessary for
> the time being, the correct place would be the arch
> probe_kernel_read() implementation.  James, would it be difficult
> implement proper probe_kernel_read() on parisc?

The problem seems to be that some traps bypass our exception table
handling.  Helge, do you have the actual stack trace for this?  That
should show where the exception handling is missing.

Thanks,

James



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 21:40   ` James Bottomley
@ 2013-10-01 22:07     ` Helge Deller
  2013-10-01 22:50       ` James Bottomley
  0 siblings, 1 reply; 14+ messages in thread
From: Helge Deller @ 2013-10-01 22:07 UTC (permalink / raw
  To: James Bottomley; +Cc: Tejun Heo, Libin, linux-kernel, linux-parisc

On 10/01/2013 11:40 PM, James Bottomley wrote:
> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>> print_worker_info() includes no validity check on the pwq and wq
>>> pointers before handing them over to the probe_kernel_read() functions.
>>>
>>> It seems that most architectures don't care about that, but at least on
>>> the parisc architecture this leads to a kernel crash since accesses to
>>> page zero are protected by the kernel for security reasons.
>>>
>>> Fix this problem by verifying the contents of pwq and wq before usage.
>>> Even if probe_kernel_read() usually prevents such crashes by disabling
>>> page faults, clean code should always include such checks. 
>>>
>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>>> crash the Linux kernel on the parisc architecture.
>>
>> Hmm... um had similar problem but the root cause here is that the arch
>> isn't implementing probe_kernel_read() properly.  We really have no
>> idea what the pointer value may be at the dump point and that's why we
>> use probe_kernel_read().  If something like the above is necessary for
>> the time being, the correct place would be the arch
>> probe_kernel_read() implementation.  James, would it be difficult
>> implement proper probe_kernel_read() on parisc?
> 
> The problem seems to be that some traps bypass our exception table
> handling.  

Yes, that's correct.
It's trap #26 and we directly call parisc_terminate() for fault_space==0
without checking the exception table.
See my patch I posted a few hours ago which fixes this:
https://patchwork.kernel.org/patch/2971701/

> Helge, do you have the actual stack trace for this?  That
> should show where the exception handling is missing.

Here it is:
[47072.976000] ksoftirqd/0     R  running task        0     3      2 0x00000000
[47072.976000] Backtrace:
[47072.976000]  [<0000000040113a54>] __schedule+0x62c/0x808
[47072.976000]
[47072.976000] kworker/0:0H    S 00000000401040c0     0     5      2 0x00000000
[47073.468000] Backtrace:
[47073.468000]  [<0000000040464264>] pa_memcpy+0x44/0xb0
[47073.468000]  [<00000000404643e0>] __copy_from_user+0x60/0x90
[47073.468000]  [<00000000401d99bc>] __probe_kernel_read+0x54/0x90
[47073.468000]  [<000000004016cc70>] print_worker_info+0x158/0x2c0
[47073.468000]  [<0000000040185a60>] sched_show_task+0x1c8/0x210
[47073.468000]  [<0000000040185b64>] show_state_filter+0xbc/0x138
[47073.468000]  [<00000000404e85c4>] sysrq_handle_showstate+0x34/0x48
[47073.468000]  [<00000000404e9154>] __handle_sysrq+0x174/0x2f0
[47073.468000]  [<00000000404e933c>] write_sysrq_trigger+0x6c/0x90
[47073.468000]  [<00000000402ca2fc>] proc_reg_write+0xbc/0x130
[47073.468000]  [<0000000040236d44>] vfs_write+0x114/0x268
[47073.468000]  [<00000000402373a4>] SyS_write+0x94/0xf8
[47073.468000]  [<0000000040105fc0>] syscall_exit+0x0/0x14
[47073.468000]
[47073.468000]
[47073.468000] Kernel Fault: Code=26 regs=00000000958a09b0 (Addr=0000000000000008)
[47073.468000] CPU: 0 PID: 30189 Comm: bash Not tainted 3.12.0-rc3-64bit+ #1
[47073.468000] task: 000000007ba64100 ti: 00000000958a0000 task.ti: 00000000958a0000
[47073.468000]
[47073.468000]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[47073.468000] PSW: 00001000000001001111111100001110 Not tainted
[47073.468000] r00-03  000000ff0804ff0e 00000000958a08c0 0000000040464264 00000000958a0960
[47073.468000] r04-07  0000000040d73db0 0000000000000008 0000000000000008 00000000958a06f8
[47073.468000] r08-11  00000000958a0600 0000000040c49d18 00000000af535494 00000000958a0370
[47073.468000] r12-15  0000000000000000 0000000000000000 000000000010e7e8 00000000000fde28
[47073.468000] r16-19  0000000000000000 00000000000c7800 0000000000000000 0000000000000000
[47073.468000] r20-23  00000000958a06e0 0000000000000018 0000000000000018 0000000000000003
[47073.468000] r24-27  0000000000000008 0000000000000008 00000000958a06f8 0000000040d73db0
[47073.468000] r28-31  00000000958a06f8 00000000958a0930 00000000958a09b0 0000000000000008
[47073.468000] sr00-03  0000000005dc5000 0000000000000000 0000000000000000 0000000005dc5000
[47073.468000] sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[47073.468000]
[47073.468000] IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040463fdc 0000000040463fe0
[47073.468000]  IIR: 0fe25033    ISR: 0000000000000000  IOR: 0000000000000008
[47073.468000]  CPU:        0   CR30: 00000000958a0000 CR31: 0000000011111111
[47073.468000]  ORIG_R28: 00000000958a0b40
[47073.468000]  IAOQ[0]: pa_memcpy_internal+0xec/0x2b4
[47073.468000]  IAOQ[1]: pa_memcpy_internal+0xf0/0x2b4
[47073.468000]  RP(r2): pa_memcpy+0x44/0xb0
[47073.468000] Backtrace:
[47073.468000]  [<0000000040464264>] pa_memcpy+0x44/0xb0
[47073.468000]  [<00000000404643e0>] __copy_from_user+0x60/0x90
[47073.468000]  [<00000000401d99bc>] __probe_kernel_read+0x54/0x90
[47073.468000]  [<000000004016cc70>] print_worker_info+0x158/0x2c0
[47073.468000]  [<0000000040185a60>] sched_show_task+0x1c8/0x210
[47073.468000]  [<0000000040185b64>] show_state_filter+0xbc/0x138
[47073.468000]  [<00000000404e85c4>] sysrq_handle_showstate+0x34/0x48
[47073.468000]  [<00000000404e9154>] __handle_sysrq+0x174/0x2f0
[47073.468000]  [<00000000404e933c>] write_sysrq_trigger+0x6c/0x90
[47073.468000]  [<00000000402ca2fc>] proc_reg_write+0xbc/0x130
[47073.468000]  [<0000000040236d44>] vfs_write+0x114/0x268
[47073.468000]  [<00000000402373a4>] SyS_write+0x94/0xf8
[47073.468000]  [<0000000040105fc0>] syscall_exit+0x0/0x14
[47073.468000]
[47073.468000] Kernel panic - not syncing: Kernel Fault


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 21:07       ` Tejun Heo
@ 2013-10-01 22:34         ` Helge Deller
  2013-10-01 22:40           ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Helge Deller @ 2013-10-01 22:34 UTC (permalink / raw
  To: Tejun Heo; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

On 10/01/2013 11:07 PM, Tejun Heo wrote:
> On Tue, Oct 01, 2013 at 05:03:48PM -0400, Tejun Heo wrote:
>> On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
>>> So, in summary my patch here is not really necessary, but for the sake of
>>> clean code I think it doesn't hurt either and as such it would be nice if
>>> you could apply it.
>>
>> What? function *must* take any value and try to access it and not
>> cause failure.  That's the *whole* purpose of that interface.  How is
>> having incomplete spurious checks around it "clean code" in any sense
>> of the word?  That doesn't make any sense.
> 
> Just in case you didn't know already.  probe_kernel_read()'s role is
> to take any ulong value and dereference it if it can.  If not, it can
> return any value, but it shouldn't crash in any case.  If you're just
> adding NULL test in probe_kernel_read(), you're just masking a common
> failure pattern and the kernel still *will* panic while dumping the
> states.  If a specific arch doesn't have proper probe_kernel_read()
> implementation, adding if (!NULL) test there could be a temporary
> workaround, but it should be clearly marked as such.

Sure, probe_kernel_read() takes care that no segfaults will happen.
Nevertheless, if we know that "pwq" might become NULL, why access pwq->wq at all?
  struct pool_workqueue *pwq = NULL;
  probe_kernel_read(&wq, &pwq>wq, sizeof(wq));

If you wouldn't have used probe_kernel_read() you would never code it 
like that. That's what I meant when I wrote "clean coding" (aka "similar
to what you would have done without probe_kernel_read()").

Helge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:34         ` Helge Deller
@ 2013-10-01 22:40           ` Tejun Heo
  2013-10-01 22:47             ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-10-01 22:40 UTC (permalink / raw
  To: Helge Deller; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

Hello,

On Wed, Oct 02, 2013 at 12:34:53AM +0200, Helge Deller wrote:
> Sure, probe_kernel_read() takes care that no segfaults will happen.
> Nevertheless, if we know that "pwq" might become NULL, why access pwq->wq at all?
>   struct pool_workqueue *pwq = NULL;
>   probe_kernel_read(&wq, &pwq>wq, sizeof(wq));
> 
> If you wouldn't have used probe_kernel_read() you would never code it 
> like that. That's what I meant when I wrote "clean coding" (aka "similar
> to what you would have done without probe_kernel_read()").

Because it is using probe_kernel_read() and such test wouldn't mean
anything?  It may be NULL, it may be 1 or full Fs.  NULL is just one
of many illegal pointers which may happen.  Why add code which doesn't
achieve anything when you're explicitly trying to access pointers
which you know could be invalid?  Why is that "clean"?  Is "if (p)
kfree(p)" cleaner than "kfree(p)"?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:40           ` Tejun Heo
@ 2013-10-01 22:47             ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2013-10-01 22:47 UTC (permalink / raw
  To: Helge Deller; +Cc: Libin, linux-kernel, linux-parisc, James Bottomley

On Tue, Oct 01, 2013 at 06:40:23PM -0400, Tejun Heo wrote:
> Because it is using probe_kernel_read() and such test wouldn't mean
> anything?  It may be NULL, it may be 1 or full Fs.  NULL is just one
> of many illegal pointers which may happen.  Why add code which doesn't
> achieve anything when you're explicitly trying to access pointers
> which you know could be invalid?  Why is that "clean"?  Is "if (p)
> kfree(p)" cleaner than "kfree(p)"?

Here's one general rule of thumb for "cleanliness" - try to do the
minimal because that's something many people can agree on.  If people
do stuff which aren't necessary, naturally different people would have
different opinions on what's cleaner / better and inevitably end up
with different choices as the choices made are functionally superflous
none would fail and we'll end up with various variants for the same
thing for no good reason, which is messy.  Adding if (p) in front of
probe_kernel_read(p) is inherently superflous and you wouldn't have
any way to enforce or even encourage such practice and the end result
would inevitably be if (p) being sprayed randomly, which is the
opposite of cleanliness.

So, no, please don't add random tests which aren't essential.  It is
inherently messy thing to do.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:07     ` Helge Deller
@ 2013-10-01 22:50       ` James Bottomley
  2013-10-02  0:41         ` John David Anglin
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: James Bottomley @ 2013-10-01 22:50 UTC (permalink / raw
  To: Helge Deller; +Cc: Tejun Heo, Libin, linux-kernel, linux-parisc

On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
> On 10/01/2013 11:40 PM, James Bottomley wrote:
> > On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
> >> Hello,
> >>
> >> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> >>> print_worker_info() includes no validity check on the pwq and wq
> >>> pointers before handing them over to the probe_kernel_read() functions.
> >>>
> >>> It seems that most architectures don't care about that, but at least on
> >>> the parisc architecture this leads to a kernel crash since accesses to
> >>> page zero are protected by the kernel for security reasons.
> >>>
> >>> Fix this problem by verifying the contents of pwq and wq before usage.
> >>> Even if probe_kernel_read() usually prevents such crashes by disabling
> >>> page faults, clean code should always include such checks. 
> >>>
> >>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> >>> crash the Linux kernel on the parisc architecture.
> >>
> >> Hmm... um had similar problem but the root cause here is that the arch
> >> isn't implementing probe_kernel_read() properly.  We really have no
> >> idea what the pointer value may be at the dump point and that's why we
> >> use probe_kernel_read().  If something like the above is necessary for
> >> the time being, the correct place would be the arch
> >> probe_kernel_read() implementation.  James, would it be difficult
> >> implement proper probe_kernel_read() on parisc?
> > 
> > The problem seems to be that some traps bypass our exception table
> > handling.  
> 
> Yes, that's correct.
> It's trap #26 and we directly call parisc_terminate() for fault_space==0
> without checking the exception table.
> See my patch I posted a few hours ago which fixes this:
> https://patchwork.kernel.org/patch/2971701/

That doesn't quite look right ... I guessed it was probably access
rights, so we should do an exception table fixup, so isn't this the fix?
because we shouldn't call do_page_fault if there's no exception table.

James

---
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 04e47c6..25a088a 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -684,6 +684,8 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
 		/* Fall Through */
 	case 26: 
 		/* PCXL: Data memory access rights trap */
+		if (!user_mode(regs) && fixup_exception(regs))
+			return;
 		fault_address = regs->ior;
 		fault_space   = regs->isr;
 		break;



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:50       ` James Bottomley
@ 2013-10-02  0:41         ` John David Anglin
  2013-10-02  1:58         ` John David Anglin
  2013-10-02  8:28         ` Helge Deller
  2 siblings, 0 replies; 14+ messages in thread
From: John David Anglin @ 2013-10-02  0:41 UTC (permalink / raw
  To: James Bottomley
  Cc: Helge Deller, Tejun Heo, Libin, linux-kernel, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 3148 bytes --]

On 1-Oct-13, at 6:50 PM, James Bottomley wrote:

> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read()  
>>>>> functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at  
>>>>> least on
>>>>> the parisc architecture this leads to a kernel crash since  
>>>>> accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before  
>>>>> usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by  
>>>>> disabling
>>>>> page faults, clean code should always include such checks.
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will  
>>>>> immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the  
>>>> arch
>>>> isn't implementing probe_kernel_read() properly.  We really have no
>>>> idea what the pointer value may be at the dump point and that's  
>>>> why we
>>>> use probe_kernel_read().  If something like the above is  
>>>> necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation.  James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for  
>> fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
>
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the  
> fix?
> because we shouldn't call do_page_fault if there's no exception table.

What about trap #18?  It appears the same problem can occur on PCXS.

I have the strong feeling that __copy_from_user still won't be bullet  
proof.
See attached fault.  As far as I know, we don't have an OS HPMC handler.

>
> James
>
> ---
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code,  
> struct pt_regs *regs)
> 		/* Fall Through */
> 	case 26:
> 		/* PCXL: Data memory access rights trap */
> +		if (!user_mode(regs) && fixup_exception(regs))
> +			return;
> 		fault_address = regs->ior;
> 		fault_space   = regs->isr;
> 		break;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux- 
> parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
John David Anglin	dave.anglin@bell.net



[-- Attachment #2: hpmc-20130929.txt --]
[-- Type: text/plain, Size: 6356 bytes --]

Service Menu: Enter command > pim


PROCESSOR PIM INFORMATION

Original Product Number:  A7136A
Current Product Number:   A7136A


-----------------  Processor 0 HPMC Information - PDC Version: 46.34  ------ 

Timestamp =   Sun Sep  29 14:40:29 GMT 2013    (20:13:09:29:14:40:29)

HPMC Chassis Codes

       Chassis Code        Extension
       ------------        ---------
       0xe800035c00e00000 0x0000000000000000


General Registers 0 - 31
00-03  0000000000000000  00000000406143a0  0000000000000000  0000000000000000
04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11  000000000000001a  00047dbc422040a0  0000000000000000  0000000000000000
12-15  0000000000000000  0000000000000000  0000000000000000  0000000000000000
16-19  0000000000000000  00000000ffffffff  0000000000000000  0000000000000000
20-23  0000000000000000  0000000000000000  0000000000000000  0000000000000000
24-27  000000000000d000  0000000000000000  0000000000000000  0000000000000000
28-31  0000000000000000  0000000000000000  160012bc00e00000  0000000000000000


                                                                 
Control Registers 0 - 31
00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11  00000000000025ac  0000000000000000  00000000000000c0  0000000000000000
12-15  0000000000000000  0000000000000000  0000000000103000  ffe0000000000000
16-19  00000065514c48b9  0000000000000000  0000000000000000  0000000000000000
20-23  0000000000000000  0000000000000000  000000f008008200  0000000000000000
24-27  00000000006b4000  00000001fbc7a000  ffffffffffffffff  ffffffffffffffff
28-31  ffffffffffffffff  ffffffffffffffff  0000000040614000  a001011408940009

                                                                 
Space Registers 0 - 7
00-03  000000000096b000  000000000096b000  0000000000000000  000000000096b000
04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000


IIA Space (back entry)       = 0x0000000000000000
IIA Offset (back entry)      = 0x0000000000000000
Check Type                   = 0xe0000000
Cpu State                    = 0x1e000000
Cache Check                  = 0xc0000000
TLB Check                    = 0x40000000
Bus Check                    = 0x00000000
Assists Check                = 0x0096b000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0x0000000000000000
System Requestor Address     = 0x0000000000000000


                                                                 
Floating Point Registers 0 - 31
00-03  0c15580000000000  0000000000000000  0000000000000000  0000000000000000
04-07  0000000a8b7ff33a  a000000000000000  0000000640000000  0000000000000000
08-11  0000000000000000  0000000000000000  0000000000000000  0000000000000000
12-15  0000000000000000  0000000000000000  0000000000000000  0000000000000000
16-19  0000000000000000  0000000000000000  0000000000000000  0000000000000000
20-23  0000000000000000  0000000000000000  0000000000000000  0000000000000000
24-27  0000000000000000  0000000000000000  0000000000000000  0000000000000000
28-31  0000000000000000  0000000000000000  0000000000000000  0000000000000000


PIM Revision                 = 0x0000000000000001                
CPU ID                       = 0x0000000000000014
CPU Revision                 = 0x0000000000000031
Cpu Serial Number            = 0x46100b89e43f0503
Check Summary                = 0xc0400040c2730000
SAL Timestamp                = 0x0000000052483bdd
System Firmware Rev.         = 0x00000ba20000121a
PDC Relocation Address       = 0xfffffff0f0c00000
Available Memory             = 0x00000001ffe00000
CPU Diagnose Register 2      = 0x311202200004200a
MIB_STAT                     = 0x0040000000200000
MIB_LOG1                     = 0x0000000000555500
MIB_LOG2                     = 0x0000000000000000
MIB_ECC_DATA                 = 0x286caf8e14000000
ICache Info                  = 0x0070000000000000
DCache Info                  = 0x0000000000000000
Sharedcache Info1            = 0x0000000000000000
Sharedcache Info2            = 0x0000000000000000
MIB_RSLOG1                   = 0x4930408847b60466
MIB_RSLOG2                   = 0x0a00010000000000
MIB_RQLOG                    = 0xc050408847b69400
MIB_REQLOGa                  = 0xa498204423db2a80
MIB_REQLOGb                  = 0x198280000e008000
Reserved                     = 0x0000000000000000
Cache Repair Detail          = 0x0000000000000000

PIM Detail Text:

                                                                 
--------------  Memory Error Log Information  --------------
Timestamp =   Sun Sep  29 14:40:30 GMT 2013    (20:13:09:29:14:40:30)

  OV  RQ  RS      ESTAT      A  C  D  corr  unc  fe  cw  pf
  --  --  --      -----      -  -  -  ----  ---  --  --  --
          X    ERR_TIMEOUT                               


  General Bus Logs: 
    REQUESTOR_ID               = 0x0000000000000000
    RESPONDER_ID               = 0x0000000000000000
    TARGET_ID                  = 0x00014dbc42204190
    BUS_SPECIFIC_DATA          = 0x0000000000189000
    ERROR_LOG_EN               = 0x0000000000001dff
    ERROR_SIG_EN               = 0x0000000000000157
    ERROR_STATUS               = 0x0000000000000008
    ERROR_OVFL                 = 0x0000000000000000
    ERROR_FIRST                = 0x0000000000000000
                                                                 
  Detailed Bus Logs:  
    AP_ADDRa      = 0x0000000000000000
    AP_ADDRb      = 0x0000000000000000
    ST_ADDRa      = 0x0000000000000000
    ST_ADDRb      = 0x0000000000000000
    RT_ADDRa      = 0x00494dbc42204190
    RT_ADDRb      = 0x0030000700001418
    RP_ADDRa      = 0x0000000000000000
    RP_ADDRb      = 0x0000000000000000
    LE_ADDRa      = 0x0000000000000000
    LE_ADDRb      = 0x0000000000000000
    ST_TO         = 0x0000000000011001
    PT_TO         = 0x000000000007a120
    RT_TO         = 0x0000000000010003


                                                                 
------------  I/O Module Error Log Information  ------------

  No IO subsystem errors recorded


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:50       ` James Bottomley
  2013-10-02  0:41         ` John David Anglin
@ 2013-10-02  1:58         ` John David Anglin
  2013-10-02  8:28         ` Helge Deller
  2 siblings, 0 replies; 14+ messages in thread
From: John David Anglin @ 2013-10-02  1:58 UTC (permalink / raw
  To: James Bottomley
  Cc: Helge Deller, Tejun Heo, Libin, linux-kernel, linux-parisc

On 1-Oct-13, at 6:50 PM, James Bottomley wrote:

> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read()  
>>>>> functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at  
>>>>> least on
>>>>> the parisc architecture this leads to a kernel crash since  
>>>>> accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before  
>>>>> usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by  
>>>>> disabling
>>>>> page faults, clean code should always include such checks.
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will  
>>>>> immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the  
>>>> arch
>>>> isn't implementing probe_kernel_read() properly.  We really have no
>>>> idea what the pointer value may be at the dump point and that's  
>>>> why we
>>>> use probe_kernel_read().  If something like the above is  
>>>> necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation.  James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for  
>> fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
>
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the  
> fix?
> because we shouldn't call do_page_fault if there's no exception table.
>
> James
>
> ---
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code,  
> struct pt_regs *regs)
> 		/* Fall Through */
> 	case 26:
> 		/* PCXL: Data memory access rights trap */
> +		if (!user_mode(regs) && fixup_exception(regs))
> +			return;
> 		fault_address = regs->ior;
> 		fault_space   = regs->isr;
> 		break;


With this change, boot on rp3440 hangs here:

Freeing unused kernel memory: 256K (000000004079c000 - 00000000407dc000)
Loading, please wait...

Dave
--
John David Anglin	dave.anglin@bell.net




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use
  2013-10-01 22:50       ` James Bottomley
  2013-10-02  0:41         ` John David Anglin
  2013-10-02  1:58         ` John David Anglin
@ 2013-10-02  8:28         ` Helge Deller
  2 siblings, 0 replies; 14+ messages in thread
From: Helge Deller @ 2013-10-02  8:28 UTC (permalink / raw
  To: James Bottomley; +Cc: Tejun Heo, Libin, linux-kernel, linux-parisc

On 10/02/2013 12:50 AM, James Bottomley wrote:
> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read() functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at least on
>>>>> the parisc architecture this leads to a kernel crash since accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by disabling
>>>>> page faults, clean code should always include such checks. 
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the arch
>>>> isn't implementing probe_kernel_read() properly.  We really have no
>>>> idea what the pointer value may be at the dump point and that's why we
>>>> use probe_kernel_read().  If something like the above is necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation.  James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.  
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
> 
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the fix?
> because we shouldn't call do_page_fault if there's no exception table.
>
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
>  		/* Fall Through */
>  	case 26: 
>  		/* PCXL: Data memory access rights trap */
> +		if (!user_mode(regs) && fixup_exception(regs))
> +			return;

You need to check for preempt_count()!=0 too, which has been increased by pagefault_disable() inside of probe_kernel_read().
Otherwise every simple memcpy(dest,NULL,count) (*) will sucessfully be handled here and we won't trap
on generic invalid memory accesses inside the kernel.

But basically your patch does exactly the same as mine.

Helge

(*) memcpy() uses internally pa_memcpy() which defines the fixup tables.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-10-02  8:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-01 20:35 [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use Helge Deller
2013-10-01 20:43 ` Tejun Heo
2013-10-01 20:53   ` Helge Deller
2013-10-01 21:03     ` Tejun Heo
2013-10-01 21:07       ` Tejun Heo
2013-10-01 22:34         ` Helge Deller
2013-10-01 22:40           ` Tejun Heo
2013-10-01 22:47             ` Tejun Heo
2013-10-01 21:40   ` James Bottomley
2013-10-01 22:07     ` Helge Deller
2013-10-01 22:50       ` James Bottomley
2013-10-02  0:41         ` John David Anglin
2013-10-02  1:58         ` John David Anglin
2013-10-02  8:28         ` Helge Deller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).