All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
@ 2011-04-15 18:39 Tim Chen
  2011-04-15 21:09 ` Andi Kleen
  2011-04-16 10:27 ` Shi, Alex
  0 siblings, 2 replies; 7+ messages in thread
From: Tim Chen @ 2011-04-15 18:39 UTC (permalink / raw
  To: Alexander Viro, Nick Piggin
  Cc: Andi Kleen, linux-fsdevel, linux-kernel, shaohua.li, alex.shi

During RCU walk in path_lookupat and path_openat, the rcu lookup
frequently failed because when root directory was looked up, seq number
was not properly set in nameidata.  We dropped out of RCU walk in
nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
reverted to slow path walk that need to take references.

With the following patch, I saw a 50% increase in an exim mail server
benchmark throughput on a 4-socket Nehalem-EX system.

Thanks.

Tim

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
diff --git a/fs/namei.c b/fs/namei.c
index 3cb616d..e4b27a6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -697,6 +697,7 @@ static __always_inline void set_root_rcu(struct nameidata *nd)
 		do {
 			seq = read_seqcount_begin(&fs->seq);
 			nd->root = fs->root;
+			nd->seq = __read_seqcount_begin(&nd->root.dentry->d_seq);
 		} while (read_seqcount_retry(&fs->seq, seq));
 	}
 }



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
  2011-04-15 18:39 [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory Tim Chen
@ 2011-04-15 21:09 ` Andi Kleen
  2011-04-15 21:54     ` Linus Torvalds
  2011-04-16 10:27 ` Shi, Alex
  1 sibling, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2011-04-15 21:09 UTC (permalink / raw
  To: Tim Chen
  Cc: Alexander Viro, Nick Piggin, linux-fsdevel, linux-kernel,
	shaohua.li, alex.shi, torvalds, akpm

On 4/15/2011 11:39 AM, Tim Chen wrote:
> During RCU walk in path_lookupat and path_openat, the rcu lookup
> frequently failed because when root directory was looked up, seq number
> was not properly set in nameidata.  We dropped out of RCU walk in
> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
> reverted to slow path walk that need to take references.

Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
actually make the fabled RCU dcache work -- without it it's just slower 
because
it will fallback nearly allways.

And it's a correctness fix because with the bogus sequence number you 
could fail
to detect a race on root's dentry, leading to very subtle malfunction.

Could it be merged ASAP please?
Also should be a stable candidate for .38 (whoever merges it please
add a Cc: stable@kernel.org # .38)

Reviewed-by: Andi Kleen <ak@linux.intel.com>

-Andi

> With the following patch, I saw a 50% increase in an exim mail server
> benchmark throughput on a 4-socket Nehalem-EX system.
>
> Thanks.
>
> Tim
>
> Signed-off-by: Tim Chen<tim.c.chen@linux.intel.com>
> diff --git a/fs/namei.c b/fs/namei.c
> index 3cb616d..e4b27a6 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -697,6 +697,7 @@ static __always_inline void set_root_rcu(struct nameidata *nd)
>   		do {
>   			seq = read_seqcount_begin(&fs->seq);
>   			nd->root = fs->root;
> +			nd->seq = __read_seqcount_begin(&nd->root.dentry->d_seq);
>   		} while (read_seqcount_retry(&fs->seq, seq));
>   	}
>   }
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
  2011-04-15 21:09 ` Andi Kleen
@ 2011-04-15 21:54     ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-15 21:54 UTC (permalink / raw
  To: Andi Kleen
  Cc: Tim Chen, Alexander Viro, Nick Piggin, linux-fsdevel,
	linux-kernel, shaohua.li, alex.shi, akpm

On Fri, Apr 15, 2011 at 2:09 PM, Andi Kleen <ak@linux.intel.com> wrote:
> On 4/15/2011 11:39 AM, Tim Chen wrote:
>>
>> During RCU walk in path_lookupat and path_openat, the rcu lookup
>> frequently failed because when root directory was looked up, seq number
>> was not properly set in nameidata.  We dropped out of RCU walk in
>> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
>> reverted to slow path walk that need to take references.
>
> Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
> actually make the fabled RCU dcache work -- without it it's just slower
> because
> it will fallback nearly allways.

Well, only for absolute paths, but yes.

I think all my benchmarking was for thing like "git diff", which are
all about the relative paths and wouldn't have triggered this case.

So this patch does look correct, and yes, should also be stable material for 38.

And it's pretty fundamental. But I'd like to get a few more acks
exactly because it's so fundamental. Al and Nick sadly are both gone.
Anybody else feel like they know something about the path lookup code?

                             Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
@ 2011-04-15 21:54     ` Linus Torvalds
  0 siblings, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2011-04-15 21:54 UTC (permalink / raw
  To: Andi Kleen
  Cc: Tim Chen, Alexander Viro, Nick Piggin, linux-fsdevel,
	linux-kernel, shaohua.li, alex.shi, akpm

On Fri, Apr 15, 2011 at 2:09 PM, Andi Kleen <ak@linux.intel.com> wrote:
> On 4/15/2011 11:39 AM, Tim Chen wrote:
>>
>> During RCU walk in path_lookupat and path_openat, the rcu lookup
>> frequently failed because when root directory was looked up, seq number
>> was not properly set in nameidata.  We dropped out of RCU walk in
>> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
>> reverted to slow path walk that need to take references.
>
> Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
> actually make the fabled RCU dcache work -- without it it's just slower
> because
> it will fallback nearly allways.

Well, only for absolute paths, but yes.

I think all my benchmarking was for thing like "git diff", which are
all about the relative paths and wouldn't have triggered this case.

So this patch does look correct, and yes, should also be stable material for 38.

And it's pretty fundamental. But I'd like to get a few more acks
exactly because it's so fundamental. Al and Nick sadly are both gone.
Anybody else feel like they know something about the path lookup code?

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
  2011-04-15 21:54     ` Linus Torvalds
@ 2011-04-16  5:43       ` Sedat Dilek
  -1 siblings, 0 replies; 7+ messages in thread
From: Sedat Dilek @ 2011-04-16  5:43 UTC (permalink / raw
  To: Linus Torvalds
  Cc: Andi Kleen, Tim Chen, Alexander Viro, Nick Piggin, linux-fsdevel,
	linux-kernel, shaohua.li, alex.shi, akpm

On Fri, Apr 15, 2011 at 11:54 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Apr 15, 2011 at 2:09 PM, Andi Kleen <ak@linux.intel.com> wrote:
>> On 4/15/2011 11:39 AM, Tim Chen wrote:
>>>
>>> During RCU walk in path_lookupat and path_openat, the rcu lookup
>>> frequently failed because when root directory was looked up, seq number
>>> was not properly set in nameidata.  We dropped out of RCU walk in
>>> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
>>> reverted to slow path walk that need to take references.
>>
>> Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
>> actually make the fabled RCU dcache work -- without it it's just slower
>> because
>> it will fallback nearly allways.
>
> Well, only for absolute paths, but yes.
>
> I think all my benchmarking was for thing like "git diff", which are
> all about the relative paths and wouldn't have triggered this case.
>
> So this patch does look correct, and yes, should also be stable material for 38.
>
> And it's pretty fundamental. But I'd like to get a few more acks
> exactly because it's so fundamental. Al and Nick sadly are both gone.
> Anybody else feel like they know something about the path lookup code?
>
>                             Linus
>

I see this patch was pushed.

What's with fs-synchronize_rcu-when-unregister_filesystem-success-not-failure.patch
from [1]?

- Sedat -

[1] https://patchwork.kernel.org/patch/707322/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
@ 2011-04-16  5:43       ` Sedat Dilek
  0 siblings, 0 replies; 7+ messages in thread
From: Sedat Dilek @ 2011-04-16  5:43 UTC (permalink / raw
  To: Linus Torvalds
  Cc: Andi Kleen, Tim Chen, Alexander Viro, Nick Piggin, linux-fsdevel,
	linux-kernel, shaohua.li, alex.shi, akpm

On Fri, Apr 15, 2011 at 11:54 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Apr 15, 2011 at 2:09 PM, Andi Kleen <ak@linux.intel.com> wrote:
>> On 4/15/2011 11:39 AM, Tim Chen wrote:
>>>
>>> During RCU walk in path_lookupat and path_openat, the rcu lookup
>>> frequently failed because when root directory was looked up, seq number
>>> was not properly set in nameidata.  We dropped out of RCU walk in
>>> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
>>> reverted to slow path walk that need to take references.
>>
>> Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
>> actually make the fabled RCU dcache work -- without it it's just slower
>> because
>> it will fallback nearly allways.
>
> Well, only for absolute paths, but yes.
>
> I think all my benchmarking was for thing like "git diff", which are
> all about the relative paths and wouldn't have triggered this case.
>
> So this patch does look correct, and yes, should also be stable material for 38.
>
> And it's pretty fundamental. But I'd like to get a few more acks
> exactly because it's so fundamental. Al and Nick sadly are both gone.
> Anybody else feel like they know something about the path lookup code?
>
>                             Linus
>

I see this patch was pushed.

What's with fs-synchronize_rcu-when-unregister_filesystem-success-not-failure.patch
from [1]?

- Sedat -

[1] https://patchwork.kernel.org/patch/707322/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
  2011-04-15 18:39 [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory Tim Chen
  2011-04-15 21:09 ` Andi Kleen
@ 2011-04-16 10:27 ` Shi, Alex
  1 sibling, 0 replies; 7+ messages in thread
From: Shi, Alex @ 2011-04-16 10:27 UTC (permalink / raw
  To: Tim Chen, Alexander Viro, Nick Piggin
  Cc: Andi Kleen, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, Li, Shaohua

This can fix dbenchthreads/aim7 regressions from 39-rc1 kernel, that caused by path_init_rcu/path_init merge. 


Regards! 
Alex  
>-----Original Message-----
>From: Tim Chen [mailto:tim.c.chen@linux.intel.com]
>Sent: Saturday, April 16, 2011 2:39 AM
>To: Alexander Viro; Nick Piggin
>Cc: Andi Kleen; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Shaohua; Shi, Alex
>Subject: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory
>
>During RCU walk in path_lookupat and path_openat, the rcu lookup
>frequently failed because when root directory was looked up, seq number
>was not properly set in nameidata.  We dropped out of RCU walk in
>nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
>reverted to slow path walk that need to take references.
>
>With the following patch, I saw a 50% increase in an exim mail server
>benchmark throughput on a 4-socket Nehalem-EX system.
>
>Thanks.
>
>Tim
>
>Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>diff --git a/fs/namei.c b/fs/namei.c
>index 3cb616d..e4b27a6 100644
>--- a/fs/namei.c
>+++ b/fs/namei.c
>@@ -697,6 +697,7 @@ static __always_inline void set_root_rcu(struct nameidata *nd)
> 		do {
> 			seq = read_seqcount_begin(&fs->seq);
> 			nd->root = fs->root;
>+			nd->seq = __read_seqcount_begin(&nd->root.dentry->d_seq);
> 		} while (read_seqcount_retry(&fs->seq, seq));
> 	}
> }
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-04-16 10:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-15 18:39 [PATCH] vfs: Fix RCU path walk failiures due to uninitialized nameidata seq number for root directory Tim Chen
2011-04-15 21:09 ` Andi Kleen
2011-04-15 21:54   ` Linus Torvalds
2011-04-15 21:54     ` Linus Torvalds
2011-04-16  5:43     ` Sedat Dilek
2011-04-16  5:43       ` Sedat Dilek
2011-04-16 10:27 ` Shi, Alex

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.