All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bug in read_buf
@ 2010-04-20  2:16 Neil Brown
  2010-04-20 16:51 ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2010-04-20  2:16 UTC (permalink / raw
  To: J. Bruce Fields; +Cc: linux-nfs


Surely this can never have worked... which implies that the code has
never been used?

When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to.  So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second
page.

I guess we never ever receive requests with any operation starting
beyond the first page!

[[
I found this while looking at why fsstress over NFS over RDMA caused
a bad memory dereference in READ32, suggesting that 'p' had a bad
value.  However it was ffff8801299188f0, which is not an "I've fallen
off the end of the page" sort of value.  So I think it must be a
different bug :-(  It is as if the page is being unmapped underneath
us...
]]
NeilBrown




diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index e170317..34ccf81 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
 	argp->p = page_address(argp->pagelist[0]);
 	argp->pagelist++;
 	if (argp->pagelen < PAGE_SIZE) {
-		argp->end = p + (argp->pagelen>>2);
+		argp->end = argp->p + (argp->pagelen>>2);
 		argp->pagelen = 0;
 	} else {
-		argp->end = p + (PAGE_SIZE>>2);
+		argp->end = argp->p + (PAGE_SIZE>>2);
 		argp->pagelen -= PAGE_SIZE;
 	}
 	memcpy(((char*)p)+avail, argp->p, (nbytes - avail));
@@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
 			argp->p = page_address(argp->pagelist[0]);
 			argp->pagelist++;
 			if (argp->pagelen < PAGE_SIZE) {
-				argp->end = p + (argp->pagelen>>2);
+				argp->end = argp->p + (argp->pagelen>>2);
 				argp->pagelen = 0;
 			} else {
-				argp->end = p + (PAGE_SIZE>>2);
+				argp->end = argp->p + (PAGE_SIZE>>2);
 				argp->pagelen -= PAGE_SIZE;
 			}
 		}



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-20  2:16 [PATCH] bug in read_buf Neil Brown
@ 2010-04-20 16:51 ` J. Bruce Fields
  2010-04-20 19:24   ` William A. (Andy) Adamson
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-04-20 16:51 UTC (permalink / raw
  To: Neil Brown; +Cc: linux-nfs

On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
> 
> Surely this can never have worked... which implies that the code has
> never been used?
> 
> When read_buf is called to move over to the next page in the pagelist
> of an NFSv4 request, it sets argp->end to essentially a random
> number, certainly not an address within the page which argp->p now
> points to.  So subsequent calls to READ_BUF will think there is much
> more than a page of spare space (the cast to u32 ensures an unsigned
> comparison) so we can expect to fall off the end of the second
> page.

Yipes, thanks.

> I guess we never ever receive requests with any operation starting
> beyond the first page!

putfh-write-getattr, for example, is common enough.  The write decoding
should leave arg->end set correctly.  But there are two read_buf()'s in
decode_getattr(), and I can't see why we don't hit this bug on a write
that leaves that final getattr exactly straddling a page boundary.

--b.

> [[
> I found this while looking at why fsstress over NFS over RDMA caused
> a bad memory dereference in READ32, suggesting that 'p' had a bad
> value.  However it was ffff8801299188f0, which is not an "I've fallen
> off the end of the page" sort of value.  So I think it must be a
> different bug :-(  It is as if the page is being unmapped underneath
> us...
> ]]
> NeilBrown
> 
> 
> 
> 
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index e170317..34ccf81 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes)
>  	argp->p = page_address(argp->pagelist[0]);
>  	argp->pagelist++;
>  	if (argp->pagelen < PAGE_SIZE) {
> -		argp->end = p + (argp->pagelen>>2);
> +		argp->end = argp->p + (argp->pagelen>>2);
>  		argp->pagelen = 0;
>  	} else {
> -		argp->end = p + (PAGE_SIZE>>2);
> +		argp->end = argp->p + (PAGE_SIZE>>2);
>  		argp->pagelen -= PAGE_SIZE;
>  	}
>  	memcpy(((char*)p)+avail, argp->p, (nbytes - avail));
> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
>  			argp->p = page_address(argp->pagelist[0]);
>  			argp->pagelist++;
>  			if (argp->pagelen < PAGE_SIZE) {
> -				argp->end = p + (argp->pagelen>>2);
> +				argp->end = argp->p + (argp->pagelen>>2);
>  				argp->pagelen = 0;
>  			} else {
> -				argp->end = p + (PAGE_SIZE>>2);
> +				argp->end = argp->p + (PAGE_SIZE>>2);
>  				argp->pagelen -= PAGE_SIZE;
>  			}
>  		}
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-20 16:51 ` J. Bruce Fields
@ 2010-04-20 19:24   ` William A. (Andy) Adamson
       [not found]     ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: William A. (Andy) Adamson @ 2010-04-20 19:24 UTC (permalink / raw
  To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs

On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.org=
> wrote:
> On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
>>
>> Surely this can never have worked... which implies that the code has
>> never been used?
>>
>> When read_buf is called to move over to the next page in the pagelis=
t
>> of an NFSv4 request, it sets argp->end to essentially a random
>> number, certainly not an address within the page which argp->p now
>> points to. =A0So subsequent calls to READ_BUF will think there is mu=
ch
>> more than a page of spare space (the cast to u32 ensures an unsigned
>> comparison) so we can expect to fall off the end of the second
>> page.
>
> Yipes, thanks.
>
>> I guess we never ever receive requests with any operation starting
>> beyond the first page!
>
> putfh-write-getattr, for example, is common enough. =A0The write deco=
ding
> should leave arg->end set correctly. =A0But there are two read_buf()'=
s in
> decode_getattr(), and I can't see why we don't hit this bug on a writ=
e
> that leaves that final getattr exactly straddling a page boundary.

The write data is dumped into the rq_vec which has non-contiguous
pages. So the xdr_buf head only holds the putfh result, the short
write response header (v4 stateid, offset, how, length, etc), and then
the getattr. so there is plenty of space.

-->Andy

>
> --b.
>
>> [[
>> I found this while looking at why fsstress over NFS over RDMA caused
>> a bad memory dereference in READ32, suggesting that 'p' had a bad
>> value. =A0However it was ffff8801299188f0, which is not an "I've fal=
len
>> off the end of the page" sort of value. =A0So I think it must be a
>> different bug :-( =A0It is as if the page is being unmapped undernea=
th
>> us...
>> ]]
>> NeilBrown
>>
>>
>>
>>
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index e170317..34ccf81 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compounda=
rgs *argp, u32 nbytes)
>> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]);
>> =A0 =A0 =A0 argp->pagelist++;
>> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2);
>> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>>2)=
;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0;
>> =A0 =A0 =A0 } else {
>> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2);
>> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE;
>> =A0 =A0 =A0 }
>> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail));
>> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compounda=
rgs *argp)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_address=
(argp->pagelist[0]);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < PAGE=
_SIZE) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end =
=3D p + (argp->pagelen>>2);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end =
=3D argp->p + (argp->pagelen>>2);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa=
gelen =3D 0;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else {
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end =
=3D p + (PAGE_SIZE>>2);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end =
=3D argp->p + (PAGE_SIZE>>2);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa=
gelen -=3D PAGE_SIZE;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
       [not found]     ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-04-20 19:39       ` J. Bruce Fields
  2010-04-21 22:35         ` J. Bruce Fields
  2010-04-22 15:41         ` William A. (Andy) Adamson
  0 siblings, 2 replies; 8+ messages in thread
From: J. Bruce Fields @ 2010-04-20 19:39 UTC (permalink / raw
  To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs

On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson wro=
te:
> On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.o=
rg> wrote:
> > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
> >>
> >> Surely this can never have worked... which implies that the code h=
as
> >> never been used?
> >>
> >> When read_buf is called to move over to the next page in the pagel=
ist
> >> of an NFSv4 request, it sets argp->end to essentially a random
> >> number, certainly not an address within the page which argp->p now
> >> points to. =C2=A0So subsequent calls to READ_BUF will think there =
is much
> >> more than a page of spare space (the cast to u32 ensures an unsign=
ed
> >> comparison) so we can expect to fall off the end of the second
> >> page.
> >
> > Yipes, thanks.
> >
> >> I guess we never ever receive requests with any operation starting
> >> beyond the first page!
> >
> > putfh-write-getattr, for example, is common enough. =C2=A0The write=
 decoding
> > should leave arg->end set correctly. =C2=A0But there are two read_b=
uf()'s in
> > decode_getattr(), and I can't see why we don't hit this bug on a wr=
ite
> > that leaves that final getattr exactly straddling a page boundary.
>=20
> The write data is dumped into the rq_vec which has non-contiguous
> pages. So the xdr_buf head only holds the putfh result, the short
> write response header (v4 stateid, offset, how, length, etc), and the=
n
> the getattr. so there is plenty of space.

This is the server-side write-decoding, so you could see:


	rpc header | putfh | write ... data ... | getattr
						     ^
						     |
						page boundary here

--b.

>=20
> -->Andy
>=20
> >
> > --b.
> >
> >> [[
> >> I found this while looking at why fsstress over NFS over RDMA caus=
ed
> >> a bad memory dereference in READ32, suggesting that 'p' had a bad
> >> value. =C2=A0However it was ffff8801299188f0, which is not an "I'v=
e fallen
> >> off the end of the page" sort of value. =C2=A0So I think it must b=
e a
> >> different bug :-( =C2=A0It is as if the page is being unmapped und=
erneath
> >> us...
> >> ]]
> >> NeilBrown
> >>
> >>
> >>
> >>
> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >> index e170317..34ccf81 100644
> >> --- a/fs/nfsd/nfs4xdr.c
> >> +++ b/fs/nfsd/nfs4xdr.c
> >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoun=
dargs *argp, u32 nbytes)
> >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]);
> >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++;
> >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) {
> >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (arg=
p->pagelen>>2);
> >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p =
+ (argp->pagelen>>2);
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D=
 0;
> >> =C2=A0 =C2=A0 =C2=A0 } else {
> >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAG=
E_SIZE>>2);
> >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p =
+ (PAGE_SIZE>>2);
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D=
 PAGE_SIZE;
> >> =C2=A0 =C2=A0 =C2=A0 }
> >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes - a=
vail));
> >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoun=
dargs *argp)
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]);
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 argp->pagelist++;
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) {
> >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2);
> >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->pagelen>=
>2);
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0;
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 } else {
> >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2);
> >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>>2);
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE;
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 }
> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> >>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs=
" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.=
html
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-20 19:39       ` J. Bruce Fields
@ 2010-04-21 22:35         ` J. Bruce Fields
  2010-04-21 22:36           ` J. Bruce Fields
  2010-04-22 15:41         ` William A. (Andy) Adamson
  1 sibling, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-04-21 22:35 UTC (permalink / raw
  To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs

On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w=
rote:
> > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses=
=2Eorg> wrote:
> > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
> > >>
> > >> Surely this can never have worked... which implies that the code=
 has
> > >> never been used?
> > >>
> > >> When read_buf is called to move over to the next page in the pag=
elist
> > >> of an NFSv4 request, it sets argp->end to essentially a random
> > >> number, certainly not an address within the page which argp->p n=
ow
> > >> points to. =C2=A0So subsequent calls to READ_BUF will think ther=
e is much
> > >> more than a page of spare space (the cast to u32 ensures an unsi=
gned
> > >> comparison) so we can expect to fall off the end of the second
> > >> page.
> > >
> > > Yipes, thanks.
> > >
> > >> I guess we never ever receive requests with any operation starti=
ng
> > >> beyond the first page!
> > >
> > > putfh-write-getattr, for example, is common enough. =C2=A0The wri=
te decoding
> > > should leave arg->end set correctly. =C2=A0But there are two read=
_buf()'s in
> > > decode_getattr(), and I can't see why we don't hit this bug on a =
write
> > > that leaves that final getattr exactly straddling a page boundary=
=2E
> >=20
> > The write data is dumped into the rq_vec which has non-contiguous
> > pages. So the xdr_buf head only holds the putfh result, the short
> > write response header (v4 stateid, offset, how, length, etc), and t=
hen
> > the getattr. so there is plenty of space.
>=20
> This is the server-side write-decoding, so you could see:
>=20
>=20
> 	rpc header | putfh | write ... data ... | getattr
> 						     ^
> 						     |
> 						page boundary here

Hm, I guess even when argp->end is wrong, argp->p is always set to
something sane; so on the next READ_BUF(), when you hit the

	nbytes <=3D (u32)((char *)argp->end - (char *)argp->p

case, you do

	p =3D argp->p;
	argp->p +=3D XDR_QUADLEN(nbytes);

and p is something reasonable.  "end" stays wrong, but that won't be a
problem until you run past the end of the *next* page, which it would
take a very unusual compound to do.

--b.

>=20
> --b.
>=20
> >=20
> > -->Andy
> >=20
> > >
> > > --b.
> > >
> > >> [[
> > >> I found this while looking at why fsstress over NFS over RDMA ca=
used
> > >> a bad memory dereference in READ32, suggesting that 'p' had a ba=
d
> > >> value. =C2=A0However it was ffff8801299188f0, which is not an "I=
've fallen
> > >> off the end of the page" sort of value. =C2=A0So I think it must=
 be a
> > >> different bug :-( =C2=A0It is as if the page is being unmapped u=
nderneath
> > >> us...
> > >> ]]
> > >> NeilBrown
> > >>
> > >>
> > >>
> > >>
> > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > >> index e170317..34ccf81 100644
> > >> --- a/fs/nfsd/nfs4xdr.c
> > >> +++ b/fs/nfsd/nfs4xdr.c
> > >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compo=
undargs *argp, u32 nbytes)
> > >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0])=
;
> > >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++;
> > >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) {
> > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (a=
rgp->pagelen>>2);
> > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->=
p + (argp->pagelen>>2);
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D=
 0;
> > >> =C2=A0 =C2=A0 =C2=A0 } else {
> > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (P=
AGE_SIZE>>2);
> > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->=
p + (PAGE_SIZE>>2);
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=
=3D PAGE_SIZE;
> > >> =C2=A0 =C2=A0 =C2=A0 }
> > >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes -=
 avail));
> > >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compo=
undargs *argp)
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]);
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 argp->pagelist++;
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) {
> > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2=
);
> > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->page=
len>>2);
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0;
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 } else {
> > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2);
> > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>=
>2);
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE;
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 }
> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
> > >>
> > >>
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-n=
fs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-inf=
o.html
> > >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-21 22:35         ` J. Bruce Fields
@ 2010-04-21 22:36           ` J. Bruce Fields
  2010-04-21 23:08             ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2010-04-21 22:36 UTC (permalink / raw
  To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs

On Wed, Apr 21, 2010 at 06:35:27PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote:
> > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson=
 wrote:
> > > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fields=
es.org> wrote:
> > > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
> > > >>
> > > >> Surely this can never have worked... which implies that the co=
de has
> > > >> never been used?
> > > >>
> > > >> When read_buf is called to move over to the next page in the p=
agelist
> > > >> of an NFSv4 request, it sets argp->end to essentially a random
> > > >> number, certainly not an address within the page which argp->p=
 now
> > > >> points to. =C2=A0So subsequent calls to READ_BUF will think th=
ere is much
> > > >> more than a page of spare space (the cast to u32 ensures an un=
signed
> > > >> comparison) so we can expect to fall off the end of the second
> > > >> page.
> > > >
> > > > Yipes, thanks.
> > > >
> > > >> I guess we never ever receive requests with any operation star=
ting
> > > >> beyond the first page!
> > > >
> > > > putfh-write-getattr, for example, is common enough. =C2=A0The w=
rite decoding
> > > > should leave arg->end set correctly. =C2=A0But there are two re=
ad_buf()'s in
> > > > decode_getattr(), and I can't see why we don't hit this bug on =
a write
> > > > that leaves that final getattr exactly straddling a page bounda=
ry.
> > >=20
> > > The write data is dumped into the rq_vec which has non-contiguous
> > > pages. So the xdr_buf head only holds the putfh result, the short
> > > write response header (v4 stateid, offset, how, length, etc), and=
 then
> > > the getattr. so there is plenty of space.
> >=20
> > This is the server-side write-decoding, so you could see:
> >=20
> >=20
> > 	rpc header | putfh | write ... data ... | getattr
> > 						     ^
> > 						     |
> > 						page boundary here
>=20
> Hm, I guess even when argp->end is wrong, argp->p is always set to
> something sane; so on the next READ_BUF(), when you hit the
>=20
> 	nbytes <=3D (u32)((char *)argp->end - (char *)argp->p
>=20
> case, you do
>=20
> 	p =3D argp->p;
> 	argp->p +=3D XDR_QUADLEN(nbytes);
>=20
> and p is something reasonable.  "end" stays wrong, but that won't be =
a
> problem until you run past the end of the *next* page, which it would
> take a very unusual compound to do.

(Nevertheless: applied, for 2.6.34 and stable.)

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-21 22:36           ` J. Bruce Fields
@ 2010-04-21 23:08             ` Neil Brown
  0 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2010-04-21 23:08 UTC (permalink / raw
  To: J. Bruce Fields; +Cc: William A. (Andy) Adamson, linux-nfs

On Wed, 21 Apr 2010 18:36:05 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:


> > Hm, I guess even when argp->end is wrong, argp->p is always set to
> > something sane; so on the next READ_BUF(), when you hit the
> > 
> > 	nbytes <= (u32)((char *)argp->end - (char *)argp->p
> > 
> > case, you do
> > 
> > 	p = argp->p;
> > 	argp->p += XDR_QUADLEN(nbytes);
> > 
> > and p is something reasonable.  "end" stays wrong, but that won't be a
> > problem until you run past the end of the *next* page, which it would
> > take a very unusual compound to do.

Yes, it would not be an easy bug to trigger ... it takes away some of the
thrill of finding a bug when you discover that it only affects a corner case
that never ever happens :-(

> 
> (Nevertheless: applied, for 2.6.34 and stable.)

Thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bug in read_buf
  2010-04-20 19:39       ` J. Bruce Fields
  2010-04-21 22:35         ` J. Bruce Fields
@ 2010-04-22 15:41         ` William A. (Andy) Adamson
  1 sibling, 0 replies; 8+ messages in thread
From: William A. (Andy) Adamson @ 2010-04-22 15:41 UTC (permalink / raw
  To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs

On Tue, Apr 20, 2010 at 3:39 PM, J. Bruce Fields <bfields@fieldses.org>=
 wrote:
> On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w=
rote:
>> On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.=
org> wrote:
>> > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote:
>> >>
>> >> Surely this can never have worked... which implies that the code =
has
>> >> never been used?
>> >>
>> >> When read_buf is called to move over to the next page in the page=
list
>> >> of an NFSv4 request, it sets argp->end to essentially a random
>> >> number, certainly not an address within the page which argp->p no=
w
>> >> points to. =A0So subsequent calls to READ_BUF will think there is=
 much
>> >> more than a page of spare space (the cast to u32 ensures an unsig=
ned
>> >> comparison) so we can expect to fall off the end of the second
>> >> page.
>> >
>> > Yipes, thanks.
>> >
>> >> I guess we never ever receive requests with any operation startin=
g
>> >> beyond the first page!
>> >
>> > putfh-write-getattr, for example, is common enough. =A0The write d=
ecoding
>> > should leave arg->end set correctly. =A0But there are two read_buf=
()'s in
>> > decode_getattr(), and I can't see why we don't hit this bug on a w=
rite
>> > that leaves that final getattr exactly straddling a page boundary.
>>
>> The write data is dumped into the rq_vec which has non-contiguous
>> pages. So the xdr_buf head only holds the putfh result, the short
>> write response header (v4 stateid, offset, how, length, etc), and th=
en
>> the getattr. so there is plenty of space.
>
> This is the server-side write-decoding, so you could see:
>
>
> =A0 =A0 =A0 =A0rpc header | putfh | write ... data ... | getattr
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 |
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0page boundary here


ulp - you're right.

-->Andy

>
> --b.
>
>>
>> -->Andy
>>
>> >
>> > --b.
>> >
>> >> [[
>> >> I found this while looking at why fsstress over NFS over RDMA cau=
sed
>> >> a bad memory dereference in READ32, suggesting that 'p' had a bad
>> >> value. =A0However it was ffff8801299188f0, which is not an "I've =
fallen
>> >> off the end of the page" sort of value. =A0So I think it must be =
a
>> >> different bug :-( =A0It is as if the page is being unmapped under=
neath
>> >> us...
>> >> ]]
>> >> NeilBrown
>> >>
>> >>
>> >>
>> >>
>> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> >> index e170317..34ccf81 100644
>> >> --- a/fs/nfsd/nfs4xdr.c
>> >> +++ b/fs/nfsd/nfs4xdr.c
>> >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compou=
ndargs *argp, u32 nbytes)
>> >> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]);
>> >> =A0 =A0 =A0 argp->pagelist++;
>> >> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) {
>> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2);
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>=
>2);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0;
>> >> =A0 =A0 =A0 } else {
>> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2);
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE;
>> >> =A0 =A0 =A0 }
>> >> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail));
>> >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compou=
ndargs *argp)
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_addr=
ess(argp->pagelist[0]);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++;
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < P=
AGE_SIZE) {
>> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e=
nd =3D p + (argp->pagelen>>2);
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e=
nd =3D argp->p + (argp->pagelen>>2);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-=
>pagelen =3D 0;
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else {
>> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e=
nd =3D p + (PAGE_SIZE>>2);
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e=
nd =3D argp->p + (PAGE_SIZE>>2);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-=
>pagelen -=3D PAGE_SIZE;
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> >>
>> >>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-nf=
s" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.ht=
ml
>> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-04-22 15:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-20  2:16 [PATCH] bug in read_buf Neil Brown
2010-04-20 16:51 ` J. Bruce Fields
2010-04-20 19:24   ` William A. (Andy) Adamson
     [not found]     ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-04-20 19:39       ` J. Bruce Fields
2010-04-21 22:35         ` J. Bruce Fields
2010-04-21 22:36           ` J. Bruce Fields
2010-04-21 23:08             ` Neil Brown
2010-04-22 15:41         ` William A. (Andy) Adamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.