All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mld, igmp: Fix reserved tailroom calculation
@ 2016-02-27 19:57 Benjamin Poirier
  2016-02-29 14:57 ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-27 19:57 UTC (permalink / raw
  To: netdev
  Cc: Daniel Borkmann, Eric Dumazet, Hannes Frederic Sowa,
	Hideaki YOSHIFUJI

The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---
 net/ipv4/igmp.c  | 4 ++--
 net/ipv6/mcast.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 05e4cba..b5d28a4 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -356,9 +356,9 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
 	skb_dst_set(skb, &rt->dst);
 	skb->dev = dev;
 
-	skb->reserved_tailroom = skb_end_offset(skb) -
-				 min(mtu, skb_end_offset(skb));
 	skb_reserve(skb, hlen);
+	skb->reserved_tailroom = skb_tailroom(skb) -
+		min_t(int, mtu, skb_tailroom(skb) - tlen);
 
 	skb_reset_network_header(skb);
 	pip = ip_hdr(skb);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5ee56d0..c157edc 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
 		return NULL;
 
 	skb->priority = TC_PRIO_CONTROL;
-	skb->reserved_tailroom = skb_end_offset(skb) -
-				 min(mtu, skb_end_offset(skb));
 	skb_reserve(skb, hlen);
+	skb->reserved_tailroom = skb_tailroom(skb) -
+		min_t(int, mtu, skb_tailroom(skb) - tlen);
 
 	if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
 		/* <draft-ietf-magma-mld-source-05.txt>:
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-27 19:57 [PATCH] mld, igmp: Fix reserved tailroom calculation Benjamin Poirier
@ 2016-02-29 14:57 ` Daniel Borkmann
  2016-02-29 15:19   ` Benjamin Poirier
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2016-02-29 14:57 UTC (permalink / raw
  To: Benjamin Poirier, netdev
  Cc: Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI

Hi Benjamin,

[ -Cc my old RH address ]

On 02/27/2016 08:57 PM, Benjamin Poirier wrote:
> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
>
> skb:
> [__hlen__|__data____________|__tlen___|__extra__]
> ^                                               ^
> head                                            skb_end_offset
>
> In this representation, hlen + data + tlen is the size passed to alloc_skb.
> "extra" is the extra space made available in __alloc_skb because of
> rounding up by kmalloc. We can reorder the representation like so:
>
> [__hlen__|__data____________|__extra__|__tlen___]
> ^                                               ^
> head                                            skb_end_offset
>
> The maximum space available for ip headers and payload without
> fragmentation is min(mtu, data + extra). Therefore,
> reserved_tailroom
> = data + extra + tlen - min(mtu, data + extra)
> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> Compare the second line to the current expression:
> reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
> and we can see that hlen and tlen are not taken into account.
>
> Depending on hlen, tlen, mtu and the number of multicast address records,
> the current code may output skbs that have less tailroom than
> dev->needed_tailroom or it may output more skbs than needed because not all
> space available is used.
>
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
> ---
>   net/ipv4/igmp.c  | 4 ++--
>   net/ipv6/mcast.c | 4 ++--
>   2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 05e4cba..b5d28a4 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
[...]

[ cutting the IPv4 part off as diff is the same ]

> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index 5ee56d0..c157edc 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
>   		return NULL;
>
>   	skb->priority = TC_PRIO_CONTROL;
> -	skb->reserved_tailroom = skb_end_offset(skb) -
> -				 min(mtu, skb_end_offset(skb));
>   	skb_reserve(skb, hlen);
> +	skb->reserved_tailroom = skb_tailroom(skb) -
> +		min_t(int, mtu, skb_tailroom(skb) - tlen);

Are you sure this is correct? Wouldn't that mean (assuming we allocated
enough space), that I could now fill a larger than MTU frame?

>   	if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
>   		/* <draft-ietf-magma-mld-source-05.txt>:

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 14:57 ` Daniel Borkmann
@ 2016-02-29 15:19   ` Benjamin Poirier
  2016-02-29 15:38     ` Daniel Borkmann
  2016-02-29 15:43     ` Hannes Frederic Sowa
  0 siblings, 2 replies; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 15:19 UTC (permalink / raw
  To: Daniel Borkmann
  Cc: netdev, Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI

On 2016/02/29 15:57, Daniel Borkmann wrote:
[...]
> 
> [ cutting the IPv4 part off as diff is the same ]
> 
> >diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> >index 5ee56d0..c157edc 100644
> >--- a/net/ipv6/mcast.c
> >+++ b/net/ipv6/mcast.c
> >@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> >  		return NULL;
> >
> >  	skb->priority = TC_PRIO_CONTROL;
> >-	skb->reserved_tailroom = skb_end_offset(skb) -
> >-				 min(mtu, skb_end_offset(skb));
> >  	skb_reserve(skb, hlen);
> >+	skb->reserved_tailroom = skb_tailroom(skb) -
> >+		min_t(int, mtu, skb_tailroom(skb) - tlen);
> 
> Are you sure this is correct? Wouldn't that mean (assuming we allocated
> enough space), that I could now fill a larger than MTU frame?

Quoting back a part of the log:

> >The maximum space available for ip headers and payload without
> >fragmentation is min(mtu, data + extra). Therefore,
> >reserved_tailroom
> >= data + extra + tlen - min(mtu, data + extra)
> >= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> >= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

The min() takes care of the situation you describe, ie. if the allocated
space is large, reserved_tailroom will be large enough that we do not
use more space than the mtu.

I tested the mld and igmp code with different driver parameters, mtu
values, number of multicast address records and even allocation
failures. If you think the formula is wrong, please provide a
counter-example with hlen, tlen, mtu and size values.

Regards,
-Benjamin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 15:19   ` Benjamin Poirier
@ 2016-02-29 15:38     ` Daniel Borkmann
  2016-02-29 15:43     ` Hannes Frederic Sowa
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2016-02-29 15:38 UTC (permalink / raw
  To: Benjamin Poirier
  Cc: netdev, Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI

On 02/29/2016 04:19 PM, Benjamin Poirier wrote:
> On 2016/02/29 15:57, Daniel Borkmann wrote:
> [...]
>>
>> [ cutting the IPv4 part off as diff is the same ]
>>
>>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>>> index 5ee56d0..c157edc 100644
>>> --- a/net/ipv6/mcast.c
>>> +++ b/net/ipv6/mcast.c
>>> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
>>>   		return NULL;
>>>
>>>   	skb->priority = TC_PRIO_CONTROL;
>>> -	skb->reserved_tailroom = skb_end_offset(skb) -
>>> -				 min(mtu, skb_end_offset(skb));
>>>   	skb_reserve(skb, hlen);
>>> +	skb->reserved_tailroom = skb_tailroom(skb) -
>>> +		min_t(int, mtu, skb_tailroom(skb) - tlen);
>>
>> Are you sure this is correct? Wouldn't that mean (assuming we allocated
>> enough space), that I could now fill a larger than MTU frame?
>
> Quoting back a part of the log:
>
>>> The maximum space available for ip headers and payload without
>>> fragmentation is min(mtu, data + extra). Therefore,
>>> reserved_tailroom
>>> = data + extra + tlen - min(mtu, data + extra)
>>> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
>>> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> The min() takes care of the situation you describe, ie. if the allocated
> space is large, reserved_tailroom will be large enough that we do not
> use more space than the mtu.

Hmm, sorry, you are right, I had a bug in my thought process wrt the
skb_reserve() that is now done first.

Code is fine, patch would be against -net tree:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Thanks, Benjamin!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 15:19   ` Benjamin Poirier
  2016-02-29 15:38     ` Daniel Borkmann
@ 2016-02-29 15:43     ` Hannes Frederic Sowa
  2016-02-29 18:08       ` Benjamin Poirier
  1 sibling, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-02-29 15:43 UTC (permalink / raw
  To: Benjamin Poirier, Daniel Borkmann; +Cc: netdev, Eric Dumazet, Hideaki YOSHIFUJI

On 29.02.2016 16:19, Benjamin Poirier wrote:
> On 2016/02/29 15:57, Daniel Borkmann wrote:
> [...]
>>
>> [ cutting the IPv4 part off as diff is the same ]
>>
>>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>>> index 5ee56d0..c157edc 100644
>>> --- a/net/ipv6/mcast.c
>>> +++ b/net/ipv6/mcast.c
>>> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
>>>   		return NULL;
>>>
>>>   	skb->priority = TC_PRIO_CONTROL;
>>> -	skb->reserved_tailroom = skb_end_offset(skb) -
>>> -				 min(mtu, skb_end_offset(skb));
>>>   	skb_reserve(skb, hlen);
>>> +	skb->reserved_tailroom = skb_tailroom(skb) -
>>> +		min_t(int, mtu, skb_tailroom(skb) - tlen);
>>
>> Are you sure this is correct? Wouldn't that mean (assuming we allocated
>> enough space), that I could now fill a larger than MTU frame?
>
> Quoting back a part of the log:
>
>>> The maximum space available for ip headers and payload without
>>> fragmentation is min(mtu, data + extra). Therefore,
>>> reserved_tailroom
>>> = data + extra + tlen - min(mtu, data + extra)
>>> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
>>> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> The min() takes care of the situation you describe, ie. if the allocated
> space is large, reserved_tailroom will be large enough that we do not
> use more space than the mtu.
>
> I tested the mld and igmp code with different driver parameters, mtu
> values, number of multicast address records and even allocation
> failures. If you think the formula is wrong, please provide a
> counter-example with hlen, tlen, mtu and size values.

I think the code is fine albeit I think we should remove the min macro 
and just do something:

if (skb_tailroom(skb) > mtu)
	skb->reserved_tailroom = skb_tailroom(skb) - mtu;

Does that make sense? I think it is much more readable.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 15:43     ` Hannes Frederic Sowa
@ 2016-02-29 18:08       ` Benjamin Poirier
  2016-02-29 18:28         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 18:08 UTC (permalink / raw
  To: Hannes Frederic Sowa
  Cc: Daniel Borkmann, netdev, Eric Dumazet, Hideaki YOSHIFUJI

On 2016/02/29 16:43, Hannes Frederic Sowa wrote:
> On 29.02.2016 16:19, Benjamin Poirier wrote:
> >On 2016/02/29 15:57, Daniel Borkmann wrote:
> >[...]
> >>
> >>[ cutting the IPv4 part off as diff is the same ]
> >>
> >>>diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> >>>index 5ee56d0..c157edc 100644
> >>>--- a/net/ipv6/mcast.c
> >>>+++ b/net/ipv6/mcast.c
> >>>@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> >>>  		return NULL;
> >>>
> >>>  	skb->priority = TC_PRIO_CONTROL;
> >>>-	skb->reserved_tailroom = skb_end_offset(skb) -
> >>>-				 min(mtu, skb_end_offset(skb));
> >>>  	skb_reserve(skb, hlen);
> >>>+	skb->reserved_tailroom = skb_tailroom(skb) -
> >>>+		min_t(int, mtu, skb_tailroom(skb) - tlen);
> >>
> >>Are you sure this is correct? Wouldn't that mean (assuming we allocated
> >>enough space), that I could now fill a larger than MTU frame?
> >
> >Quoting back a part of the log:
> >
> >>>The maximum space available for ip headers and payload without
> >>>fragmentation is min(mtu, data + extra). Therefore,
> >>>reserved_tailroom
> >>>= data + extra + tlen - min(mtu, data + extra)
> >>>= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> >>>= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
> >
> >The min() takes care of the situation you describe, ie. if the allocated
> >space is large, reserved_tailroom will be large enough that we do not
> >use more space than the mtu.
> >
> >I tested the mld and igmp code with different driver parameters, mtu
> >values, number of multicast address records and even allocation
> >failures. If you think the formula is wrong, please provide a
> >counter-example with hlen, tlen, mtu and size values.
> 
> I think the code is fine albeit I think we should remove the min macro and
> just do something:
> 
> if (skb_tailroom(skb) > mtu)
> 	skb->reserved_tailroom = skb_tailroom(skb) - mtu;
> 
> Does that make sense? I think it is much more readable.

That is not equivalent. It fails to take tlen into account.

For igmp, consider this case:
with hlen = 16, mtu = 9000, tlen = 8,
additionally, suppose that the first iteration of the allocation loop
(alloc_skb(9000 + 16 + 8, ...) which requires 4 pages) fails and the
second iteration (alloc_skb((9000 >> 1) + 16 + 8, ...) which requires 2
pages) succeeds:
	size = (9000 >> 1) + 16 + 8 = 4524
	skb_end_offset = 8192 - 320 = 7872
	tailroom = 7872 - 16 = 7856

	data = 9000 >> 1 = 4500
	extra = 7872 - 4524 = 3348

	reserved tailroom (patch version)
		= 4500 + 3348 + 8 - min(9000, 4500 + 3348)
		= 8
	reserved tailroom (your version)
		= 0

	Headers are ipv4 + igmpv3 = 24 + 8 = 32, records are 8 bytes
	With 978 igmpv3 records, with your version, we would output an
	skb that has less tailroom (0) than dev->needed_tailroom (8).

For mld, consider this case:
with hlen = 16, mtu = 9000, tlen = 8:
	size = 3776 (SKB_MAX_ORDER case)
	skb_end_offset = 3776
	tailroom = 3776 - 16 = 3760

	data = 3776 - 16 - 8 = 3752
	extra = 0

	reserved tailroom (patch version)
		= 3752 + 0 + 8 - min(9000, 3752 + 0)
		= 8
	reserved tailroom (your version)
		= 0

	Headers are ipv6 + icmpv6 = 48 + 8 = 56, records are 20 bytes
	With 185 mld records, with your formula, we would output an skb that
	has less tailroom (4) than dev->needed_tailroom (8).

If you think we should write the expression with "if" instead of "min",
instead of the current

+	skb->reserved_tailroom = skb_tailroom(skb) -
+		min_t(int, mtu, skb_tailroom(skb) - tlen);

it should be:

+	if (mtu < skb_tailroom(skb) - tlen)
+		skb->reserved_tailroom = skb_tailroom(skb) - mtu;
+	else
+		skb->reserved_tailroom = tlen;

The second alternative does not look more readable to me but I have been
looking at that expression for a while. If you think that it is more
readable, I will resend the patch expressed that way. Please let me
know.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 18:08       ` Benjamin Poirier
@ 2016-02-29 18:28         ` Hannes Frederic Sowa
  2016-02-29 23:03           ` [PATCH net v2] " Benjamin Poirier
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-02-29 18:28 UTC (permalink / raw
  To: Benjamin Poirier; +Cc: Daniel Borkmann, netdev, Eric Dumazet, Hideaki YOSHIFUJI

On 29.02.2016 19:08, Benjamin Poirier wrote:
> If you think we should write the expression with "if" instead of "min",
> instead of the current
>
> +	skb->reserved_tailroom = skb_tailroom(skb) -
> +		min_t(int, mtu, skb_tailroom(skb) - tlen);
>
> it should be:
>
> +	if (mtu < skb_tailroom(skb) - tlen)
> +		skb->reserved_tailroom = skb_tailroom(skb) - mtu;
> +	else
> +		skb->reserved_tailroom = tlen;
>
> The second alternative does not look more readable to me but I have been
> looking at that expression for a while. If you think that it is more
> readable, I will resend the patch expressed that way. Please let me
> know.

I would still find it more readable actually, but no strong opinion, I 
would leave it up to you.

Could it make sense to put this code into a static inline helper and 
reuse it for both, igmp and mld?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 18:28         ` Hannes Frederic Sowa
@ 2016-02-29 23:03           ` Benjamin Poirier
  2016-03-01 10:09             ` Hannes Frederic Sowa
                               ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 23:03 UTC (permalink / raw
  To: netdev
  Cc: Daniel Borkmann, Eric Dumazet, Hannes Frederic Sowa,
	Hideaki YOSHIFUJI

The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
	reserved_tailroom = skb_tailroom - mtu
else:
	reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---

Notes:
    Changes v1->v2
    As suggested by Hannes, move the code to an inline helper and express it
    using "if" rather than "min".

 include/linux/skbuff.h | 24 ++++++++++++++++++++++++
 net/ipv4/igmp.c        |  3 +--
 net/ipv6/mcast.c       |  3 +--
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4ce9ff7..d3fcd45 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1985,6 +1985,30 @@ static inline void skb_reserve(struct sk_buff *skb, int len)
 	skb->tail += len;
 }
 
+/**
+ *	skb_tailroom_reserve - adjust reserved_tailroom
+ *	@skb: buffer to alter
+ *	@mtu: maximum amount of headlen permitted
+ *	@needed_tailroom: minimum amount of reserved_tailroom
+ *
+ *	Set reserved_tailroom so that headlen can be as large as possible but
+ *	not larger than mtu and tailroom cannot be smaller than
+ *	needed_tailroom.
+ *	The required headroom should already have been reserved before using
+ *	this function.
+ */
+static inline void skb_tailroom_reserve(struct sk_buff *skb, unsigned int mtu,
+					unsigned int needed_tailroom)
+{
+	SKB_LINEAR_ASSERT(skb);
+	if (mtu < skb_tailroom(skb) - needed_tailroom)
+		/* use at most mtu */
+		skb->reserved_tailroom = skb_tailroom(skb) - mtu;
+	else
+		/* use up to all available space */
+		skb->reserved_tailroom = needed_tailroom;
+}
+
 #define ENCAP_TYPE_ETHER	0
 #define ENCAP_TYPE_IPPROTO	1
 
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 05e4cba..b3086cf 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -356,9 +356,8 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
 	skb_dst_set(skb, &rt->dst);
 	skb->dev = dev;
 
-	skb->reserved_tailroom = skb_end_offset(skb) -
-				 min(mtu, skb_end_offset(skb));
 	skb_reserve(skb, hlen);
+	skb_tailroom_reserve(skb, mtu, tlen);
 
 	skb_reset_network_header(skb);
 	pip = ip_hdr(skb);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5ee56d0..d64ee7e 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1574,9 +1574,8 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
 		return NULL;
 
 	skb->priority = TC_PRIO_CONTROL;
-	skb->reserved_tailroom = skb_end_offset(skb) -
-				 min(mtu, skb_end_offset(skb));
 	skb_reserve(skb, hlen);
+	skb_tailroom_reserve(skb, mtu, tlen);
 
 	if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
 		/* <draft-ietf-magma-mld-source-05.txt>:
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 23:03           ` [PATCH net v2] " Benjamin Poirier
@ 2016-03-01 10:09             ` Hannes Frederic Sowa
  2016-03-01 10:18             ` Daniel Borkmann
  2016-03-03 20:42             ` David Miller
  2 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-01 10:09 UTC (permalink / raw
  To: Benjamin Poirier, netdev; +Cc: Daniel Borkmann, Eric Dumazet, Hideaki YOSHIFUJI

On 01.03.2016 00:03, Benjamin Poirier wrote:
> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
>
> skb:
> [__hlen__|__data____________|__tlen___|__extra__]
> ^                                               ^
> head                                            skb_end_offset
>
> In this representation, hlen + data + tlen is the size passed to alloc_skb.
> "extra" is the extra space made available in __alloc_skb because of
> rounding up by kmalloc. We can reorder the representation like so:
>
> [__hlen__|__data____________|__extra__|__tlen___]
> ^                                               ^
> head                                            skb_end_offset
>
> The maximum space available for ip headers and payload without
> fragmentation is min(mtu, data + extra). Therefore,
> reserved_tailroom
> = data + extra + tlen - min(mtu, data + extra)
> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> Compare the second line to the current expression:
> reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
> and we can see that hlen and tlen are not taken into account.
>
> The min() in the third line can be expanded into:
> if mtu < skb_tailroom - tlen:
> 	reserved_tailroom = skb_tailroom - mtu
> else:
> 	reserved_tailroom = tlen
>
> Depending on hlen, tlen, mtu and the number of multicast address records,
> the current code may output skbs that have less tailroom than
> dev->needed_tailroom or it may output more skbs than needed because not all
> space available is used.
>
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>

I like it, thanks a lot!

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 23:03           ` [PATCH net v2] " Benjamin Poirier
  2016-03-01 10:09             ` Hannes Frederic Sowa
@ 2016-03-01 10:18             ` Daniel Borkmann
  2016-03-01 16:00               ` Hannes Frederic Sowa
  2016-03-03 20:42             ` David Miller
  2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2016-03-01 10:18 UTC (permalink / raw
  To: Benjamin Poirier, netdev
  Cc: Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI

On 03/01/2016 12:03 AM, Benjamin Poirier wrote:
[...]
> Notes:
>      Changes v1->v2
>      As suggested by Hannes, move the code to an inline helper and express it
>      using "if" rather than "min".

The code is correct, thanks!

Therefore:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

However, I actually think that v1 was much better/easier as a fix though. :/

Meaning 1) it's likely easier to backport, and 2) that we now need a comment
above each skb->reserved_tailroom assignment probably says that min() might
perhaps have been more self-documenting ...

skb_tailroom_reserve() looks quite generic, but it only makes sense to use in
combination with skb_availroom(), which would have been good to put a note to
the header comment. Also "the required headroom should already have been
reserved before using this function", places one more requirement for usage.

If we really want to go that path, maybe rather a skb_setroom() that is coupled
with skb_availroom() like:

static inline int __skb_tailroom(const struct sk_buff *skb)
{
	return skb->end - skb->tail;
}

static inline void skb_setroom(struct sk_buff *skb,
                                unsigned int needed_headroom,
                                unsigned int size,
                                unsigned int needed_tailroom)
{
         SKB_LINEAR_ASSERT(skb);

         skb_reserve(skb, needed_headroom);
         skb->reserved_tailroom = needed_tailroom;

         if (size < __skb_tailroom(skb) - needed_tailroom)
                 skb->reserved_tailroom = __skb_tailroom(skb) - size;
}

Then, skb_tailroom() would internally use __skb_tailroom(), too. And we can also
spare us the two unneeded skb_is_nonlinear() checks in our helper which will
currently always evaluate to false anyway.

... just a thought.

Thanks again,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
  2016-03-01 10:18             ` Daniel Borkmann
@ 2016-03-01 16:00               ` Hannes Frederic Sowa
  0 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-01 16:00 UTC (permalink / raw
  To: Daniel Borkmann, Benjamin Poirier, netdev; +Cc: Eric Dumazet, Hideaki YOSHIFUJI

On 01.03.2016 11:18, Daniel Borkmann wrote:
> On 03/01/2016 12:03 AM, Benjamin Poirier wrote:
> [...]
>> Notes:
>>      Changes v1->v2
>>      As suggested by Hannes, move the code to an inline helper and
>> express it
>>      using "if" rather than "min".
>
> The code is correct, thanks!
>
> Therefore:
>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>
> However, I actually think that v1 was much better/easier as a fix
> though. :/
>
> Meaning 1) it's likely easier to backport, and 2) that we now need a
> comment
> above each skb->reserved_tailroom assignment probably says that min() might
> perhaps have been more self-documenting ...
>
> skb_tailroom_reserve() looks quite generic, but it only makes sense to
> use in
> combination with skb_availroom(), which would have been good to put a
> note to
> the header comment. Also "the required headroom should already have been
> reserved before using this function", places one more requirement for
> usage.
>
> If we really want to go that path, maybe rather a skb_setroom() that is
> coupled
> with skb_availroom() like:
>
> static inline int __skb_tailroom(const struct sk_buff *skb)
> {
>      return skb->end - skb->tail;
> }
>
> static inline void skb_setroom(struct sk_buff *skb,
>                                 unsigned int needed_headroom,
>                                 unsigned int size,
>                                 unsigned int needed_tailroom)
> {
>          SKB_LINEAR_ASSERT(skb);
>
>          skb_reserve(skb, needed_headroom);
>          skb->reserved_tailroom = needed_tailroom;
>
>          if (size < __skb_tailroom(skb) - needed_tailroom)
>                  skb->reserved_tailroom = __skb_tailroom(skb) - size;
> }
>
> Then, skb_tailroom() would internally use __skb_tailroom(), too. And we
> can also
> spare us the two unneeded skb_is_nonlinear() checks in our helper which
> will
> currently always evaluate to false anyway.
>
> ... just a thought.

I think it is fine. The code will be inlined anyway and probably skb 
linear assertions will be optimized away.

I like the current code in its form:

skb_reserve(skb, header);
skb_tailroom_reserve(skb, mtu, tailroom);

Combined form has too many arguments, so one has to look up the header 
file again.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
  2016-02-29 23:03           ` [PATCH net v2] " Benjamin Poirier
  2016-03-01 10:09             ` Hannes Frederic Sowa
  2016-03-01 10:18             ` Daniel Borkmann
@ 2016-03-03 20:42             ` David Miller
  2 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2016-03-03 20:42 UTC (permalink / raw
  To: bpoirier; +Cc: netdev, daniel, eric.dumazet, hannes, yoshfuji

From: Benjamin Poirier <bpoirier@suse.com>
Date: Mon, 29 Feb 2016 15:03:33 -0800

> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
 ...
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-03-03 20:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-27 19:57 [PATCH] mld, igmp: Fix reserved tailroom calculation Benjamin Poirier
2016-02-29 14:57 ` Daniel Borkmann
2016-02-29 15:19   ` Benjamin Poirier
2016-02-29 15:38     ` Daniel Borkmann
2016-02-29 15:43     ` Hannes Frederic Sowa
2016-02-29 18:08       ` Benjamin Poirier
2016-02-29 18:28         ` Hannes Frederic Sowa
2016-02-29 23:03           ` [PATCH net v2] " Benjamin Poirier
2016-03-01 10:09             ` Hannes Frederic Sowa
2016-03-01 10:18             ` Daniel Borkmann
2016-03-01 16:00               ` Hannes Frederic Sowa
2016-03-03 20:42             ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.