* [PATCH] mld, igmp: Fix reserved tailroom calculation
@ 2016-02-27 19:57 Benjamin Poirier
2016-02-29 14:57 ` Daniel Borkmann
0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-27 19:57 UTC (permalink / raw
To: netdev
Cc: Daniel Borkmann, Eric Dumazet, Hannes Frederic Sowa,
Hideaki YOSHIFUJI
The current reserved_tailroom calculation fails to take hlen and tlen into
account.
skb:
[__hlen__|__data____________|__tlen___|__extra__]
^ ^
head skb_end_offset
In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:
[__hlen__|__data____________|__extra__|__tlen___]
^ ^
head skb_end_offset
The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.
Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.
Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---
net/ipv4/igmp.c | 4 ++--
net/ipv6/mcast.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 05e4cba..b5d28a4 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -356,9 +356,9 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
skb_dst_set(skb, &rt->dst);
skb->dev = dev;
- skb->reserved_tailroom = skb_end_offset(skb) -
- min(mtu, skb_end_offset(skb));
skb_reserve(skb, hlen);
+ skb->reserved_tailroom = skb_tailroom(skb) -
+ min_t(int, mtu, skb_tailroom(skb) - tlen);
skb_reset_network_header(skb);
pip = ip_hdr(skb);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5ee56d0..c157edc 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
return NULL;
skb->priority = TC_PRIO_CONTROL;
- skb->reserved_tailroom = skb_end_offset(skb) -
- min(mtu, skb_end_offset(skb));
skb_reserve(skb, hlen);
+ skb->reserved_tailroom = skb_tailroom(skb) -
+ min_t(int, mtu, skb_tailroom(skb) - tlen);
if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
/* <draft-ietf-magma-mld-source-05.txt>:
--
2.7.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-27 19:57 [PATCH] mld, igmp: Fix reserved tailroom calculation Benjamin Poirier
@ 2016-02-29 14:57 ` Daniel Borkmann
2016-02-29 15:19 ` Benjamin Poirier
0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2016-02-29 14:57 UTC (permalink / raw
To: Benjamin Poirier, netdev
Cc: Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI
Hi Benjamin,
[ -Cc my old RH address ]
On 02/27/2016 08:57 PM, Benjamin Poirier wrote:
> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
>
> skb:
> [__hlen__|__data____________|__tlen___|__extra__]
> ^ ^
> head skb_end_offset
>
> In this representation, hlen + data + tlen is the size passed to alloc_skb.
> "extra" is the extra space made available in __alloc_skb because of
> rounding up by kmalloc. We can reorder the representation like so:
>
> [__hlen__|__data____________|__extra__|__tlen___]
> ^ ^
> head skb_end_offset
>
> The maximum space available for ip headers and payload without
> fragmentation is min(mtu, data + extra). Therefore,
> reserved_tailroom
> = data + extra + tlen - min(mtu, data + extra)
> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> Compare the second line to the current expression:
> reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
> and we can see that hlen and tlen are not taken into account.
>
> Depending on hlen, tlen, mtu and the number of multicast address records,
> the current code may output skbs that have less tailroom than
> dev->needed_tailroom or it may output more skbs than needed because not all
> space available is used.
>
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
> ---
> net/ipv4/igmp.c | 4 ++--
> net/ipv6/mcast.c | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 05e4cba..b5d28a4 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
[...]
[ cutting the IPv4 part off as diff is the same ]
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index 5ee56d0..c157edc 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> return NULL;
>
> skb->priority = TC_PRIO_CONTROL;
> - skb->reserved_tailroom = skb_end_offset(skb) -
> - min(mtu, skb_end_offset(skb));
> skb_reserve(skb, hlen);
> + skb->reserved_tailroom = skb_tailroom(skb) -
> + min_t(int, mtu, skb_tailroom(skb) - tlen);
Are you sure this is correct? Wouldn't that mean (assuming we allocated
enough space), that I could now fill a larger than MTU frame?
> if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> /* <draft-ietf-magma-mld-source-05.txt>:
Thanks,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-29 14:57 ` Daniel Borkmann
@ 2016-02-29 15:19 ` Benjamin Poirier
2016-02-29 15:38 ` Daniel Borkmann
2016-02-29 15:43 ` Hannes Frederic Sowa
0 siblings, 2 replies; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 15:19 UTC (permalink / raw
To: Daniel Borkmann
Cc: netdev, Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI
On 2016/02/29 15:57, Daniel Borkmann wrote:
[...]
>
> [ cutting the IPv4 part off as diff is the same ]
>
> >diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> >index 5ee56d0..c157edc 100644
> >--- a/net/ipv6/mcast.c
> >+++ b/net/ipv6/mcast.c
> >@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> > return NULL;
> >
> > skb->priority = TC_PRIO_CONTROL;
> >- skb->reserved_tailroom = skb_end_offset(skb) -
> >- min(mtu, skb_end_offset(skb));
> > skb_reserve(skb, hlen);
> >+ skb->reserved_tailroom = skb_tailroom(skb) -
> >+ min_t(int, mtu, skb_tailroom(skb) - tlen);
>
> Are you sure this is correct? Wouldn't that mean (assuming we allocated
> enough space), that I could now fill a larger than MTU frame?
Quoting back a part of the log:
> >The maximum space available for ip headers and payload without
> >fragmentation is min(mtu, data + extra). Therefore,
> >reserved_tailroom
> >= data + extra + tlen - min(mtu, data + extra)
> >= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> >= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
The min() takes care of the situation you describe, ie. if the allocated
space is large, reserved_tailroom will be large enough that we do not
use more space than the mtu.
I tested the mld and igmp code with different driver parameters, mtu
values, number of multicast address records and even allocation
failures. If you think the formula is wrong, please provide a
counter-example with hlen, tlen, mtu and size values.
Regards,
-Benjamin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-29 15:19 ` Benjamin Poirier
@ 2016-02-29 15:38 ` Daniel Borkmann
2016-02-29 15:43 ` Hannes Frederic Sowa
1 sibling, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2016-02-29 15:38 UTC (permalink / raw
To: Benjamin Poirier
Cc: netdev, Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI
On 02/29/2016 04:19 PM, Benjamin Poirier wrote:
> On 2016/02/29 15:57, Daniel Borkmann wrote:
> [...]
>>
>> [ cutting the IPv4 part off as diff is the same ]
>>
>>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>>> index 5ee56d0..c157edc 100644
>>> --- a/net/ipv6/mcast.c
>>> +++ b/net/ipv6/mcast.c
>>> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
>>> return NULL;
>>>
>>> skb->priority = TC_PRIO_CONTROL;
>>> - skb->reserved_tailroom = skb_end_offset(skb) -
>>> - min(mtu, skb_end_offset(skb));
>>> skb_reserve(skb, hlen);
>>> + skb->reserved_tailroom = skb_tailroom(skb) -
>>> + min_t(int, mtu, skb_tailroom(skb) - tlen);
>>
>> Are you sure this is correct? Wouldn't that mean (assuming we allocated
>> enough space), that I could now fill a larger than MTU frame?
>
> Quoting back a part of the log:
>
>>> The maximum space available for ip headers and payload without
>>> fragmentation is min(mtu, data + extra). Therefore,
>>> reserved_tailroom
>>> = data + extra + tlen - min(mtu, data + extra)
>>> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
>>> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> The min() takes care of the situation you describe, ie. if the allocated
> space is large, reserved_tailroom will be large enough that we do not
> use more space than the mtu.
Hmm, sorry, you are right, I had a bug in my thought process wrt the
skb_reserve() that is now done first.
Code is fine, patch would be against -net tree:
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Thanks, Benjamin!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-29 15:19 ` Benjamin Poirier
2016-02-29 15:38 ` Daniel Borkmann
@ 2016-02-29 15:43 ` Hannes Frederic Sowa
2016-02-29 18:08 ` Benjamin Poirier
1 sibling, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-02-29 15:43 UTC (permalink / raw
To: Benjamin Poirier, Daniel Borkmann; +Cc: netdev, Eric Dumazet, Hideaki YOSHIFUJI
On 29.02.2016 16:19, Benjamin Poirier wrote:
> On 2016/02/29 15:57, Daniel Borkmann wrote:
> [...]
>>
>> [ cutting the IPv4 part off as diff is the same ]
>>
>>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>>> index 5ee56d0..c157edc 100644
>>> --- a/net/ipv6/mcast.c
>>> +++ b/net/ipv6/mcast.c
>>> @@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
>>> return NULL;
>>>
>>> skb->priority = TC_PRIO_CONTROL;
>>> - skb->reserved_tailroom = skb_end_offset(skb) -
>>> - min(mtu, skb_end_offset(skb));
>>> skb_reserve(skb, hlen);
>>> + skb->reserved_tailroom = skb_tailroom(skb) -
>>> + min_t(int, mtu, skb_tailroom(skb) - tlen);
>>
>> Are you sure this is correct? Wouldn't that mean (assuming we allocated
>> enough space), that I could now fill a larger than MTU frame?
>
> Quoting back a part of the log:
>
>>> The maximum space available for ip headers and payload without
>>> fragmentation is min(mtu, data + extra). Therefore,
>>> reserved_tailroom
>>> = data + extra + tlen - min(mtu, data + extra)
>>> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
>>> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> The min() takes care of the situation you describe, ie. if the allocated
> space is large, reserved_tailroom will be large enough that we do not
> use more space than the mtu.
>
> I tested the mld and igmp code with different driver parameters, mtu
> values, number of multicast address records and even allocation
> failures. If you think the formula is wrong, please provide a
> counter-example with hlen, tlen, mtu and size values.
I think the code is fine albeit I think we should remove the min macro
and just do something:
if (skb_tailroom(skb) > mtu)
skb->reserved_tailroom = skb_tailroom(skb) - mtu;
Does that make sense? I think it is much more readable.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-29 15:43 ` Hannes Frederic Sowa
@ 2016-02-29 18:08 ` Benjamin Poirier
2016-02-29 18:28 ` Hannes Frederic Sowa
0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 18:08 UTC (permalink / raw
To: Hannes Frederic Sowa
Cc: Daniel Borkmann, netdev, Eric Dumazet, Hideaki YOSHIFUJI
On 2016/02/29 16:43, Hannes Frederic Sowa wrote:
> On 29.02.2016 16:19, Benjamin Poirier wrote:
> >On 2016/02/29 15:57, Daniel Borkmann wrote:
> >[...]
> >>
> >>[ cutting the IPv4 part off as diff is the same ]
> >>
> >>>diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> >>>index 5ee56d0..c157edc 100644
> >>>--- a/net/ipv6/mcast.c
> >>>+++ b/net/ipv6/mcast.c
> >>>@@ -1574,9 +1574,9 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> >>> return NULL;
> >>>
> >>> skb->priority = TC_PRIO_CONTROL;
> >>>- skb->reserved_tailroom = skb_end_offset(skb) -
> >>>- min(mtu, skb_end_offset(skb));
> >>> skb_reserve(skb, hlen);
> >>>+ skb->reserved_tailroom = skb_tailroom(skb) -
> >>>+ min_t(int, mtu, skb_tailroom(skb) - tlen);
> >>
> >>Are you sure this is correct? Wouldn't that mean (assuming we allocated
> >>enough space), that I could now fill a larger than MTU frame?
> >
> >Quoting back a part of the log:
> >
> >>>The maximum space available for ip headers and payload without
> >>>fragmentation is min(mtu, data + extra). Therefore,
> >>>reserved_tailroom
> >>>= data + extra + tlen - min(mtu, data + extra)
> >>>= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> >>>= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
> >
> >The min() takes care of the situation you describe, ie. if the allocated
> >space is large, reserved_tailroom will be large enough that we do not
> >use more space than the mtu.
> >
> >I tested the mld and igmp code with different driver parameters, mtu
> >values, number of multicast address records and even allocation
> >failures. If you think the formula is wrong, please provide a
> >counter-example with hlen, tlen, mtu and size values.
>
> I think the code is fine albeit I think we should remove the min macro and
> just do something:
>
> if (skb_tailroom(skb) > mtu)
> skb->reserved_tailroom = skb_tailroom(skb) - mtu;
>
> Does that make sense? I think it is much more readable.
That is not equivalent. It fails to take tlen into account.
For igmp, consider this case:
with hlen = 16, mtu = 9000, tlen = 8,
additionally, suppose that the first iteration of the allocation loop
(alloc_skb(9000 + 16 + 8, ...) which requires 4 pages) fails and the
second iteration (alloc_skb((9000 >> 1) + 16 + 8, ...) which requires 2
pages) succeeds:
size = (9000 >> 1) + 16 + 8 = 4524
skb_end_offset = 8192 - 320 = 7872
tailroom = 7872 - 16 = 7856
data = 9000 >> 1 = 4500
extra = 7872 - 4524 = 3348
reserved tailroom (patch version)
= 4500 + 3348 + 8 - min(9000, 4500 + 3348)
= 8
reserved tailroom (your version)
= 0
Headers are ipv4 + igmpv3 = 24 + 8 = 32, records are 8 bytes
With 978 igmpv3 records, with your version, we would output an
skb that has less tailroom (0) than dev->needed_tailroom (8).
For mld, consider this case:
with hlen = 16, mtu = 9000, tlen = 8:
size = 3776 (SKB_MAX_ORDER case)
skb_end_offset = 3776
tailroom = 3776 - 16 = 3760
data = 3776 - 16 - 8 = 3752
extra = 0
reserved tailroom (patch version)
= 3752 + 0 + 8 - min(9000, 3752 + 0)
= 8
reserved tailroom (your version)
= 0
Headers are ipv6 + icmpv6 = 48 + 8 = 56, records are 20 bytes
With 185 mld records, with your formula, we would output an skb that
has less tailroom (4) than dev->needed_tailroom (8).
If you think we should write the expression with "if" instead of "min",
instead of the current
+ skb->reserved_tailroom = skb_tailroom(skb) -
+ min_t(int, mtu, skb_tailroom(skb) - tlen);
it should be:
+ if (mtu < skb_tailroom(skb) - tlen)
+ skb->reserved_tailroom = skb_tailroom(skb) - mtu;
+ else
+ skb->reserved_tailroom = tlen;
The second alternative does not look more readable to me but I have been
looking at that expression for a while. If you think that it is more
readable, I will resend the patch expressed that way. Please let me
know.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mld, igmp: Fix reserved tailroom calculation
2016-02-29 18:08 ` Benjamin Poirier
@ 2016-02-29 18:28 ` Hannes Frederic Sowa
2016-02-29 23:03 ` [PATCH net v2] " Benjamin Poirier
0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-02-29 18:28 UTC (permalink / raw
To: Benjamin Poirier; +Cc: Daniel Borkmann, netdev, Eric Dumazet, Hideaki YOSHIFUJI
On 29.02.2016 19:08, Benjamin Poirier wrote:
> If you think we should write the expression with "if" instead of "min",
> instead of the current
>
> + skb->reserved_tailroom = skb_tailroom(skb) -
> + min_t(int, mtu, skb_tailroom(skb) - tlen);
>
> it should be:
>
> + if (mtu < skb_tailroom(skb) - tlen)
> + skb->reserved_tailroom = skb_tailroom(skb) - mtu;
> + else
> + skb->reserved_tailroom = tlen;
>
> The second alternative does not look more readable to me but I have been
> looking at that expression for a while. If you think that it is more
> readable, I will resend the patch expressed that way. Please let me
> know.
I would still find it more readable actually, but no strong opinion, I
would leave it up to you.
Could it make sense to put this code into a static inline helper and
reuse it for both, igmp and mld?
Thanks,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
2016-02-29 18:28 ` Hannes Frederic Sowa
@ 2016-02-29 23:03 ` Benjamin Poirier
2016-03-01 10:09 ` Hannes Frederic Sowa
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Benjamin Poirier @ 2016-02-29 23:03 UTC (permalink / raw
To: netdev
Cc: Daniel Borkmann, Eric Dumazet, Hannes Frederic Sowa,
Hideaki YOSHIFUJI
The current reserved_tailroom calculation fails to take hlen and tlen into
account.
skb:
[__hlen__|__data____________|__tlen___|__extra__]
^ ^
head skb_end_offset
In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:
[__hlen__|__data____________|__extra__|__tlen___]
^ ^
head skb_end_offset
The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.
The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
reserved_tailroom = skb_tailroom - mtu
else:
reserved_tailroom = tlen
Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.
Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
---
Notes:
Changes v1->v2
As suggested by Hannes, move the code to an inline helper and express it
using "if" rather than "min".
include/linux/skbuff.h | 24 ++++++++++++++++++++++++
net/ipv4/igmp.c | 3 +--
net/ipv6/mcast.c | 3 +--
3 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4ce9ff7..d3fcd45 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1985,6 +1985,30 @@ static inline void skb_reserve(struct sk_buff *skb, int len)
skb->tail += len;
}
+/**
+ * skb_tailroom_reserve - adjust reserved_tailroom
+ * @skb: buffer to alter
+ * @mtu: maximum amount of headlen permitted
+ * @needed_tailroom: minimum amount of reserved_tailroom
+ *
+ * Set reserved_tailroom so that headlen can be as large as possible but
+ * not larger than mtu and tailroom cannot be smaller than
+ * needed_tailroom.
+ * The required headroom should already have been reserved before using
+ * this function.
+ */
+static inline void skb_tailroom_reserve(struct sk_buff *skb, unsigned int mtu,
+ unsigned int needed_tailroom)
+{
+ SKB_LINEAR_ASSERT(skb);
+ if (mtu < skb_tailroom(skb) - needed_tailroom)
+ /* use at most mtu */
+ skb->reserved_tailroom = skb_tailroom(skb) - mtu;
+ else
+ /* use up to all available space */
+ skb->reserved_tailroom = needed_tailroom;
+}
+
#define ENCAP_TYPE_ETHER 0
#define ENCAP_TYPE_IPPROTO 1
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 05e4cba..b3086cf 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -356,9 +356,8 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
skb_dst_set(skb, &rt->dst);
skb->dev = dev;
- skb->reserved_tailroom = skb_end_offset(skb) -
- min(mtu, skb_end_offset(skb));
skb_reserve(skb, hlen);
+ skb_tailroom_reserve(skb, mtu, tlen);
skb_reset_network_header(skb);
pip = ip_hdr(skb);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5ee56d0..d64ee7e 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1574,9 +1574,8 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
return NULL;
skb->priority = TC_PRIO_CONTROL;
- skb->reserved_tailroom = skb_end_offset(skb) -
- min(mtu, skb_end_offset(skb));
skb_reserve(skb, hlen);
+ skb_tailroom_reserve(skb, mtu, tlen);
if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
/* <draft-ietf-magma-mld-source-05.txt>:
--
2.7.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
2016-02-29 23:03 ` [PATCH net v2] " Benjamin Poirier
@ 2016-03-01 10:09 ` Hannes Frederic Sowa
2016-03-01 10:18 ` Daniel Borkmann
2016-03-03 20:42 ` David Miller
2 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-01 10:09 UTC (permalink / raw
To: Benjamin Poirier, netdev; +Cc: Daniel Borkmann, Eric Dumazet, Hideaki YOSHIFUJI
On 01.03.2016 00:03, Benjamin Poirier wrote:
> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
>
> skb:
> [__hlen__|__data____________|__tlen___|__extra__]
> ^ ^
> head skb_end_offset
>
> In this representation, hlen + data + tlen is the size passed to alloc_skb.
> "extra" is the extra space made available in __alloc_skb because of
> rounding up by kmalloc. We can reorder the representation like so:
>
> [__hlen__|__data____________|__extra__|__tlen___]
> ^ ^
> head skb_end_offset
>
> The maximum space available for ip headers and payload without
> fragmentation is min(mtu, data + extra). Therefore,
> reserved_tailroom
> = data + extra + tlen - min(mtu, data + extra)
> = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
> = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
>
> Compare the second line to the current expression:
> reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
> and we can see that hlen and tlen are not taken into account.
>
> The min() in the third line can be expanded into:
> if mtu < skb_tailroom - tlen:
> reserved_tailroom = skb_tailroom - mtu
> else:
> reserved_tailroom = tlen
>
> Depending on hlen, tlen, mtu and the number of multicast address records,
> the current code may output skbs that have less tailroom than
> dev->needed_tailroom or it may output more skbs than needed because not all
> space available is used.
>
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
I like it, thanks a lot!
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
2016-02-29 23:03 ` [PATCH net v2] " Benjamin Poirier
2016-03-01 10:09 ` Hannes Frederic Sowa
@ 2016-03-01 10:18 ` Daniel Borkmann
2016-03-01 16:00 ` Hannes Frederic Sowa
2016-03-03 20:42 ` David Miller
2 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2016-03-01 10:18 UTC (permalink / raw
To: Benjamin Poirier, netdev
Cc: Eric Dumazet, Hannes Frederic Sowa, Hideaki YOSHIFUJI
On 03/01/2016 12:03 AM, Benjamin Poirier wrote:
[...]
> Notes:
> Changes v1->v2
> As suggested by Hannes, move the code to an inline helper and express it
> using "if" rather than "min".
The code is correct, thanks!
Therefore:
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
However, I actually think that v1 was much better/easier as a fix though. :/
Meaning 1) it's likely easier to backport, and 2) that we now need a comment
above each skb->reserved_tailroom assignment probably says that min() might
perhaps have been more self-documenting ...
skb_tailroom_reserve() looks quite generic, but it only makes sense to use in
combination with skb_availroom(), which would have been good to put a note to
the header comment. Also "the required headroom should already have been
reserved before using this function", places one more requirement for usage.
If we really want to go that path, maybe rather a skb_setroom() that is coupled
with skb_availroom() like:
static inline int __skb_tailroom(const struct sk_buff *skb)
{
return skb->end - skb->tail;
}
static inline void skb_setroom(struct sk_buff *skb,
unsigned int needed_headroom,
unsigned int size,
unsigned int needed_tailroom)
{
SKB_LINEAR_ASSERT(skb);
skb_reserve(skb, needed_headroom);
skb->reserved_tailroom = needed_tailroom;
if (size < __skb_tailroom(skb) - needed_tailroom)
skb->reserved_tailroom = __skb_tailroom(skb) - size;
}
Then, skb_tailroom() would internally use __skb_tailroom(), too. And we can also
spare us the two unneeded skb_is_nonlinear() checks in our helper which will
currently always evaluate to false anyway.
... just a thought.
Thanks again,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
2016-03-01 10:18 ` Daniel Borkmann
@ 2016-03-01 16:00 ` Hannes Frederic Sowa
0 siblings, 0 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2016-03-01 16:00 UTC (permalink / raw
To: Daniel Borkmann, Benjamin Poirier, netdev; +Cc: Eric Dumazet, Hideaki YOSHIFUJI
On 01.03.2016 11:18, Daniel Borkmann wrote:
> On 03/01/2016 12:03 AM, Benjamin Poirier wrote:
> [...]
>> Notes:
>> Changes v1->v2
>> As suggested by Hannes, move the code to an inline helper and
>> express it
>> using "if" rather than "min".
>
> The code is correct, thanks!
>
> Therefore:
>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>
> However, I actually think that v1 was much better/easier as a fix
> though. :/
>
> Meaning 1) it's likely easier to backport, and 2) that we now need a
> comment
> above each skb->reserved_tailroom assignment probably says that min() might
> perhaps have been more self-documenting ...
>
> skb_tailroom_reserve() looks quite generic, but it only makes sense to
> use in
> combination with skb_availroom(), which would have been good to put a
> note to
> the header comment. Also "the required headroom should already have been
> reserved before using this function", places one more requirement for
> usage.
>
> If we really want to go that path, maybe rather a skb_setroom() that is
> coupled
> with skb_availroom() like:
>
> static inline int __skb_tailroom(const struct sk_buff *skb)
> {
> return skb->end - skb->tail;
> }
>
> static inline void skb_setroom(struct sk_buff *skb,
> unsigned int needed_headroom,
> unsigned int size,
> unsigned int needed_tailroom)
> {
> SKB_LINEAR_ASSERT(skb);
>
> skb_reserve(skb, needed_headroom);
> skb->reserved_tailroom = needed_tailroom;
>
> if (size < __skb_tailroom(skb) - needed_tailroom)
> skb->reserved_tailroom = __skb_tailroom(skb) - size;
> }
>
> Then, skb_tailroom() would internally use __skb_tailroom(), too. And we
> can also
> spare us the two unneeded skb_is_nonlinear() checks in our helper which
> will
> currently always evaluate to false anyway.
>
> ... just a thought.
I think it is fine. The code will be inlined anyway and probably skb
linear assertions will be optimized away.
I like the current code in its form:
skb_reserve(skb, header);
skb_tailroom_reserve(skb, mtu, tailroom);
Combined form has too many arguments, so one has to look up the header
file again.
Bye,
Hannes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] mld, igmp: Fix reserved tailroom calculation
2016-02-29 23:03 ` [PATCH net v2] " Benjamin Poirier
2016-03-01 10:09 ` Hannes Frederic Sowa
2016-03-01 10:18 ` Daniel Borkmann
@ 2016-03-03 20:42 ` David Miller
2 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2016-03-03 20:42 UTC (permalink / raw
To: bpoirier; +Cc: netdev, daniel, eric.dumazet, hannes, yoshfuji
From: Benjamin Poirier <bpoirier@suse.com>
Date: Mon, 29 Feb 2016 15:03:33 -0800
> The current reserved_tailroom calculation fails to take hlen and tlen into
> account.
...
> Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-03-03 20:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-27 19:57 [PATCH] mld, igmp: Fix reserved tailroom calculation Benjamin Poirier
2016-02-29 14:57 ` Daniel Borkmann
2016-02-29 15:19 ` Benjamin Poirier
2016-02-29 15:38 ` Daniel Borkmann
2016-02-29 15:43 ` Hannes Frederic Sowa
2016-02-29 18:08 ` Benjamin Poirier
2016-02-29 18:28 ` Hannes Frederic Sowa
2016-02-29 23:03 ` [PATCH net v2] " Benjamin Poirier
2016-03-01 10:09 ` Hannes Frederic Sowa
2016-03-01 10:18 ` Daniel Borkmann
2016-03-01 16:00 ` Hannes Frederic Sowa
2016-03-03 20:42 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.