All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] bonding: 3ad: fix a crash in agg_device_up()
@ 2021-06-04  2:14 zhudi
  2021-06-04  4:47 ` Jay Vosburgh
  0 siblings, 1 reply; 3+ messages in thread
From: zhudi @ 2021-06-04  2:14 UTC (permalink / raw
  To: j.vosburgh, vfalico, kuba, davem; +Cc: netdev, zhudi21, rose.chen

From: Di Zhu <zhudi21@huawei.com>

When doing the test of restarting the network card, the system is
broken because the port->slave is null pointer in agg_device_up().
After in-depth investigation, we found the real cause: in
bond_3ad_unbind_slave()  if there are no active ports in the
aggregator to be deleted, the ad_clear_agg() will be called to
set "aggregator->lag_ports = NULL", but the ports originally
belonging to the aggregator are still linked together.

Before bond_3ad_unbind_slave():
	aggregator4->lag_ports = port1->port2->port3
After bond_3ad_unbind_slave():
	aggregator4->lag_ports = NULL
	port1->port2->port3

After the port2 is deleted, the port is still  remain in the linked
list: because the port does not belong to any agg, so unbind do
nothing for this port.

After a while, bond_3ad_state_machine_handler() will run and
traverse each existing port, trying to bind each port to the
newly selected agg, such as:
	if (!found) {
		if (free_aggregator) {
			...
			port->aggregator->lag_ports = port;
			...
		}
	}
After this operation, the link list looks like this:
	 aggregator1->lag_ports = port1->port2(has been deleted)->port3

After that, just traverse the linked list of agg1 and access the
port2->slave, the crash will happen.

The easiest way to fix it is: if a port does not belong to any agg, delete
it from the list and wait for the state machine to select the agg again.

Fixes: 0622cab0341c ("bonding: fix 802.3ad aggregator reselection")
Signed-off-by: Di Zhu <zhudi21@huawei.com>
---
/* v2 */
-David Miller <davem@davemloft.net>
 -add Fixes: tag
---
 drivers/net/bonding/bond_3ad.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 6908822d9773..1d6ff4e1ed28 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1793,6 +1793,8 @@ static void ad_agg_selection_logic(struct aggregator *agg,
 static void ad_clear_agg(struct aggregator *aggregator)
 {
 	if (aggregator) {
+		struct port *port, *next;
+
 		aggregator->is_individual = false;
 		aggregator->actor_admin_aggregator_key = 0;
 		aggregator->actor_oper_aggregator_key = 0;
@@ -1801,6 +1803,11 @@ static void ad_clear_agg(struct aggregator *aggregator)
 		aggregator->partner_oper_aggregator_key = 0;
 		aggregator->receive_state = 0;
 		aggregator->transmit_state = 0;
+		for (port = aggregator->lag_ports; port; port = next) {
+			next = port->next_port_in_aggregator;
+			if (port->aggregator == aggregator)
+				port->next_port_in_aggregator = NULL;
+		}
 		aggregator->lag_ports = NULL;
 		aggregator->is_active = 0;
 		aggregator->num_of_ports = 0;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] bonding: 3ad: fix a crash in agg_device_up()
  2021-06-04  2:14 [PATCH v2] bonding: 3ad: fix a crash in agg_device_up() zhudi
@ 2021-06-04  4:47 ` Jay Vosburgh
  0 siblings, 0 replies; 3+ messages in thread
From: Jay Vosburgh @ 2021-06-04  4:47 UTC (permalink / raw
  To: zhudi; +Cc: vfalico, kuba, davem, netdev, rose.chen

zhudi <zhudi21@huawei.com> wrote:

>From: Di Zhu <zhudi21@huawei.com>
>
>When doing the test of restarting the network card, the system is
>broken because the port->slave is null pointer in agg_device_up().
>After in-depth investigation, we found the real cause: in
>bond_3ad_unbind_slave()  if there are no active ports in the
>aggregator to be deleted, the ad_clear_agg() will be called to
>set "aggregator->lag_ports = NULL", but the ports originally
>belonging to the aggregator are still linked together.

	Presumably by "no active ports" you refer to the following block
near the end of bond_3ad_unbind_slave() (where a port being deleted is
removed from an aggregator):

				if (prev_port)
					prev_port->next_port_in_aggregator = temp_port->next_port_in_aggregator;
				else
					temp_aggregator->lag_ports = temp_port->next_port_in_aggregator;
				temp_aggregator->num_of_ports--;
				if (__agg_active_ports(temp_aggregator) == 0) {
					select_new_active_agg = temp_aggregator->is_active;
					ad_clear_agg(temp_aggregator);
					if (select_new_active_agg) {
						slave_info(bond->dev, slave->dev, "Removing an active aggregator\n");
						/* select new active aggregator */
						ad_agg_selection_logic(__get_first_agg(port),
							               &dummy_slave_update);


	by what mechanism are you causing the ports within a particular
aggregator to all be inactive (port->is_enabled == false)?  Am I correct
in speculating that you're removing carrier from "port1" and "port3"
while simultaneously removing "port2" from the bond?  I believe this
would all need to occur within a single gap between passes of the state
machine logic.

	The logic in bond_3ad_handle_link_change() that sets
port->is_enabled = false happens on link state change, and is
immediately followed (under the same acquisition of bond->mode_lock) by
a call to ad_agg_selection_logic().  So, in principle, when the port is
set to inactive (is_enabled == false), it should be reselected to a
different aggregator (likely as an individual port) on the next run of
the state machine.

>Before bond_3ad_unbind_slave():
>	aggregator4->lag_ports = port1->port2->port3
>After bond_3ad_unbind_slave():
>	aggregator4->lag_ports = NULL
>	port1->port2->port3
>
>After the port2 is deleted, the port is still  remain in the linked
>list: because the port does not belong to any agg, so unbind do
>nothing for this port.
>
>After a while, bond_3ad_state_machine_handler() will run and
>traverse each existing port, trying to bind each port to the
>newly selected agg, such as:
>	if (!found) {
>		if (free_aggregator) {
>			...
>			port->aggregator->lag_ports = port;
>			...
>		}
>	}
>After this operation, the link list looks like this:
>	 aggregator1->lag_ports = port1->port2(has been deleted)->port3
>
>After that, just traverse the linked list of agg1 and access the
>port2->slave, the crash will happen.
>
>The easiest way to fix it is: if a port does not belong to any agg, delete
>it from the list and wait for the state machine to select the agg again.

	As I understand the above, the bug causes a deleted (freed)
object to be left on the linked list (port1 -> port2 -> port3 in the
above).  I'm not quite sure how the code block above allows this to
happen, as the port being deleted is unlinked (at the top of the code
block).

	In any event, assuming that I'm missing something and it does
leave the freed port in the list, this fix may be the easiest, but I
don't believe it's correct, as it leaves opportunity for something else
to traverse the linked list and dereference the freed element.

	The code sample I included above occurs immediately after
removing a port from the aggregator's list of ports, and, in light of
your analysis (and assuming I've got the correct code block above), I
believe that the "__agg_active_ports(temp_aggregator) == 0" test should
really test to determine if there are no ports at all in the aggregator
(i.e., temp_aggregator->num_of_ports == 0), not only test for active
ports.  This way, if any ports remain in the aggregator, then
ad_clear_agg() would not be called, and the linked list would not end up
detached.

	Comments?

	-J

>Fixes: 0622cab0341c ("bonding: fix 802.3ad aggregator reselection")
>Signed-off-by: Di Zhu <zhudi21@huawei.com>
>---
>/* v2 */
>-David Miller <davem@davemloft.net>
> -add Fixes: tag
>---
> drivers/net/bonding/bond_3ad.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 6908822d9773..1d6ff4e1ed28 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -1793,6 +1793,8 @@ static void ad_agg_selection_logic(struct aggregator *agg,
> static void ad_clear_agg(struct aggregator *aggregator)
> {
> 	if (aggregator) {
>+		struct port *port, *next;
>+
> 		aggregator->is_individual = false;
> 		aggregator->actor_admin_aggregator_key = 0;
> 		aggregator->actor_oper_aggregator_key = 0;
>@@ -1801,6 +1803,11 @@ static void ad_clear_agg(struct aggregator *aggregator)
> 		aggregator->partner_oper_aggregator_key = 0;
> 		aggregator->receive_state = 0;
> 		aggregator->transmit_state = 0;
>+		for (port = aggregator->lag_ports; port; port = next) {
>+			next = port->next_port_in_aggregator;
>+			if (port->aggregator == aggregator)
>+				port->next_port_in_aggregator = NULL;
>+		}
> 		aggregator->lag_ports = NULL;
> 		aggregator->is_active = 0;
> 		aggregator->num_of_ports = 0;
>-- 
>2.23.0
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] bonding: 3ad: fix a crash in agg_device_up()
@ 2021-06-05  8:56 zhudi (J)
  0 siblings, 0 replies; 3+ messages in thread
From: zhudi (J) @ 2021-06-05  8:56 UTC (permalink / raw
  To: Jay Vosburgh
  Cc: vfalico@gmail.com, kuba@kernel.org, davem@davemloft.net,
	netdev@vger.kernel.org, Chenxiang (EulerOS)

> zhudi <zhudi21@huawei.com> wrote:
> 
> >From: Di Zhu <zhudi21@huawei.com>
> >
> >When doing the test of restarting the network card, the system is
> >broken because the port->slave is null pointer in agg_device_up().
> >After in-depth investigation, we found the real cause: in
> >bond_3ad_unbind_slave()  if there are no active ports in the
> >aggregator to be deleted, the ad_clear_agg() will be called to
> >set "aggregator->lag_ports = NULL", but the ports originally
> >belonging to the aggregator are still linked together.
> 
> 	Presumably by "no active ports" you refer to the following block
> near the end of bond_3ad_unbind_slave() (where a port being deleted is
> removed from an aggregator):
> 
> 				if (prev_port)
> 					prev_port->next_port_in_aggregator
> = temp_port->next_port_in_aggregator;
> 				else
> 					temp_aggregator->lag_ports =
> temp_port->next_port_in_aggregator;
> 				temp_aggregator->num_of_ports--;
> 				if (__agg_active_ports(temp_aggregator) ==
> 0) {
> 					select_new_active_agg =
> temp_aggregator->is_active;
> 					ad_clear_agg(temp_aggregator);
> 					if (select_new_active_agg) {
> 						slave_info(bond->dev, slave-
> >dev, "Removing an active aggregator\n");
> 						/* select new active
> aggregator */
> 
> 	ad_agg_selection_logic(__get_first_agg(port),
> 
> &dummy_slave_update);
> 
	Yes

> 
> 	by what mechanism are you causing the ports within a particular
> aggregator to all be inactive (port->is_enabled == false)?  Am I correct
> in speculating that you're removing carrier from "port1" and "port3"
> while simultaneously removing "port2" from the bond?  I believe this
> would all need to occur within a single gap between passes of the state
> machine logic.
> 
> 	The logic in bond_3ad_handle_link_change() that sets
> port->is_enabled = false happens on link state change, and is
> immediately followed (under the same acquisition of bond->mode_lock) by
> a call to ad_agg_selection_logic().  So, in principle, when the port is
> set to inactive (is_enabled == false), it should be reselected to a
> different aggregator (likely as an individual port) on the next run of
> the state machine.
> 

 	We have four physical network(enp3s0, enp4s0, enp5s0, enp6s0) cards 
bound to the bonding network card and test script looks like this:
	while [1]; do
	"echo 1 > /sys/bus/pci/devices/${pci_slot[0]}/remove" &
	"echo 1 > /sys/bus/pci/devices/${pci_slot[1]}/remove" &
        	"echo 1 > /sys/bus/pci/devices/${pci_slot[2]}/remove" &
        	"echo 1 > /sys/bus/pci/devices/${pci_slot[3]}/remove" &
       	 "echo 1 > /sys/bus/pci/rescan"
	done

According to dmesg,  the sequence of events is as follows:
	1)bond_mii_monitor():print the status of the four physical network cards as link down
		hinic network card driver firstly call netif_carrier_off() changing the dev->state, 
		so mii thread can see state change.
	2)enp3s0 is unbind and release
		Then hinic network card driver call unregister_netdev()
		   ->call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
		      -> bond_netdev_event()
		         -> __bond_release_one()->bond_3ad_unbind_slave()

		firstly, the link list look like this:
			Aggregator3->lag_ports = enp3s0->enp5s0-> enp4s0-> enp6s0
		Unbind process first selects a new agg4 and the link list look like this:
			Aggregator4->lag_ports = enp3s0->enp5s0-> enp4s0-> enp6s0
			Aggregator3->lag_ports = NULL
		Unbind process then run code block you described above, remove enp3s0 from port list and 
		call ad_clear_agg()(all ports are inactive) :
			Aggregator4->lag_ports = NULL
			enp5s0-> enp4s0-> enp6s0 (This patch is to break the linked list to ensure that no one can access it after deletion)
	3)enp4s0 is unbind and release
		The process is the same as enp3s0 and now the linked list looks like this:
			enp5s0-> enp4s0(has been deleted)-> enp6s0
	4)enp5s0 up
		enp5s0 network cards were reactivated(pci rescan operation) ,without completing the pci remove operation.
		State machine select the enp5s0 to join agg5(this is a new LAG), so the link list looks like this:
			Aggregator5->lag_ports = enp5s0-> enp4s0(has been deleted)-> enp6s0
	5)enp6s0 up
		The process is the same as enp6s0 and now the linked list looks like this:
			Aggregator5->lag_ports = enp5s0-> enp4s0(has been deleted)-> enp6s0
			Aggregator6->lag_ports = enp6s0 (joined new LAG)

		This is the final state that remain in vmcore and yes, enp6s0 exists in two linked lists at the same time.

	The above event sequence will cause the problem of patch description


> >Before bond_3ad_unbind_slave():
> >	aggregator4->lag_ports = port1->port2->port3
> >After bond_3ad_unbind_slave():
> >	aggregator4->lag_ports = NULL
> >	port1->port2->port3
> >
> >After the port2 is deleted, the port is still  remain in the linked
> >list: because the port does not belong to any agg, so unbind do
> >nothing for this port.
> >
> >After a while, bond_3ad_state_machine_handler() will run and
> >traverse each existing port, trying to bind each port to the
> >newly selected agg, such as:
> >	if (!found) {
> >		if (free_aggregator) {
> >			...
> >			port->aggregator->lag_ports = port;
> >			...
> >		}
> >	}
> >After this operation, the link list looks like this:
> >	 aggregator1->lag_ports = port1->port2(has been deleted)->port3
> >
> >After that, just traverse the linked list of agg1 and access the
> >port2->slave, the crash will happen.
> >
> >The easiest way to fix it is: if a port does not belong to any agg, delete
> >it from the list and wait for the state machine to select the agg again.
> 
> 	As I understand the above, the bug causes a deleted (freed)
> object to be left on the linked list (port1 -> port2 -> port3 in the
> above).  I'm not quite sure how the code block above allows this to
> happen, as the port being deleted is unlinked (at the top of the code
> block).
> 
> 	In any event, assuming that I'm missing something and it does
> leave the freed port in the list, this fix may be the easiest, but I
> don't believe it's correct, as it leaves opportunity for something else
> to traverse the linked list and dereference the freed element.

	First, all port list operations are protected by bond->mode_lock
and in ad_clear_agg(), we break the link list immediately, It's guaranteed that
 no one will be able to access it after release.

	
> 
> 	The code sample I included above occurs immediately after
> removing a port from the aggregator's list of ports, and, in light of
> your analysis (and assuming I've got the correct code block above), I
> believe that the "__agg_active_ports(temp_aggregator) == 0" test should
> really test to determine if there are no ports at all in the aggregator
> (i.e., temp_aggregator->num_of_ports == 0), not only test for active
> ports.  This way, if any ports remain in the aggregator, then
> ad_clear_agg() would not be called, and the linked list would not end up
> detached.

	There is a time gap between the change of link state and the
operation of state machine, during which unbinding events occurs, it will
see "__agg_active_ports(temp_aggregator) == 0" but temp_aggregator->num_of_ports != 0.

	In addition, the link state change does not necessarily cause the state machine to reselect the aggregator,
because ad_agg_selection_logic() not always possible to choose a best aggregator(such as, 
all NIC devices on bond are all in the down state)

> 
> 	Comments?
> 
> 	-J
> 
> >Fixes: 0622cab0341c ("bonding: fix 802.3ad aggregator reselection")
> >Signed-off-by: Di Zhu <zhudi21@huawei.com>
> >---
> >/* v2 */
> >-David Miller <davem@davemloft.net>
> > -add Fixes: tag
> >---
> > drivers/net/bonding/bond_3ad.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c
> b/drivers/net/bonding/bond_3ad.c
> >index 6908822d9773..1d6ff4e1ed28 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -1793,6 +1793,8 @@ static void ad_agg_selection_logic(struct
> aggregator *agg,
> > static void ad_clear_agg(struct aggregator *aggregator)
> > {
> > 	if (aggregator) {
> >+		struct port *port, *next;
> >+
> > 		aggregator->is_individual = false;
> > 		aggregator->actor_admin_aggregator_key = 0;
> > 		aggregator->actor_oper_aggregator_key = 0;
> >@@ -1801,6 +1803,11 @@ static void ad_clear_agg(struct aggregator
> *aggregator)
> > 		aggregator->partner_oper_aggregator_key = 0;
> > 		aggregator->receive_state = 0;
> > 		aggregator->transmit_state = 0;
> >+		for (port = aggregator->lag_ports; port; port = next) {
> >+			next = port->next_port_in_aggregator;
> >+			if (port->aggregator == aggregator)
> >+				port->next_port_in_aggregator = NULL;
> >+		}
> > 		aggregator->lag_ports = NULL;
> > 		aggregator->is_active = 0;
> > 		aggregator->num_of_ports = 0;
> >--
> >2.23.0
> >
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-05  8:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-04  2:14 [PATCH v2] bonding: 3ad: fix a crash in agg_device_up() zhudi
2021-06-04  4:47 ` Jay Vosburgh
  -- strict thread matches above, loose matches on Subject: below --
2021-06-05  8:56 zhudi (J)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.