linux-hams.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: duoming@zju.edu.cn
To: Thomas Osterried <thomas@osterried.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>,
	linux-hams@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Paolo Abeni <pabeni@redhat.com>
Subject: Re: Regression in "ax25: Fix UAF bugs in ax25 timers", patch 82e31755e55fbcea6a9dfaae5fe4860ade17cbc0
Date: Fri, 27 May 2022 12:48:19 +0800 (GMT+08:00)	[thread overview]
Message-ID: <52d4c593.3891f.18103d80b22.Coremail.duoming@zju.edu.cn> (raw)
In-Reply-To: <CE99195A-25DD-4715-BABF-A83B2E572635@osterried.de>


Hello,

On Fri, 27 May 2022 00:56:37 +0200 thomas wrote:

> I'm not so happy with adding new states to the ax25_dev struct.
> When I asked "If we like to delete the timers here because we are in ax25_release due
>   to the ax25_kill_by_device/ax25_dev_device_down event, do we have another option to
>   see what called us, and for being able to handle the case correctly?",
> I had in mind if we could see this situation in the current state inside
> the ax25_cb, like ax25->state or ax25->sk->..
> 
> But in case of ax25->state, ax25_connect() changes the state when executed in
> a parallel thread after ax25_kill_by_device() -> that's no option

The "ax25->state" or "ax25->sk->.." inside ax25_cb could not be used, because these flags could
be changed in many places such as ax25_destroy_socket(), ax25_create_cb() or ax25_disconnect().
We need a flag that only represents the ax25_kill_by_device() has been executed.

> Another one I thought about was an assurance in ax25_start_xxx_timer() functions,
> that timer should not start if ax25_cb->ax25_dev == NULL,
> but your examples showed, that
>   mod_timer(&ax25->t1timer,..)
> is also used directly to modify running timer.
> 
> 
> An other option may be assurances in ax25_xxx_expiry() to immediately return if
> ax25_cb->ax25_dev == NULL.
> 
> Indeed, I see in ax25_heartbeat_expiry(struct timer_list *t) this assurance:
>           if (ax25->ax25_dev)
>                 proto = ax25->ax25_dev->values[AX25_VALUES_PROTOCOL];
> Unfortunately, those checks are missing in the other timer_expiry functions and
> one of them is ax25_t1timer_expiry, for which you showed in your kernel panic
> backtrace.

The "if(ax25->ax25_dev)" check in ax25_heartbeat_expiry() is useless, because there
is not any lock that could synchronize with ax25_kill_by_device(), ax25_dev_device_down()
or ax25_release(). Even if we add "if(ax25->ax25_dev)" check in other timer_expiry functions,
the race condition will also happen.

> If I see it correctly, this would solve the issue you had with the running
> timers, without forced-stopping them in ax25_release().
> But on the other hand: if the interface is really down, there's no need
> for the timers to keep running, and keeping things running where parts
> may already be freed is dangerous. The session has to been cleaned up immediately.
> -> This clearly speaks for your latest approach.

The timers will start when we call ax25_bind()-->ax25_connect() to establish a session or
during the period of session cleanup. 

If we only use ax25_dev_device_up() to turn the device on instead of using ax25_bind()-->ax25_connect()
to establish a session, no timers will start. So if the interface is down, there are no timers running.

In some cases, the timers could not be stopped in ax25_kill_by_device(). If you call ax25_kill_by_device()
before ax25_bind(), which is shown below. You could not stop timers in ax25_kill_by_device(), because it 
has already been executed.

      (Thread 1)                       |      (Thread 2)
ax25_dev_device_up()                   |
...                                    | ax25_kill_by_device()
ax25_bind()                            |
ax25_connect()  //! timers start       | ...
 ax25_std_establish_data_link()        |
  ax25_start_t1timer()                 | ax25_dev_device_down()
   mod_timer(&ax25->t1timer,..)        |
                                       | ax25_release()
   (wait a time)                       |  ...
                                       |  ax25_dev_put(ax25_dev) //FREE
   ax25_t1timer_expiry()               |
    ax25->ax25_dev->values[..] //USE   |  ...
     ...                               |

We have to use del_timer_sync() in ax25_release() to stop timers, because the device could
only be removed once in the real world. What's more, If we use ax25_bind()-->ax25_connect()
to establish a session, we need to call ax25_release() to close the session in normal case.
So putting the del_timer_sync() in ax25_release() is reasonable.

> Perhaps it's a good idea anyway to add the assurance (like ax25_heartbeat_expiry()
> does) to the other ax25_xxx_expiry() functions. 

I disagree, there is not any lock that could synchronize with other functions in 
ax25_xxx_expiry() functions.

> An other approach I had in mind was a variable in struct ax25_cb, that the cb
> is in progress of being cleaned up.
> This way, we could prohibit starting or modifying timers.
> Unfortunately, there might be many checks in various functions necessary.
> -> This clearly speaks for your latest approach.
> 
> Perhaps it's a good idea to also test in
> ax25_setsockopt SO_BINDTODEVICE
> and in
> ax25_bind_to_device
> if ax25_ dev->device_up flag is 0, and if so, return in  -ENODEV;

It is unnecessary to add ax25_dev->device_up flag in ax25_setsockopt SO_BINDTODEVICE
or ax25_bind(). Because there is no bug and we need to add extra locks in order to 
synchronize with ax25_dev->device_up in ax25_kill_by_device(), which is tedious.
 
> Off-topic-question: about the quite-new dev_hold_track():
> while looking to 
> ax25_setsockopt() and ax25_bind() for SO_BINDTODEVICE,
> I se only that dev_hold_track() is called in ax25_bind(),
> but not in ax25_setsockopt() for SO_BINDTODEVICE.
> Is it correct that ax25_setsockopt() for SO_BINDTODEVICE
> does not need to trigger the tracker?

There is no need to add refcount of net_device in SO_BINDTODEVICE, because there is
rtnl_lock() which is well synchronized in it.

Best regards,
Duoming Zhou

  reply	other threads:[~2022-05-27  4:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-23  5:46 Regression in "ax25: Fix UAF bugs in ax25 timers", patch 82e31755e55fbcea6a9dfaae5fe4860ade17cbc0 Thomas Osterried
2022-05-23 13:52 ` Dan Carpenter
2022-05-24  9:25   ` Thomas Osterried
2022-05-24 10:47     ` duoming
2022-05-24 21:52       ` Thomas Osterried
2022-05-25  2:34         ` duoming
2022-05-26 22:56           ` Thomas Osterried
2022-05-27  4:48             ` duoming [this message]
2022-05-23 14:07 ` duoming
2022-05-23 16:05   ` Roland Schwarz
2022-05-23 18:53     ` Thomas Osterried
2022-05-23 19:14   ` Thomas Osterried

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52d4c593.3891f.18103d80b22.Coremail.duoming@zju.edu.cn \
    --to=duoming@zju.edu.cn \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=linux-hams@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=thomas@osterried.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).