Virtualization Archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	"'peterz@infradead.org'" <peterz@infradead.org>,
	"'longman@redhat.com'" <longman@redhat.com>
Cc: "'mingo@redhat.com'" <mingo@redhat.com>,
	"'will@kernel.org'" <will@kernel.org>,
	"'boqun.feng@gmail.com'" <boqun.feng@gmail.com>,
	"'Linus Torvalds'" <torvalds@linux-foundation.org>,
	"'virtualization@lists.linux-foundation.org'"
	<virtualization@lists.linux-foundation.org>,
	'Zeng Heng' <zengheng4@huawei.com>
Subject: [PATCH next v2 0/5] locking/osq_lock: Optimisations to osq_lock code.
Date: Sun, 31 Dec 2023 21:49:53 +0000	[thread overview]
Message-ID: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> (raw)

This is an updated series of optimisations to osq_lock.c
Patches #1 and #3 from v1 have been applied by Linus.
Some of the generated code issues I was getting were caused by
CONFIG_DEBUG_PREEMPT being set. No idea why, it isn't any more.

Patch #1 is the node->locked part of the old #2.

Patch #2 removes the pretty much guaranteed cache line reload getting
the cpu number (from node->prev) for the vcpu_is_preempted() check.
It is (basically) the old #5 with the addition of a READ_ONCE()
and leaving the '+ 1' offset (for patch 3).

Patch #3 ends up removing both node->cpu and node->prev.
This saves issues initialising node->cpu.
Basically node->cpu was only ever read as node->prev->cpu in the unqueue code.
Most of the time it is the value read from lock->tail that was used to
obtain 'prev' in the first place.
The only time it is different is in the unlock race path where 'prev'
is re-read from node->prev - updated right at the bottom of osq_lock().
So the updated node->prev_cpu can used (and prev obtained from it) without
worrying about only one of node->prev and node->prev-cpu being updated.

Linus did suggest just saving the cpu numbers instead of pointers.
It actually works for 'prev' but not 'next'.

Patch #4 removes the 'should be unnecessary' node->next = NULL
assignment from the top of osq_lock().
Since longman was worried about race conditions, I've added a
WARN_ON_ONCE() check that ensures it is NULL.
This saves dirtying the 'node' cache line in the fast path, but the
check still requires the cache line be loaded.

Patch #5 just stops gcc using two separate instructions to decrement
the offset cpu number and then convert it to 64 bits.
Linus got annoyed with it, and I'd spotted it as well.
I don't seem to be able to get gcc to convert __per_cpu_offset[cpu - 1]
to (__per_cpu_offset - 1)[cpu] (cpu is offset by one) but, in any case,
it would still need zero extending in the common case.

David Laight (5):
  1) Defer clearing node->locked until the slow osq_lock() path.
  2) Optimise vcpu_is_preempted() check.
  3) Use node->prev_cpu instead of saving node->prev.
  4) Avoid writing to node->next in the osq_lock() fast path.
  5) Optimise decode_cpu() and per_cpu_ptr().

 kernel/locking/osq_lock.c | 59 +++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 27 deletions(-)

-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


             reply	other threads:[~2023-12-31 21:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-31 21:49 David Laight [this message]
2023-12-31 21:51 ` [PATCH next v2 1/5] locking/osq_lock: Defer clearing node->locked until the slow osq_lock() path David Laight
2024-01-01  4:08   ` Waiman Long
2023-12-31 21:52 ` [PATCH next v2 2/5] locking/osq_lock: Optimise the vcpu_is_preempted() check David Laight
2024-01-01  4:09   ` Waiman Long
2024-01-08  7:42   ` kernel test robot
2023-12-31 21:54 ` [PATCH next v2 3/5] locking/osq_lock: Use node->prev_cpu instead of saving node->prev David Laight
2024-01-01  4:09   ` Waiman Long
2023-12-31 21:54 ` [PATCH next v2 4/5] locking/osq_lock: Avoid writing to node->next in the osq_lock() fast path David Laight
2024-01-01  4:13   ` Waiman Long
2024-01-02  9:47   ` Ingo Molnar
2023-12-31 21:55 ` [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr() David Laight
2024-01-01  4:14   ` Waiman Long
2024-01-01  8:47     ` David Laight
2024-05-03 15:59     ` Waiman Long
2024-05-03 16:16       ` David Laight
2024-05-03 21:10       ` David Laight
2024-05-03 22:13         ` Waiman Long
2024-05-04 20:26           ` David Laight
2024-01-02  9:54   ` Ingo Molnar
2024-01-02 10:20     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    --cc=zengheng4@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).