From: Leo Yan <leo.yan@linaro.org>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Namhyung Kim <namhyung@kernel.org>,
Andi Kleen <ak@linux.intel.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 2/2] perf auxtrace: Optimize barriers with load-acquire and store-release
Date: Tue, 1 Jun 2021 14:33:42 +0800 [thread overview]
Message-ID: <20210601063342.GB10026@leoy-ThinkPad-X240s> (raw)
In-Reply-To: <cc3810cd-5edc-26d3-9c77-8bb6479152c1@intel.com>
On Mon, May 31, 2021 at 10:03:33PM +0300, Adrian Hunter wrote:
> On 31/05/21 6:10 pm, Leo Yan wrote:
> > Hi Peter, Adrian,
> >
> > On Wed, May 19, 2021 at 10:03:19PM +0800, Leo Yan wrote:
> >> Load-acquire and store-release are one-way permeable barriers, which can
> >> be used to guarantee the memory ordering between accessing the buffer
> >> data and the buffer's head / tail.
> >>
> >> This patch optimizes the memory ordering with the load-acquire and
> >> store-release barriers.
> >
> > Is this patch okay for you?
> >
> > Besides this patch, I have an extra question. You could see for
> > accessing the AUX buffer's head and tail, it also support to use
> > compiler build-in functions for atomicity accessing:
> >
> > __sync_val_compare_and_swap()
> > __sync_bool_compare_and_swap()
> >
> > Since now we have READ_ONCE()/WRITE_ONCE(), do you think we still need
> > to support __sync_xxx_compare_and_swap() atomicity?
>
> I don't remember, but it seems to me atomicity is needed only
> for a 32-bit perf running with a 64-bit kernel.
32-bit perf wants to access 64-bit value atomically, I think it tries to
avoid the issue caused by scenario:
CPU0 (64-bit kernel) CPU1 (32-bit user)
read head_lo
WRITE_ONCE(head)
read head_hi
I dumped the disassembly for reading 64-bit value for perf Arm32 and get
below results:
perf Arm32 for READ_ONCE():
case 8: *(__u64_alias_t *) res = *(volatile __u64_alias_t *) p; break;
84a: 68fb ldr r3, [r7, #12]
84c: e9d3 2300 ldrd r2, r3, [r3]
850: 6939 ldr r1, [r7, #16]
852: e9c1 2300 strd r2, r3, [r1]
856: e007 b.n 868 <auxtrace_mmap__read_head+0xb0>
It uses the instruction ldrd which is "Load Register Dual (register)",
but this doesn't mean the instruction is atomic, especially based on
the comment in the kernel header include/asm-generic/rwonce.h, I think
the instruction ldrd/strd will be "atomic in some cases (namely Armv7 +
LPAE), but for others we rely on the access being split into 2x32-bit
accesses".
perf Arm32 for __sync_val_compare_and_swap():
u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
7d6: 68fb ldr r3, [r7, #12]
7d8: f503 6484 add.w r4, r3, #1056 ; 0x420
7dc: f04f 0000 mov.w r0, #0
7e0: f04f 0100 mov.w r1, #0
7e4: f3bf 8f5b dmb ish
7e8: e8d4 237f ldrexd r2, r3, [r4]
7ec: ea52 0c03 orrs.w ip, r2, r3
7f0: d106 bne.n 800 <auxtrace_mmap__read_head+0x48>
7f2: e8c4 017c strexd ip, r0, r1, [r4]
7f6: f1bc 0f00 cmp.w ip, #0
7fa: f1bc 0f00 cmp.w ip, #0
7fe: d1f3 bne.n 7e8 <auxtrace_mmap__read_head+0x30>
800: f3bf 8f5b dmb ish
804: e9c7 2304 strd r2, r3, [r7, #16]
For __sync_val_compare_and_swap(), it uses the instructions
ldrexd/ldrexd, these two instructions rely on the exclusive monitor
for accessing 64-bit value, so seems to me this is more reliable way
for accessing 64-bit value in CPU's 32-bit mode.
Conclusion: seems to me __sync_xxx_compare_and_swap() should be kept
in this case rather than using READ_ONCE() for 32-bit building. Or
any other suggestions? Thanks!
Leo
next prev parent reply other threads:[~2021-06-01 6:33 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-19 14:03 [PATCH v1 1/2] perf auxtrace: Change to use SMP memory barriers Leo Yan
2021-05-19 14:03 ` [PATCH v1 2/2] perf auxtrace: Optimize barriers with load-acquire and store-release Leo Yan
2021-05-31 15:10 ` Leo Yan
2021-05-31 15:55 ` Peter Zijlstra
2021-05-31 19:03 ` Adrian Hunter
2021-06-01 6:33 ` Leo Yan [this message]
2021-06-01 6:58 ` Peter Zijlstra
2021-06-01 9:07 ` Adrian Hunter
2021-06-01 9:17 ` Peter Zijlstra
2021-06-01 9:45 ` Adrian Hunter
2021-06-01 9:48 ` Peter Zijlstra
2021-06-01 14:56 ` Leo Yan
2021-06-01 6:48 ` Peter Zijlstra
2021-05-27 7:54 ` [PATCH v1 1/2] perf auxtrace: Change to use SMP memory barriers Adrian Hunter
2021-05-27 8:11 ` Peter Zijlstra
2021-05-27 8:25 ` Adrian Hunter
2021-05-27 9:24 ` Adrian Hunter
2021-05-27 9:57 ` Peter Zijlstra
2021-05-31 14:53 ` Leo Yan
2021-05-31 15:48 ` Peter Zijlstra
2021-06-01 3:21 ` Leo Yan
2021-05-27 9:45 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210601063342.GB10026@leoy-ThinkPad-X240s \
--to=leo.yan@linaro.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).