From: Justin Tobler <jltobler@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 7/7] reftable/block: avoid decoding keys when searching restart points
Date: Tue, 2 Apr 2024 11:47:16 -0500 [thread overview]
Message-ID: <eiyd2nmwxjaetkux4prwm6adcx7z77ry3wc62art6gnfklvgmw@hox32vwuu5sj> (raw)
In-Reply-To: <e751b3c536ace78f975b7d2553c22dbf6845a8d4.1711361340.git.ps@pks.im>
On 24/03/25 11:11AM, Patrick Steinhardt wrote:
> When searching over restart points in a block we decode the key of each
> of the records, which results in a memory allocation. This is quite
> pointless though given that records it restart points will never use
> prefix compression and thus store their keys verbatim in the block.
>
> Refactor the code so that we can avoid decoding the keys, which saves us
> some allocations.
Out of curiousity, do you have any benchmarks around this change and
would that be something we would want to add to the commit message?
-Justin
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> reftable/block.c | 29 +++++++++++++++++++----------
> 1 file changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/reftable/block.c b/reftable/block.c
> index ca80a05e21..8bb4e43cec 100644
> --- a/reftable/block.c
> +++ b/reftable/block.c
> @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args)
> .buf = args->reader->block.data + off,
> .len = args->reader->block_len - off,
> };
> - struct strbuf kth_restart_key = STRBUF_INIT;
> - uint8_t unused_extra;
> - int result, n;
> + uint64_t prefix_len, suffix_len;
> + uint8_t extra;
> + int n;
>
> /*
> - * TODO: The restart key is verbatim in the block, so we can in theory
> - * avoid decoding the key and thus save some allocations.
> + * Records at restart points are stored without prefix compression, so
> + * there is no need to fully decode the record key here. This removes
> + * the need for allocating memory.
> */
> - n = reftable_decode_key(&kth_restart_key, &unused_extra, in);
> - if (n < 0) {
> + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra);
> + if (n < 0 || prefix_len) {
> args->error = 1;
> return -1;
> }
>
> - result = strbuf_cmp(&args->needle, &kth_restart_key);
> - strbuf_release(&kth_restart_key);
> - return result < 0;
> + string_view_consume(&in, n);
> + if (suffix_len > in.len) {
> + args->error = 1;
> + return -1;
> + }
> +
> + n = memcmp(args->needle.buf, in.buf,
> + args->needle.len < suffix_len ? args->needle.len : suffix_len);
> + if (n)
> + return n < 0;
> + return args->needle.len < suffix_len;
> }
>
> void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
> --
> 2.44.GIT
>
next prev parent reply other threads:[~2024-04-02 16:48 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-22 12:22 [PATCH 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-03-22 18:46 ` Justin Tobler
2024-03-25 10:07 ` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-03-22 18:55 ` Justin Tobler
2024-03-25 10:07 ` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-02 16:27 ` Justin Tobler
2024-04-02 17:15 ` Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-02 16:42 ` Justin Tobler
2024-04-02 17:15 ` Patrick Steinhardt
2024-04-02 17:46 ` Justin Tobler
2024-04-03 6:01 ` Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-03-25 10:11 ` [PATCH v2 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-04-02 16:47 ` Justin Tobler [this message]
2024-04-02 17:15 ` Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-04-02 17:25 ` [PATCH v3 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-04-02 17:25 ` [PATCH v3 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-04-02 17:49 ` [PATCH v3 0/7] reftable: improvements for the `binsearch()` mechanism Justin Tobler
2024-04-03 6:03 ` [PATCH v4 " Patrick Steinhardt
2024-04-03 6:03 ` [PATCH v4 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-04-03 6:04 ` [PATCH v4 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eiyd2nmwxjaetkux4prwm6adcx7z77ry3wc62art6gnfklvgmw@hox32vwuu5sj \
--to=jltobler@gmail.com \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).