From: Teng Long <dyroneteng@gmail.com>
To: peff@peff.net
Cc: avarab@gmail.com, derrickstolee@github.com, dyroneteng@gmail.com,
git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
tenglong.tl@alibaba-inc.com
Subject: Re: [PATCH 0/1] pack-bitmap.c: avoid exposing absolute paths
Date: Wed, 2 Nov 2022 21:52:45 +0800 [thread overview]
Message-ID: <20221102135245.97998-1-tenglong.tl@alibaba-inc.com> (raw)
In-Reply-To: <Y2IiSU1L+bJPUioV@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> I mean that later in the process, if we need to find an object we may
> open the .idx file to look for it. So by opening them all up front, we
> _might_ just be doing work that would get done later.
>
> But it's not guaranteed. Imagine you have 10,000 small packs, and one
> big bitmapped pack. If you can serve the request from just the big pack,
> then you'd never need to open those other .idx files at all. However,
> the current code will open them anyway.
>
> I care less about mmap space, and more that it's work (syscalls, and
> examining the contents of the idx) to open each one. It's probably not
> even measurable unless you have a ton of packs, though.
>
> > > So it may not be worth worrying about. It does seem like it would be
> > > easy to reorder open_pack_bitmap_1() to look for a bitmap file first and
> > > only open the idx if it finds something.
> >
> > I think it may be worthy if we have lots of packs and the bitmap is refer to
> > an older one, but I didn't make the test. At least, the scenario is common, I
> > agree with that, so maybe we could shuffle the sort order in "open_pack_bitmap()".
>
> I don't mean the order in which we look at packs. I mean the order of
> operations in open_pack_bitmap_1(), something like:
Thank you for the explanation. Make sense.
I run a test under a repo with 3 packs and without bitmaps,it seems like now
will open every idx and failed at last:
➜ pack git:(master) git rev-list --test-bitmap HEAD
pack: /Users/tenglong.tl/Downloads/trace-test/.git/objects/pack/pack-c9fe9d2dc5d002d4a4b622626ffa282bcbccb7ee.pack
pack: /Users/tenglong.tl/Downloads/trace-test/.git/objects/pack/pack-08841c0c4c1fd176c354bdbd25c5a1b152ea95d0.pack
pack: /Users/tenglong.tl/Downloads/trace-test/.git/objects/pack/pack-3cea516b416961285fd8f519e12102b19bcf257e.pack
fatal: failed to load bitmap indexes
So we're now looping for packs first, then try to find the corresponded bitmap
of it. In that case, why can't we start the search from the bitmap files at
first? If this is possible, when we found the first bitmap file or an
appropriate one under some mechanism (biggest or newest maybe?I'm not deep into
it right now) then break the loop and open it.
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 440407f1be..1df2f6c8b6 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -411,9 +411,6 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git
> struct stat st;
> char *bitmap_name;
>
> - if (open_pack_index(packfile))
> - return -1;
> -
> bitmap_name = pack_bitmap_filename(packfile);
> fd = git_open(bitmap_name);
>
> @@ -438,6 +435,10 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git
> return -1;
> }
>
> + /* now we know we have a plausible bitmap; make sure the idx is OK, too */
> + if (open_pack_index(packfile))
> + return -1;
> +
> if (!is_pack_valid(packfile)) {
> close(fd);
> return -1;
>
> But we can further observe that the first thing is_pack_valid() will do
> is open the idx file. :) So we can really just drop this line entirely,
> I'd think.
I agree that and I think it could append to patch v3, maybe.
> BTW, another oddity I noticed in this function. We check:
>
> if (bitmap_git->pack || bitmap_git->midx) {
> /* ignore extra bitmap file; we can only handle one */
> ...
> }
>
> but it's impossible for bitmap_git->midx to be set here. If we opened
> the midx bitmap, we'll skip calling open_pack_bitmap() entirely.
Oh, I remember that and it's mentioned in another patchset at Tue, 29 Mar 2022:
https://public-inbox.org/git/20220329024949.62091-1-dyroneteng@gmail.com/
I agree with Taylor with
https://public-inbox.org/git/YkPGq0mDL4NG6D1o@nand.local/
But I'm ok if you think it should be solved.
Thank you very much for your help.
next prev parent reply other threads:[~2022-11-02 13:53 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 7:09 [PATCH 0/1] pack-bitmap.c: avoid exposing absolute paths Teng Long
2022-08-26 7:09 ` [PATCH 1/1] " Teng Long
2022-08-26 16:34 ` [PATCH 0/1] " Junio C Hamano
2022-08-29 2:48 ` Teng Long
2022-10-26 21:42 ` Taylor Blau
2022-10-26 23:19 ` Ævar Arnfjörð Bjarmason
2022-10-31 13:20 ` Teng Long
2022-10-27 20:45 ` Jeff King
2022-10-30 18:42 ` Taylor Blau
2022-10-31 12:22 ` [PATCH 0/1] pack-bitmap.c: avoid exposing absolute paths Taylor Blau <me@ttaylorr.com> writes: Teng Long
2022-11-02 5:37 ` [PATCH 0/1] pack-bitmap.c: avoid exposing absolute paths Teng Long
2022-11-02 7:54 ` Jeff King
2022-11-02 13:52 ` Teng Long [this message]
2022-10-31 13:13 ` Teng Long
2022-11-03 1:00 ` Taylor Blau
2022-11-02 9:20 ` Ævar Arnfjörð Bjarmason
2022-11-02 13:04 ` Teng Long
2022-11-02 12:56 ` [PATCH v2 " Teng Long
2022-11-02 12:56 ` [PATCH v2 1/1] " Teng Long
2022-11-03 1:16 ` Taylor Blau
2022-11-03 9:35 ` Teng Long
2022-11-05 0:35 ` Taylor Blau
2022-11-03 1:21 ` [PATCH v2 0/1] " Taylor Blau
2022-11-03 8:42 ` Teng Long
2022-11-04 3:17 ` [PATCH v3 0/2] " Teng Long
2022-11-04 3:17 ` [PATCH v3 1/2] " Teng Long
2022-11-04 22:11 ` Taylor Blau
2022-11-04 3:17 ` [PATCH v3 2/2] pack-bitmap.c: remove unnecessary "open_pack_index()" calls Teng Long
2022-11-04 22:09 ` Taylor Blau
2022-11-04 22:13 ` [PATCH v3 0/2] pack-bitmap.c: avoid exposing absolute paths Taylor Blau
2022-11-10 7:10 ` Teng Long
2022-11-10 7:10 ` [PATCH v3 1/2] pack-bitmap.c: remove unnecessary "open_pack_index()" calls Teng Long
2022-11-14 22:03 ` Jeff King
2022-11-14 22:14 ` Taylor Blau
2022-11-14 22:31 ` Jeff King
2022-11-14 22:50 ` Taylor Blau
2022-11-10 7:10 ` [PATCH v3 2/2] pack-bitmap.c: avoid exposing absolute paths Teng Long
2022-11-11 22:26 ` [PATCH v3 0/2] " Taylor Blau
2022-11-14 22:23 ` Jeff King
2022-11-17 14:19 ` Teng Long
2022-11-17 15:03 ` Jeff King
2022-11-17 21:57 ` Taylor Blau
2022-11-21 3:27 ` Teng Long
2022-11-21 12:16 ` [PATCH v4 0/4] " Teng Long
2022-11-21 12:16 ` [PATCH v4 1/4] pack-bitmap.c: remove unnecessary "open_pack_index()" calls Teng Long
2022-11-21 12:16 ` [PATCH v4 2/4] pack-bitmap.c: avoid exposing absolute paths Teng Long
2022-11-21 12:16 ` [PATCH v4 3/4] pack-bitmap.c: break out of the bitmap loop early if not tracing Teng Long
2022-11-21 23:27 ` Junio C Hamano
2022-11-28 13:09 ` Teng Long
2022-11-21 12:16 ` [PATCH v4 4/4] pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found Teng Long
2022-11-21 19:09 ` Jeff King
2022-11-21 23:29 ` Junio C Hamano
2022-11-28 12:29 ` Teng Long
2022-11-28 12:37 ` Teng Long
2022-11-29 1:27 ` Jeff King
2022-11-29 13:14 ` Teng Long
2022-11-21 19:04 ` [PATCH v4 0/4] pack-bitmap.c: avoid exposing absolute paths Jeff King
2022-11-28 12:48 ` Teng Long
2022-11-28 14:09 ` [PATCH v5 " Teng Long
2022-11-28 14:09 ` [PATCH v5 1/4] pack-bitmap.c: remove unnecessary "open_pack_index()" calls Teng Long
2022-11-28 14:09 ` [PATCH v5 2/4] pack-bitmap.c: avoid exposing absolute paths Teng Long
2022-11-28 14:09 ` [PATCH v5 3/4] pack-bitmap.c: break out of the bitmap loop early if not tracing Teng Long
2022-11-28 23:26 ` Taylor Blau
2022-11-29 13:17 ` Teng Long
2022-11-28 14:09 ` [PATCH v5 4/4] pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found Teng Long
2022-11-28 23:30 ` [PATCH v5 0/4] pack-bitmap.c: avoid exposing absolute paths Taylor Blau
2022-11-29 13:21 ` Teng Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221102135245.97998-1-tenglong.tl@alibaba-inc.com \
--to=dyroneteng@gmail.com \
--cc=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=tenglong.tl@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).