From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
Derrick Stolee <derrickstolee@github.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 0/3] pack-bitmap: boundary-based bitmap traversal
Date: Mon, 24 Apr 2023 20:00:17 -0400 [thread overview]
Message-ID: <cover.1682380788.git.me@ttaylorr.com> (raw)
Here is a short (but dense) series that I worked on towards the end of
2021 with Peff.
Its goal is to improve the bitmap traversal we implement in
prepare_bitmap_walk(). Specifically, it avoids building up a complete
bitmap of the "haves" side, and instead uses a combination of (a)
UNINTERESTING commits between the tips and boundary, and (b) the
boundary itself.
The gory details are laid out in 3/3, but the high-level overview of the
new algorithm to compute the set of objects between "haves" and "wants"
using bitmaps is:
1. Build a (partial) bitmap of the haves side by first OR-ing any
bitmap(s) that already exist for UNINTERESTING commits between the
haves and the boundary.
2. For each commit along the boundary, add it as a fill-in traversal
tip (where the traversal terminates once an existing bitmap is
found), and perform fill-in traversal.
3. Build up a complete bitmap of the wants side as usual, stopping any
time we intersect the (partial) haves side.
This improves many cases where using bitmaps was significantly *slower*
than regular, non-bitmap traversal. In some instances, using bitmaps is
still slower than the non-bitmap case, but it is a substantial
improvement over the classic bitmap walk.
$ ours="$(git branch --show-current)"
argv="--count --objects $ours --not --exclude=$ours --branches"
hyperfine \
-n 'no bitmaps' "git.compile rev-list $argv" \
-n 'existing bitmap traversal' "git rev-list --use-bitmap-index $argv" \
-n 'this commit' "git.compile rev-list --use-bitmap-index $argv"
Benchmark 1: no bitmaps
Time (mean ± σ): 15.1 ms ± 4.1 ms [User: 8.1 ms, System: 6.9 ms]
Range (min … max): 7.4 ms … 21.8 ms 131 runs
Benchmark 2: existing bitmap traversal
Time (mean ± σ): 82.6 ms ± 9.2 ms [User: 63.6 ms, System: 19.0 ms]
Range (min … max): 73.8 ms … 105.4 ms 28 runs
Benchmark 3: this commit
Time (mean ± σ): 19.8 ms ± 3.1 ms [User: 13.0 ms, System: 6.8 ms]
Range (min … max): 17.7 ms … 38.6 ms 158 runs
Summary
'no bitmaps' ran
1.31 ± 0.41 times faster than 'this commit'
5.49 ± 1.62 times faster than 'existing bitmap traversal'
In a large repository on GitHub.com, the timings to compute the objects
unique to "master", as in:
$ git rev-list --count --objects master --not --exclude=master --branches
improve as follows:
- 1.012 sec (without bitmaps)
- 29.571 sec (with the existing bitmap walk)
- 2.279 sec (with these patches)
The first couple of patches are preparatory, and the third patch is the
substantive one.
Please have a look and poke any holes you can find in the new approach
:-). Thanks in advance for your review.
Taylor Blau (3):
revision: support tracking uninteresting commits
pack-bitmap.c: extract `fill_in_bitmap()`
pack-bitmap.c: use commit boundary during bitmap traversal
pack-bitmap.c | 252 ++++++++++++++++++++++++++++++++++++++------------
revision.c | 10 +-
revision.h | 5 +
3 files changed, 205 insertions(+), 62 deletions(-)
--
2.40.0.380.gda896aa358.dirty
next reply other threads:[~2023-04-25 0:00 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-25 0:00 Taylor Blau [this message]
2023-04-25 0:00 ` [PATCH 1/3] revision: support tracking uninteresting commits Taylor Blau
2023-04-25 18:15 ` Derrick Stolee
2023-05-03 21:48 ` Taylor Blau
2023-05-04 13:46 ` Derrick Stolee
2023-05-03 22:08 ` Taylor Blau
2023-05-04 13:59 ` Derrick Stolee
2023-05-05 17:30 ` Taylor Blau
2023-04-25 18:22 ` Junio C Hamano
2023-04-25 18:48 ` Taylor Blau
2023-04-25 0:00 ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-04-25 18:32 ` Junio C Hamano
2023-04-25 18:51 ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()`t Taylor Blau
2023-04-25 0:00 ` [PATCH 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-04-25 18:52 ` Junio C Hamano
2023-05-02 21:31 ` Felipe Contreras
2023-05-03 21:42 ` Taylor Blau
2023-05-03 21:58 ` Junio C Hamano
2023-04-25 18:53 ` Derrick Stolee
2023-04-25 18:02 ` [PATCH 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-04-25 18:30 ` Derrick Stolee
2023-04-25 18:57 ` Taylor Blau
2023-04-25 19:52 ` Derrick Stolee
2023-05-03 21:43 ` Taylor Blau
2023-04-25 18:06 ` Derrick Stolee
2023-04-25 19:01 ` Taylor Blau
2023-04-25 20:27 ` Derrick Stolee
2023-05-01 22:08 ` Junio C Hamano
2023-05-02 23:52 ` Taylor Blau
2023-05-05 17:27 ` [PATCH v2 0/2] " Taylor Blau
2023-05-05 17:27 ` [PATCH v2 1/2] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-05 17:27 ` [PATCH v2 2/2] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-05 18:13 ` Derrick Stolee
2023-05-05 18:43 ` Taylor Blau
2023-05-05 17:33 ` [PATCH v2 0/2] pack-bitmap: boundary-based " Junio C Hamano
2023-05-05 17:59 ` Derrick Stolee
2023-05-05 18:46 ` [PATCH v2 0/2] pack-bitmap: boundary-based bitmap traversalt Taylor Blau
2023-05-05 20:45 ` Derrick Stolee
2023-05-08 17:38 ` [PATCH v3 0/3] pack-bitmap: boundary-based bitmap traversal Taylor Blau
2023-05-08 17:38 ` [PATCH v3 1/3] object: add object_array initializer helper function Taylor Blau
2023-05-08 17:38 ` [PATCH v3 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-08 17:38 ` [PATCH v3 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-08 20:53 ` Derrick Stolee
2023-05-08 22:12 ` Taylor Blau
2023-05-10 22:55 ` [PATCH v3 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-05-10 23:10 ` Taylor Blau
2023-05-11 15:23 ` Derrick Stolee
2023-06-08 16:25 ` [PATCH v4 " Taylor Blau
2023-06-08 16:25 ` [PATCH v4 1/3] object: add object_array initializer helper function Taylor Blau
2023-06-08 16:25 ` [PATCH v4 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-06-08 16:25 ` [PATCH v4 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1682380788.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).