From: "Justin Tobler via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v2 0/3] reftable/stack: use geometric table compaction
Date: Thu, 21 Mar 2024 22:40:16 +0000 [thread overview]
Message-ID: <pull.1683.v2.git.1711060819.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1683.git.1709669025722.gitgitgadget@gmail.com>
Hello again,
This is the second version my patch series that refactors the reftable
compaction strategy to instead follow a geometric sequence. Changes compared
to v1:
* Added GIT_TEST_REFTABLE_NO_AUTOCOMPACTION environment variable to disable
reftable compaction when testing.
* Refactored worktree tests in t0610-reftable-basics.sh to properly assert
git-pack-refs(1) works as expected.
* Added test to validate that alternating table sizes are compacted.
* Added benchmark to compare compaction strategies.
* Moved change that made compaction segment end inclusive to its own
commit.
* Added additional explanation in commits and comments and fixed typos.
Thanks for taking a look!
Justin
Justin Tobler (3):
reftable/stack: add env to disable autocompaction
reftable/stack: use geometric table compaction
reftable/segment: make segment end inclusive
reftable/stack.c | 113 ++++++++++++++++---------------------
reftable/stack.h | 3 -
reftable/stack_test.c | 66 +++++-----------------
reftable/system.h | 1 +
t/t0610-reftable-basics.sh | 43 +++++++++-----
5 files changed, 94 insertions(+), 132 deletions(-)
base-commit: 3bd955d26919e149552f34aacf8a4e6368c26cec
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1683%2Fjltobler%2Fjt%2Freftable-geometric-compaction-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1683/jltobler/jt/reftable-geometric-compaction-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1683
Range-diff vs v1:
-: ----------- > 1: cb6b152e5c8 reftable/stack: add env to disable autocompaction
1: 7a518853a10 ! 2: def70084523 reftable/stack: use geometric table compaction
@@ Commit message
occurring until a separate operation produces a table matching the
previous table log value.
- To avoid unbounded growth of the table list, walk through each table and
- evaluate if it needs to be included in the compaction segment to restore
- a geometric sequence.
+ Instead, to avoid unbounded growth of the table list, the compaction
+ strategy is updated to ensure tables follow a geometric sequence after
+ each operation. This is done by walking the table list in reverse index
+ order to identify the compaction segment start and end. The compaction
+ segment end is found by identifying the first table which has a
+ preceding table size less than twice the current table. Next, the
+ compaction segment start is found iterating through the remaining tables
+ in the list checking if the previous table size is less than twice the
+ cumulative of tables from the segment end. This ensures the correct
+ segment start is found and that the newly compacted table does not
+ violate the geometric sequence.
+
+ When creating 10 thousand references, the new strategy has no
+ performance impact:
+
+ Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
+ Time (mean ± σ): 26.516 s ± 0.047 s [User: 17.864 s, System: 8.491 s]
+ Range (min … max): 26.447 s … 26.569 s 10 runs
+
+ Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
+ Time (mean ± σ): 26.417 s ± 0.028 s [User: 17.738 s, System: 8.500 s]
+ Range (min … max): 26.366 s … 26.444 s 10 runs
+
+ Summary
+ update-ref: create refs sequentially (revision = HEAD) ran
+ 1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)
Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
tables and are therefore updated to specify the correct new table count.
@@ reftable/stack.c: static int segment_size(struct segment *s)
+ * until a valid segment end is found. If the preceding table is smaller
+ * than the current table multiplied by the geometric factor (2), the
+ * current table is set as the compaction segment end.
++ *
++ * Tables after the ending point are not added to the byte count because
++ * they are already valid members of the geometric sequence. Due to the
++ * properties of a geometric sequence, it is not possible for the sum of
++ * these tables to exceed the value of the ending point table.
+ */
+ for (i = n - 1; i > 0; i--) {
+ if (sizes[i - 1] < sizes[i] * 2) {
-+ seg.end = i;
++ seg.end = i + 1;
+ bytes = sizes[i];
break;
+ }
@@ reftable/stack.c: static int segment_size(struct segment *s)
+
+ /*
+ * Find the starting table of the compaction segment by iterating
-+ * through the remaing tables and keeping track of the accumulated size
-+ * of all tables seen from the segment end table.
++ * through the remaining tables and keeping track of the accumulated
++ * size of all tables seen from the segment end table.
+ *
+ * Note that we keep iterating even after we have found the first
-+ * first starting point. This is because there may be tables in the
-+ * stack preceding that first starting point which violate the geometric
++ * starting point. This is because there may be tables in the stack
++ * preceding that first starting point which violate the geometric
+ * sequence.
+ */
+ for (; i > 0; i--) {
@@ reftable/stack.c: static int segment_size(struct segment *s)
}
static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
-@@ reftable/stack.c: int reftable_stack_auto_compact(struct reftable_stack *st)
- suggest_compaction_segment(sizes, st->merged->stack_len);
- reftable_free(sizes);
- if (segment_size(&seg) > 0)
-- return stack_compact_range_stats(st, seg.start, seg.end - 1,
-+ return stack_compact_range_stats(st, seg.start, seg.end,
- NULL);
-
- return 0;
## reftable/stack.h ##
@@ reftable/stack.h: int read_lines(const char *filename, char ***lines);
@@ reftable/stack_test.c: static void test_reftable_stack_hash_id(void)
- EXPECT(min.start == 2);
- EXPECT(min.end == 7);
+ EXPECT(min.start == 1);
-+ EXPECT(min.end == 9);
++ EXPECT(min.end == 10);
}
static void test_suggest_compaction_segment_nothing(void)
@@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes cause a
test_commit -C repo --no-tag B &&
test_line_count = 1 repo/.git/reftable/tables.list
+ '
+
++test_expect_success 'ref transaction: alternating table sizes are compacted' '
++ test_when_finished "rm -rf repo" &&
++ git init repo &&
++ test_commit -C repo A &&
++ for i in $(test_seq 20)
++ do
++ git -C repo branch -f foo &&
++ git -C repo branch -d foo || return 1
++ done &&
++ test_line_count = 2 repo/.git/reftable/tables.list
++'
++
+ check_fsync_events () {
+ local trace="$1" &&
+ shift &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'ref transaction: writes are synced' '
git -C repo -c core.fsync=reference \
-c core.fsyncMethod=fsync update-ref refs/heads/branch HEAD &&
@@ t/t0610-reftable-basics.sh: do
git -C repo pack-refs &&
test_expect_perms "-rw-rw-r--" repo/.git/reftable/tables.list &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in main repo packs main refs' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
test_commit -C repo A &&
- git -C repo worktree add ../worktree &&
+- git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 3 repo/.git/reftable/tables.list &&
git -C repo pack-refs &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
test_line_count = 1 repo/.git/reftable/tables.list
'
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: pack-refs in worktree packs worktree refs' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
test_commit -C repo A &&
- git -C repo worktree add ../worktree &&
+- git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
- test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-+ test_line_count = 1 repo/.git/reftable/tables.list &&
++ test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
++ test_line_count = 3 repo/.git/reftable/tables.list &&
git -C worktree pack-refs &&
test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
- test_line_count = 4 repo/.git/reftable/tables.list
-+ test_line_count = 1 repo/.git/reftable/tables.list
++ test_line_count = 3 repo/.git/reftable/tables.list
'
test_expect_success 'worktree: creating shared ref updates main stack' '
+ test_when_finished "rm -rf repo worktree" &&
+ git init repo &&
+ test_commit -C repo A &&
++ test_commit -C repo B &&
+
+ git -C repo worktree add ../worktree &&
+ git -C repo pack-refs &&
@@ t/t0610-reftable-basics.sh: test_expect_success 'worktree: creating shared ref updates main stack' '
+ test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
+ test_line_count = 1 repo/.git/reftable/tables.list &&
- git -C worktree update-ref refs/heads/shared HEAD &&
+- git -C worktree update-ref refs/heads/shared HEAD &&
++ GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/heads/shared HEAD &&
test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-- test_line_count = 2 repo/.git/reftable/tables.list
-+ test_line_count = 1 repo/.git/reftable/tables.list
+ test_line_count = 2 repo/.git/reftable/tables.list
'
-
- test_expect_success 'worktree: creating per-worktree ref updates worktree stack' '
-: ----------- > 3: a23e3fc6972 reftable/segment: make segment end inclusive
--
gitgitgadget
next prev parent reply other threads:[~2024-03-21 22:40 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-05 20:03 [PATCH] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-03-06 12:30 ` Patrick Steinhardt
2024-03-06 12:37 ` Patrick Steinhardt
2024-03-21 22:48 ` Justin Tobler
2024-03-21 22:40 ` Justin Tobler via GitGitGadget [this message]
2024-03-21 22:40 ` [PATCH v2 1/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-03-22 1:25 ` Patrick Steinhardt
2024-03-21 22:40 ` [PATCH v2 2/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-03-22 1:25 ` Patrick Steinhardt
2024-03-27 13:24 ` Karthik Nayak
2024-03-21 22:40 ` [PATCH v2 3/3] reftable/segment: make segment end inclusive Justin Tobler via GitGitGadget
2024-03-22 1:25 ` [PATCH v2 0/3] reftable/stack: use geometric table compaction Patrick Steinhardt
2024-04-03 10:13 ` Han-Wen Nienhuys
2024-04-03 10:18 ` Patrick Steinhardt
2024-04-03 15:14 ` Justin Tobler
2024-04-03 16:40 ` Junio C Hamano
2024-03-29 4:16 ` [PATCH v3 " Justin Tobler via GitGitGadget
2024-03-29 4:16 ` [PATCH v3 1/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-03-29 18:25 ` Junio C Hamano
2024-03-29 21:56 ` Junio C Hamano
2024-04-02 7:23 ` Patrick Steinhardt
2024-04-02 17:23 ` Junio C Hamano
2024-03-29 4:16 ` [PATCH v3 2/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-02 7:23 ` Patrick Steinhardt
2024-03-29 4:16 ` [PATCH v3 3/3] reftable/stack: make segment end inclusive Justin Tobler via GitGitGadget
2024-03-29 18:36 ` Junio C Hamano
2024-04-02 7:23 ` Patrick Steinhardt
2024-04-03 0:20 ` [PATCH v4 0/2] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-03 0:20 ` [PATCH v4 1/2] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-03 0:20 ` [PATCH v4 2/2] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-03 4:47 ` [PATCH v4 0/2] " Patrick Steinhardt
2024-04-03 11:12 ` Karthik Nayak
2024-04-03 16:56 ` Junio C Hamano
2024-04-04 18:29 ` [PATCH v5 0/3] " Justin Tobler via GitGitGadget
2024-04-04 18:29 ` [PATCH v5 1/3] reftable/stack: allow disabling of auto-compaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` Patrick Steinhardt
2024-04-04 18:29 ` [PATCH v5 2/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` Patrick Steinhardt
2024-04-08 16:18 ` Junio C Hamano
2024-04-04 18:29 ` [PATCH v5 3/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-08 6:12 ` [PATCH v5 0/3] " Patrick Steinhardt
2024-04-08 16:17 ` Justin Tobler
2024-04-08 16:16 ` [PATCH v6 " Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 1/3] reftable/stack: expose option to disable auto-compaction Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 2/3] reftable/stack: add env to disable autocompaction Justin Tobler via GitGitGadget
2024-04-08 16:16 ` [PATCH v6 3/3] reftable/stack: use geometric table compaction Justin Tobler via GitGitGadget
2024-04-08 16:20 ` [PATCH v6 0/3] " Patrick Steinhardt
2024-04-08 19:12 ` Junio C Hamano
2024-04-03 19:12 ` [PATCH v2 " Junio C Hamano
2024-04-03 19:30 ` Patrick Steinhardt
2024-04-04 5:34 ` Patrick Steinhardt
2024-04-04 18:28 ` Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1683.v2.git.1711060819.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=jltobler@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).