From: Derrick Stolee <derrickstolee@github.com>
To: Taylor Blau <me@ttaylorr.com>, git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2] builtin/pack-objects.c: introduce `pack.extraCruftTips`
Date: Wed, 3 May 2023 10:01:51 -0400 [thread overview]
Message-ID: <b29ff03e-5504-af04-72c6-f98abde1442f@github.com> (raw)
In-Reply-To: <73ad7b90e1fe6c15f41ff828651f7ab06076ffd8.1683072587.git.me@ttaylorr.com>
On 5/2/2023 8:09 PM, Taylor Blau wrote:
> This patch introduces a new multi-valued configuration option,
> `pack.extraCruftTips` as an alternative means to mark certain objects in
> the cruft pack as rescued, even if they would otherwise be pruned.
> Range-diff against v1:
> 1: 8af478ebe3 ! 1: 73ad7b90e1 builtin/pack-objects.c: introduce `pack.extraCruftTips`
> @@ Commit message
>
> - to point a reference at them, which may be undesirable or
> infeasible,
> + - to track them via the reflog, which may be undesirable since the
> + reflog's lifetime is limited to that of the reference it's tracking
> + (and callers may want to keep those unreachable objects around for
> + longer)
> - to extend the grace period, which may keep around other objects that
> the caller *does* want to discard,
> - or, to force the caller to construct the pack of objects they want
> @@ Documentation/config/pack.txt: pack.deltaCacheLimit::
> ++
> +Output must contain exactly one hex object ID per line, and nothing
> +else. Objects which cannot be found in the repository are ignored.
> ++Multiple hooks are supported, but all must exit successfully, else no
> ++cruft pack will be generated.
Thanks for these updates.
> pack.threads::
> Specifies the number of threads to spawn when searching for best
> @@ builtin/pack-objects.c: static void enumerate_and_traverse_cruft_objects(struct
> + add_pending_object(revs, obj, "");
> + }
> +
> ++ ret = finish_command(&cmd);
> ++
> +done:
> + if (out)
> + fclose(out);
> @@ builtin/pack-objects.c: static void enumerate_and_traverse_cruft_objects(struct
> + for (i = 0; i < programs->nr; i++) {
> + ret = enumerate_extra_cruft_tips_1(&revs,
> + programs->items[i].string);
> -+ if (!ret)
> ++ if (ret)
> + break;
> + }
If we have a failure, then we stop immediately and report a failure
(from this bit outside the range-diff's context):
> + if (ret)
> + die(_("unable to enumerate additional cruft tips"));
> @@ t/t5329-pack-objects-cruft.sh: test_expect_success 'cruft objects are freshend v
> + cruft_new="$(git rev-parse HEAD)" &&
> +
> + git checkout main &&
> -+ git branch -D old new &&
> ++ git branch -D discard old new &&
Good update.
> + git reflog expire --all --expire=all &&
> +
> + # mark cruft.old with an mtime that is many minutes
> @@ t/t5329-pack-objects-cruft.sh: test_expect_success 'cruft objects are freshend v
> + mtimes="$(ls .git/objects/pack/pack-*.mtimes)" &&
> + git show-index <${mtimes%.mtimes}.idx >cruft &&
> + cut -d" " -f2 cruft | sort >cruft.actual &&
> -+ test_cmp cruft.expect cruft.actual
> ++ test_cmp cruft.expect cruft.actual &&
> ++
> ++ # Ensure that the "old" objects are removed after
> ++ # dropping the pack.extraCruftTips hook.
> ++ git config --unset pack.extraCruftTips &&
> ++ git repack --cruft --cruft-expiration=now -d &&
Double-checking that removing the tips correctly removes the objects
at this time is good, but...
> ++ mtimes="$(ls .git/objects/pack/pack-*.mtimes)" &&
> ++ git show-index <${mtimes%.mtimes}.idx >cruft &&
> ++ cut -d" " -f2 cruft | sort >cruft.actual &&
> ++
> ++ git rev-list --objects --no-object-names $cruft_new >cruft.raw &&
> ++ cp cruft.expect cruft.old &&
> ++ sort cruft.raw >cruft.expect &&
> ++ test_cmp cruft.expect cruft.actual &&
> ++
> ++ # ensure objects which are no longer in the cruft pack were
> ++ # removed from the repository
> ++ for object in $(comm -13 cruft.expect cruft.old)
> ++ do
> ++ test_must_fail git cat-file -t $object || return 1
> ++ done
...it would be good to do this check before the removal of the hook
to be sure the objects from 'discard' are removed as part of the step
with the hook. I can imagine a world where the hook-based approach
erroneously keeps the 'discard' objects in the non-cruft pack only
for them to be deleted by the repack where hooks are disabled, and
having a test would help guarantee we don't live in that hypothetical
world.
I would also be satisfied with checking just the commits that were
previously referenced by 'discard' and 'old', but this more
thorough check is also good.
> ++test_expect_success 'multi-valued pack.extraCruftTips' '
> ++ git init repo &&
> ++ test_when_finished "rm -fr repo" &&
> ++ (
> ++ cd repo &&
> ++
> ++ test_commit base &&
> ++ git branch -M main &&
> ++
> ++ git checkout --orphan cruft.a &&
> ++ git rm -fr . &&
> ++ test_commit --no-tag cruft.a &&
> ++ cruft_a="$(git rev-parse HEAD)" &&
> ++
> ++ git checkout --orphan cruft.b &&
> ++ git rm -fr . &&
> ++ test_commit --no-tag cruft.b &&
> ++ cruft_b="$(git rev-parse HEAD)" &&
> ++
> ++ git checkout main &&
> ++ git branch -D cruft.a cruft.b &&
> ++ git reflog expire --all --expire=all &&
> ++
> ++ echo "echo $cruft_a" | write_script extra-tips.a &&
> ++ echo "echo $cruft_b" | write_script extra-tips.b &&
> ++ echo "false" | write_script extra-tips.c &&
> ++
> ++ git rev-list --objects --no-object-names $cruft_a $cruft_b \
> ++ >cruft.raw &&
> ++ sort cruft.raw >cruft.expect &&
> ++
> ++ # ensure that each extra cruft tip is saved by its
> ++ # respective hook
> ++ git config --add pack.extraCruftTips ./extra-tips.a &&
> ++ git config --add pack.extraCruftTips ./extra-tips.b &&
> ++ git repack --cruft --cruft-expiration=now -d &&
> ++
> ++ mtimes="$(ls .git/objects/pack/pack-*.mtimes)" &&
> ++ git show-index <${mtimes%.mtimes}.idx >cruft &&
> ++ cut -d" " -f2 cruft | sort >cruft.actual &&
> ++ test_cmp cruft.expect cruft.actual &&
> ++
> ++ # ensure that a dirty exit halts cruft pack generation
> ++ git config --add pack.extraCruftTips ./extra-tips.c &&
> ++ test_must_fail git repack --cruft -d 2>err &&
> ++ grep "unable to enumerate additional cruft tips" err &&
> ++
> ++ # and that the existing cruft pack is left alone
> ++ test_path_is_file "$mtimes"
> + )
This new test looks good to me.
Thanks,
-Stolee
next prev parent reply other threads:[~2023-05-03 14:01 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-20 17:27 [PATCH] builtin/pack-objects.c: introduce `pack.extraCruftTips` Taylor Blau
2023-04-20 18:12 ` Junio C Hamano
2023-04-20 19:30 ` Taylor Blau
2023-04-20 19:52 ` Junio C Hamano
2023-04-20 20:48 ` Taylor Blau
2023-04-21 0:10 ` Chris Torek
2023-04-21 2:14 ` Taylor Blau
2023-04-25 19:42 ` Derrick Stolee
2023-04-25 21:25 ` Taylor Blau
2023-04-26 10:52 ` Derrick Stolee
2023-05-03 0:06 ` Taylor Blau
2023-05-03 0:09 ` [PATCH v2] " Taylor Blau
2023-05-03 14:01 ` Derrick Stolee [this message]
2023-05-03 19:59 ` Jeff King
2023-05-03 21:22 ` Taylor Blau
2023-05-05 21:23 ` Jeff King
2023-05-06 0:06 ` Taylor Blau
2023-05-06 0:14 ` Taylor Blau
2023-05-03 21:28 ` Taylor Blau
2023-05-05 21:26 ` Jeff King
2023-05-05 22:13 ` Jeff King
2023-05-06 0:13 ` Taylor Blau
2023-05-06 0:20 ` Taylor Blau
2023-05-06 2:12 ` Jeff King
2023-05-03 22:05 ` [PATCH v3] " Taylor Blau
2023-05-03 23:18 ` Junio C Hamano
2023-05-03 23:42 ` Junio C Hamano
2023-05-03 23:48 ` Taylor Blau
2023-05-03 23:50 ` Taylor Blau
2023-05-05 21:39 ` Jeff King
2023-05-05 22:19 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b29ff03e-5504-af04-72c6-f98abde1442f@github.com \
--to=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).