Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Chris Torek <chris.torek@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 2/2] builtin/pack-objects.c: introduce `pack.recentObjectsHook`
Date: Fri, 12 May 2023 18:01:17 -0400	[thread overview]
Message-ID: <20230512220117.GA2971903@coredump.intra.peff.net> (raw)
In-Reply-To: <20230512214542.GB2495860@coredump.intra.peff.net>

On Fri, May 12, 2023 at 05:45:43PM -0400, Jeff King wrote:

> The thing that confused me when looking at the code earlier is that
> git-prune itself checks the mtime of the objects, too, even if they were
> not mentioned in the recent-but-reachable walk. That seems redundant to
> me, but somehow it isn't. If I do:
> 
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 5dc9b20720..22b7ce4b10 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -92,7 +92,7 @@ static int prune_object(const struct object_id *oid, const char *fullpath,
>  		return 0;
>  	}
>  	if (st.st_mtime > expire)
> -		return 0;
> +		BUG("object should have been saved via is_object_reachable!");
>  	if (show_only || verbose) {
>  		enum object_type type = oid_object_info(the_repository, oid,
>  							NULL);
> 
> then t5304 shows some failures. I'm not quite sure what is going on, but
> _if_ that mtime check in prune_object() is important, then it probably
> should also respect the hook mechanism. So there may be a (3) to add to
> your patch there.

Ah, I see. The problem is that "git prune --expire=never" relies on this
check. In mark_reachable_objects(), we take a "0" expiration as "do not
bother adding these objects to the traversal", so it's here that we
catch them and say "ah, we are not expiring anything, so keep this".

If my hack above is switched to:

diff --git a/builtin/prune.c b/builtin/prune.c
index 5dc9b20720..108305cd26 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -91,8 +91,11 @@ static int prune_object(const struct object_id *oid, const char *fullpath,
 		error("Could not stat '%s'", fullpath);
 		return 0;
 	}
-	if (st.st_mtime > expire)
-		return 0;
+	if (st.st_mtime > expire) {
+		if (!expire)
+			return 0; /* we are not expiring anything! */
+		BUG("object should have been saved via is_object_reachable!");
+	}
 	if (show_only || verbose) {
 		enum object_type type = oid_object_info(the_repository, oid,
 							NULL);

then all of the tests pass. I do suspect this code could trigger racily
in the real world (we do our traversal, and then a new object is added,
which is of course recent). So we would not want to drop the check.

Is it important there for your patch to kick in and say "we were
specifically asked to consider this object recent, so do so"? I think it
shouldn't matter, because any such object is by definition recent,
having an mtime greater than or equal to when we started (it cannot even
get racily old while we traverse, because we decide the cutoff time as
an absolute value even before starting the traversal). The strict
greater-than in the check makes it feel like an off-by-one problem, but
unless you are saying "now" it must be at least 1 second in the past
anyway.

It does make me wonder why "git prune --expire=never" does not simply
exit immediately. Isn't there by definition nothing useful to do?

-Peff

  reply	other threads:[~2023-05-12 22:01 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11 23:20 [PATCH v3 0/2] pack-objects: introduce `pack.extraRecentObjectsHook` Taylor Blau
2023-05-11 23:20 ` [PATCH v3 1/2] reachable.c: extract `obj_is_recent()` Taylor Blau
2023-05-11 23:20 ` [PATCH v3 2/2] builtin/pack-objects.c: introduce `pack.recentObjectsHook` Taylor Blau
2023-05-12  4:58   ` Jeff King
2023-05-15 20:15     ` Taylor Blau
2023-05-12 21:24   ` Jeff King
2023-05-12 21:36     ` Taylor Blau
2023-05-12 21:46       ` Jeff King
2023-05-12 21:45     ` Jeff King
2023-05-12 22:01       ` Jeff King [this message]
2023-05-12 23:21         ` Junio C Hamano
2023-05-13  0:11           ` Jeff King
2023-05-13  0:11             ` Jeff King
2023-05-15 20:49       ` Taylor Blau
2023-05-15 20:38     ` Taylor Blau
2023-05-11 23:23 ` [PATCH v3 0/2] pack-objects: introduce `pack.extraRecentObjectsHook` Taylor Blau
2023-05-11 23:39 ` Junio C Hamano
2023-05-11 23:48   ` Taylor Blau
2023-05-16  0:23 ` [PATCH v4 0/2] gc: introduce `gc.recentObjectsHook` Taylor Blau
2023-05-16  0:24   ` [PATCH v4 1/2] reachable.c: extract `obj_is_recent()` Taylor Blau
2023-05-16  0:24   ` [PATCH v4 2/2] gc: introduce `gc.recentObjectsHook` Taylor Blau
2023-05-24 23:21     ` Glen Choo
2023-06-07 22:56       ` Taylor Blau
2023-06-07 22:58 ` [PATCH v5 0/2] " Taylor Blau
2023-06-07 22:58   ` [PATCH v5 1/2] reachable.c: extract `obj_is_recent()` Taylor Blau
2023-06-07 22:58   ` [PATCH v5 2/2] gc: introduce `gc.recentObjectsHook` Taylor Blau
2023-06-09 23:33     ` Glen Choo
2023-06-12 21:14       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230512220117.GA2971903@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=chris.torek@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).