Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "John Cai via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, John Cai <johncai86@gmail.com>,
	John Cai <jcai@gitlab.com>
Subject: Re: [PATCH] pack-refs: teach --exclude option to exclude refs from being packed
Date: Thu, 04 May 2023 09:48:57 -0700	[thread overview]
Message-ID: <xmqq4joskpom.fsf@gitster.g> (raw)
In-Reply-To: <pull.1501.git.git.1683215331910.gitgitgadget@gmail.com> (John Cai via GitGitGadget's message of "Thu, 04 May 2023 15:48:51 +0000")

"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: John Cai <jcai@gitlab.com>
>
> Currently once refs are packed into a pack-refs file, deleting them can be
> slow since it involves a full file scan of the entire pack-refs file. At

"pack-refs" -> "packed-refs".

But performing a full file scan of 100MB file would take how many
milliseconds?  Having to remove many refs from a single packed-refs
file would be costly if it is done one-by-one, though.  I wonder how
effective our batched update mechanism is these days...

Sorry for straying to a tangent.  In any case, I do not think the
first sentence is necessary; just start it with "At GitLab, ..." to
say that you have a need to be more selective than "is it a tag or
not?" to choose refs to be and not to be packed.  The reason why we
may want to leave a ref loose is not limited to ref deletion
performance, and the benefit of this new feature is not limited to
it, either.

> GitLab, we have a system that creates ephemeral internal refs that don't
> live long before getting deleted. Having an option to not include certain
> refs from a pack-refs file allows these internal references to be deleted
> much more efficiently.

I think that is a valid use case.

If we step back a bit, we can see "pack-refs" has an option "--all"
that was added by b3d4204f (git-pack-refs --all, 2006-10-08), to
allow us pack only tags by default, because packing branches that
are meant to be updated often and also removed more often than tags
were found to be detrimental.  We can view this new option a
follow-up work for the commit, to allow the users to define what to
be and what not to be packed, depending on their workflow.

This observation also makes us realize that we should consider the
opposite.  It would benefit us to include some refs that we normally
do not pack and be more selective than "--all" ("--exclude" proposed
here is a way to leave some refs that we normally pack and be more
selective than not running pack-refs at all).  A set of branches
that are only occasionally used may want to be packed while the rest
of branches want to be left loose because they are more active, or
something.

> Add an --exclude option to the pack-refs builtin, and use the ref
> exclusions API to exclude certain refs from being packed into the final
> pack-refs file

"pack-refs" appears here, too.

>  Documentation/git-pack-refs.txt |  8 +++++++-
>  builtin/pack-refs.c             | 17 +++++++++++++++--
>  refs.c                          |  4 ++--
>  refs.h                          |  6 +++++-
>  refs/debug.c                    |  4 ++--
>  refs/files-backend.c            | 13 ++++++++++---
>  refs/packed-backend.c           |  3 ++-
>  refs/refs-internal.h            |  4 +++-
>  t/helper/test-ref-store.c       |  3 ++-
>  t/t3210-pack-refs.sh            | 18 ++++++++++++++++++
>  10 files changed, 66 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/git-pack-refs.txt b/Documentation/git-pack-refs.txt
> index 154081f2de2..d80e0a1562d 100644
> --- a/Documentation/git-pack-refs.txt
> +++ b/Documentation/git-pack-refs.txt
> @@ -8,7 +8,7 @@ git-pack-refs - Pack heads and tags for efficient repository access
>  SYNOPSIS
>  --------
>  [verse]
> -'git pack-refs' [--all] [--no-prune]
> +'git pack-refs' [--all] [--no-prune] [--exclude <pattern>]
>  
>  DESCRIPTION
>  -----------
> @@ -59,6 +59,12 @@ a repository with many branches of historical interests.
>  The command usually removes loose refs under `$GIT_DIR/refs`
>  hierarchy after packing them.  This option tells it not to.
>  
> +--exclude <pattern>::
> +
> +Do not pack refs matching the given `glob(7)` pattern. Repetitions of this option
> +accumulate exclusion patterns. Use `--no-exclude` to clear and reset the list of
> +patterns.

The interaction of this option with "--all" needs to be described
somewhere.  If we are to be adding "--include" for completeness,
that one also needs to interact with "--all".

> diff --git a/refs.c b/refs.c
> index d2a98e1c21f..637810347a0 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2132,9 +2132,9 @@ void base_ref_store_init(struct ref_store *refs, struct repository *repo,
>  }
>  
>  /* backend functions */
> -int refs_pack_refs(struct ref_store *refs, unsigned int flags)
> +int refs_pack_refs(struct ref_store *refs, unsigned int flags, struct pack_refs_opts *pack_opts)
>  {
> -	return refs->be->pack_refs(refs, flags);
> +	return refs->be->pack_refs(refs, flags, pack_opts);

That's a curious choice of the API.  It is not like backend methods
all share the same "flags" bitset (they share "refs" pointer to the
ref_store), so I would have expected that it would become part of
the pack_refs_opts structure.  Do not call the parameter pack_opts;
either spell it out as "pack_refs_opts", or just use "opts".  The
latter is probably more preferrable as I do not expect it to be
ambiguous with other kinds of "opts".

The rest of the pack I found nothing unexpected or surprising.

  reply	other threads:[~2023-05-04 16:49 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-04 15:48 [PATCH] pack-refs: teach --exclude option to exclude refs from being packed John Cai via GitGitGadget
2023-05-04 16:48 ` Junio C Hamano [this message]
2023-05-04 21:26   ` John Cai
2023-05-09 19:18 ` [PATCH v2 0/3] pack-refs: Teach " John Cai via GitGitGadget
2023-05-09 19:18   ` [PATCH v2 1/3] docs: clarify git-pack-refs --all will pack all refs John Cai via GitGitGadget
2023-05-09 19:18   ` [PATCH v2 2/3] pack-refs: teach --exclude option to exclude refs from being packed John Cai via GitGitGadget
2023-05-09 21:04     ` Junio C Hamano
2023-05-09 19:18   ` [PATCH v2 3/3] pack-refs: teach pack-refs --include option John Cai via GitGitGadget
2023-05-09 21:25     ` Junio C Hamano
2023-05-10 19:52       ` John Cai
2023-05-11 18:10   ` [PATCH v3 0/4] pack-refs: allow users control over which refs to pack John Cai via GitGitGadget
2023-05-11 18:10     ` [PATCH v3 1/4] docs: clarify git-pack-refs --all will pack all refs John Cai via GitGitGadget
2023-05-11 23:53       ` Taylor Blau
2023-05-11 18:10     ` [PATCH v3 2/4] pack-refs: teach --exclude option to exclude refs from being packed John Cai via GitGitGadget
2023-05-11 19:34       ` Junio C Hamano
2023-05-12  0:00       ` Taylor Blau
2023-05-12 12:53         ` John Cai
2023-05-12 21:11           ` John Cai
2023-05-11 18:10     ` [PATCH v3 3/4] revision: modify ref_exclusions to handle inclusions John Cai via GitGitGadget
2023-05-11 19:54       ` Junio C Hamano
2023-05-12 14:56         ` John Cai
2023-05-11 18:10     ` [PATCH v3 4/4] pack-refs: teach pack-refs --include option John Cai via GitGitGadget
2023-05-11 20:06       ` Junio C Hamano
2023-05-12 14:48         ` John Cai
2023-05-12 19:03       ` John Cai
2023-05-12 21:34     ` [PATCH v4 0/3] pack-refs: allow users control over which refs to pack John Cai via GitGitGadget
2023-05-12 21:34       ` [PATCH v4 1/3] docs: clarify git-pack-refs --all will pack all refs John Cai via GitGitGadget
2023-05-12 21:34       ` [PATCH v4 2/3] pack-refs: teach --exclude option to exclude refs from being packed John Cai via GitGitGadget
2023-05-12 21:34       ` [PATCH v4 3/3] pack-refs: teach pack-refs --include option John Cai via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4joskpom.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jcai@gitlab.com \
    --cc=johncai86@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).