From: Christian Couder <christian.couder@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
John Cai <johncai86@gmail.com>,
Jonathan Tan <jonathantanmy@google.com>,
Jonathan Nieder <jrnieder@gmail.com>,
Derrick Stolee <stolee@gmail.com>, Patrick Steinhardt <ps@pks.im>,
Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH 6/9] repack: add `--filter=<filter-spec>` option
Date: Wed, 5 Jul 2023 09:18:54 +0200 [thread overview]
Message-ID: <CAP8UFD1oOiH71NfzgNhE_aCMHT8U8mCNADoAfpghxgXzEN30iw@mail.gmail.com> (raw)
In-Reply-To: <ZJLcS78DDOjn6MzP@nand.local>
On Wed, Jun 21, 2023 at 1:17 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Wed, Jun 14, 2023 at 09:25:38PM +0200, Christian Couder wrote:
> > ---
> > Documentation/git-repack.txt | 5 +++
> > builtin/repack.c | 75 ++++++++++++++++++++++++++++++++++--
> > t/t7700-repack.sh | 16 ++++++++
> > 3 files changed, 93 insertions(+), 3 deletions(-)
>
> Having read through the implementation in the repack builtin, I am
> almost certain that my suggestion earlier in the thread to implement
> this in terms of 'git pack-objects --filter' and 'git pack-objects
> --stdin-packs' would work.
Yeah, it works, thanks. That's what is used in the version 2 I just sent.
> > diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
> > index 4017157949..aa29c7e648 100644
> > --- a/Documentation/git-repack.txt
> > +++ b/Documentation/git-repack.txt
> > @@ -143,6 +143,11 @@ depth is 4095.
> > a larger and slower repository; see the discussion in
> > `pack.packSizeLimit`.
> >
> > +--filter=<filter-spec>::
> > + Remove objects matching the filter specification from the
> > + resulting packfile and put them into a separate packfile. See
> > + linkgit:git-rev-list[1] for valid `<filter-spec>` forms.
> > +
>
> This documentation leaves me with a handful of questions about how it
> interacts with other options.
I have improved this doc in version 2. It now says:
--filter=<filter-spec>::
Remove objects matching the filter specification from the
resulting packfile and put them into a separate packfile. Note
that objects used in the working directory are not filtered
out. So for the split to fully work, it's best to perform it
in a bare repo and to use the `-a` and `-d` options along with
this option. See linkgit:git-rev-list[1] for valid
`<filter-spec>` forms.
For comparison the doc about --cruft is:
--cruft::
Same as `-a`, unless `-d` is used. Then any unreachable objects
are packed into a separate cruft pack. Unreachable objects can
be pruned using the normal expiry rules with the next `git gc`
invocation (see linkgit:git-gc[1]). Incompatible with `-k`.
> Here are some:
>
> - What happens when you pass it with "-d"? Does it delete objects that
> didn't match the filter? Leave them alone?
With or without "-d", no object is deleted, objects are just put into
2 different packfiles depending on whether or not they match the
filter.
> If the latter, should
> this combination be declared invalid instead of silently ignoring
> the user's request to delete redundant packs?
"-d" is indeed about removing redundant packs. If you want to split
the objects into different packs according to a filter, then you
probably don't want to keep packs that contain both kinds of objects
that you want to separate into different packs, so the doc now
recommends using "-d" (along with "-a") when using "--filter". That's
also what the tests are using.
I am not sure it's worth erroring out when "-d", or "-a", is not used.
Perhaps there are some use cases where it's interesting to keep old
packs, or, in case of not using "-a", to split only loose objects into
separate packs, but not objects from existing packs.
> - What happens with --max-pack-size? Does the filtered pack get split
> into multiple packs (as I think we would expect from such a
> combination)?
Yes, in version 2 I have added a test (in the commit introducing
--filter-to) that checks that objects that are filtered out and that
are larger than --max-pack-size get split into their own packs.
> - What about with `--cruft`? Does it split the cruft pack into two
> based on whether or not the unreachable object(s) matched or didn't
> match the filter?
I am not sure as I don't know well how cruft works. I would need to
check the code and/or test it.
> - What happens when passed with "--geometric"? I don't think there is
> a sensible interpretation (at least, I can't think of what it would
> mean to do "--filter=<spec> --geometric=<factor>" off the top of my
> head).
Not sure either, as I also don't know well how --geometric works.
> - What about with "--write-bitmap-index"? Do we write one bitmap
> index? Two? If the latter, do we combine the packs into a MIDX
> before writing the bitmap? Should we?
In the tests, repack --filter is used with "-c
repack.writebitmaps=false" as otherwise the bitmap writing code tends
to complain that a pack is not complete or something like that. I
think it would make sense to have options that would make the bitmap
writing code write a regular bitmap index if all the objects are in a
single pack, but a MIDX if they are in multiple packs. I don't know
what --cruft or --max-pack-size are doing about this, but I think such
a feature could be useful in case those options are used.
In the meantime I would be Ok with disallowing using both
--write-bitmap-index and --filter. The issue is that writing a bitmap
index is the default behavior, even without --write-bitmap-index, so
it might be a bit more complex, but I plan to do something about it in
version 3.
> I think it may be worth spelling out answers to some of these questions
> in the documentation, and codifying those answers in the form of tests.
In version 2, the doc has been improved and now it seems comparable to
the --cruft doc. About the tests, I added one in version 2 related to
--max-pack-size, and I am open to adding a few others, but I don't
think it's a good idea to add a lot of them. It just doesn't scale
when commands have more than a few options.
The new code is reusing existing features (like --stdin-packs) that
are already used by other code (like --cruft code). It is not adding
new intricate mechanisms, nor a lot of new code. It is not changing
code used by other features except for a few clean refactorings. This
should give a good indication that it will work well with other
features.
> This makes me wonder whether or not this option should belong in repack
> at all, or whether there should be some new special-purpose builtin that
> is designed to split existing pack(s) based on whether or not they meet
> some filter criteria.
--cruft is also splitting existing packs based on some criteria. So if
we go this way, perhaps --cruft code should be moved first to this new
special purpose builtin?
next prev parent reply other threads:[~2023-07-05 7:19 UTC|newest]
Thread overview: 161+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-14 19:25 [PATCH 0/9] Repack objects into separate packfiles based on a filter Christian Couder
2023-06-14 19:25 ` [PATCH 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-06-21 10:49 ` Taylor Blau
2023-07-05 6:16 ` Christian Couder
2023-06-14 19:25 ` [PATCH 2/9] pack-objects: add `--print-filtered` to print omitted objects Christian Couder
2023-06-15 22:50 ` Junio C Hamano
2023-06-21 10:52 ` Taylor Blau
2023-06-21 11:11 ` Christian Couder
2023-06-21 11:54 ` Taylor Blau
2023-06-14 19:25 ` [PATCH 3/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-06-15 23:32 ` Junio C Hamano
2023-06-21 10:40 ` Christian Couder
2023-06-21 10:54 ` Taylor Blau
2023-06-14 19:25 ` [PATCH 4/9] repack: refactor piping an oid to a command Christian Couder
2023-06-15 23:46 ` Junio C Hamano
2023-06-21 10:55 ` Taylor Blau
2023-06-21 10:56 ` Christian Couder
2023-06-14 19:25 ` [PATCH 5/9] repack: refactor finishing pack-objects command Christian Couder
2023-06-16 0:13 ` Junio C Hamano
2023-06-21 11:06 ` Taylor Blau
2023-06-21 11:19 ` Christian Couder
2023-06-21 11:05 ` Taylor Blau
2023-06-14 19:25 ` [PATCH 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-06-16 0:43 ` Junio C Hamano
2023-06-21 11:20 ` Taylor Blau
2023-06-21 15:04 ` Christian Couder
2023-06-22 11:05 ` Taylor Blau
2023-06-21 14:40 ` Christian Couder
2023-06-21 16:53 ` Junio C Hamano
2023-06-22 8:39 ` Christian Couder
2023-06-22 18:32 ` Junio C Hamano
2023-06-21 11:17 ` Taylor Blau
2023-07-05 7:18 ` Christian Couder [this message]
2023-06-14 19:25 ` [PATCH 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-06-14 19:25 ` [PATCH 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-06-16 2:21 ` Junio C Hamano
2023-06-21 11:49 ` Taylor Blau
2023-06-21 12:08 ` Christian Couder
2023-06-21 12:25 ` Taylor Blau
2023-06-21 16:44 ` Junio C Hamano
2023-07-05 6:19 ` Christian Couder
2023-06-14 19:25 ` [PATCH 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-06-16 2:54 ` Junio C Hamano
2023-06-14 21:36 ` [PATCH 0/9] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-06-16 3:08 ` Junio C Hamano
2023-07-05 6:08 ` [PATCH v2 0/8] " Christian Couder
2023-07-05 6:08 ` [PATCH v2 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-07-05 6:08 ` [PATCH v2 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-07-05 6:08 ` [PATCH v2 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-07-05 6:08 ` [PATCH v2 4/8] repack: refactor finding pack prefix Christian Couder
2023-07-05 6:08 ` [PATCH v2 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-07-05 17:53 ` Junio C Hamano
2023-07-24 9:01 ` Christian Couder
2023-07-24 18:28 ` Junio C Hamano
2023-07-25 15:22 ` Christian Couder
2023-07-25 17:25 ` Junio C Hamano
2023-07-25 23:08 ` Junio C Hamano
2023-08-08 8:45 ` Christian Couder
2023-08-09 20:38 ` Taylor Blau
2023-08-09 22:50 ` Junio C Hamano
2023-08-09 23:38 ` Junio C Hamano
2023-08-10 0:10 ` Jeff King
2023-07-05 18:12 ` Junio C Hamano
2023-07-24 9:02 ` Christian Couder
2023-07-05 6:08 ` [PATCH v2 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-07-05 6:08 ` [PATCH v2 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-07-05 18:26 ` Junio C Hamano
2023-07-24 9:00 ` Christian Couder
2023-07-24 18:18 ` Junio C Hamano
2023-07-25 13:41 ` Robert Coup
2023-07-25 16:50 ` Junio C Hamano
2023-07-25 15:45 ` Christian Couder
2023-07-05 6:08 ` [PATCH v2 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-07-24 8:59 ` [PATCH v3 0/8] Repack objects into separate packfiles based on a filter Christian Couder
2023-07-24 8:59 ` [PATCH v3 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-07-25 22:38 ` Taylor Blau
2023-07-25 23:51 ` Junio C Hamano
2023-07-24 8:59 ` [PATCH v3 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-07-25 22:44 ` Taylor Blau
2023-08-08 8:28 ` Christian Couder
2023-07-24 8:59 ` [PATCH v3 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-07-25 22:45 ` Taylor Blau
2023-07-24 8:59 ` [PATCH v3 4/8] repack: refactor finding pack prefix Christian Couder
2023-07-25 22:47 ` Taylor Blau
2023-08-08 8:29 ` Christian Couder
2023-07-24 8:59 ` [PATCH v3 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-07-25 23:04 ` Taylor Blau
2023-08-08 8:34 ` Christian Couder
2023-08-09 21:12 ` Taylor Blau
2023-07-24 8:59 ` [PATCH v3 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-07-25 23:07 ` Taylor Blau
2023-08-08 8:38 ` Christian Couder
2023-08-09 21:15 ` Taylor Blau
2023-07-24 8:59 ` [PATCH v3 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-07-24 8:59 ` [PATCH v3 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-07-25 23:10 ` [PATCH v3 0/8] Repack objects into separate packfiles based on a filter Taylor Blau
2023-08-08 8:26 ` [PATCH v4 " Christian Couder
2023-08-08 8:26 ` [PATCH v4 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-08-08 8:26 ` [PATCH v4 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-08-09 21:18 ` Taylor Blau
2023-08-08 8:26 ` [PATCH v4 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-08-08 8:26 ` [PATCH v4 4/8] repack: refactor finding pack prefix Christian Couder
2023-08-09 21:20 ` Taylor Blau
2023-08-08 8:26 ` [PATCH v4 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-08-09 21:40 ` Taylor Blau
2023-08-08 8:26 ` [PATCH v4 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-08-08 8:26 ` [PATCH v4 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-08-08 8:26 ` [PATCH v4 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-08-09 21:45 ` [PATCH v4 0/8] Repack objects into separate packfiles based on a filter Taylor Blau
2023-08-09 21:57 ` Junio C Hamano
2023-08-12 0:12 ` Christian Couder
2023-08-12 0:00 ` [PATCH v5 " Christian Couder
2023-08-12 0:00 ` [PATCH v5 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-08-12 0:00 ` [PATCH v5 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-08-12 0:00 ` [PATCH v5 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-08-12 0:00 ` [PATCH v5 4/8] repack: refactor finding pack prefix Christian Couder
2023-08-12 0:00 ` [PATCH v5 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-08-12 0:00 ` [PATCH v5 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-08-12 0:00 ` [PATCH v5 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-08-12 0:00 ` [PATCH v5 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-08-15 0:51 ` [PATCH v5 0/8] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-08-15 21:43 ` Taylor Blau
2023-08-15 22:32 ` Junio C Hamano
2023-08-15 23:09 ` Taylor Blau
2023-08-15 23:18 ` Junio C Hamano
2023-08-16 0:38 ` Taylor Blau
2023-08-16 17:16 ` Junio C Hamano
2023-09-11 15:20 ` Christian Couder
2023-09-11 15:06 ` [PATCH v6 0/9] " Christian Couder
2023-09-11 15:06 ` [PATCH v6 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-09-11 15:06 ` [PATCH v6 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-09-11 15:06 ` [PATCH v6 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-09-11 15:06 ` [PATCH v6 4/9] repack: refactor finding pack prefix Christian Couder
2023-09-11 15:06 ` [PATCH v6 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-09-11 15:06 ` [PATCH v6 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-09-11 15:06 ` [PATCH v6 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-09-11 15:06 ` [PATCH v6 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-09-11 15:06 ` [PATCH v6 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-09-25 15:25 ` [PATCH v7 0/9] Repack objects into separate packfiles based on a filter Christian Couder
2023-09-25 15:25 ` [PATCH v7 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-09-25 15:25 ` [PATCH v7 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-09-25 15:25 ` [PATCH v7 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-09-25 15:25 ` [PATCH v7 4/9] repack: refactor finding pack prefix Christian Couder
2023-09-25 15:25 ` [PATCH v7 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-09-25 15:25 ` [PATCH v7 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-09-25 15:25 ` [PATCH v7 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-09-25 15:25 ` [PATCH v7 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-09-25 15:25 ` [PATCH v7 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-09-25 19:14 ` [PATCH v7 0/9] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-09-25 22:41 ` Taylor Blau
2023-10-02 16:54 ` [PATCH v8 " Christian Couder
2023-10-02 16:54 ` [PATCH v8 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-10-02 16:54 ` [PATCH v8 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-10-02 16:54 ` [PATCH v8 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-10-02 16:54 ` [PATCH v8 4/9] repack: refactor finding pack prefix Christian Couder
2023-10-02 16:55 ` [PATCH v8 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-10-02 16:55 ` [PATCH v8 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-10-02 16:55 ` [PATCH v8 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-10-02 16:55 ` [PATCH v8 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-10-02 16:55 ` [PATCH v8 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-10-02 20:14 ` [PATCH v8 0/9] Repack objects into separate packfiles based on a filter Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAP8UFD1oOiH71NfzgNhE_aCMHT8U8mCNADoAfpghxgXzEN30iw@mail.gmail.com \
--to=christian.couder@gmail.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johncai86@gmail.com \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=me@ttaylorr.com \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).