Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Tao Klerks <tao@klerks.biz>
To: git <git@vger.kernel.org>
Cc: Robert Coup <robert@coup.net.nz>
Subject: "git fetch --refetch" and multiple (separate/orphan) branches
Date: Fri, 2 Jun 2023 23:22:53 +0200	[thread overview]
Message-ID: <CAPMMpoiJ4cNcAR9gO5d-749N3YW-88p1gMnX8ySGgz84Mr9coA@mail.gmail.com> (raw)

Hi folks,

I just recently noticed that "--refetch" was added in 2.36, and I got
pretty excited - the ability to "fill in" missing blobs after a
too-filtered clone is something that I've wanted a number of times, as
I mentioned in 2021 in thread
https://public-inbox.org/git/CAPMMpohOuXX-0YOjV46jFZFvx7mQdj0p7s8SDR4SQxj5hEhCgg@mail.gmail.com/
.

When I first ran "git fetch --refetch" today however (git 2.38.1,
against server git/2.38.4.gl1), with a configured blob filter of
"blob:1100M", a much higher size than any blob in the history, it only
got a *relatively* small number of objects - 3GB of data rather than
the 18GB that a new unfiltered fetch would have retrieved.

After some more testing I tried again, and got the expected outcome
that time. The relevant difference between the two attempts is that in
the first case, when I only got some of the objects I expected, there
was an updated tag as a result of the fetch. The second time, when I
got everything, there were no updated refs.

In this repository there are several "independent" sets of branches,
and the tag updated in that first fetch belongs to one of the
smaller-history branches.

What I believe is happening is that *if* there are refs to be updated
(or new refs, presumably), *then* the objects returned to the client
are only those required for those refs. If, on the other hand, there
are no updated refs, then you get what is advertised in the doc: "all
objects as a fresh clone would [...]".

I've tested a couple of different scenarios and the behavior seems
consistent with this explanation.

In a repo where all branches are derived from the same history, this
probably isn't very noticeable; in the repo I'm working on it makes a
huge difference, so the only way I can imagine getting "correct"
behavior would be to always to a "git fetch" right before the "git
fetch --refetch".

Is this a bug, or expected behavior that should be noted in the doc,
or do we consider the multiple-independent-branches usecase to be
edge-casey enough to be an easter egg for people like me?

Thanks,
Tao

             reply	other threads:[~2023-06-02 21:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 21:22 Tao Klerks [this message]
2023-06-03  8:18 ` "git fetch --refetch" and multiple (separate/orphan) branches Robert Coup
2023-08-10  7:14   ` Tao Klerks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPMMpoiJ4cNcAR9gO5d-749N3YW-88p1gMnX8ySGgz84Mr9coA@mail.gmail.com \
    --to=tao@klerks.biz \
    --cc=git@vger.kernel.org \
    --cc=robert@coup.net.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).