Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Jeff King <peff@peff.net>
Cc: Adam Majer <adamm@zombino.com>, git@vger.kernel.org
Subject: Re: git clone of empty repositories doesn't preserve hash
Date: Wed, 26 Apr 2023 08:08:28 -0700	[thread overview]
Message-ID: <xmqqcz3qwuj7.fsf@gitster.g> (raw)
In-Reply-To: <20230426112508.GB130148@coredump.intra.peff.net> (Jeff King's message of "Wed, 26 Apr 2023 07:25:08 -0400")

Jeff King <peff@peff.net> writes:

> It sounds from your description that your test is running in a mode
> where the client defaults to sha256 (though I'm not sure how, since we
> explicitly document that GIT_DEFAULT_HASH should not affect clone), and
> then you clone an empty sha256 repository via v0, expecting the result
> to be sha256.

Thanks for coming up with an excellent guess that helped come up
with a reproduction.

With Brian's patch a bit tweaked (attached below), the test does
fail with the current 'master' and passes before the merge in
question.  And the trace clearly shows that without being told
anything about the object format via the capability, the client
chooses to honor GIT_DEFAULT_HASH to initialize the new repository
with sha256.

There is this entry in the release notes of 2.29.0:

 * "git clone" that clones from SHA-1 repository, while
   GIT_DEFAULT_HASH set to use SHA-256 already, resulted in an
   unusable repository that half-claims to be SHA-256 repository
   with SHA-1 objects and refs.  This has been corrected.

This refers to 47ac9703 (builtin/clone: avoid failure with
GIT_DEFAULT_HASH, 2020-09-20).  It seems that the "fix" described
there was incomplete and did not address "clone a void" case, and
the patch under discussion finished the incomplete fix without
knowing by accident.  The test that was added by 47ac9703 to t5601
does use non-empty repositories (one with SHA-1, the other with
SHA-256) and tries to check the interaction with "clone" with the
environment variable.  With the patch under discussion, this test
did not break and "clone" is still ignoring the GIT_DEFAULT_HASH
variable.

Also the description in the document of GIT_DEFAULT_HASH makes
readers ambivalent:

`GIT_DEFAULT_HASH`::
	If this variable is set, the default hash algorithm for new
	repositories will be set to this value. This value is currently
	ignored when cloning; the setting of the remote repository
	is used instead. The default is "sha1". THIS VARIABLE IS
	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].

To me, "is currently ignored" hints that the author, or somebody
close to the author, of this entry feels that it is a bug that
"clone" does not honor it.  The log message of 47ac9703 also phrases
"failed to honor the GIT_DEFAULT_HASH" as if it were a bad thing [*].

On the other hand, it is clearly documented as experimental so
anything that has been relying on the behaviour of any command with
a particular value set to this variable are waiting to be broken.

I am torn about this.  Even though we may have been very clear that
GIT_DEFAULT_HASH should not kick in in this case, "clone" was buggy
(in the sense that it did not behave as documented in an empty
repository) and whatever thing that changed behaviour that Brian
noticed was ignoring the documentation and taking advantage of that
"bug", and Hyrum's law ensues.

In the longer term, after/when we allow incremental/over-the-wire
migration of object-format, i.e. clone from SHA-1 repository to
create SHA-256 repository (or vice versa) and fetching and pushing
between them would bidirectionally convert the object format on the
fly, it is likely we would teach a new option "--object-format" to
"git clone" to say "you would use whatever object format the origin
uses by default, but this time, I am telling you to use this format
on our side, doing on-the-fly object format conversion as needed".

So it would be OK to make sure that GIT_DEFAULT_HASH is ignored by
"clone" as documented now, and even after on-the-fly object-format
migration is implemented, I would think.


[Footnote]

* ... but I think it was misguided.  What it says there is this:

    And we also don't want to initialize the repository as SHA-1
    initially, since that means if we're cloning an empty
    repository, we'll have failed to honor the GIT_DEFAULT_HASH
    variable and will end up with a SHA-1 repository, not a SHA-256
    repository.

If this were "When we're cloning an empty repository, we'd have
failed to honor the object format the other side has chosen and will
end up with a SHA-1 repository, not a SHA-256 repository.", then it
is very much in line with the reality before the patch under
discussion and also in line with the official stance that "clone"
should not honor GIT_DEFAULT_HASH.

Where the original description breaks down is when the other side is
SHA-1 and this side has GIT_DEFAULT_HASH set to SHA-256.  If we
honored the variable, we'd create a SHA-256 repository that will
talk to SHA-1 repository before the rest of the system is ready.

 t/t5700-protocol-v1.sh | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git i/t/t5700-protocol-v1.sh w/t/t5700-protocol-v1.sh
index 6c8d4c6cf1..2f39dd2e05 100755
--- i/t/t5700-protocol-v1.sh
+++ w/t/t5700-protocol-v1.sh
@@ -244,6 +244,19 @@ test_expect_success 'push with ssh:// using protocol v1' '
 	grep "push< version 1" log
 '
 
+test_expect_success 'clone propagates object-format from empty repo' '
+	test_when_finished "rm -fr src256 dst256" &&
+
+	echo sha256 >expect &&
+	git init --object-format=sha256 src256 &&
+	GIT_DEFAULT_HASH=sha256 \
+	GIT_TRACE_PACKET=1 \
+	git clone --no-local src256 dst256 &&
+	git -C dst256 rev-parse --show-object-format >actual &&
+
+	test_cmp expect actual
+'
+
 # Test protocol v1 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh

  reply	other threads:[~2023-04-26 15:08 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 10:28 git clone of empty repositories doesn't preserve hash Adam Majer
2023-04-05 19:04 ` Junio C Hamano
2023-04-05 19:47   ` Adam Majer
2023-04-05 20:01     ` Jeff King
2023-04-05 20:40       ` Junio C Hamano
2023-04-05 21:15         ` Junio C Hamano
2023-04-05 21:26           ` Jeff King
2023-04-05 22:48           ` brian m. carlson
2023-04-06 13:11           ` Adam Majer
2023-04-25 21:35           ` brian m. carlson
2023-04-25 22:24             ` Junio C Hamano
2023-04-25 23:12             ` Junio C Hamano
2023-04-26  0:20               ` brian m. carlson
2023-04-26 11:25                 ` Jeff King
2023-04-26 15:08                   ` Junio C Hamano [this message]
2023-04-26 15:13                     ` [PATCH] doc: GIT_DEFAULT_HASH is and will be ignored during "clone" Junio C Hamano
2023-04-26 21:06                       ` brian m. carlson
2023-04-27  4:46                     ` git clone of empty repositories doesn't preserve hash Jeff King
2023-04-26 10:51               ` Jeff King
2023-04-26 15:42                 ` Junio C Hamano
2023-04-26 20:40                 ` brian m. carlson
2023-04-26 20:53                   ` [PATCH 0/2] Fix empty SHA-256 clones with v0 and v1 brian m. carlson
2023-04-26 20:53                     ` [PATCH 1/2] http: advertise capabilities when cloning empty repos brian m. carlson
2023-04-26 21:14                       ` Junio C Hamano
2023-04-26 21:28                         ` brian m. carlson
2023-04-27  5:00                           ` Jeff King
2023-04-27  5:30                       ` Jeff King
2023-04-27 20:40                         ` Junio C Hamano
2023-04-26 20:53                     ` [PATCH 2/2] Honor GIT_DEFAULT_HASH for empty clones without remote algo brian m. carlson
2023-04-26 21:18                       ` Junio C Hamano
2023-04-26 21:33                       ` Junio C Hamano
2023-04-27  5:43                         ` Jeff King
2023-05-02 23:46                           ` Is GIT_DEFAULT_HASH flawed? Felipe Contreras
2023-05-03  9:03                             ` Adam Majer
2023-05-03 15:44                               ` Felipe Contreras
2023-05-03 17:21                                 ` Adam Majer
2023-05-08  0:34                                   ` Felipe Contreras
2023-05-03  9:09                             ` demerphq
2023-05-03 18:20                               ` Felipe Contreras
2023-05-03 22:54                             ` brian m. carlson
2023-05-08  2:00                               ` Felipe Contreras
2023-05-08 21:38                                 ` brian m. carlson
2023-05-09 10:32                                   ` Oswald Buddenhagen
2023-05-09 16:47                                     ` Junio C Hamano
2023-04-26 21:12                     ` [PATCH 0/2] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-04-27  4:56                   ` git clone of empty repositories doesn't preserve hash Jeff King
2023-05-01 17:00                   ` [PATCH v2 0/1] Fix empty SHA-256 clones with v0 and v1 brian m. carlson
2023-05-01 17:00                     ` [PATCH v2 1/1] upload-pack: advertise capabilities when cloning empty repos brian m. carlson
2023-05-01 22:40                       ` Jeff King
2023-05-01 22:51                         ` Junio C Hamano
2023-05-01 17:37                     ` [PATCH v2 0/1] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-05-17 19:24                   ` [PATCH v3 " brian m. carlson
2023-05-17 19:24                     ` [PATCH v3 1/1] upload-pack: advertise capabilities when cloning empty repos brian m. carlson
2023-05-17 21:48                     ` [PATCH v3 0/1] Fix empty SHA-256 clones with v0 and v1 Junio C Hamano
2023-05-17 22:28                       ` brian m. carlson
2023-05-18 18:28                     ` Jeff King
2023-05-19 15:32                       ` brian m. carlson
2023-04-05 21:23         ` git clone of empty repositories doesn't preserve hash Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqcz3qwuj7.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=adamm@zombino.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).