All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v3 06/13] remote-curl: fix parsing of detached SHA256 heads
Date: Mon, 29 Apr 2024 08:34:38 +0200	[thread overview]
Message-ID: <53439067a1e94075c3f432b6c2774b1cfb11e88c.1714371422.git.ps@pks.im> (raw)
In-Reply-To: <cover.1714371422.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3673 bytes --]

The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.

This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.

Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote-curl.c              | 19 ++++++++++++++++++-
 t/t5550-http-fetch-dumb.sh | 15 +++++++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 0b6d7815fd..004b707fdf 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -266,12 +266,23 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 	return list;
 }
 
+/*
+ * Try to detect the hash algorithm used by the remote repository when using
+ * the dumb HTTP transport. As dumb transports cannot tell us the object hash
+ * directly have to derive it from the advertised ref lengths.
+ */
 static const struct git_hash_algo *detect_hash_algo(struct discovery *heads)
 {
 	const char *p = memchr(heads->buf, '\t', heads->len);
 	int algo;
+
+	/*
+	 * In case the remote has no refs we have no way to reliably determine
+	 * the object hash used by that repository. In that case we simply fall
+	 * back to SHA1, which may or may not be correct.
+	 */
 	if (!p)
-		return the_hash_algo;
+		return &hash_algos[GIT_HASH_SHA1];
 
 	algo = hash_algo_by_length((p - heads->buf) / 2);
 	if (algo == GIT_HASH_UNKNOWN)
@@ -295,6 +306,12 @@ static struct ref *parse_info_refs(struct discovery *heads)
 		    "is this a git repository?",
 		    transport_anonymize_url(url.buf));
 
+	/*
+	 * Set the repository's hash algo to whatever we have just detected.
+	 * This ensures that we can correctly parse the remote references.
+	 */
+	repo_set_hash_algo(the_repository, hash_algo_by_ptr(options.hash_algo));
+
 	data = heads->buf;
 	start = NULL;
 	mid = data;
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 4c3b32785d..5f16cbc58d 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -55,6 +55,21 @@ test_expect_success 'list refs from outside any repository' '
 	test_cmp expect actual
 '
 
+
+test_expect_success 'list detached HEAD from outside any repository' '
+	git clone --mirror "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+		"$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" \
+		update-ref --no-deref HEAD refs/heads/main &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo-detached.git" update-server-info &&
+	cat >expect <<-EOF &&
+	$(git rev-parse main)	HEAD
+	$(git rev-parse main)	refs/heads/main
+	EOF
+	nongit git ls-remote "$HTTPD_URL/dumb/repo-detached.git" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'create password-protected repository' '
 	mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" &&
 	cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
-- 
2.45.0-rc1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2024-04-29  6:34 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19  9:51 [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 01/11] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-19 19:03   ` brian m. carlson
2024-04-22  4:56     ` Patrick Steinhardt
2024-04-22 16:15   ` Junio C Hamano
2024-04-23  4:50     ` Patrick Steinhardt
2024-04-23 16:54       ` Junio C Hamano
2024-04-19  9:51 ` [PATCH 02/11] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23  0:30   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 03/11] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23  0:32   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 04/11] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 05/11] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 06/11] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 07/11] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 08/11] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-19  9:51 ` [PATCH 09/11] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-22 18:41   ` Junio C Hamano
2024-04-19  9:51 ` [PATCH 10/11] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23  0:35   ` Justin Tobler
2024-04-19  9:51 ` [PATCH 11/11] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-19 19:12 ` [PATCH 00/11] Stop relying on SHA1 fallback for `the_hash_algo` brian m. carlson
2024-04-19 19:16   ` Junio C Hamano
2024-04-22  4:56   ` Patrick Steinhardt
2024-04-23  5:07 ` [PATCH v2 00/12] " Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 01/12] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 02/12] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 03/12] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 04/12] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 05/12] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 06/12] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-04-23  5:07   ` [PATCH v2 07/12] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 08/12] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 09/12] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 10/12] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 11/12] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-23  5:08   ` [PATCH v2 12/12] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-04-27 22:09   ` [PATCH v2 00/12] Stop relying on SHA1 fallback for `the_hash_algo` Junio C Hamano
2024-04-29  6:05     ` Patrick Steinhardt
2024-04-29  6:34 ` [PATCH v3 00/13] " Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-04-29  6:34   ` Patrick Steinhardt [this message]
2024-04-29  6:34   ` [PATCH v3 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-04-29  6:34   ` [PATCH v3 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-04-29  6:35   ` [PATCH v3 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt
2024-05-07  4:52 ` [PATCH v4 00/13] Stop relying on SHA1 fallback for `the_hash_algo` Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 01/13] path: harden validation of HEAD with non-standard hashes Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 02/13] path: move `validate_headref()` to its only user Patrick Steinhardt
2024-05-07  4:52   ` [PATCH v4 03/13] parse-options-cb: only abbreviate hashes when hash algo is known Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 04/13] attr: don't recompute default attribute source Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 05/13] attr: fix BUG() when parsing attrs outside of repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 06/13] remote-curl: fix parsing of detached SHA256 heads Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 07/13] builtin/rev-parse: allow shortening to more than 40 hex characters Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 08/13] builtin/blame: don't access potentially unitialized `the_hash_algo` Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 09/13] builtin/bundle: abort "verify" early when there is no repository Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 10/13] builtin/diff: explicitly set hash algo when there is no repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 11/13] builtin/shortlog: don't set up revisions without repo Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 12/13] oss-fuzz/commit-graph: set up hash algorithm Patrick Steinhardt
2024-05-07  4:53   ` [PATCH v4 13/13] repository: stop setting SHA1 as the default object hash Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53439067a1e94075c3f432b6c2774b1cfb11e88c.1714371422.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jltobler@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.