Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: "Xing Xin" <bupt_xingxin@163.com>
To: "Junio C Hamano" <gitster@pobox.com>
Cc: "Xing Xin via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org,  "Patrick Steinhardt" <ps@pks.im>,
	 "Karthik Nayak" <karthik.188@gmail.com>,
	 "Xing Xin" <xingxin.xx@bytedance.com>
Subject: Re:Re: [PATCH v6 1/3] bundle-uri: verify oid before writing refs
Date: Mon, 17 Jun 2024 21:53:56 +0800 (CST)	[thread overview]
Message-ID: <77366503.bef9.1902679d458.Coremail.bupt_xingxin@163.com> (raw)
In-Reply-To: <xmqqa5jruwkr.fsf@gitster.g>

At 2024-06-12 03:08:04, "Junio C Hamano" <gitster@pobox.com> wrote:
>"Xing Xin via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag
>
>"Fix the bug by ...".
>
>> when writing bundle refs. When `refs.c:refs_update_ref` is called to to
>
>"to to"

Ah, always typos. :-)

[snip]
>>  test_expect_success 'create bundle' '
>>  	git init clone-from &&
>> -	git -C clone-from checkout -b topic &&
>> -	test_commit -C clone-from A &&
>> -	test_commit -C clone-from B &&
>> -	git -C clone-from bundle create B.bundle topic
>> +	(
>> +		cd clone-from &&
>> +		git checkout -b topic &&
>> +
>> +		test_commit A &&
>> +		git bundle create A.bundle topic &&
>> +
>> +		test_commit B &&
>> +		git bundle create B.bundle topic &&
>> +
>> +		# Create a bundle with reference pointing to non-existent object.
>> +		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle
>
>I suspect that this would be terribly unportable.  The early part of
>a bundle file may be text and sed may be able to grok but are you
>sure everybody's implementation of sed would not barf (or even
>worse, corrupt) the pack stream data that follows?
>
>The code used in t/lib-bundle.sh:convert_bundle_to_pack() has been
>in use since 8315588b (bundle: fix wrong check of read_header()'s
>return value & add tests, 2007-03-06), so munging the bundle with
>a code similar to it may have a better portability story.
>
>Add something like:
>
>        corrupt_bundle_header () {
>                sed -e '/^$/q' "$@"
>                cat
>        }
>
>to t/lib-bundle.sh, which can take an arbitrary sequence of command
>line parameters to drive "sed", and can be used like so:
>
>	corrupt_bundle_header \
>		-e "s/^$(git rev-parse A) /$(git rev-parse B) /" \
>		<A.bndl >B.bndl
>
>perhaps?

Thanks, I never knew sed could be used this way! It's so concise!

However, after applying these code, the added test "t5558.5 clone with
bundle that has bad header" fails in some CI jobs showing that we can
not get expected error. For example CI "linux-musl (alpine)" gives:

	+ git clone '--bundle-uri=clone-from/bad-header.bundle' clone-from clone-bad-header
	+ git -C clone-from rev-parse B
	+ commit_b=d9df4505cb3522088b9e29d6051ac16f1564154a
	+ test_grep 'trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' err
	+ eval 'last_arg=${2}'
	+ last_arg=err
	+ test -f err
	+ test 2 -lt 2
	+ test 'x!' '=' 'xtrying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a'
	+ test 'x!' '=' 'xtrying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a'
	+ grep 'trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' err
	+ echo 'error: '"'"'grep trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' 'err'"'"' didn'"'"'t find a match in:'
	error: 'grep trying to write ref 'refs/bundles/topic' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a err' didn't find a match in:
	+ test -s err
	+ cat err
	Cloning into 'clone-bad-header'...
	fatal: pack signature mismatch
	error: index-pack died
	done.
	+ return 1
	error: last command exited with $?=1
	not ok 5 - clone with bundle that has bad header

	+# Create a bundle with reference pointing to non-existent object.
	+sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \
	+	<A.bundle >bad-header.bundle &&
	+convert_bundle_to_pack \
	+	<A.bundle >>bad-header.bundle

More details can be found at https://github.com/blanet/git/actions/runs/9541478191/job/26294731254.

After some digging, I discovered that inside the Alpine container,
`corrupt_bundle_header` is missing the leading "PACK\x00" in the pack
section of the converted bundle. This is likely caused by the
incompatibility of "sed" across different operating systems. I stopped
investigating further due to my unfamiliarity with containers and "sed"
itself. Instead, I found another approach that utilizes the suggested
usage of "sed" and `lib-bundle.sh:convert_bundle_to_pack`.

 	+# Create a bundle with reference pointing to non-existent object.
	+sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \
	+	<A.bundle >bad-header.bundle &&
	+convert_bundle_to_pack \
	+	<A.bundle >>bad-header.bundle

>> @@ -33,6 +42,16 @@ test_expect_success 'clone with path bundle' '
>>  	test_cmp expect actual
>>  '
>>  
>> +test_expect_success 'clone with bundle that has bad header' '
>> +	git clone --bundle-uri="clone-from/bad-header.bundle" \
>> +		clone-from clone-bad-header 2>err &&
>> +	# Write bundle ref fails, but clone can still proceed.
>> +	commit_b=$(git -C clone-from rev-parse B) &&
>> +	test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
>> +	git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
>> +	! grep "refs/bundles/" refs
>> +'
>> +
>
>So this is the test the proposed log message discussed.  The
>description gave a false impression that the "broken header" test
>that has not much to do with the bug being fixed was the only added
>test---we probably want to correct that impression.

Adjusted the commit message.

>> @@ -259,6 +278,132 @@ test_expect_success 'clone bundle list (file, any mode, all failures)' '
>>  	! grep "refs/bundles/" refs
>>  '
>>  
>> +#########################################################################
>> +# Clone negotiation related tests begin here
>
>Drop this divider and comment.  The HTTP tests you see below has a
>much better reason to be separated like that in order to warn test
>writers (they shouldn't add their random new tests after that point,
>because everything after that one is skipped when HTTPD tests are
>disabled---see the beginning of t/lib-httpd.sh which is included
>after that divider line), but everything here you added is not
>special.  Everybody should run these tests.

Thanks for the explanation, dropped the divider in new series.

>> +test_expect_success 'negotiation: bundle with part of wanted commits' '
>> +	test_when_finished rm -rf trace*.txt &&
>
>Do not overly depend on glob not matching at this point when you
>establish the when-finished handler (or glob matching the files you
>want to catch and later test not adding anything you would want to
>clean).  Quote "rm -f trace*.txt" and drop "r" if you do not absolutely
>need it (and I would imagine you don't, given the .txt suffix).

Yes, the "-r" is not safe and I just need to clean the "*.txt", not involving directories.

>> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
>> +	git clone --no-local --bundle-uri="clone-from/A.bundle" \
>> +		clone-from nego-bundle-part &&
>> +	git -C nego-bundle-part for-each-ref --format="%(refname)" >refs &&
>> +	grep "refs/bundles/" refs >actual &&
>> +	cat >expect <<-\EOF &&
>> +	refs/bundles/topic
>> +	EOF
>
>Hmph, if the expected pattern is only a few lines without any magic,
>
>	test_write_lines >expect refs/bundles/topic &&
>
>may be easier to follow.

Applied!

>> +	test_cmp expect actual &&
>> +	# Ensure that refs/bundles/topic are sent as "have".
>> +	grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt
>> +'
>
>Using "test_grep" would make it easier to diagnose when test breaks.
>A failing "grep" will be silent (i.e., "I didn't find anything you
>told me to look for").  A failing "test_grep" will tell you "I was
>told to find this, but didn't find any in that".

Thanks, it just accelerated the digging process of the test failure
mentioned above. :)

>> +test_expect_success 'negotiation: bundle with all wanted commits' '
>> +	test_when_finished rm -rf trace*.txt &&
>> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
>> +	git clone --no-local --single-branch --branch=topic --no-tags \
>> +		--bundle-uri="clone-from/B.bundle" \
>> +		clone-from nego-bundle-all &&
>> +	git -C nego-bundle-all for-each-ref --format="%(refname)" >refs &&
>> +	grep "refs/bundles/" refs >actual &&
>> +	cat >expect <<-\EOF &&
>> +	refs/bundles/topic
>> +	EOF
>> +	test_cmp expect actual &&
>> +	# We already have all needed commits so no "want" needed.
>> +	! grep "clone> want " trace-packet.txt
>> +'
>> +
>> +test_expect_success 'negotiation: bundle list (no heuristic)' '
>> +	test_when_finished rm -f trace*.txt &&
>> +	cat >bundle-list <<-EOF &&
>> +	[bundle]
>> +		version = 1
>> +		mode = all
>> +
>> +	[bundle "bundle-1"]
>> +		uri = file://$(pwd)/clone-from/bundle-1.bundle
>> +
>> +	[bundle "bundle-2"]
>> +		uri = file://$(pwd)/clone-from/bundle-2.bundle
>> +	EOF
>
>OK.  This is a good use of here-doc (as opposed to test_write_lines
>I sugested earlier for different purposes).  I wondered if these
>$(pwd) and file://$(pwd) are safe (I always get confused by the need
>to sometimes use $PWD to help Windows), but I see them used in what
>Derrick wrote in this file, so they must be fine.
>
>But there may be characters in the leading part of $(pwd) that we do
>not control that needs quoting (like a double quote '"').  The value
>of bundle.*.uri may need to be quoted a bit carefully.  This is not
>a new problem this patch introduces, so you do not have to rewrite
>this part of the patch; I'll mark it with #leftoverbits here---the
>idea being somebody else who is too bored can come back, see if it
>is truly broken, and fix them after all dust settles.
>
>Abusing "git config -f bundle-list" might be safer, e.g.
>
>	$ git config -f bundle.list bundle.bundle-1.uri \
>		'file:///home/abc"def/t/trash dir/clone-from/b1.bndl'
>	$ cat bundle.list
>        [bundle "bundle-1"]
>                uri = file:///home/abc\"def/t/trash dir/clone-from/b1.bndl
>
>as you do not know what other garbage character is in $(pwd) part.
>
>> +	# We already have all needed commits so no "want" needed.
>> +	! grep "clone> want " trace-packet.txt
>
>Just FYI, to negate test_grep, use
>
>	test_grep ! "clone > want " trace-packet.txt
>
>not	
>
>	! test_grep "clone > want " trace-packet.txt ;# WRONG

Thanks your through review!

Xing Xin


  reply	other threads:[~2024-06-17 13:54 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15  3:01 [PATCH] bundle-uri: refresh packed_git if unbundle succeed blanet via GitGitGadget
2024-05-17  5:00 ` Patrick Steinhardt
2024-05-17 16:14   ` Junio C Hamano
2024-05-20 11:48     ` Xing Xin
2024-05-20 17:19       ` Junio C Hamano
2024-05-27 16:04         ` Xing Xin
2024-05-20  9:41   ` Xing Xin
2024-05-17  7:36 ` Karthik Nayak
2024-05-20 10:19   ` Xing Xin
2024-05-20 12:36 ` [PATCH v2] bundle-uri: verify oid before writing refs blanet via GitGitGadget
2024-05-21 15:41   ` Karthik Nayak
2024-05-27 15:41   ` [PATCH v3 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-27 15:41     ` [PATCH v3 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-28 11:55       ` Patrick Steinhardt
2024-05-30  8:32         ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 2/4] unbundle: introduce unbundle_fsck_flags for fsckobjects handling Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-29 18:12         ` Xing Xin
2024-05-30  4:38           ` Patrick Steinhardt
2024-05-30  8:46             ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-28 17:10         ` Junio C Hamano
2024-05-28 17:24           ` Junio C Hamano
2024-05-29  5:52             ` Patrick Steinhardt
2024-05-30  8:48             ` Xing Xin
2024-05-29  5:52           ` Patrick Steinhardt
2024-05-27 15:41     ` [PATCH v3 4/4] unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-05-28 12:05       ` Patrick Steinhardt
2024-05-30  8:54         ` Xing Xin
2024-05-30  8:21     ` [PATCH v4 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 2/4] unbundle: extend verify_bundle_flags to support fsck-objects Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-05-30  8:21       ` [PATCH v4 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 4/4] unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-06-11  6:42       ` [PATCH v5 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 2/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 3/4] unbundle: extend options to support object verification Xing Xin via GitGitGadget
2024-06-11  9:11           ` Patrick Steinhardt
2024-06-11 12:47             ` Xing Xin
2024-06-11  6:42         ` [PATCH v5 4/4] unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches Xing Xin via GitGitGadget
2024-06-11 12:45         ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11 12:45           ` [PATCH v6 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11 19:08             ` Junio C Hamano
2024-06-17 13:53               ` Xing Xin [this message]
2024-06-11 12:45           ` [PATCH v6 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11 19:20             ` Junio C Hamano
2024-06-11 12:45           ` [PATCH v6 3/3] unbundle: support object verification for fetches Xing Xin via GitGitGadget
2024-06-11 20:05             ` Junio C Hamano
2024-06-12 18:33               ` Xing Xin
2024-06-11 13:14           ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches Patrick Steinhardt
2024-06-17 13:55           ` [PATCH v7 " blanet via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-18 17:37               ` Junio C Hamano
2024-06-19  6:30                 ` Xing Xin
2024-06-17 13:55             ` [PATCH v7 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget
2024-06-19  4:07             ` [PATCH v8 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77366503.bef9.1902679d458.Coremail.bupt_xingxin@163.com \
    --to=bupt_xingxin@163.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    --cc=xingxin.xx@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).