Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: "blanet via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>,
	Karthik Nayak <karthik.188@gmail.com>,
	blanet <bupt_xingxin@163.com>
Subject: [PATCH v5 0/4] object checking related additions and fixes for bundles in fetches
Date: Tue, 11 Jun 2024 06:42:02 +0000	[thread overview]
Message-ID: <pull.1730.v5.git.1718088126.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1730.v4.git.1717057290.gitgitgadget@gmail.com>

While attempting to fix a reference negotiation bug in bundle-uri, we
identified that the fetch process lacks some crucial object validation
checks when processing bundles. The primary issues are:

 1. In the bundle-uri scenario, object IDs were not validated before writing
    bundle references. This was the root cause of the original negotiation
    bug in bundle-uri and could lead to potential repository corruption.
 2. The existing "fetch.fsckObjects" and "transfer.fsckObjects"
    configurations were not applied when directly fetching bundles or
    fetching with bundle-uri enabled. In fact, there were no object
    validation supports for unbundle.

The first patch addresses the bundle-uri negotiation issue by removing the
REF_SKIP_OID_VERIFICATION flag when writing bundle references.

Patches 2 through 4 extend verify_bundle_flags for bundle.c:unbundle to add
support for object validation (fsck) in different scenarios, mainly
following the suggestions from Junio on the mailing list.

Xing Xin (4):
  bundle-uri: verify oid before writing refs
  fetch-pack: expose fsckObjects configuration logic
  unbundle: extend options to support object verification
  unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches

 bundle-uri.c                |   5 +-
 bundle.c                    |  10 ++
 bundle.h                    |   2 +
 fetch-pack.c                |  17 ++--
 fetch-pack.h                |   5 +
 t/t5558-clone-bundle-uri.sh | 186 +++++++++++++++++++++++++++++++++++-
 t/t5607-clone-bundle.sh     |  33 +++++++
 transport.c                 |   2 +-
 8 files changed, 246 insertions(+), 14 deletions(-)


base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1730%2Fblanet%2Fxx%2Fbundle-uri-bug-using-bundle-list-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1730/blanet/xx/bundle-uri-bug-using-bundle-list-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1730

Range-diff vs v4:

 1:  e958a3ab20c = 1:  e958a3ab20c bundle-uri: verify oid before writing refs
 3:  5ddc894c2c1 = 2:  d21c236b8de fetch-pack: expose fsckObjects configuration logic
 2:  beb70735811 ! 3:  0a18d7839be unbundle: extend verify_bundle_flags to support fsck-objects
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: extend verify_bundle_flags to support fsck-objects
     +    unbundle: extend options to support object verification
      
     -    This commit extends `verify_bundle_flags` by adding a new option
     -    `VERIFY_BUNDLE_FSCK_ALWAYS`, which enables checks for broken objects in
     -    `bundle.c:unbundle`. This option is now used as the default for fetches
     -    involving bundles, specifically by `transport.c:fetch_refs_from_bundle`
     -    for direct bundle fetches and by `bundle-uri.c:unbundle_from_file` for
     -    _bundle-uri_ enabled fetches.
     +    This commit extends object verification support in `bundle.c:unbundle`
     +    by adding two new options to `verify_bundle_flags`:
      
     -    Upcoming commits will introduce another option as a replacement that
     -    fits better with fetch operations. `VERIFY_BUNDLE_FSCK_ALWAYS` will be
     -    further used to add "--fsck-objects" support for "git bundle unbundle"
     -    and "git bundle verify".
     +    - `VERIFY_BUNDLE_FSCK_ALWAYS` explicitly enables checks for broken
     +      objects. It will be used to add "--fsck-objects" support for "git
     +      bundle unbundle" in a separate series.
     +    - `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` is designed to be used during fetch
     +      operations, specifically for direct bundle fetches and _bundle-uri_
     +      enabled fetches. When enabled, `bundle.c:unbundle` invokes
     +      `fetch-pack.c:fetch_pack_fsck_objects` to determine whether to enable
     +      checks for broken objects. Passing this flag during fetching will be
     +      implemented in a subsequent commit.
     +
     +    Note that the option `VERIFY_BUNDLE_FSCK_ALWAYS` takes precedence over
     +    `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH`.
      
          Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
     - ## bundle-uri.c ##
     -@@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *file)
     - 	 * the prerequisite commits.
     - 	 */
     - 	if ((result = unbundle(r, &header, bundle_fd, NULL,
     --			       VERIFY_BUNDLE_QUIET)))
     -+			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
     - 		return 1;
     - 
     - 	/*
     -
       ## bundle.c ##
     +@@
     + #include "list-objects-filter-options.h"
     + #include "connected.h"
     + #include "write-or-die.h"
     ++#include "fetch-pack.h"
     + 
     + static const char v2_bundle_signature[] = "# v2 git bundle\n";
     + static const char v3_bundle_signature[] = "# v3 git bundle\n";
     +@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     + 	     enum verify_bundle_flags flags)
     + {
     + 	struct child_process ip = CHILD_PROCESS_INIT;
     ++	int fsck_objects = 0;
     + 
     + 	if (verify_bundle(r, header, flags))
     + 		return -1;
      @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
       	if (header->filter.choice)
       		strvec_push(&ip.args, "--promisor=from-bundle");
       
      +	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
     ++		fsck_objects = 1;
     ++	else if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)
     ++		fsck_objects = fetch_pack_fsck_objects();
     ++
     ++	if (fsck_objects)
      +		strvec_push(&ip.args, "--fsck-objects");
      +
       	if (extra_index_pack_args) {
     @@ bundle.h: int create_bundle(struct repository *r, const char *path,
       	VERIFY_BUNDLE_VERBOSE = (1 << 0),
       	VERIFY_BUNDLE_QUIET = (1 << 1),
      +	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     ++	VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 3),
       };
       
       int verify_bundle(struct repository *r, struct bundle_header *header,
     -
     - ## transport.c ##
     -@@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
     - 	if (!data->get_refs_from_bundle_called)
     - 		get_refs_from_bundle_inner(transport);
     - 	ret = unbundle(the_repository, &data->header, data->fd,
     --		       &extra_index_pack_args, 0);
     -+		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
     - 	transport->hash_algo = data->header.hash_algo;
     - 	return ret;
     - }
 4:  68b9bca9f8b ! 4:  eb9f21f16b5 unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH
     @@ Metadata
      Author: Xing Xin <xingxin.xx@bytedance.com>
      
       ## Commit message ##
     -    unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH
     +    unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches
      
     -    This commit introduces a new option `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` to
     -    `verify_bundle_flags`. In `bundle.c:unbundle`, this new option controls
     -    whether broken object checks should be enabled by invoking
     -    `fetch-pack.c:fetch_pack_fsck_objects`. Note that the option
     -    `VERIFY_BUNDLE_FSCK_ALWAYS` takes precedence over
     -    `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH`.
     -
     -    This flag is now used in the fetching process by:
     +    This commit passes `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` to `unbundle` in
     +    the fetching process, including:
      
          - `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
          - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.
     @@ Commit message
          fetch operations. Tests have been added to confirm functionality in the
          scenarios mentioned above.
      
     +    Reviewed-by: Patrick Steinhardt <ps@pks.im>
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
       ## bundle-uri.c ##
     @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *fi
       	 * the prerequisite commits.
       	 */
       	if ((result = unbundle(r, &header, bundle_fd, NULL,
     --			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_ALWAYS)))
     +-			       VERIFY_BUNDLE_QUIET)))
      +			       VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)))
       		return 1;
       
       	/*
      
     - ## bundle.c ##
     -@@
     - #include "list-objects-filter-options.h"
     - #include "connected.h"
     - #include "write-or-die.h"
     -+#include "fetch-pack.h"
     - 
     - static const char v2_bundle_signature[] = "# v2 git bundle\n";
     - static const char v3_bundle_signature[] = "# v3 git bundle\n";
     -@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     - 	     enum verify_bundle_flags flags)
     - {
     - 	struct child_process ip = CHILD_PROCESS_INIT;
     -+	int fsck_objects = 0;
     - 
     - 	if (verify_bundle(r, header, flags))
     - 		return -1;
     -@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
     - 		strvec_push(&ip.args, "--promisor=from-bundle");
     - 
     - 	if (flags & VERIFY_BUNDLE_FSCK_ALWAYS)
     -+		fsck_objects = 1;
     -+	else if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH)
     -+		fsck_objects = fetch_pack_fsck_objects();
     -+
     -+	if (fsck_objects)
     - 		strvec_push(&ip.args, "--fsck-objects");
     - 
     - 	if (extra_index_pack_args) {
     -
     - ## bundle.h ##
     -@@ bundle.h: enum verify_bundle_flags {
     - 	VERIFY_BUNDLE_VERBOSE = (1 << 0),
     - 	VERIFY_BUNDLE_QUIET = (1 << 1),
     - 	VERIFY_BUNDLE_FSCK_ALWAYS = (1 << 2),
     -+	VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 3),
     - };
     - 
     - int verify_bundle(struct repository *r, struct bundle_header *header,
     -
       ## t/t5558-clone-bundle-uri.sh ##
      @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'create bundle' '
       		git bundle create B.bundle topic &&
     @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport,
       	if (!data->get_refs_from_bundle_called)
       		get_refs_from_bundle_inner(transport);
       	ret = unbundle(the_repository, &data->header, data->fd,
     --		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_ALWAYS);
     +-		       &extra_index_pack_args, 0);
      +		       &extra_index_pack_args, VERIFY_BUNDLE_FSCK_FOLLOW_FETCH);
       	transport->hash_algo = data->header.hash_algo;
       	return ret;

-- 
gitgitgadget

  parent reply	other threads:[~2024-06-11  6:42 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15  3:01 [PATCH] bundle-uri: refresh packed_git if unbundle succeed blanet via GitGitGadget
2024-05-17  5:00 ` Patrick Steinhardt
2024-05-17 16:14   ` Junio C Hamano
2024-05-20 11:48     ` Xing Xin
2024-05-20 17:19       ` Junio C Hamano
2024-05-27 16:04         ` Xing Xin
2024-05-20  9:41   ` Xing Xin
2024-05-17  7:36 ` Karthik Nayak
2024-05-20 10:19   ` Xing Xin
2024-05-20 12:36 ` [PATCH v2] bundle-uri: verify oid before writing refs blanet via GitGitGadget
2024-05-21 15:41   ` Karthik Nayak
2024-05-27 15:41   ` [PATCH v3 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-27 15:41     ` [PATCH v3 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-28 11:55       ` Patrick Steinhardt
2024-05-30  8:32         ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 2/4] unbundle: introduce unbundle_fsck_flags for fsckobjects handling Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-29 18:12         ` Xing Xin
2024-05-30  4:38           ` Patrick Steinhardt
2024-05-30  8:46             ` Xing Xin
2024-05-27 15:41     ` [PATCH v3 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-28 12:03       ` Patrick Steinhardt
2024-05-28 17:10         ` Junio C Hamano
2024-05-28 17:24           ` Junio C Hamano
2024-05-29  5:52             ` Patrick Steinhardt
2024-05-30  8:48             ` Xing Xin
2024-05-29  5:52           ` Patrick Steinhardt
2024-05-27 15:41     ` [PATCH v3 4/4] unbundle: introduce new option UNBUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-05-28 12:05       ` Patrick Steinhardt
2024-05-30  8:54         ` Xing Xin
2024-05-30  8:21     ` [PATCH v4 0/4] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 2/4] unbundle: extend verify_bundle_flags to support fsck-objects Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-05-30  8:21       ` [PATCH v4 3/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-05-30  8:21       ` [PATCH v4 4/4] unbundle: introduce option VERIFY_BUNDLE_FSCK_FOLLOW_FETCH Xing Xin via GitGitGadget
2024-06-06 12:06         ` Patrick Steinhardt
2024-06-11  6:46           ` Xing Xin
2024-06-11  6:42       ` blanet via GitGitGadget [this message]
2024-06-11  6:42         ` [PATCH v5 1/4] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 2/4] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11  6:42         ` [PATCH v5 3/4] unbundle: extend options to support object verification Xing Xin via GitGitGadget
2024-06-11  9:11           ` Patrick Steinhardt
2024-06-11 12:47             ` Xing Xin
2024-06-11  6:42         ` [PATCH v5 4/4] unbundle: use VERIFY_BUNDLE_FSCK_FOLLOW_FETCH for fetches Xing Xin via GitGitGadget
2024-06-11 12:45         ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-11 12:45           ` [PATCH v6 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-11 19:08             ` Junio C Hamano
2024-06-17 13:53               ` Xing Xin
2024-06-11 12:45           ` [PATCH v6 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-11 19:20             ` Junio C Hamano
2024-06-11 12:45           ` [PATCH v6 3/3] unbundle: support object verification for fetches Xing Xin via GitGitGadget
2024-06-11 20:05             ` Junio C Hamano
2024-06-12 18:33               ` Xing Xin
2024-06-11 13:14           ` [PATCH v6 0/3] object checking related additions and fixes for bundles in fetches Patrick Steinhardt
2024-06-17 13:55           ` [PATCH v7 " blanet via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-18 17:37               ` Junio C Hamano
2024-06-19  6:30                 ` Xing Xin
2024-06-17 13:55             ` [PATCH v7 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-17 13:55             ` [PATCH v7 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget
2024-06-19  4:07             ` [PATCH v8 0/3] object checking related additions and fixes for bundles in fetches blanet via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 1/3] bundle-uri: verify oid before writing refs Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 2/3] fetch-pack: expose fsckObjects configuration logic Xing Xin via GitGitGadget
2024-06-19  4:07               ` [PATCH v8 3/3] unbundle: extend object verification for fetches Xing Xin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1730.v5.git.1718088126.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=bupt_xingxin@163.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).