Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Heather Lapointe via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>,
	"Heather Lapointe" <alpha@alphaservcomputing.solutions>
Subject: Re: [PATCH v3 4/9] tree: handle submodule case for read_tree_at properly
Date: Mon, 17 Oct 2022 06:56:37 -0700	[thread overview]
Message-ID: <xmqqsfjm1rl6.fsf@gitster.g> (raw)
In-Reply-To: d3d1738e670d5dbf1378fc5c3209b2e98234a771.1665973401.git.gitgitgadget@gmail.com

"Heather Lapointe via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Heather Lapointe <alpha@alphaservcomputing.solutions>
>
> This supports traversal into an actual submodule for read_tree_at.
> The logic is blocked on pathspec->recurse_submodules now,
> but previously hadn't been executed due to all fn() cases
> returning early for submodules.
>
> Signed-off-by: Heather Lapointe <alpha@alphaservcomputing.solutions>
> ---
>  tree.c | 88 ++++++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 61 insertions(+), 27 deletions(-)
>
> diff --git a/tree.c b/tree.c
> index 13f9173d45e..2a087c010f9 100644
> --- a/tree.c
> +++ b/tree.c
> @@ -8,6 +8,7 @@
>  #include "alloc.h"
>  #include "tree-walk.h"
>  #include "repository.h"
> +#include "pathspec.h"
>  
>  const char *tree_type = "tree";
>  
> @@ -47,40 +48,73 @@ int read_tree_at(struct repository *r,
>  			return -1;
>  		}
>  
> -		if (S_ISDIR(entry.mode))
> +		if (S_ISDIR(entry.mode)) {
>  			oidcpy(&oid, &entry.oid);
> +			len = tree_entry_len(&entry);
> +			strbuf_add(base, entry.path, len);
> +			strbuf_addch(base, '/');
> +			retval = read_tree_at(r, lookup_tree(r, &oid),
> +						base, pathspec,
> +						fn, context);
> +			strbuf_setlen(base, oldlen);
> +			if (retval)
> +				return -1;

The diff output makes it appear as if we are now adding many extra
processing to a normal directory case, but it actually folds the
code that was originally outside the if/else if/ cascade here.  So
I think this is not breaking the normal directory case.

> +		} else if (pathspec->recurse_submodules && S_ISGITLINK(entry.mode)) {
> +			struct commit *commit;
> +			struct repository subrepo;
> +			struct repository* subrepo_p = &subrepo;
> +			struct tree* submodule_tree;

In our codebase, star/asterisk for a pointer declaration sticks to
the variable, not the type.

cf. Documentation/CodingGuidelines

> +			char *submodule_rel_path;

Funny that the new code sometimes gets it right ;-)

> +			int name_base_len = 0;
> +
> +			len = tree_entry_len(&entry);
> +			strbuf_add(base, entry.path, len);
> +			submodule_rel_path = base->buf;
> +			// repo_submodule_init expects a path relative to submodule_prefix

We avoid // comments.

> +			if (r->submodule_prefix) {
> +				name_base_len = strlen(r->submodule_prefix);
> +				// we should always expect to start with submodule_prefix
> +				assert(!strncmp(submodule_rel_path, r->submodule_prefix, name_base_len));
> +				// strip the prefix
> +				submodule_rel_path += name_base_len;
> +				// if submodule_prefix doesn't end with a /, we want to get rid of that too
> +				if (is_dir_sep(submodule_rel_path[0])) {
> +					submodule_rel_path++;
> +				}
> +			}
> +
> +			if (repo_submodule_init(subrepo_p, r, submodule_rel_path, null_oid()))
> +				die("couldn't init submodule %s", base->buf);
> +
> +			if (repo_read_index(subrepo_p) < 0)
> +				die("index file corrupt");

Why?  You are going to ask the object store of the submodule
repository, and to do so you do not need to have its index read into
the core.

> +			commit = lookup_commit(subrepo_p, &entry.oid);
>  			if (!commit)
> -				die("Commit %s in submodule path %s%s not found",
> +				die("Commit %s in submodule path %s not found",
>  				    oid_to_hex(&entry.oid),
> -				    base->buf, entry.path);
> -
> -			// FIXME: This is the wrong repo instance (it refers to the superproject)
> -			// it will always fail as is (will fix in later patch)
> -			// This current codepath isn't executed by any existing callbacks
> -			// so it wouldn't show up as an issue at this time.
> -			if (repo_parse_commit(r, commit))
> -				die("Invalid commit %s in submodule path %s%s",
> +				    base->buf);
> +
> +			if (repo_parse_commit(subrepo_p, commit))
> +				die("Invalid commit %s in submodule path %s",
>  				    oid_to_hex(&entry.oid),
> -				    base->buf, entry.path);
> +				    base->buf);
>  
> -			oidcpy(&oid, get_commit_tree_oid(commit));
> -		}
> -		else
> -			continue;
> +			submodule_tree = repo_get_commit_tree(subrepo_p, commit);
> +			oidcpy(&oid, submodule_tree ? &submodule_tree->object.oid : NULL);
>  
> -		len = tree_entry_len(&entry);
> -		strbuf_add(base, entry.path, len);
> -		strbuf_addch(base, '/');
> -		retval = read_tree_at(r, lookup_tree(r, &oid),
> -				      base, pathspec,
> -				      fn, context);
> -		strbuf_setlen(base, oldlen);
> -		if (retval)
> -			return -1;
> +			strbuf_addch(base, '/');
> +
> +			retval = read_tree_at(subrepo_p, lookup_tree(subrepo_p, &oid),
> +						base, pathspec,
> +						fn, context);
> +			if (retval)
> +			    die("failed to read tree for %s", base->buf);
> +			strbuf_setlen(base, oldlen);
> +			repo_clear(subrepo_p);

This is a lot of new code, which must be done correctly.  An easier
way out to use the add_submodule_odb() trick that the original code
assumed becomes somewhat tempting (I guess we would do that in fn()
that would tell us to recurse into this codepath upon seeing a gitlink
entry).  Then we wouldn't have had to touch any tree() calls that were
taught to take "struct repository *" in earlier steps in this series.

But at some point, we would need to bite the bullet and plumb the
repository pointer through the callchain of more APIs, and this may
be that point.  I dunno.

> +		}
> +		// else, this is a file (or a submodule, but no pathspec->recurse_submodules)
>  	}
>  	return 0;
>  }

  parent reply	other threads:[~2022-10-17 13:56 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-12 17:52 [PATCH] archive: add --recurse-submodules to git-archive command Heather Lapointe via GitGitGadget
2022-10-13 11:35 ` [PATCH v2 0/2] archive: Add " Heather Lapointe via GitGitGadget
2022-10-13 11:35   ` [PATCH v2 1/2] archive: add " Alphadelta14 via GitGitGadget
2022-10-13 17:53     ` René Scharfe
2022-10-13 21:37       ` Heather Lapointe
2022-10-13 11:36   ` [PATCH v2 2/2] archive: fix a case of submodule in submodule traversal Alphadelta14 via GitGitGadget
2022-10-13 17:53   ` [PATCH v2 0/2] archive: Add --recurse-submodules to git-archive command René Scharfe
2022-10-13 21:23     ` Heather Lapointe
2022-10-14  9:47       ` René Scharfe
2022-10-17  2:23   ` [PATCH v3 0/9] " Heather Lapointe via GitGitGadget
2022-10-17  2:23     ` [PATCH v3 1/9] tree: do not use the_repository for tree traversal methods Alphadelta14 via GitGitGadget
2022-10-17 13:26       ` Junio C Hamano
2022-10-26 22:33       ` Glen Choo
2022-10-27 18:09       ` Jonathan Tan
2022-10-27 18:50         ` Junio C Hamano
2022-10-17  2:23     ` [PATCH v3 2/9] tree: update cases to use repo_ tree methods Heather Lapointe via GitGitGadget
2022-10-17  2:23     ` [PATCH v3 3/9] tree: increase test coverage for tree.c Heather Lapointe via GitGitGadget
2022-10-17 13:34       ` Phillip Wood
2022-10-17 13:36       ` Junio C Hamano
2022-10-27 18:28       ` Jonathan Tan
2022-10-17  2:23     ` [PATCH v3 4/9] tree: handle submodule case for read_tree_at properly Heather Lapointe via GitGitGadget
2022-10-17 13:48       ` Phillip Wood
2022-10-17 13:56       ` Junio C Hamano [this message]
2022-10-26 22:48       ` Glen Choo
2022-10-27 18:43       ` Jonathan Tan
2022-10-17  2:23     ` [PATCH v3 5/9] tree: add repository parameter to read_tree_fn_t Heather Lapointe via GitGitGadget
2022-10-17  2:23     ` [PATCH v3 6/9] archive: pass repo objects to write_archive handlers Heather Lapointe via GitGitGadget
2022-10-17 13:50       ` Phillip Wood
2022-10-17  2:23     ` [PATCH v3 7/9] archive: remove global repository from archive_args Heather Lapointe via GitGitGadget
2022-10-17  2:23     ` [PATCH v3 8/9] archive: add --recurse-submodules to git-archive command Heather Lapointe via GitGitGadget
2022-10-26 23:34       ` Glen Choo
2022-10-27  7:09         ` René Scharfe
2022-10-27 17:29           ` Glen Choo
2022-10-27 17:30           ` Glen Choo
2022-10-27 17:33           ` Glen Choo
2022-10-17  2:23     ` [PATCH v3 9/9] archive: add tests for git archive --recurse-submodules Heather Lapointe via GitGitGadget
2022-10-27 18:54       ` Jonathan Tan
2022-10-27 23:30         ` Glen Choo
2022-10-28  0:17       ` Ævar Arnfjörð Bjarmason
2022-10-17 13:57     ` [PATCH v3 0/9] archive: Add --recurse-submodules to git-archive command Phillip Wood
2022-10-18 18:34     ` Junio C Hamano
2022-10-18 18:48       ` Heather Lapointe
2022-10-19 16:16         ` Junio C Hamano
2022-10-19 20:44           ` Junio C Hamano
2022-10-20  1:21             ` Junio C Hamano
2022-10-21  1:43               ` Junio C Hamano
2022-10-26 22:14     ` Glen Choo
2022-10-28 18:18       ` Heather Lapointe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqsfjm1rl6.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=alpha@alphaservcomputing.solutions \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).