Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: Han Xin <hanxin.hx@bytedance.com>, git@vger.kernel.org
Cc: xingxin.xx@bytedance.com, jonathantanmy@google.com,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 2/2] negotiator/skipping: fix some problems in mark_common()
Date: Wed, 26 Apr 2023 07:08:54 -0400	[thread overview]
Message-ID: <41273b5d-f4f8-2dce-94d1-37a9b56ed1ea@github.com> (raw)
In-Reply-To: <abbb1bc0b35d03e13249ec2e5bb8229a0a123685.1682473718.git.hanxin.hx@bytedance.com>

On 4/26/2023 12:05 AM, Han Xin wrote:
> Fixed the following problems:

This might be a good time to reference the change from recursive to
iterative:

  The mark_common() method in negotiator/skipping.c was converted
  from recursive to iterative in 4654134976f (negotiator/skipping:
  avoid stack overflow, 2022-10-25), but there is some more work
  to do:
 
> 1. prio_queue() should be used with clear_prio_queue(), otherwise there
>    will be a memory leak.
> 2. It does not do duplicate protection before prio_queue_put().
>    (The COMMON bit would work here, too.)
> 3. When it translated from recursive to iterative it kept "return"
>    statements that should probably be "continue" statements.
> 4. It does not attempt to parse commits, and instead returns
>    immediately when finding an unparsed commit. This is something
>    that it did in its original version, so maybe it is by design,
>    but it doesn't match the doc comment for the method.
> 
> Helped-by: Derrick Stolee <derrickstolee@github.com>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  negotiator/skipping.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/negotiator/skipping.c b/negotiator/skipping.c
> index c7d6ab39bc..b06dcb197b 100644
> --- a/negotiator/skipping.c
> +++ b/negotiator/skipping.c
> @@ -85,7 +85,7 @@ static int clear_marks(const char *refname, const struct object_id *oid,
>  }
>  
>  /*
> - * Mark this SEEN commit and all its SEEN ancestors as COMMON.
> + * Mark this SEEN commit and all its parsed SEEN ancestors as COMMON.

Ok, the doc comment is updated here instead of starting to parse
commits. Since this is the behavior since it was first introduced
in 42cc7485a2e (negotiator/skipping: skip commits during fetch,
2018-07-16), this is the best thing to do.

I notice from a second glance that we are only walking the commits
that are marked SEEN, so we are only visiting commits with respect
to a previous walk of some kind, which provides some explanation.

>  	while ((c = prio_queue_get(&queue))) {
>  		struct commit_list *p;
>  		if (c->object.flags & COMMON)
> -			return;
> +			continue;
>  		c->object.flags |= COMMON;
>  		if (!(c->object.flags & POPPED))
>  			data->non_common_revs--;
>  
>  		if (!c->object.parsed)
> -			return;
> +			continue;
>  		for (p = c->parents; p; p = p->next) {
> -			if (p->item->object.flags & SEEN)
> +			if (p->item->object.flags & SEEN || p->item->object.flags & COMMON)
>  				prio_queue_put(&queue, p->item);

This is the incorrect check for the COMMON bit, because it is
a positive check (we add the common bit after we pop a commit
from the queue) _and_ because we could add a commit multiple
times before it is first popped and that bit is added.

Instead, we need

			if ((p->item->object.flags & SEEN) &&
			    !(p->item->object.flags & COMMON)) {
				p->item->object.flags |= COMMON;
				prio_queue_put(&queue, p->item);
			}

and at the start of the loop we need to add the COMMON bit to
the starting commit. We also need to remove this bit from the
main section of the loop:

  		if (c->object.flags & COMMON)
			continue;
 		c->object.flags |= COMMON;

because it does nothing if the COMMON bit is added before
being added to the queue.

I'm very suspicious that this change did not trigger a test
failure, since the behavior is quite different from the previous
version. Of course, the recursive-to-iterative change was first
to change the behavior, so I'm not surprised that it isn't caught
by tests. What kind of tests can we introduce to harden our
coverage here?

>  		}
>  	}
> +
> +	clear_prio_queue(&queue);

Good to clean up this queue.

Thanks,
-Stolee


  reply	other threads:[~2023-04-26 11:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24  2:23 [PATCH v1] negotiator/default.c: avoid stack overflow Han Xin
2023-04-24 14:44 ` Derrick Stolee
2023-04-25  3:02   ` [External] " Han Xin
2023-04-25 13:34     ` Derrick Stolee
2023-04-26  4:05 ` [PATCH v2 0/2] negotiator/default: " Han Xin
2023-04-26  4:05   ` [PATCH v2 1/2] " Han Xin
2023-04-26 11:13     ` Derrick Stolee
2023-04-26 11:40       ` [External] " Han Xin
2023-04-26  4:05   ` [PATCH v2 2/2] negotiator/skipping: fix some problems in mark_common() Han Xin
2023-04-26 11:08     ` Derrick Stolee [this message]
2023-04-26 11:55       ` [External] " Han Xin
2023-04-26 13:15   ` [PATCH v2 0/2] negotiator/default: avoid stack overflow Han Xin
2023-04-26 13:15     ` [PATCH v3 1/2] " Han Xin
2023-04-26 17:14       ` Junio C Hamano
2023-04-26 17:30         ` Derrick Stolee
2023-04-26 17:38           ` Junio C Hamano
2023-04-26 13:15     ` [PATCH v3 2/2] negotiator/skipping: fix some problems in mark_common() Han Xin
2023-05-01 22:11     ` [PATCH v2 0/2] negotiator/default: avoid stack overflow Junio C Hamano
2023-05-02  1:49       ` Derrick Stolee
2023-05-02 15:51         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41273b5d-f4f8-2dce-94d1-37a9b56ed1ea@github.com \
    --to=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanxin.hx@bytedance.com \
    --cc=jonathantanmy@google.com \
    --cc=xingxin.xx@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).