Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: phillip.wood@dunelm.org.uk
Cc: Phillip Wood via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH 1/2] sequencer: stop exporting GIT_REFLOG_ACTION
Date: Tue, 08 Nov 2022 15:51:08 +0100	[thread overview]
Message-ID: <221108.864jv9sc9r.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <16baa0cd-0797-4427-3e39-e5ffd2dca544@dunelm.org.uk>


On Tue, Nov 08 2022, Phillip Wood wrote:

> Hi Ævar
>
> On 07/11/2022 19:35, Phillip Wood wrote:
>>>> @@ -5116,7 +5121,7 @@ static int single_pick(struct repository *r,
>>>>               TODO_PICK : TODO_REVERT;
>>>>       item.commit = cmit;
>>>> -    setenv(GIT_REFLOG_ACTION, action_name(opts), 0);
>>>> +    opts->reflog_message = sequencer_reflog_action(opts);
>>>>       return do_pick_commit(r, &item, opts, 0, &check_todo);
>>>
>>> Here you're adding a new memory leak, which you can see if you run
>>> e.g. the 1st test of ./t1013-read-tree-submodule.sh before & after this
>>> change.
>
> What's a read-tree test using rebase for? I find the submodule tests
> completely incomprehensible. It is calling 
> test_submodule_switch_recursing_with_args() which does not call rebase
> directly but who knows what is going on in all the helper functions. 

I don't know, I just worked by way backwards from the leak logs, so...

> Have you got a simple example of a test which shows a new leak?

...yes, e.g. (after make SANITIZE=leak):

	./t3425-rebase-topology-merges.sh -vixd

Will, on "master", emit:
	
	Direct leak of 1408 byte(s) in 1 object(s) allocated from:
	    #0 0x7ff891b5f545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
	    #1 0x6c45e8 in do_xmalloc wrapper.c:51
	    #2 0x6c4670 in xmalloc wrapper.c:72
	    #3 0x6037e2 in parse_options_concat parse-options-cb.c:188
	    #4 0x4c547c in run_sequencer builtin/revert.c:140
	    #5 0x4c5a4c in cmd_revert builtin/revert.c:247
	    #6 0x407a32 in run_builtin git.c:466
	    #7 0x407e0a in handle_builtin git.c:721
	    #8 0x40803d in run_argv git.c:788
	    #9 0x40850f in cmd_main git.c:923
	    #10 0x4eed6f in main common-main.c:57
	    #11 0x7ff8918b9209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #12 0x7ff8918b92bb in __libc_start_main_impl ../csu/libc-start.c:389
	    #13 0x405fd0 in _start (git+0x405fd0)
	
	Direct leak of 4 byte(s) in 1 object(s) allocated from:
	    #0 0x7ff891b5f545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
	    #1 0x7ff891929caa in __GI___strdup string/strdup.c:42
	    #2 0x6c4591 in xstrdup wrapper.c:39
	    #3 0x4c58f8 in run_sequencer builtin/revert.c:223
	    #4 0x4c5a4c in cmd_revert builtin/revert.c:247
	    #5 0x407a32 in run_builtin git.c:466
	    #6 0x407e0a in handle_builtin git.c:721
	    #7 0x40803d in run_argv git.c:788
	    #8 0x40850f in cmd_main git.c:923
	    #9 0x4eed6f in main common-main.c:57
	    #10 0x7ff8918b9209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #11 0x7ff8918b92bb in __libc_start_main_impl ../csu/libc-start.c:389
	    #12 0x405fd0 in _start (git+0x405fd0)

After we still have the first leak (which is unrelated), and the second,
but have added this one:
	
	Direct leak of 7 byte(s) in 1 object(s) allocated from:
	    #0 0x7f7cc51e5545 in __interceptor_malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
	    #1 0x7f7cc4fafcaa in __GI___strdup string/strdup.c:42
	    #2 0x6c460b in xstrdup wrapper.c:39
	    #3 0x66df91 in sequencer_reflog_action sequencer.c:3685
	    #4 0x6725ad in single_pick sequencer.c:5124
	    #5 0x6728dd in sequencer_pick_revisions sequencer.c:5178
	    #6 0x4c5a17 in run_sequencer builtin/revert.c:237
	    #7 0x4c5aa9 in cmd_revert builtin/revert.c:247
	    #8 0x407a32 in run_builtin git.c:466
	    #9 0x407e0a in handle_builtin git.c:721
	    #10 0x40803d in run_argv git.c:788
	    #11 0x40850f in cmd_main git.c:923
	    #12 0x4eedcc in main common-main.c:57
	    #13 0x7f7cc4f3f209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #14 0x7f7cc4f3f2bb in __libc_start_main_impl ../csu/libc-start.c:389
	    #15 0x405fd0 in _start (git+0x405fd0)

But more to the point, if you run the test suite with e.g.:

	GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true

You can find these raw reports in:

	grep -r sequencer test-results/*.leak

Or, from my github.com/avar/git.git use this nice script/alias to
summarize it (I haven't upstreamed this yet):

	$ git help scan-leaks-top
	'scan-leaks-top' is aliased to '!f() { cd t && git cat-file blob avar/add-new-sanitize-leak-test-modes-follow-up:t/aggregate-leaks.perl | perl - | less -S; }; f'

>> I'm not sure how, opts->reflog_message will be a copy of
>> opts->reflog_action which is freed at the end of the rebase. I'll
>> have a proper look tomorrow to see if I'm missing something.
>
> So it is possible this is showing up because I think we only free the
> heap allocated members of opts in sequencer_remove_state() and that is 
> not called when we stop for a conflict resolution, a break command, a
> failed exec or a rescheduled pick/reset etc. The way to fix that would 
> be to refactor sequencer_remove_state() to create
> replay_opts_release() and call that from builtin/{revert,rebase}.c

Yes, I think that's probably the root cause. I have a leak-fixing topic
as a follow-up to my current one, which among other things tried to
address this: https://github.com/avar/git/commit/7a150d1b7e2

I'd just forgot about it. That link currently says committed <24hrs ago,
but I was just rebasing the topic for something unrelated, I hacked this
up in mid-August.

> As that is unrelated to removing the setenv() calls which is the focus
> of this series I will not be doing that in this series.

I'm fine with us leaving this for now, and saying that it's OK that
we're adding some new leaks, if we're addressing the setenv/getenv
issue, and that we can fix the root cause of the current leaks later.

But let's be clear: It's not unrelated to your refactoring in this
topic, we didn't have this leak before, and now we have it. These two
patches are the cause of some new leaks we didn't have before.

And, if we run this on my topic which narrowly attempted to fix these
leaks e.g. that "t3425-rebase-topology-merges.sh" test will have just 1
leak in that failing test, v.s. 3 leaks with this topic (the "4 byte(s)
in 1 object(s)" above).

It's just a nice coincidence that our memory leaks are currently in such
a sorry state overall that this isn't failing e.g. the linux-leaks CI,
because the new leaks are being masked by tests that area already
failing due to other pre-existing leaks.

But all that being said I think the right move is for this topic to
proceed, perhaps with an updated commit message noting some of this.

It's really just running into the existing problem of
replay_opts_release(). If that destructor isn't reliable (which it
isn't) we can still make new use of it, and then fix how we call it
later for all its callers.

Which I've just tested: I cherry-pick that 7a150d1b7e2 and the few
preceding commits it needs (dcc104aef89..7a150d1b7e2), and then apply
this on top:
	
	diff --git a/builtin/revert.c b/builtin/revert.c
	index ee32c714a76..0abd805beed 100644
	--- a/builtin/revert.c
	+++ b/builtin/revert.c
	@@ -250,6 +250,7 @@ int cmd_revert(int argc, const char **argv, const char *prefix)
	 	if (opts.revs)
	 		release_revisions(opts.revs);
	 	free(opts.revs);
	+	replay_opts_release(&opts);
	 	return res;
	 }

The leaks above are down to just the unrelated parse_options_concat()
leak. I.e. this really is just a case of us missing the destructor due
to a more general issue.

1. https://lore.kernel.org/git/8eec228d-d392-523d-2415-149b946f642e@dunelm.org.uk/

  reply	other threads:[~2022-11-08 15:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 15:19 [PATCH 0/2] rebase: stop setting GIT_REFLOG_ACTION Phillip Wood via GitGitGadget
2022-11-04 15:19 ` [PATCH 1/2] sequencer: stop exporting GIT_REFLOG_ACTION Phillip Wood via GitGitGadget
2022-11-04 21:56   ` Taylor Blau
2022-11-07 16:12   ` Ævar Arnfjörð Bjarmason
2022-11-07 19:35     ` Phillip Wood
2022-11-08  9:54       ` Phillip Wood
2022-11-08 14:51         ` Ævar Arnfjörð Bjarmason [this message]
2022-11-04 15:19 ` [PATCH 2/2] rebase: " Phillip Wood via GitGitGadget
2022-11-04 21:49 ` [PATCH 0/2] rebase: stop setting GIT_REFLOG_ACTION Taylor Blau
2022-11-04 21:49   ` Taylor Blau
2022-11-07 15:51   ` Ævar Arnfjörð Bjarmason
2022-11-07 19:56     ` Taylor Blau
2022-11-09 14:21 ` [PATCH v2 " Phillip Wood via GitGitGadget
2022-11-09 14:21   ` [PATCH v2 1/2] sequencer: stop exporting GIT_REFLOG_ACTION Phillip Wood via GitGitGadget
2022-11-09 14:21   ` [PATCH v2 2/2] rebase: " Phillip Wood via GitGitGadget
2022-11-09 16:05   ` [PATCH v2 0/2] rebase: stop setting GIT_REFLOG_ACTION Ævar Arnfjörð Bjarmason
2022-11-09 16:30     ` Phillip Wood
2022-11-09 23:17       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221108.864jv9sc9r.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).