Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] fix a leak with excludes_file
@ 2024-04-06 14:29 Rubén Justo
  2024-04-06 14:31 ` [PATCH 1/4] path.c: introduce strbuf_interpolate_path Rubén Justo
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-06 14:29 UTC (permalink / raw
  To: Git List

Having ...
	[core]
        	excludesFile = /some/global/path/.gitignore

... this triggers a leak:

	$ GIT_EDITOR=: git rebase -i HEAD
	Successfully rebased and updated detached HEAD.

	=================================================================
	==...==ERROR: LeakSanitizer: detected memory leaks

	Direct leak of 60 byte(s) in 1 object(s) allocated from:
	    #0  ... in realloc 
	    #1  ... in xrealloc wrapper.c:137
	    #2  ... in strbuf_grow strbuf.c:112
	    #3  ... in strbuf_add strbuf.c:311
	    #4  ... in strbuf_addstr strbuf.h:310
	    #5  ... in interpolate_path path.c:771
	    #6  ... in git_config_pathname config.c:1352
	    #7  ... in git_default_core_config config.c:1588
	    #8  ... in git_default_config config.c:1791
	    #9  ... in rebase_config builtin/rebase.c:801
	    #10 ... in configset_iter config.c:2161
	    #11 ... in repo_config config.c:2540
	    #12 ... in git_config config.c:2663
	    #13 ... in cmd_rebase builtin/rebase.c:1187
	    #14 ... in run_builtin git.c:469
	    #15 ... in handle_builtin git.c:724
	    #16 ... in run_argv git.c:788
	    #17 ... in cmd_main git.c:923
	    #18 ... in main common-main.c:62

It happens because we parse twice the configuration: 

	$ GIT_EDITOR=: gdb --ex "break git_config" --ex "run" --args git rebase -i HEAD
	(gdb) bt
	#0  git_config () at config.c:2663
	#1  ... in cmd_rebase () at builtin/rebase.c:1187
	#2  ... in run_builtin () at git.c:469
	#3  ... in handle_builtin () at git.c:724
	#4  ... in run_argv () at git.c:788
	#5  ... in cmd_main () at git.c:923
	#6  ... in main () at common-main.c:62
	(gdb) c
	(gdb) bt
	#0  git_config () at config.c:2663
	#1  ... in sequencer_init_config () at sequencer.c:291
	#2  ... in get_replay_opts () at builtin/rebase.c:161
	#3  ... in do_interactive_rebase () at builtin/rebase.c:271
	#4  ... in run_sequencer_rebase () at builtin/rebase.c:339
	#5  ... in run_specific_rebase () at builtin/rebase.c:705
	#6  ... in cmd_rebase () at builtin/rebase.c:1830
	#7  ... in run_builtin () at git.c:469
	#8  ... in handle_builtin () at git.c:724
	#9  ... in run_argv () at git.c:788
	#10 ... in cmd_main () at git.c:923
	#11 ... in main () at common-main.c:62

We call twice to git_config(): first to get the main git-branch(1) options,
and second to get the ones related to the sequencer.

Due to how git_config() works the global configuration is parsed twice,
therefore if core.excludesFile is set, it will be allocated twice.

A free() before the git_config_pathname() can be a simpler fix, but I
think this series offers a better approach, perhaps applicable to other
potential similar leaks.

Rubén Justo (4):
  path.c: introduce strbuf_interpolate_path
  config.c: introduce git_config_strbuf_pathname
  environment.c: convert excludes_file to struct strbuf
  t7300: mark as leak-free

 config.c         | 12 +++++++++++-
 config.h         |  2 ++
 dir.c            | 13 +++++++++----
 environment.c    |  2 +-
 environment.h    |  2 +-
 path.c           | 20 ++++++++++++++------
 path.h           |  1 +
 t/t7300-clean.sh |  1 +
 8 files changed, 40 insertions(+), 13 deletions(-)

-- 
2.44.0.697.g9b33b46f29


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] path.c: introduce strbuf_interpolate_path
  2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
@ 2024-04-06 14:31 ` Rubén Justo
  2024-04-06 14:32 ` [PATCH 2/4] config.c: introduce git_config_strbuf_pathname Rubén Justo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-06 14:31 UTC (permalink / raw
  To: Git List

Factorize interpolate_path to have a similar function that uses a
strbuf, instead of allocating a new string, to return the interpolated
path.

It will allow us to avoid some allocs and also some frees, which we will
take advantage of in the next commits.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 path.c | 20 ++++++++++++++------
 path.h |  1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/path.c b/path.c
index 8bb223c92c..7a1a1a9bc0 100644
--- a/path.c
+++ b/path.c
@@ -737,8 +737,16 @@ static struct passwd *getpw_str(const char *username, size_t len)
 char *interpolate_path(const char *path, int real_home)
 {
 	struct strbuf user_path = STRBUF_INIT;
+
+	return strbuf_interpolate_path(path, real_home, &user_path);
+}
+
+char *strbuf_interpolate_path(const char *path, int real_home, struct strbuf* dst)
+{
 	const char *to_copy = path;
 
+	strbuf_reset(dst);
+
 	if (!path)
 		goto return_null;
 
@@ -754,9 +762,9 @@ char *interpolate_path(const char *path, int real_home)
 			if (!home)
 				goto return_null;
 			if (real_home)
-				strbuf_add_real_path(&user_path, home);
+				strbuf_add_real_path(dst, home);
 			else
-				strbuf_addstr(&user_path, home);
+				strbuf_addstr(dst, home);
 #ifdef GIT_WINDOWS_NATIVE
 			convert_slashes(user_path.buf);
 #endif
@@ -764,14 +772,14 @@ char *interpolate_path(const char *path, int real_home)
 			struct passwd *pw = getpw_str(username, username_len);
 			if (!pw)
 				goto return_null;
-			strbuf_addstr(&user_path, pw->pw_dir);
+			strbuf_addstr(dst, pw->pw_dir);
 		}
 		to_copy = first_slash;
 	}
-	strbuf_addstr(&user_path, to_copy);
-	return strbuf_detach(&user_path, NULL);
+	strbuf_addstr(dst, to_copy);
+	return dst->buf;
 return_null:
-	strbuf_release(&user_path);
+	strbuf_release(dst);
 	return NULL;
 }
 
diff --git a/path.h b/path.h
index e053effef2..da7e5384a3 100644
--- a/path.h
+++ b/path.h
@@ -185,6 +185,7 @@ int calc_shared_perm(int mode);
 int adjust_shared_perm(const char *path);
 
 char *interpolate_path(const char *path, int real_home);
+char *strbuf_interpolate_path(const char *path, int real_home, struct strbuf *dst);
 const char *enter_repo(const char *path, int strict);
 const char *remove_leading_path(const char *in, const char *prefix);
 const char *relative_path(const char *in, const char *prefix, struct strbuf *sb);
-- 
2.44.0.697.g9b33b46f29

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] config.c: introduce git_config_strbuf_pathname
  2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
  2024-04-06 14:31 ` [PATCH 1/4] path.c: introduce strbuf_interpolate_path Rubén Justo
@ 2024-04-06 14:32 ` Rubén Justo
  2024-04-06 14:32 ` [PATCH 3/4] environment.c: convert excludes_file to struct strbuf Rubén Justo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-06 14:32 UTC (permalink / raw
  To: Git List

Add a new git_config_strbuf_pathname function, similar to
git_config_pathname, that works with a strbuf relying on the recently
introduced strbuf_interpolate_path.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 config.c | 10 ++++++++++
 config.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/config.c b/config.c
index eebce8c7e0..9beeb63b50 100644
--- a/config.c
+++ b/config.c
@@ -1355,6 +1355,16 @@ int git_config_pathname(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
+int git_config_strbuf_pathname(struct strbuf *dest, const char *var, const char *value)
+{
+	if (!value)
+		return config_error_nonbool(var);
+	strbuf_interpolate_path(value, 0, dest);
+	if (!dest->len)
+		die(_("failed to expand user dir in: '%s'"), value);
+	return 0;
+}
+
 int git_config_expiry_date(timestamp_t *timestamp, const char *var, const char *value)
 {
 	if (!value)
diff --git a/config.h b/config.h
index f4966e3749..e405f1c140 100644
--- a/config.h
+++ b/config.h
@@ -22,6 +22,7 @@
  */
 
 struct object_id;
+struct strbuf;
 
 /* git_config_parse_key() returns these negated: */
 #define CONFIG_INVALID_KEY 1
@@ -287,6 +288,7 @@ int git_config_string(const char **, const char *, const char *);
  * user's home directory when found at the beginning of the path.
  */
 int git_config_pathname(const char **, const char *, const char *);
+int git_config_strbuf_pathname(struct strbuf *, const char *, const char *);
 
 int git_config_expiry_date(timestamp_t *, const char *, const char *);
 int git_config_color(char *, const char *, const char *);
-- 
2.44.0.697.g9b33b46f29

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] environment.c: convert excludes_file to struct strbuf
  2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
  2024-04-06 14:31 ` [PATCH 1/4] path.c: introduce strbuf_interpolate_path Rubén Justo
  2024-04-06 14:32 ` [PATCH 2/4] config.c: introduce git_config_strbuf_pathname Rubén Justo
@ 2024-04-06 14:32 ` Rubén Justo
  2024-04-06 14:32 ` [PATCH 4/4] t7300: mark as leak-free Rubén Justo
  2024-04-06 17:53 ` [PATCH 0/4] fix a leak with excludes_file Junio C Hamano
  4 siblings, 0 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-06 14:32 UTC (permalink / raw
  To: Git List

Make excludes_file a strbuf, so that we don't have to worry about freeing
it if it is set multiple times.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 config.c      |  2 +-
 dir.c         | 13 +++++++++----
 environment.c |  2 +-
 environment.h |  2 +-
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/config.c b/config.c
index 9beeb63b50..bc64af8efa 100644
--- a/config.c
+++ b/config.c
@@ -1595,7 +1595,7 @@ static int git_default_core_config(const char *var, const char *value,
 		return git_config_string(&askpass_program, var, value);
 
 	if (!strcmp(var, "core.excludesfile"))
-		return git_config_pathname(&excludes_file, var, value);
+		return git_config_strbuf_pathname(&excludes_file, var, value);
 
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
diff --git a/dir.c b/dir.c
index 20ebe4cba2..e31ccd8e48 100644
--- a/dir.c
+++ b/dir.c
@@ -3386,10 +3386,15 @@ void setup_standard_excludes(struct dir_struct *dir)
 	dir->exclude_per_dir = ".gitignore";
 
 	/* core.excludesfile defaulting to $XDG_CONFIG_HOME/git/ignore */
-	if (!excludes_file)
-		excludes_file = xdg_config_home("ignore");
-	if (excludes_file && !access_or_warn(excludes_file, R_OK, 0))
-		add_patterns_from_file_1(dir, excludes_file,
+	if (!excludes_file.len) {
+		char *str = xdg_config_home("ignore");
+		if (str) {
+			strbuf_addstr(&excludes_file, str);
+			free(str);
+		}
+	}
+	if (excludes_file.len && !access_or_warn(excludes_file.buf, R_OK, 0))
+		add_patterns_from_file_1(dir, excludes_file.buf,
 					 dir->untracked ? &dir->internal.ss_excludes_file : NULL);
 
 	/* per repository user preference */
diff --git a/environment.c b/environment.c
index a73ba9c12c..b4c66e7153 100644
--- a/environment.c
+++ b/environment.c
@@ -60,7 +60,7 @@ size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
 const char *editor_program;
 const char *askpass_program;
-const char *excludes_file;
+struct strbuf excludes_file = STRBUF_INIT;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 enum eol core_eol = EOL_UNSET;
 int global_conv_flags_eol = CONV_EOL_RNDTRP_WARN;
diff --git a/environment.h b/environment.h
index 05fd94d7be..114e3dde99 100644
--- a/environment.h
+++ b/environment.h
@@ -222,7 +222,7 @@ extern const char *git_log_output_encoding;
 
 extern const char *editor_program;
 extern const char *askpass_program;
-extern const char *excludes_file;
+extern struct strbuf excludes_file;
 
 /*
  * Should we print an ellipsis after an abbreviated SHA-1 value
-- 
2.44.0.697.g9b33b46f29

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] t7300: mark as leak-free
  2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
                   ` (2 preceding siblings ...)
  2024-04-06 14:32 ` [PATCH 3/4] environment.c: convert excludes_file to struct strbuf Rubén Justo
@ 2024-04-06 14:32 ` Rubén Justo
  2024-04-06 17:53 ` [PATCH 0/4] fix a leak with excludes_file Junio C Hamano
  4 siblings, 0 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-06 14:32 UTC (permalink / raw
  To: Git List

Since the previous commit this test does not leak.

Mark it with TEST_PASSES_SANITIZE_LEAK=true so to make the leak
checkers happy.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 t/t7300-clean.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 1f7201eb60..0aae0dee67 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -5,6 +5,7 @@
 
 test_description='git clean basic tests'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 git config clean.requireForce no
-- 
2.44.0.697.g9b33b46f29

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] fix a leak with excludes_file
  2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
                   ` (3 preceding siblings ...)
  2024-04-06 14:32 ` [PATCH 4/4] t7300: mark as leak-free Rubén Justo
@ 2024-04-06 17:53 ` Junio C Hamano
  2024-04-07 17:48   ` Rubén Justo
  4 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2024-04-06 17:53 UTC (permalink / raw
  To: Rubén Justo; +Cc: Git List

Rubén Justo <rjusto@gmail.com> writes:

> We call twice to git_config(): first to get the main git-branch(1) options,
> and second to get the ones related to the sequencer.

The diagnosis of the control flow that leads to overwriting the
value assigned earlier to "excludes_file" is quite correct, but
I find that your fix is somehow not addressing the real issue.

We used to assume that excludes_file *always* points at a piece of
memory that somebody else owns.  But seeing the two places that
assign to the variable, both actually give the variable a piece of
memory allocated for the caller.

 - The one in config.c that reads core.excludesfile via
   git_config_pathname(), which calls interpolate_path(), whose
   return value is always allocated.

 - The one in dir.c that sets up standard excludes uses
   xdg_config_home() that returns a value obtained from mkpathdup().

The core of the fix ought to be just like this, but to lose the cast
that discards the constness away, the fallouts will be large (if you
remove "const" from the first parameter to git_config_pathname(),
the compiler will tell you where the fallouts are).

We may want to eventually fix the type of the first parameter of the
git_config_pathmame() function, but I somehow doubt that your
approach gets us any closer to that future.  Adding a parallel API
next to the existing git_config_pathname() and interpolate_path()
and convert only these callers that touch excludes_file would not
help other callers that hold the pointer git_config_pathname()
returned.

 config.c         | 4 +++-
 t/t7300-clean.sh | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git c/config.c w/config.c
index eebce8c7e0..ae3652b08f 100644
--- c/config.c
+++ w/config.c
@@ -1584,8 +1584,10 @@ static int git_default_core_config(const char *var, const char *value,
 	if (!strcmp(var, "core.askpass"))
 		return git_config_string(&askpass_program, var, value);
 
-	if (!strcmp(var, "core.excludesfile"))
+	if (!strcmp(var, "core.excludesfile")) {
+		free((char *)excludes_file);
 		return git_config_pathname(&excludes_file, var, value);
+	}
 
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
diff --git c/t/t7300-clean.sh w/t/t7300-clean.sh
index 1f7201eb60..0aae0dee67 100755
--- c/t/t7300-clean.sh
+++ w/t/t7300-clean.sh
@@ -5,6 +5,7 @@
 
 test_description='git clean basic tests'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 git config clean.requireForce no

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] fix a leak with excludes_file
  2024-04-06 17:53 ` [PATCH 0/4] fix a leak with excludes_file Junio C Hamano
@ 2024-04-07 17:48   ` Rubén Justo
  2024-04-08 17:36     ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Rubén Justo @ 2024-04-07 17:48 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Git List

On Sat, Apr 06, 2024 at 10:53:09AM -0700, Junio C Hamano wrote:
> Rubén Justo <rjusto@gmail.com> writes:
> 
> > We call twice to git_config(): first to get the main git-branch(1) options,
> > and second to get the ones related to the sequencer.

Obviously I meant git-rebase(1) -- or any other command that uses the
sequencer under the hood.

> Adding a parallel API
> next to the existing git_config_pathname() and interpolate_path()
> and convert only these callers that touch excludes_file would not
> help other callers that hold the pointer git_config_pathname()
> returned.

It does not have to be like that.  We may no longer need the current
and problematic git_config_pathname().  However I did not want to go
that far in this series.

>  config.c         | 4 +++-
>  t/t7300-clean.sh | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git c/config.c w/config.c
> index eebce8c7e0..ae3652b08f 100644
> --- c/config.c
> +++ w/config.c
> @@ -1584,8 +1584,10 @@ static int git_default_core_config(const char *var, const char *value,
>  	if (!strcmp(var, "core.askpass"))
>  		return git_config_string(&askpass_program, var, value);
>  
> -	if (!strcmp(var, "core.excludesfile"))
> +	if (!strcmp(var, "core.excludesfile")) {
> +		free((char *)excludes_file);

Aaah, you prefer this :-( ...  this free is what I was referring to, in
the message you are replying to, as the simpler fix ...

It obviously plugs the leak and so my itch is gone;  therefore OK by me
to the series I have seen that you have already sent with this.

Still, I find the approach in this series more interesting than the
simple free.  The strbuf saves us from subsequent free+alloc's, for
exactly the same "core.excludesFile".  The strbuf states clearly, IMHO,
who owns the pointer.  And other niceties.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] fix a leak with excludes_file
  2024-04-07 17:48   ` Rubén Justo
@ 2024-04-08 17:36     ` Junio C Hamano
  2024-04-08 19:33       ` Rubén Justo
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2024-04-08 17:36 UTC (permalink / raw
  To: Rubén Justo; +Cc: Git List

Rubén Justo <rjusto@gmail.com> writes:

> It does not have to be like that.  We may no longer need the current
> and problematic git_config_pathname().  However I did not want to go
> that far in this series.

True, but that one and only true interface we will end up with MUST
NOT be strbuf based one, and that is why I said the patch as posted
will not take us into a better future.

And as a stop-gap measure, ...

>> -	if (!strcmp(var, "core.excludesfile"))
>> +	if (!strcmp(var, "core.excludesfile")) {
>> +		free((char *)excludes_file);
>
> Aaah, you prefer this :-( ...  this free is what I was referring to, in
> the message you are replying to, as the simpler fix ...

... obviously this would be much more preferrable.  And when the one
and only true interface comes, we only need to drop the cast from
around here.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] fix a leak with excludes_file
  2024-04-08 17:36     ` Junio C Hamano
@ 2024-04-08 19:33       ` Rubén Justo
  0 siblings, 0 replies; 9+ messages in thread
From: Rubén Justo @ 2024-04-08 19:33 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Git List

On Mon, Apr 08, 2024 at 10:36:30AM -0700, Junio C Hamano wrote:
> Rubén Justo <rjusto@gmail.com> writes:
> 
> > It does not have to be like that.  We may no longer need the current
> > and problematic git_config_pathname().  However I did not want to go
> > that far in this series.
> 
> True, but that one and only true interface we will end up with MUST
> NOT be strbuf based one, and that is why I said the patch as posted
> will not take us into a better future.

It doesn't have to be having a strbuf-based-only interface, either.

It is about introducing an interface where callers:

	- can take advantage of a not-always-allocate interface, and

	- can stop worrying about freeing previous values, when reusing
	  variables.

This latter fixes the leak while the former introduces a nice "if you
set n-times the same value, we'll allocate once".

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-04-08 19:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-06 14:29 [PATCH 0/4] fix a leak with excludes_file Rubén Justo
2024-04-06 14:31 ` [PATCH 1/4] path.c: introduce strbuf_interpolate_path Rubén Justo
2024-04-06 14:32 ` [PATCH 2/4] config.c: introduce git_config_strbuf_pathname Rubén Justo
2024-04-06 14:32 ` [PATCH 3/4] environment.c: convert excludes_file to struct strbuf Rubén Justo
2024-04-06 14:32 ` [PATCH 4/4] t7300: mark as leak-free Rubén Justo
2024-04-06 17:53 ` [PATCH 0/4] fix a leak with excludes_file Junio C Hamano
2024-04-07 17:48   ` Rubén Justo
2024-04-08 17:36     ` Junio C Hamano
2024-04-08 19:33       ` Rubén Justo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).