Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* [PATCH] config: do not leak excludes_file
@ 2024-04-06 18:11 Junio C Hamano
  2024-04-06 19:17 ` [WIP] git_config_pathname() leakfix Junio C Hamano
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-04-06 18:11 UTC (permalink / raw
  To: git; +Cc: Rubén Justo

The excludes_file variable is marked "const char *", but all the
assignments to it are made with a piece of memory allocated just
for it, and the variable is responsible for owning it.

When "core.excludesfile" is read, the code just lost the previous
value, leaking memory.  Plug it.

The real problem is that the variable is mistyped; our convention
is to never make a variable that owns the piece of memory pointed
by it as "const".  Fixing that would reduce the chance of this kind
of bug happening, and also would make it unnecessary to cast the
constness away while free()ing it, but that would be a much larger
follow-up effort.

Reported-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c         | 4 +++-
 t/t7300-clean.sh | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index eebce8c7e0..ae3652b08f 100644
--- a/config.c
+++ b/config.c
@@ -1584,8 +1584,10 @@ static int git_default_core_config(const char *var, const char *value,
 	if (!strcmp(var, "core.askpass"))
 		return git_config_string(&askpass_program, var, value);
 
-	if (!strcmp(var, "core.excludesfile"))
+	if (!strcmp(var, "core.excludesfile")) {
+		free((char *)excludes_file);
 		return git_config_pathname(&excludes_file, var, value);
+	}
 
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 1f7201eb60..0aae0dee67 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -5,6 +5,7 @@
 
 test_description='git clean basic tests'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 git config clean.requireForce no
-- 
2.44.0-501-g19981daefd


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [WIP] git_config_pathname() leakfix
  2024-04-06 18:11 [PATCH] config: do not leak excludes_file Junio C Hamano
@ 2024-04-06 19:17 ` Junio C Hamano
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-04-06 19:17 UTC (permalink / raw
  To: git

The git_config_pathname() function takes a pointer to "const char *"
that specifies where to store the resulting pathname string, but
always assigns a newly assigned value to it.  The caller has to cast
the constness away if it wants to call free() on it, e.g.,

	const char *str;
	int ret = git_config_pathname(&str, var, value);
	use(str)
	free((const char *)str);

Here is a not-working WIP just to show the scale of fallout from an
exercise to plug leaks of returned value from git_config_pathname().
It revealed a few more issues around memory ownership rules.

 * OPTION_FILENAME of parse-options API assigns the string from the
   command line argument directly to the supplied variable.  The
   variable obviously does not own the value and cannot and does not
   have to free() it.  It however is very common to use the same
   variable to also hold a string read from the config API, via
   git_config_pathname(), which wants the caller to own the string.

   In this sketch, I attempted to introduce OPTION_FILENAME_DUP and
   have the parse-options API to treat the destination variable as
   owning the string (i.e. free the existing contents before
   assigning the value from the command line, and xstrdup() the
   string from the command line option), but I am sure I missed
   some (which will lead to segfault when a caller later tries
   to free() it before calling git_config_pathname()) and/or 
   converted too many (which will lead to leaks when a caller
   later does not free() it).

 * http.c has a private set_from_env() helper function, which lets
   the callers to borrow memory from the environment.  But some
   variables that receive value from this mechanism also receive
   pathnames from configuration via git_config_pathname().

   In this sketch, I attempted to solve it by introducing
   dup_from_env() and changed callers that wants to overwrite the
   variable with a value obtained from git_config_pathname().

Some readers might wonder if it is easier to go in the other
direction, i.e. allowing cllaers of git_config_pathname() to
borrow and not worry about freeing, but this is fundamentally
impossible as expanding ~/.gitconfig into /home/jch/.gitconfig
and the like is the central part of its function.

In any case, I am not going to finish this soonish, but I think I
found some existing leaks that can be fixed _without_ going through
the whole nine yards, so perhaps I'll see if I can salvage small
changes like that as "preliminary clean-up".

 builtin/blame.c        |  3 ++-
 builtin/commit.c       |  4 ++--
 builtin/config.c       |  4 ++--
 builtin/log.c          |  4 ++--
 builtin/receive-pack.c |  4 ++--
 config.c               |  8 ++++----
 config.h               |  8 ++++----
 diff.c                 |  2 +-
 environment.c          |  6 +++---
 environment.h          |  6 +++---
 fetch-pack.c           |  4 ++--
 fsck.c                 |  4 ++--
 fsmonitor-settings.c   |  8 ++++++--
 gpg-interface.c        |  3 ++-
 http.c                 | 29 +++++++++++++++++++----------
 mailmap.c              |  2 +-
 mailmap.h              |  2 +-
 parse-options.c        | 14 ++++++++++++--
 parse-options.h        | 11 ++++++++++-
 setup.c                |  8 ++++----
 20 files changed, 84 insertions(+), 50 deletions(-)

diff --git c/builtin/blame.c w/builtin/blame.c
index db1f56de61..23810323e6 100644
--- c/builtin/blame.c
+++ w/builtin/blame.c
@@ -718,13 +718,14 @@ static int git_blame_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "blame.ignorerevsfile")) {
-		const char *str;
+		char *str;
 		int ret;
 
 		ret = git_config_pathname(&str, var, value);
 		if (ret)
 			return ret;
 		string_list_insert(&ignore_revs_file_list, str);
+		free(str);
 		return 0;
 	}
 	if (!strcmp(var, "blame.markunblamablelines")) {
diff --git c/builtin/commit.c w/builtin/commit.c
index 7ba7201cfb..a35c524458 100644
--- c/builtin/commit.c
+++ w/builtin/commit.c
@@ -107,7 +107,7 @@ static enum {
 } commit_style;
 
 static const char *logfile, *force_author;
-static const char *template_file;
+static char *template_file;
 /*
  * The _message variables are commit names from which to take
  * the commit message and/or authorship.
@@ -1319,7 +1319,7 @@ static int parse_and_validate_options(int argc, const char *argv[],
 				  !!use_message, "-C",
 				  !!logfile, "-F");
 	if (use_message || edit_message || logfile ||fixup_message || have_option_m)
-		template_file = NULL;
+		FREE_AND_NULL(template_file);
 	if (edit_message)
 		use_message = edit_message;
 	if (amend && !use_message && !fixup_message)
diff --git c/builtin/config.c w/builtin/config.c
index 0015620dde..739d3beb57 100644
--- c/builtin/config.c
+++ w/builtin/config.c
@@ -282,11 +282,11 @@ static int format_config(struct strbuf *buf, const char *key_,
 			else
 				strbuf_addstr(buf, v ? "true" : "false");
 		} else if (type == TYPE_PATH) {
-			const char *v;
+			char *v;
 			if (git_config_pathname(&v, key_, value_) < 0)
 				return -1;
 			strbuf_addstr(buf, v);
-			free((char *)v);
+			free(v);
 		} else if (type == TYPE_EXPIRY_DATE) {
 			timestamp_t t;
 			if (git_config_expiry_date(&t, key_, value_) < 0)
diff --git c/builtin/log.c w/builtin/log.c
index c0a8bb95e9..e90d3f13e7 100644
--- c/builtin/log.c
+++ w/builtin/log.c
@@ -957,7 +957,7 @@ static int do_signoff;
 static enum auto_base_setting auto_base;
 static char *from;
 static const char *signature = git_version_string;
-static const char *signature_file;
+static char *signature_file;
 static enum cover_setting config_cover_letter;
 static const char *config_output_directory;
 static enum cover_from_description cover_from_description_mode = COVER_FROM_MESSAGE;
@@ -1981,7 +1981,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		OPT_CALLBACK_F(0, "base", &base_commit, N_("base-commit"),
 			       N_("add prerequisite tree info to the patch series"),
 			       0, base_callback),
-		OPT_FILENAME(0, "signature-file", &signature_file,
+		OPT_FILENAME_DUP(0, "signature-file", &signature_file,
 				N_("add a signature from a file")),
 		OPT__QUIET(&quiet, N_("don't print the patch filenames")),
 		OPT_BOOL(0, "progress", &show_progress,
diff --git c/builtin/receive-pack.c w/builtin/receive-pack.c
index 56d8a77ed7..f5e55bad51 100644
--- c/builtin/receive-pack.c
+++ w/builtin/receive-pack.c
@@ -168,13 +168,13 @@ static int receive_pack_config(const char *var, const char *value,
 	}
 
 	if (strcmp(var, "receive.fsck.skiplist") == 0) {
-		const char *path;
+		char *path;
 
 		if (git_config_pathname(&path, var, value))
 			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
-		free((char *)path);
+		free(path);
 		return 0;
 	}
 
diff --git c/config.c w/config.c
index ae3652b08f..683acbe235 100644
--- c/config.c
+++ w/config.c
@@ -1345,7 +1345,7 @@ int git_config_string(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
-int git_config_pathname(const char **dest, const char *var, const char *value)
+int git_config_pathname(char **dest, const char *var, const char *value)
 {
 	if (!value)
 		return config_error_nonbool(var);
@@ -2482,7 +2482,7 @@ int git_configset_get_maybe_bool(struct config_set *set, const char *key, int *d
 		return 1;
 }
 
-int git_configset_get_pathname(struct config_set *set, const char *key, const char **dest)
+int git_configset_get_pathname(struct config_set *set, const char *key, char **dest)
 {
 	const char *value;
 	if (!git_configset_get_value(set, key, &value, NULL))
@@ -2627,7 +2627,7 @@ int repo_config_get_maybe_bool(struct repository *repo,
 }
 
 int repo_config_get_pathname(struct repository *repo,
-			     const char *key, const char **dest)
+			     const char *key, char **dest)
 {
 	int ret;
 	git_config_check_init(repo);
@@ -2726,7 +2726,7 @@ int git_config_get_maybe_bool(const char *key, int *dest)
 	return repo_config_get_maybe_bool(the_repository, key, dest);
 }
 
-int git_config_get_pathname(const char *key, const char **dest)
+int git_config_get_pathname(const char *key, char **dest)
 {
 	return repo_config_get_pathname(the_repository, key, dest);
 }
diff --git c/config.h w/config.h
index f4966e3749..868057c77f 100644
--- c/config.h
+++ w/config.h
@@ -286,7 +286,7 @@ int git_config_string(const char **, const char *, const char *);
  * Similar to `git_config_string`, but expands `~` or `~user` into the
  * user's home directory when found at the beginning of the path.
  */
-int git_config_pathname(const char **, const char *, const char *);
+int git_config_pathname(char **, const char *, const char *);
 
 int git_config_expiry_date(timestamp_t *, const char *, const char *);
 int git_config_color(char *, const char *, const char *);
@@ -541,7 +541,7 @@ int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned lon
 int git_configset_get_bool(struct config_set *cs, const char *key, int *dest);
 int git_configset_get_bool_or_int(struct config_set *cs, const char *key, int *is_bool, int *dest);
 int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest);
-int git_configset_get_pathname(struct config_set *cs, const char *key, const char **dest);
+int git_configset_get_pathname(struct config_set *cs, const char *key, char **dest);
 
 /* Functions for reading a repository's config */
 struct repository;
@@ -577,7 +577,7 @@ int repo_config_get_bool_or_int(struct repository *repo,
 int repo_config_get_maybe_bool(struct repository *repo,
 			       const char *key, int *dest);
 int repo_config_get_pathname(struct repository *repo,
-			     const char *key, const char **dest);
+			     const char *key, char **dest);
 
 /*
  * Functions for reading protected config. By definition, protected
@@ -687,7 +687,7 @@ int git_config_get_maybe_bool(const char *key, int *dest);
  * Similar to `git_config_get_string`, but expands `~` or `~user` into
  * the user's home directory when found at the beginning of the path.
  */
-int git_config_get_pathname(const char *key, const char **dest);
+int git_config_get_pathname(const char *key, char **dest);
 
 int git_config_get_index_threads(int *dest);
 int git_config_get_split_index(void);
diff --git c/diff.c w/diff.c
index 108c187577..e8754888f4 100644
--- c/diff.c
+++ w/diff.c
@@ -58,7 +58,7 @@ static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
 static const char *external_diff_cmd_cfg;
-static const char *diff_order_file_cfg;
+static char *diff_order_file_cfg;
 int diff_auto_refresh_index = 1;
 static int diff_mnemonic_prefix;
 static int diff_no_prefix;
diff --git c/environment.c w/environment.c
index a73ba9c12c..279ea3fd5e 100644
--- c/environment.c
+++ w/environment.c
@@ -46,8 +46,8 @@ const char *git_commit_encoding;
 const char *git_log_output_encoding;
 char *apply_default_whitespace;
 char *apply_default_ignorewhitespace;
-const char *git_attributes_file;
-const char *git_hooks_path;
+char *git_attributes_file;
+char *git_hooks_path;
 int zlib_compression_level = Z_BEST_SPEED;
 int pack_compression_level = Z_DEFAULT_COMPRESSION;
 int fsync_object_files = -1;
@@ -60,7 +60,7 @@ size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
 const char *editor_program;
 const char *askpass_program;
-const char *excludes_file;
+char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 enum eol core_eol = EOL_UNSET;
 int global_conv_flags_eol = CONV_EOL_RNDTRP_WARN;
diff --git c/environment.h w/environment.h
index 05fd94d7be..c0e0c8b1f9 100644
--- c/environment.h
+++ w/environment.h
@@ -124,8 +124,8 @@ extern int warn_ambiguous_refs;
 extern int warn_on_object_refname_ambiguity;
 extern char *apply_default_whitespace;
 extern char *apply_default_ignorewhitespace;
-extern const char *git_attributes_file;
-extern const char *git_hooks_path;
+extern char *git_attributes_file;
+extern char *git_hooks_path;
 extern int zlib_compression_level;
 extern int pack_compression_level;
 extern size_t packed_git_window_size;
@@ -222,7 +222,7 @@ extern const char *git_log_output_encoding;
 
 extern const char *editor_program;
 extern const char *askpass_program;
-extern const char *excludes_file;
+extern char *excludes_file;
 
 /*
  * Should we print an ellipsis after an abbreviated SHA-1 value
diff --git c/fetch-pack.c w/fetch-pack.c
index 091f9a80a9..f7579cb3bc 100644
--- c/fetch-pack.c
+++ w/fetch-pack.c
@@ -1863,13 +1863,13 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 	const char *msg_id;
 
 	if (strcmp(var, "fetch.fsck.skiplist") == 0) {
-		const char *path;
+		char *path;
 
 		if (git_config_pathname(&path, var, value))
 			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
-		free((char *)path);
+		free(path);
 		return 0;
 	}
 
diff --git c/fsck.c w/fsck.c
index 78af29d264..dd0a33028e 100644
--- c/fsck.c
+++ w/fsck.c
@@ -1274,13 +1274,13 @@ int git_fsck_config(const char *var, const char *value,
 	const char *msg_id;
 
 	if (strcmp(var, "fsck.skiplist") == 0) {
-		const char *path;
+		char *path;
 		struct strbuf sb = STRBUF_INIT;
 
 		if (git_config_pathname(&path, var, value))
 			return 1;
 		strbuf_addf(&sb, "skiplist=%s", path);
-		free((char *)path);
+		free(path);
 		fsck_set_msg_types(options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
diff --git c/fsmonitor-settings.c w/fsmonitor-settings.c
index a6a9e6bc19..20e9347907 100644
--- c/fsmonitor-settings.c
+++ w/fsmonitor-settings.c
@@ -129,8 +129,12 @@ static void lookup_fsmonitor_settings(struct repository *r)
 		break;
 
 	case -1: /* config value set to an arbitrary string */
-		if (repo_config_get_pathname(r, "core.fsmonitor", &const_str))
-			return; /* should not happen */
+		{
+			char *str;
+			if (repo_config_get_pathname(r, "core.fsmonitor", &str))
+				return; /* should not happen */
+			const_str = str;
+		}
 		break;
 
 	default: /* should not happen */
diff --git c/gpg-interface.c w/gpg-interface.c
index b5993385ff..87e214c76e 100644
--- c/gpg-interface.c
+++ w/gpg-interface.c
@@ -27,7 +27,8 @@ static void gpg_interface_lazy_init(void)
 }
 
 static char *configured_signing_key;
-static const char *ssh_default_key_command, *ssh_allowed_signers, *ssh_revocation_file;
+static const char *ssh_default_key_command;
+static char *ssh_allowed_signers, *ssh_revocation_file;
 static enum signature_trust_level configured_min_trust_level = TRUST_UNDEFINED;
 
 struct gpg_format {
diff --git c/http.c w/http.c
index e73b136e58..b03a9a169d 100644
--- c/http.c
+++ w/http.c
@@ -39,7 +39,7 @@ char curl_errorstr[CURL_ERROR_SIZE];
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *curl_http_version = NULL;
-static const char *ssl_cert;
+static char *ssl_cert;
 static const char *ssl_cert_type;
 static const char *ssl_cipherlist;
 static const char *ssl_version;
@@ -59,14 +59,14 @@ static struct {
 	{ "tlsv1.3", CURL_SSLVERSION_TLSv1_3 },
 #endif
 };
-static const char *ssl_key;
+static char *ssl_key;
 static const char *ssl_key_type;
-static const char *ssl_capath;
+static char *ssl_capath;
 static const char *curl_no_proxy;
 #ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
-static const char *ssl_pinnedkey;
+static char *ssl_pinnedkey;
 #endif
-static const char *ssl_cainfo;
+static char *ssl_cainfo;
 static long curl_low_speed_limit = -1;
 static long curl_low_speed_time = -1;
 static int curl_ftp_no_epsv;
@@ -108,7 +108,7 @@ static struct {
 
 static struct credential proxy_auth = CREDENTIAL_INIT;
 static const char *curl_proxyuserpwd;
-static const char *curl_cookie_file;
+static char *curl_cookie_file;
 static int curl_save_cookies;
 struct credential http_auth = CREDENTIAL_INIT;
 static int http_proactive_auth;
@@ -1215,6 +1215,15 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static void dup_from_env(char **var, const char *envname)
+{
+	const char *val = getenv(envname);
+	if (val) {
+		free(*var);
+		*var = strdup(val);
+	}
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -1291,12 +1300,12 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (getenv("GIT_SSL_NO_VERIFY"))
 		curl_ssl_verify = 0;
 
-	set_from_env(&ssl_cert, "GIT_SSL_CERT");
+	dup_from_env(&ssl_cert, "GIT_SSL_CERT");
 	set_from_env(&ssl_cert_type, "GIT_SSL_CERT_TYPE");
-	set_from_env(&ssl_key, "GIT_SSL_KEY");
+	dup_from_env(&ssl_key, "GIT_SSL_KEY");
 	set_from_env(&ssl_key_type, "GIT_SSL_KEY_TYPE");
-	set_from_env(&ssl_capath, "GIT_SSL_CAPATH");
-	set_from_env(&ssl_cainfo, "GIT_SSL_CAINFO");
+	dup_from_env(&ssl_capath, "GIT_SSL_CAPATH");
+	dup_from_env(&ssl_cainfo, "GIT_SSL_CAINFO");
 
 	set_from_env(&user_agent, "GIT_HTTP_USER_AGENT");
 
diff --git c/mailmap.c w/mailmap.c
index 3d6a5e9400..044466b043 100644
--- c/mailmap.c
+++ w/mailmap.c
@@ -6,7 +6,7 @@
 #include "object-store-ll.h"
 #include "setup.h"
 
-const char *git_mailmap_file;
+char *git_mailmap_file;
 const char *git_mailmap_blob;
 
 struct mailmap_info {
diff --git c/mailmap.h w/mailmap.h
index 0f8fd2c586..429a760945 100644
--- c/mailmap.h
+++ w/mailmap.h
@@ -3,7 +3,7 @@
 
 struct string_list;
 
-extern const char *git_mailmap_file;
+extern char *git_mailmap_file;
 extern const char *git_mailmap_blob;
 
 int read_mailmap(struct string_list *map);
diff --git c/parse-options.c w/parse-options.c
index 30b9e68f8a..94c1e52b60 100644
--- c/parse-options.c
+++ w/parse-options.c
@@ -64,8 +64,11 @@ static void fix_filename(const char *prefix, char **file)
 {
 	if (!file || !*file)
 		; /* leave as NULL */
-	else
-		*file = prefix_filename_except_for_dash(prefix, *file);
+	else {
+		char *tmp = prefix_filename_except_for_dash(prefix, *file);
+		free(*file);
+		*file = tmp;
+	}
 }
 
 static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
@@ -128,6 +131,9 @@ static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 			return get_arg(p, opt, flags, (const char **)opt->value);
 		return 0;
 
+	case OPTION_FILENAME_DUP:
+		FREE_AND_NULL(*(char **)opt->value);
+		/* fallthru */
 	case OPTION_FILENAME:
 		err = 0;
 		if (unset)
@@ -137,6 +143,8 @@ static enum parse_opt_result do_get_value(struct parse_opt_ctx_t *p,
 		else
 			err = get_arg(p, opt, flags, (const char **)opt->value);
 
+		if (opt->type == OPTION_FILENAME_DUP)
+			*(char **)opt->value = xstrdup_or_null(*(const char **)opt->value);
 		if (!err)
 			fix_filename(p->prefix, (char **)opt->value);
 		return err;
@@ -649,6 +657,7 @@ static void show_negated_gitcomp(const struct option *opts, int show_all,
 		switch (opts->type) {
 		case OPTION_STRING:
 		case OPTION_FILENAME:
+		case OPTION_FILENAME_DUP:
 		case OPTION_INTEGER:
 		case OPTION_MAGNITUDE:
 		case OPTION_CALLBACK:
@@ -701,6 +710,7 @@ static int show_gitcomp(const struct option *opts, int show_all)
 			continue;
 		case OPTION_STRING:
 		case OPTION_FILENAME:
+		case OPTION_FILENAME_DUP:
 		case OPTION_INTEGER:
 		case OPTION_MAGNITUDE:
 		case OPTION_CALLBACK:
diff --git c/parse-options.h w/parse-options.h
index bd62e20268..e78f72f986 100644
--- c/parse-options.h
+++ w/parse-options.h
@@ -26,7 +26,8 @@ enum parse_opt_type {
 	OPTION_MAGNITUDE,
 	OPTION_CALLBACK,
 	OPTION_LOWLEVEL_CALLBACK,
-	OPTION_FILENAME
+	OPTION_FILENAME,
+	OPTION_FILENAME_DUP
 };
 
 enum parse_opt_flags {
@@ -330,6 +331,14 @@ struct option {
 	.argh = N_("file"), \
 	.help = (h), \
 }
+#define OPT_FILENAME_DUP(s, l, v, h) { \
+	.type = OPTION_FILENAME_DUP, \
+	.short_name = (s), \
+	.long_name = (l), \
+	.value = (v), \
+	.argh = N_("file"), \
+	.help = (h), \
+}
 #define OPT_COLOR_FLAG(s, l, v, h) { \
 	.type = OPTION_CALLBACK, \
 	.short_name = (s), \
diff --git c/setup.c w/setup.c
index f4b32f76e3..d99856ac81 100644
--- c/setup.c
+++ w/setup.c
@@ -1176,13 +1176,13 @@ static int safe_directory_cb(const char *key, const char *value,
 	} else if (!strcmp(value, "*")) {
 		data->is_safe = 1;
 	} else {
-		const char *interpolated = NULL;
+		char *interpolated = NULL;
 
 		if (!git_config_pathname(&interpolated, key, value) &&
 		    !fspathcmp(data->path, interpolated ? interpolated : value))
 			data->is_safe = 1;
 
-		free((char *)interpolated);
+		free(interpolated);
 	}
 
 	return 0;
@@ -2023,7 +2023,7 @@ static int create_default_files(const char *template_path,
 	char *path;
 	int reinit;
 	int filemode;
-	const char *init_template_dir = NULL;
+	char *init_template_dir = NULL;
 	const char *work_tree = get_git_work_tree();
 
 	/*
@@ -2037,7 +2037,7 @@ static int create_default_files(const char *template_path,
 	 */
 	git_config_get_pathname("init.templatedir", &init_template_dir);
 	copy_templates(template_path, init_template_dir);
-	free((char *)init_template_dir);
+	free(init_template_dir);
 	git_config_clear();
 	reset_shared_repository();
 	git_config(git_default_config, NULL);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 0/12] git_config_string() considered harmful
  2024-04-06 18:11 [PATCH] config: do not leak excludes_file Junio C Hamano
  2024-04-06 19:17 ` [WIP] git_config_pathname() leakfix Junio C Hamano
@ 2024-04-07  0:56 ` Jeff King
  2024-04-07  0:58   ` [PATCH 01/11] config: make sequencer.c's git_config_string_dup() public Jeff King
                     ` (12 more replies)
  1 sibling, 13 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  0:56 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

On Sat, Apr 06, 2024 at 11:11:12AM -0700, Junio C Hamano wrote:

> The excludes_file variable is marked "const char *", but all the
> assignments to it are made with a piece of memory allocated just
> for it, and the variable is responsible for owning it.
> 
> When "core.excludesfile" is read, the code just lost the previous
> value, leaking memory.  Plug it.
> 
> The real problem is that the variable is mistyped; our convention
> is to never make a variable that owns the piece of memory pointed
> by it as "const".  Fixing that would reduce the chance of this kind
> of bug happening, and also would make it unnecessary to cast the
> constness away while free()ing it, but that would be a much larger
> follow-up effort.

As you noticed in your follow-up, this is just the tip of the iceberg.
And it's not just git_config_pathname(), but really git_config_string(),
and it is a potential problem for almost every call.

I have a series that I started a few months ago to try to improve this,
but I never sent it in because I didn't have a good solution for the
long tail of variables where we assign a string literal as the default.

But that doesn't mean we can't incrementally make things better. So I
polished it up a bit, and will send the result in a minute.

Looking at your sketch, I think I glossed over the parse-options
OPT_FILENAME_DUP() issue. In practice it's OK because we wouldn't
generally re-read the config after parsing the options. But leaving it
does seem rather ugly, and your solution looks reasonable. I'm not sure
if there's an easy way to get the compiler to point to spots which need
it; the type information is all lost when parse-options passes
everything through a void pointer.

(I remember a while ago looking at retaining type annotations for
parse-options; this could be another use case for that).

I think it would also be useful if we could enable -Wwrite-strings to
catch cases where string literals are assigned to non-const pointers.
But there's some cleanup/refactoring to get that to compile cleanly.

  [01/11]: config: make sequencer.c's git_config_string_dup() public
  [02/11]: config: add git_config_pathname_dup()
  [03/11]: config: prefer git_config_string_dup() for temp variables
  [04/11]: config: use git_config_string_dup() for open-coded equivalents
  [05/11]: config: use git_config_string_dup() to fix leaky open coding
  [06/11]: config: use git_config_string() in easy cases
  [07/11]: config: use git_config_pathname_dup() in easy cases
  [08/11]: http: use git_config_string_dup()
  [09/11]: merge: use git_config_string_dup() for pull strategies
  [10/11]: userdiff: use git_config_string_dup() when we can
  [11/11]: blame: use "dup" string_list for ignore-revs files

 alias.c                |   3 +-
 archive-tar.c          |  10 ++--
 attr.c                 |   2 +-
 attr.h                 |   2 +-
 builtin/blame.c        |   7 +--
 builtin/commit.c       |   8 ++--
 builtin/config.c       |   6 +--
 builtin/help.c         |   7 +--
 builtin/log.c          |  16 +++----
 builtin/merge.c        |  12 ++---
 builtin/receive-pack.c |  10 ++--
 builtin/repack.c       |  16 +++----
 compat/mingw.c         |   7 +--
 config.c               |  48 ++++++++++++-------
 config.h               |  19 ++++++++
 convert.c              |  12 ++---
 delta-islands.c        |   4 +-
 diff.c                 |  12 ++---
 environment.c          |  14 +++---
 environment.h          |  14 +++---
 fetch-pack.c           |   6 +--
 fsck.c                 |   6 +--
 gpg-interface.c        |   9 ++--
 http.c                 | 105 +++++++++++++++++++----------------------
 imap-send.c            |  20 ++++----
 mailmap.c              |   4 +-
 mailmap.h              |   4 +-
 merge-ll.c             |  17 +++----
 pager.c                |   4 +-
 promisor-remote.c      |   2 +-
 promisor-remote.h      |   2 +-
 remote.c               |  45 +++++++++---------
 remote.h               |   8 ++--
 sequencer.c            |  12 +----
 setup.c                |  11 ++---
 upload-pack.c          |   4 +-
 userdiff.c             |   6 +--
 userdiff.h             |   6 +--
 38 files changed, 251 insertions(+), 249 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/11] config: make sequencer.c's git_config_string_dup() public
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
@ 2024-04-07  0:58   ` Jeff King
  2024-04-07  1:00   ` [PATCH 02/11] config: add git_config_pathname_dup() Jeff King
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  0:58 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

Just about every caller of git_config_string() has a possible leak in
it: if we parse a config variable twice, it will overwrite the pointer
that was allocated the first time, leaving the memory unreferenced.

Unfortunately we can't just fix this directly in git_config_string().
Some callers do something like:

   const char *foo = "default_value";
   ...
   git_config_string(&foo, var, value);

And we must _not_ free that initial value, as it's a string literal. We
can't help those cases easily, as there's no way to distinguish a
heap-allocated variable from the default one. But let's start by at
least providing a helper that avoids the leak. That will let us convert
some cases right away, and give us one potential path forward for the
more complex ones.

It turns out we already have such a helper, courtesy of 03a4e260e2
(sequencer: plug memory leaks for the option values, 2016-10-21). The
problem is more acute in sequencer.c, which may load config multiple
times. Hence the solution was limited to that file back then. But this
really is a more general problem within our config callbacks.

Note that the new helper takes a "char *" rather than a const pointer.
This is more appropriate, since we tend to use "const" as a signal for
a lack of memory ownership (and this function is most certainly
asserting ownership over the pointed-to memory).

Signed-off-by: Jeff King <peff@peff.net>
---
 config.c    |  9 +++++++++
 config.h    | 12 ++++++++++++
 sequencer.c | 10 ----------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/config.c b/config.c
index eebce8c7e0..2194fb078a 100644
--- a/config.c
+++ b/config.c
@@ -1345,6 +1345,15 @@ int git_config_string(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
+int git_config_string_dup(char **dest, const char *var, const char *value)
+{
+	if (!value)
+		return config_error_nonbool(var);
+	free(*dest);
+	*dest = xstrdup(value);
+	return 0;
+}
+
 int git_config_pathname(const char **dest, const char *var, const char *value)
 {
 	if (!value)
diff --git a/config.h b/config.h
index f4966e3749..cdffc14ccf 100644
--- a/config.h
+++ b/config.h
@@ -279,9 +279,21 @@ int git_config_bool(const char *, const char *);
 /**
  * Allocates and copies the value string into the `dest` parameter; if no
  * string is given, prints an error message and returns -1.
+ *
+ * Note that this function does _not_ free the memory referenced by the
+ * destination pointer. This makes it safe to use on a variable that initially
+ * points to a string literal, but it also means that it leaks if the config
+ * option is seen multiple times.
  */
 int git_config_string(const char **, const char *, const char *);
 
+/**
+ * Like git_config_string(), but frees any previously-allocated
+ * string at the destination pointer, avoiding a leak when a
+ * config variable is seen multiple times.
+ */
+int git_config_string_dup(char **, const char *, const char *);
+
 /**
  * Similar to `git_config_string`, but expands `~` or `~user` into the
  * user's home directory when found at the beginning of the path.
diff --git a/sequencer.c b/sequencer.c
index 2c19846385..3e5d82e0e5 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2920,16 +2920,6 @@ static int read_populate_todo(struct repository *r,
 	return 0;
 }
 
-static int git_config_string_dup(char **dest,
-				 const char *var, const char *value)
-{
-	if (!value)
-		return config_error_nonbool(var);
-	free(*dest);
-	*dest = xstrdup(value);
-	return 0;
-}
-
 static int populate_opts_cb(const char *key, const char *value,
 			    const struct config_context *ctx,
 			    void *data)
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/11] config: add git_config_pathname_dup()
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
  2024-04-07  0:58   ` [PATCH 01/11] config: make sequencer.c's git_config_string_dup() public Jeff King
@ 2024-04-07  1:00   ` Jeff King
  2024-04-07  1:00   ` [PATCH 03/11] config: prefer git_config_string_dup() for temp variables Jeff King
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:00 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

The git_config_pathname() function suffers the same potential leak issue
as git_config_string(), since it is basically the same thing but with
the added twist of interpolating the path rather than just duplicating
the value.

Let's provide a similar "dup()" variant to help call sites transition to
using the leak-free variant.

Signed-off-by: Jeff King <peff@peff.net>
---
 config.c | 11 +++++++++++
 config.h |  7 +++++++
 2 files changed, 18 insertions(+)

diff --git a/config.c b/config.c
index 2194fb078a..a0aa45abd5 100644
--- a/config.c
+++ b/config.c
@@ -1364,6 +1364,17 @@ int git_config_pathname(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
+int git_config_pathname_dup(char **dest, const char *var, const char *value)
+{
+	if (!value)
+		return config_error_nonbool(var);
+	free(*dest);
+	*dest = interpolate_path(value, 0);
+	if (!*dest)
+		die(_("failed to expand user dir in: '%s'"), value);
+	return 0;
+}
+
 int git_config_expiry_date(timestamp_t *timestamp, const char *var, const char *value)
 {
 	if (!value)
diff --git a/config.h b/config.h
index cdffc14ccf..fed21d3144 100644
--- a/config.h
+++ b/config.h
@@ -300,6 +300,13 @@ int git_config_string_dup(char **, const char *, const char *);
  */
 int git_config_pathname(const char **, const char *, const char *);
 
+/**
+ * Like git_config_pathname(), but frees any previously-allocated
+ * string at the destination pointer, avoiding a leak when a
+ * config variable is seen multiple times.
+ */
+int git_config_pathname_dup(char **, const char *, const char *);
+
 int git_config_expiry_date(timestamp_t *, const char *, const char *);
 int git_config_color(char *, const char *, const char *);
 int git_config_set_in_file_gently(const char *, const char *, const char *, const char *);
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/11] config: prefer git_config_string_dup() for temp variables
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
  2024-04-07  0:58   ` [PATCH 01/11] config: make sequencer.c's git_config_string_dup() public Jeff King
  2024-04-07  1:00   ` [PATCH 02/11] config: add git_config_pathname_dup() Jeff King
@ 2024-04-07  1:00   ` Jeff King
  2024-04-07  1:50     ` Eric Sunshine
  2024-04-07  1:01   ` [PATCH 04/11] config: use git_config_string_dup() for open-coded equivalents Jeff King
                     ` (9 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:00 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

In some cases we may use git_config_string() or git_config_pathname() to
read a value into a temporary variable, and then either hand off memory
ownership of the new variable or free it. These are not potential leaks,
since we know that there is no previous value we are overwriting.

However, it's worth converting these to use git_config_string_dup() and
git_config_pathname_dup(). It makes it easier to audit for leaky cases,
and possibly we can get rid of the leak-prone functions in the future.
Plus it lets the const-ness of our variables match their expected memory
ownership, which avoids some casts when calling free().

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/blame.c        |  4 ++--
 builtin/config.c       |  6 +++---
 builtin/receive-pack.c |  6 +++---
 fetch-pack.c           |  6 +++---
 fsck.c                 |  6 +++---
 remote.c               | 28 ++++++++++++++--------------
 setup.c                |  6 +++---
 7 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index db1f56de61..0b07ceb110 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -718,10 +718,10 @@ static int git_blame_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "blame.ignorerevsfile")) {
-		const char *str;
+		char *str = NULL;
 		int ret;
 
-		ret = git_config_pathname(&str, var, value);
+		ret = git_config_pathname_dup(&str, var, value);
 		if (ret)
 			return ret;
 		string_list_insert(&ignore_revs_file_list, str);
diff --git a/builtin/config.c b/builtin/config.c
index 0015620dde..fc5d96bb4c 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -282,11 +282,11 @@ static int format_config(struct strbuf *buf, const char *key_,
 			else
 				strbuf_addstr(buf, v ? "true" : "false");
 		} else if (type == TYPE_PATH) {
-			const char *v;
-			if (git_config_pathname(&v, key_, value_) < 0)
+			char *v = NULL;
+			if (git_config_pathname_dup(&v, key_, value_) < 0)
 				return -1;
 			strbuf_addstr(buf, v);
-			free((char *)v);
+			free(v);
 		} else if (type == TYPE_EXPIRY_DATE) {
 			timestamp_t t;
 			if (git_config_expiry_date(&t, key_, value_) < 0)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 56d8a77ed7..15ed81a3f6 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -168,13 +168,13 @@ static int receive_pack_config(const char *var, const char *value,
 	}
 
 	if (strcmp(var, "receive.fsck.skiplist") == 0) {
-		const char *path;
+		char *path = NULL;
 
-		if (git_config_pathname(&path, var, value))
+		if (git_config_pathname_dup(&path, var, value))
 			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
-		free((char *)path);
+		free(path);
 		return 0;
 	}
 
diff --git a/fetch-pack.c b/fetch-pack.c
index 091f9a80a9..fd59c497b4 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1863,13 +1863,13 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 	const char *msg_id;
 
 	if (strcmp(var, "fetch.fsck.skiplist") == 0) {
-		const char *path;
+		char *path = NULL;
 
-		if (git_config_pathname(&path, var, value))
+		if (git_config_pathname_dup(&path, var, value))
 			return 1;
 		strbuf_addf(&fsck_msg_types, "%cskiplist=%s",
 			fsck_msg_types.len ? ',' : '=', path);
-		free((char *)path);
+		free(path);
 		return 0;
 	}
 
diff --git a/fsck.c b/fsck.c
index 78af29d264..a9d27a660f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1274,13 +1274,13 @@ int git_fsck_config(const char *var, const char *value,
 	const char *msg_id;
 
 	if (strcmp(var, "fsck.skiplist") == 0) {
-		const char *path;
+		char *path = NULL;
 		struct strbuf sb = STRBUF_INIT;
 
-		if (git_config_pathname(&path, var, value))
+		if (git_config_pathname_dup(&path, var, value))
 			return 1;
 		strbuf_addf(&sb, "skiplist=%s", path);
-		free((char *)path);
+		free(path);
 		fsck_set_msg_types(options, sb.buf);
 		strbuf_release(&sb);
 		return 0;
diff --git a/remote.c b/remote.c
index 2b650b813b..09912bebf1 100644
--- a/remote.c
+++ b/remote.c
@@ -428,38 +428,38 @@ static int handle_config(const char *key, const char *value,
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		add_url(remote, v);
 	} else if (!strcmp(subkey, "pushurl")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		add_pushurl(remote, v);
 	} else if (!strcmp(subkey, "push")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		refspec_append(&remote->push, v);
-		free((char *)v);
+		free(v);
 	} else if (!strcmp(subkey, "fetch")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		refspec_append(&remote->fetch, v);
-		free((char *)v);
+		free(v);
 	} else if (!strcmp(subkey, "receivepack")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		if (!remote->receivepack)
 			remote->receivepack = v;
 		else
 			error(_("more than one receivepack given, using the first"));
 	} else if (!strcmp(subkey, "uploadpack")) {
-		const char *v;
-		if (git_config_string(&v, key, value))
+		char *v = NULL;
+		if (git_config_string_dup(&v, key, value))
 			return -1;
 		if (!remote->uploadpack)
 			remote->uploadpack = v;
diff --git a/setup.c b/setup.c
index f4b32f76e3..9f35a27978 100644
--- a/setup.c
+++ b/setup.c
@@ -1176,13 +1176,13 @@ static int safe_directory_cb(const char *key, const char *value,
 	} else if (!strcmp(value, "*")) {
 		data->is_safe = 1;
 	} else {
-		const char *interpolated = NULL;
+		char *interpolated = NULL;
 
-		if (!git_config_pathname(&interpolated, key, value) &&
+		if (!git_config_pathname_dup(&interpolated, key, value) &&
 		    !fspathcmp(data->path, interpolated ? interpolated : value))
 			data->is_safe = 1;
 
-		free((char *)interpolated);
+		free(interpolated);
 	}
 
 	return 0;
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/11] config: use git_config_string_dup() for open-coded equivalents
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (2 preceding siblings ...)
  2024-04-07  1:00   ` [PATCH 03/11] config: prefer git_config_string_dup() for temp variables Jeff King
@ 2024-04-07  1:01   ` Jeff King
  2024-04-07  1:01   ` [PATCH 05/11] config: use git_config_string_dup() to fix leaky open coding Jeff King
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:01 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

These are cases where the calling code does the exact same thing
git_config_string_dup() would do. We can shorten the code a bit by using
it.

Note in the final case that we rely on leaving the if-else chain to
return "0" for success, and now we'll return more directly. The two are
equivalent.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive-tar.c  | 10 +++-------
 compat/mingw.c |  7 ++-----
 setup.c        |  5 +----
 3 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 8ae30125f8..6da7101553 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -393,13 +393,9 @@ static int tar_filter_config(const char *var, const char *value,
 		tar_filters[nr_tar_filters++] = ar;
 	}
 
-	if (!strcmp(type, "command")) {
-		if (!value)
-			return config_error_nonbool(var);
-		free(ar->filter_command);
-		ar->filter_command = xstrdup(value);
-		return 0;
-	}
+	if (!strcmp(type, "command"))
+		return git_config_string_dup(&ar->filter_command, var, value);
+
 	if (!strcmp(type, "remote")) {
 		if (git_config_bool(var, value))
 			ar->flags |= ARCHIVER_REMOTE;
diff --git a/compat/mingw.c b/compat/mingw.c
index 320fb99a90..aeccb3957f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -255,11 +255,8 @@ int mingw_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.unsetenvvars")) {
-		if (!value)
-			return config_error_nonbool(var);
-		free(unset_environment_variables);
-		unset_environment_variables = xstrdup(value);
-		return 0;
+		return git_config_string_dup(&unset_environment_variables, var,
+					     value);
 	}
 
 	if (!strcmp(var, "core.restrictinheritedhandles")) {
diff --git a/setup.c b/setup.c
index 9f35a27978..7204fd2815 100644
--- a/setup.c
+++ b/setup.c
@@ -529,10 +529,7 @@ static int read_worktree_config(const char *var, const char *value,
 	if (strcmp(var, "core.bare") == 0) {
 		data->is_bare = git_config_bool(var, value);
 	} else if (strcmp(var, "core.worktree") == 0) {
-		if (!value)
-			return config_error_nonbool(var);
-		free(data->work_tree);
-		data->work_tree = xstrdup(value);
+		return git_config_string_dup(&data->work_tree, var, value);
 	}
 	return 0;
 }
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/11] config: use git_config_string_dup() to fix leaky open coding
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (3 preceding siblings ...)
  2024-04-07  1:01   ` [PATCH 04/11] config: use git_config_string_dup() for open-coded equivalents Jeff King
@ 2024-04-07  1:01   ` Jeff King
  2024-04-07  1:02   ` [PATCH 06/11] config: use git_config_string() in easy cases Jeff King
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:01 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

These are cases which open-code the equivalent of git_config_string(),
but end up leaking the resulting value if the config value is seen
multiple times (because they never free an existing value). Using
git_config_string_dup() plug these potential leaks.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/help.c | 7 ++-----
 config.c       | 8 ++------
 merge-ll.c     | 5 +----
 3 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/builtin/help.c b/builtin/help.c
index dc1fbe2b98..1bdd2faee0 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -52,7 +52,7 @@ static enum help_action {
 	HELP_ACTION_CONFIG_SECTIONS_FOR_COMPLETION,
 } cmd_mode;
 
-static const char *html_path;
+static char *html_path;
 static int verbose = 1;
 static enum help_format help_format = HELP_FORMAT_NONE;
 static int exclude_guides;
@@ -407,10 +407,7 @@ static int git_help_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "help.htmlpath")) {
-		if (!value)
-			return config_error_nonbool(var);
-		html_path = xstrdup(value);
-		return 0;
+		return git_config_string_dup(&html_path, var, value);
 	}
 	if (!strcmp(var, "man.viewer")) {
 		if (!value)
diff --git a/config.c b/config.c
index a0aa45abd5..c115e6d8c9 100644
--- a/config.c
+++ b/config.c
@@ -1575,12 +1575,8 @@ static int git_default_core_config(const char *var, const char *value,
 	if (!strcmp(var, "core.checkroundtripencoding"))
 		return git_config_string(&check_roundtrip_encoding, var, value);
 
-	if (!strcmp(var, "core.notesref")) {
-		if (!value)
-			return config_error_nonbool(var);
-		notes_ref_name = xstrdup(value);
-		return 0;
-	}
+	if (!strcmp(var, "core.notesref"))
+		return git_config_string_dup(&notes_ref_name, var, value);
 
 	if (!strcmp(var, "core.editor"))
 		return git_config_string(&editor_program, var, value);
diff --git a/merge-ll.c b/merge-ll.c
index bf1077ae09..660d9a3bd6 100644
--- a/merge-ll.c
+++ b/merge-ll.c
@@ -308,8 +308,6 @@ static int read_merge_config(const char *var, const char *value,
 		return git_config_string(&fn->description, var, value);
 
 	if (!strcmp("driver", key)) {
-		if (!value)
-			return config_error_nonbool(var);
 		/*
 		 * merge.<name>.driver specifies the command line:
 		 *
@@ -333,8 +331,7 @@ static int read_merge_config(const char *var, const char *value,
 		 * file named by %A, and signal that it has done with zero exit
 		 * status.
 		 */
-		fn->cmdline = xstrdup(value);
-		return 0;
+		return git_config_string_dup(&fn->cmdline, var, value);
 	}
 
 	if (!strcmp("recursive", key))
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/11] config: use git_config_string() in easy cases
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (4 preceding siblings ...)
  2024-04-07  1:01   ` [PATCH 05/11] config: use git_config_string_dup() to fix leaky open coding Jeff King
@ 2024-04-07  1:02   ` Jeff King
  2024-04-07  2:05     ` Eric Sunshine
  2024-04-07  1:03   ` [PATCH 07/11] config: use git_config_pathname_dup() " Jeff King
                     ` (6 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:02 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

The git_config_string() interface is very prone to leaking; it
overwrites the contents of the pointer without freeing any existing
value, so:

  const char *foo;
  ...
  if (!strcmp(var, "foo.bar"))
	git_config_string(&foo, var, value);

is always going to leak if we see the "foo.bar" config more than once.
We don't tend to notice in practice because people don't have a lot of
redundant config, but basically every use of git_config_string() has
this problem.

For simple cases where the initial value of the "foo" pointer is NULL
(like the one above), we can easily convert them to use a variant of the
helper that frees the existing value before writing over it. That works
because free(NULL) is a noop. What we can't change, though, is cases
where the pointer is initialized to a string literal, like:

  const char *foo = "some string literal";

In that case it would be incorrect to free() the original value. Those
cases will take more refactoring to make them leak-free, and this patch
punts on that for now.

In each case we switch the underlying variable from "const char *" to
"char *" to indicate that it will always point to allocated memory (and
our git_config_string_dup() interface enforces this).

Signed-off-by: Jeff King <peff@peff.net>
---
 alias.c                |  3 +--
 attr.c                 |  2 +-
 attr.h                 |  2 +-
 builtin/commit.c       |  4 ++--
 builtin/log.c          | 12 ++++++------
 builtin/merge.c        |  4 ++--
 builtin/receive-pack.c |  4 ++--
 builtin/repack.c       | 16 ++++++++--------
 config.c               | 12 ++++++------
 convert.c              | 12 ++++++------
 delta-islands.c        |  4 ++--
 diff.c                 |  8 ++++----
 environment.c          |  8 ++++----
 environment.h          |  8 ++++----
 gpg-interface.c        |  5 +++--
 imap-send.c            | 20 ++++++++++----------
 mailmap.c              |  2 +-
 mailmap.h              |  2 +-
 merge-ll.c             | 12 ++++++------
 pager.c                |  4 ++--
 promisor-remote.c      |  2 +-
 promisor-remote.h      |  2 +-
 remote.c               | 17 ++++++++---------
 remote.h               |  8 ++++----
 sequencer.c            |  2 +-
 upload-pack.c          |  4 ++--
 26 files changed, 89 insertions(+), 90 deletions(-)

diff --git a/alias.c b/alias.c
index 5a238f2e30..7f8c1c3e7d 100644
--- a/alias.c
+++ b/alias.c
@@ -22,8 +22,7 @@ static int config_alias_cb(const char *key, const char *value,
 
 	if (data->alias) {
 		if (!strcasecmp(p, data->alias))
-			return git_config_string((const char **)&data->v,
-						 key, value);
+			return git_config_string_dup(&data->v, key, value);
 	} else if (data->list) {
 		string_list_append(data->list, p);
 	}
diff --git a/attr.c b/attr.c
index 679e42258c..4eda36c865 100644
--- a/attr.c
+++ b/attr.c
@@ -25,7 +25,7 @@
 #include "tree-walk.h"
 #include "object-name.h"
 
-const char *git_attr_tree;
+char *git_attr_tree;
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
diff --git a/attr.h b/attr.h
index 127998ae01..e7cc318b0c 100644
--- a/attr.h
+++ b/attr.h
@@ -236,6 +236,6 @@ const char *git_attr_global_file(void);
 /* Return whether the system gitattributes file is enabled and should be used. */
 int git_attr_system_is_enabled(void);
 
-extern const char *git_attr_tree;
+extern char *git_attr_tree;
 
 #endif /* ATTR_H */
diff --git a/builtin/commit.c b/builtin/commit.c
index 7ba7201cfb..9b6546d401 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -133,7 +133,7 @@ static struct strvec trailer_args = STRVEC_INIT;
  * is specified explicitly.
  */
 static enum commit_msg_cleanup_mode cleanup_mode;
-static const char *cleanup_arg;
+static char *cleanup_arg;
 
 static enum commit_whence whence;
 static int use_editor = 1, include_status = 1;
@@ -1632,7 +1632,7 @@ static int git_commit_config(const char *k, const char *v,
 		return 0;
 	}
 	if (!strcmp(k, "commit.cleanup"))
-		return git_config_string(&cleanup_arg, k, v);
+		return git_config_string_dup(&cleanup_arg, k, v);
 	if (!strcmp(k, "commit.gpgsign")) {
 		sign_commit = git_config_bool(k, v) ? "" : NULL;
 		return 0;
diff --git a/builtin/log.c b/builtin/log.c
index c0a8bb95e9..eb336f7efb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -49,7 +49,7 @@
 #define FORMAT_PATCH_NAME_MAX_DEFAULT 64
 
 /* Set a default date-time format for git log ("log.date" config variable) */
-static const char *default_date_mode = NULL;
+static char *default_date_mode;
 
 static int default_abbrev_commit;
 static int default_show_root = 1;
@@ -63,7 +63,7 @@ static unsigned int force_in_body_from;
 static int stdout_mboxrd;
 static const char *fmt_patch_subject_prefix = "PATCH";
 static int fmt_patch_name_max = FORMAT_PATCH_NAME_MAX_DEFAULT;
-static const char *fmt_pretty;
+static char *fmt_pretty;
 static int format_no_prefix;
 
 static const char * const builtin_log_usage[] = {
@@ -569,7 +569,7 @@ static int git_log_config(const char *var, const char *value,
 	const char *slot_name;
 
 	if (!strcmp(var, "format.pretty"))
-		return git_config_string(&fmt_pretty, var, value);
+		return git_config_string_dup(&fmt_pretty, var, value);
 	if (!strcmp(var, "format.subjectprefix"))
 		return git_config_string(&fmt_patch_subject_prefix, var, value);
 	if (!strcmp(var, "format.filenamemaxlength")) {
@@ -585,7 +585,7 @@ static int git_log_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "log.date"))
-		return git_config_string(&default_date_mode, var, value);
+		return git_config_string_dup(&default_date_mode, var, value);
 	if (!strcmp(var, "log.decorate")) {
 		decoration_style = parse_decoration_style(value);
 		if (decoration_style < 0)
@@ -959,7 +959,7 @@ static char *from;
 static const char *signature = git_version_string;
 static const char *signature_file;
 static enum cover_setting config_cover_letter;
-static const char *config_output_directory;
+static char *config_output_directory;
 static enum cover_from_description cover_from_description_mode = COVER_FROM_MESSAGE;
 static int show_notes;
 static struct display_notes_opt notes_opt;
@@ -1054,7 +1054,7 @@ static int git_format_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "format.outputdirectory"))
-		return git_config_string(&config_output_directory, var, value);
+		return git_config_string_dup(&config_output_directory, var, value);
 	if (!strcmp(var, "format.useautobase")) {
 		if (value && !strcasecmp(value, "whenAble")) {
 			auto_base = AUTO_BASE_WHEN_ABLE;
diff --git a/builtin/merge.c b/builtin/merge.c
index 6f4fec87fc..c2be29ed2f 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -111,7 +111,7 @@ enum ff_type {
 
 static enum ff_type fast_forward = FF_ALLOW;
 
-static const char *cleanup_arg;
+static char *cleanup_arg;
 static enum commit_msg_cleanup_mode cleanup_mode;
 
 static int option_parse_message(const struct option *opt,
@@ -619,7 +619,7 @@ static int git_merge_config(const char *k, const char *v,
 	else if (!strcmp(k, "pull.octopus"))
 		return git_config_string(&pull_octopus, k, v);
 	else if (!strcmp(k, "commit.cleanup"))
-		return git_config_string(&cleanup_arg, k, v);
+		return git_config_string_dup(&cleanup_arg, k, v);
 	else if (!strcmp(k, "merge.ff")) {
 		int boolval = git_parse_maybe_bool(v);
 		if (0 <= boolval) {
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 15ed81a3f6..2ba71b6673 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -88,7 +88,7 @@ static struct strbuf push_cert = STRBUF_INIT;
 static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
-static const char *cert_nonce_seed;
+static char *cert_nonce_seed;
 static struct strvec hidden_refs = STRVEC_INIT;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
@@ -230,7 +230,7 @@ static int receive_pack_config(const char *var, const char *value,
 	}
 
 	if (strcmp(var, "receive.certnonceseed") == 0)
-		return git_config_string(&cert_nonce_seed, var, value);
+		return git_config_string_dup(&cert_nonce_seed, var, value);
 
 	if (strcmp(var, "receive.certnonceslop") == 0) {
 		nonce_stamp_slop_limit = git_config_ulong(var, value, ctx->kvi);
diff --git a/builtin/repack.c b/builtin/repack.c
index 15e4cccc45..c144e18d9f 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -48,10 +48,10 @@ static const char incremental_bitmap_conflict_error[] = N_(
 );
 
 struct pack_objects_args {
-	const char *window;
-	const char *window_memory;
-	const char *depth;
-	const char *threads;
+	char *window;
+	char *window_memory;
+	char *depth;
+	char *threads;
 	unsigned long max_pack_size;
 	int no_reuse_delta;
 	int no_reuse_object;
@@ -86,13 +86,13 @@ static int repack_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "repack.cruftwindow"))
-		return git_config_string(&cruft_po_args->window, var, value);
+		return git_config_string_dup(&cruft_po_args->window, var, value);
 	if (!strcmp(var, "repack.cruftwindowmemory"))
-		return git_config_string(&cruft_po_args->window_memory, var, value);
+		return git_config_string_dup(&cruft_po_args->window_memory, var, value);
 	if (!strcmp(var, "repack.cruftdepth"))
-		return git_config_string(&cruft_po_args->depth, var, value);
+		return git_config_string_dup(&cruft_po_args->depth, var, value);
 	if (!strcmp(var, "repack.cruftthreads"))
-		return git_config_string(&cruft_po_args->threads, var, value);
+		return git_config_string_dup(&cruft_po_args->threads, var, value);
 	return git_default_config(var, value, ctx, cb);
 }
 
diff --git a/config.c b/config.c
index c115e6d8c9..449b8f4f66 100644
--- a/config.c
+++ b/config.c
@@ -1579,7 +1579,7 @@ static int git_default_core_config(const char *var, const char *value,
 		return git_config_string_dup(&notes_ref_name, var, value);
 
 	if (!strcmp(var, "core.editor"))
-		return git_config_string(&editor_program, var, value);
+		return git_config_string_dup(&editor_program, var, value);
 
 	if (!strcmp(var, "core.commentchar") ||
 	    !strcmp(var, "core.commentstring")) {
@@ -1598,7 +1598,7 @@ static int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.askpass"))
-		return git_config_string(&askpass_program, var, value);
+		return git_config_string_dup(&askpass_program, var, value);
 
 	if (!strcmp(var, "core.excludesfile"))
 		return git_config_pathname(&excludes_file, var, value);
@@ -1703,10 +1703,10 @@ static int git_default_sparse_config(const char *var, const char *value)
 static int git_default_i18n_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "i18n.commitencoding"))
-		return git_config_string(&git_commit_encoding, var, value);
+		return git_config_string_dup(&git_commit_encoding, var, value);
 
 	if (!strcmp(var, "i18n.logoutputencoding"))
-		return git_config_string(&git_log_output_encoding, var, value);
+		return git_config_string_dup(&git_log_output_encoding, var, value);
 
 	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
@@ -1782,7 +1782,7 @@ static int git_default_mailmap_config(const char *var, const char *value)
 	if (!strcmp(var, "mailmap.file"))
 		return git_config_pathname(&git_mailmap_file, var, value);
 	if (!strcmp(var, "mailmap.blob"))
-		return git_config_string(&git_mailmap_blob, var, value);
+		return git_config_string_dup(&git_mailmap_blob, var, value);
 
 	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
@@ -1791,7 +1791,7 @@ static int git_default_mailmap_config(const char *var, const char *value)
 static int git_default_attr_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "attr.tree"))
-		return git_config_string(&git_attr_tree, var, value);
+		return git_config_string_dup(&git_attr_tree, var, value);
 
 	/*
 	 * Add other attribute related config variables here and to
diff --git a/convert.c b/convert.c
index 35b25eb3cb..d462485d1f 100644
--- a/convert.c
+++ b/convert.c
@@ -979,9 +979,9 @@ int async_query_available_blobs(const char *cmd, struct string_list *available_p
 static struct convert_driver {
 	const char *name;
 	struct convert_driver *next;
-	const char *smudge;
-	const char *clean;
-	const char *process;
+	char *smudge;
+	char *clean;
+	char *process;
 	int required;
 } *user_convert, **user_convert_tail;
 
@@ -1047,13 +1047,13 @@ static int read_convert_config(const char *var, const char *value,
 	 */
 
 	if (!strcmp("smudge", key))
-		return git_config_string(&drv->smudge, var, value);
+		return git_config_string_dup(&drv->smudge, var, value);
 
 	if (!strcmp("clean", key))
-		return git_config_string(&drv->clean, var, value);
+		return git_config_string_dup(&drv->clean, var, value);
 
 	if (!strcmp("process", key))
-		return git_config_string(&drv->process, var, value);
+		return git_config_string_dup(&drv->process, var, value);
 
 	if (!strcmp("required", key)) {
 		drv->required = git_config_bool(var, value);
diff --git a/delta-islands.c b/delta-islands.c
index f7e079425f..c8ff736a4d 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -313,7 +313,7 @@ struct island_load_data {
 	size_t nr;
 	size_t alloc;
 };
-static const char *core_island_name;
+static char *core_island_name;
 
 static void free_config_regexes(struct island_load_data *ild)
 {
@@ -362,7 +362,7 @@ static int island_config_callback(const char *k, const char *v,
 	}
 
 	if (!strcmp(k, "pack.islandcore"))
-		return git_config_string(&core_island_name, k, v);
+		return git_config_string_dup(&core_island_name, k, v);
 
 	return 0;
 }
diff --git a/diff.c b/diff.c
index 108c187577..bb00bc1110 100644
--- a/diff.c
+++ b/diff.c
@@ -56,8 +56,8 @@ static int diff_color_moved_default;
 static int diff_color_moved_ws_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
-static const char *diff_word_regex_cfg;
-static const char *external_diff_cmd_cfg;
+static char *diff_word_regex_cfg;
+static char *external_diff_cmd_cfg;
 static const char *diff_order_file_cfg;
 int diff_auto_refresh_index = 1;
 static int diff_mnemonic_prefix;
@@ -429,9 +429,9 @@ int git_diff_ui_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "diff.external"))
-		return git_config_string(&external_diff_cmd_cfg, var, value);
+		return git_config_string_dup(&external_diff_cmd_cfg, var, value);
 	if (!strcmp(var, "diff.wordregex"))
-		return git_config_string(&diff_word_regex_cfg, var, value);
+		return git_config_string_dup(&diff_word_regex_cfg, var, value);
 	if (!strcmp(var, "diff.orderfile"))
 		return git_config_pathname(&diff_order_file_cfg, var, value);
 
diff --git a/environment.c b/environment.c
index a73ba9c12c..5e9fab4d1a 100644
--- a/environment.c
+++ b/environment.c
@@ -42,8 +42,8 @@ int is_bare_repository_cfg = -1; /* unspecified */
 int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int repository_format_precious_objects;
-const char *git_commit_encoding;
-const char *git_log_output_encoding;
+char *git_commit_encoding;
+char *git_log_output_encoding;
 char *apply_default_whitespace;
 char *apply_default_ignorewhitespace;
 const char *git_attributes_file;
@@ -58,8 +58,8 @@ size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
 size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
-const char *editor_program;
-const char *askpass_program;
+char *editor_program;
+char *askpass_program;
 const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 enum eol core_eol = EOL_UNSET;
diff --git a/environment.h b/environment.h
index 05fd94d7be..f06024a457 100644
--- a/environment.h
+++ b/environment.h
@@ -217,11 +217,11 @@ int odb_pack_keep(const char *name);
 const char *get_log_output_encoding(void);
 const char *get_commit_output_encoding(void);
 
-extern const char *git_commit_encoding;
-extern const char *git_log_output_encoding;
+extern char *git_commit_encoding;
+extern char *git_log_output_encoding;
 
-extern const char *editor_program;
-extern const char *askpass_program;
+extern char *editor_program;
+extern char *askpass_program;
 extern const char *excludes_file;
 
 /*
diff --git a/gpg-interface.c b/gpg-interface.c
index b5993385ff..4b2f70ef44 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -27,7 +27,8 @@ static void gpg_interface_lazy_init(void)
 }
 
 static char *configured_signing_key;
-static const char *ssh_default_key_command, *ssh_allowed_signers, *ssh_revocation_file;
+static char *ssh_default_key_command;
+static const char *ssh_allowed_signers, *ssh_revocation_file;
 static enum signature_trust_level configured_min_trust_level = TRUST_UNDEFINED;
 
 struct gpg_format {
@@ -762,7 +763,7 @@ static int git_gpg_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "gpg.ssh.defaultkeycommand"))
-		return git_config_string(&ssh_default_key_command, var, value);
+		return git_config_string_dup(&ssh_default_key_command, var, value);
 
 	if (!strcmp(var, "gpg.ssh.allowedsignersfile"))
 		return git_config_pathname(&ssh_allowed_signers, var, value);
diff --git a/imap-send.c b/imap-send.c
index f2e1947e63..d4454a3b49 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -87,16 +87,16 @@ static int nfvasprintf(char **strp, const char *fmt, va_list ap)
 
 struct imap_server_conf {
 	const char *name;
-	const char *tunnel;
+	char *tunnel;
 	const char *host;
 	int port;
-	const char *folder;
-	const char *user;
-	const char *pass;
+	char *folder;
+	char *user;
+	char *pass;
 	int use_ssl;
 	int ssl_verify;
 	int use_html;
-	const char *auth_method;
+	char *auth_method;
 };
 
 static struct imap_server_conf server = {
@@ -1332,15 +1332,15 @@ static int git_imap_config(const char *var, const char *val,
 	else if (!strcmp("imap.preformattedhtml", var))
 		server.use_html = git_config_bool(var, val);
 	else if (!strcmp("imap.folder", var))
-		return git_config_string(&server.folder, var, val);
+		return git_config_string_dup(&server.folder, var, val);
 	else if (!strcmp("imap.user", var))
-		return git_config_string(&server.user, var, val);
+		return git_config_string_dup(&server.user, var, val);
 	else if (!strcmp("imap.pass", var))
-		return git_config_string(&server.pass, var, val);
+		return git_config_string_dup(&server.pass, var, val);
 	else if (!strcmp("imap.tunnel", var))
-		return git_config_string(&server.tunnel, var, val);
+		return git_config_string_dup(&server.tunnel, var, val);
 	else if (!strcmp("imap.authmethod", var))
-		return git_config_string(&server.auth_method, var, val);
+		return git_config_string_dup(&server.auth_method, var, val);
 	else if (!strcmp("imap.port", var))
 		server.port = git_config_int(var, val, ctx->kvi);
 	else if (!strcmp("imap.host", var)) {
diff --git a/mailmap.c b/mailmap.c
index 3d6a5e9400..9bd11c90e6 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -7,7 +7,7 @@
 #include "setup.h"
 
 const char *git_mailmap_file;
-const char *git_mailmap_blob;
+char *git_mailmap_blob;
 
 struct mailmap_info {
 	char *name;
diff --git a/mailmap.h b/mailmap.h
index 0f8fd2c586..a1bc019b52 100644
--- a/mailmap.h
+++ b/mailmap.h
@@ -4,7 +4,7 @@
 struct string_list;
 
 extern const char *git_mailmap_file;
-extern const char *git_mailmap_blob;
+extern char *git_mailmap_blob;
 
 int read_mailmap(struct string_list *map);
 void clear_mailmap(struct string_list *map);
diff --git a/merge-ll.c b/merge-ll.c
index 660d9a3bd6..5ef96309d8 100644
--- a/merge-ll.c
+++ b/merge-ll.c
@@ -27,9 +27,9 @@ typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 
 struct ll_merge_driver {
 	const char *name;
-	const char *description;
+	char *description;
 	ll_merge_fn fn;
-	const char *recursive;
+	char *recursive;
 	struct ll_merge_driver *next;
 	char *cmdline;
 };
@@ -268,7 +268,7 @@ static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
  * merge.default and merge.driver configuration items
  */
 static struct ll_merge_driver *ll_user_merge, **ll_user_merge_tail;
-static const char *default_ll_merge;
+static char *default_ll_merge;
 
 static int read_merge_config(const char *var, const char *value,
 			     const struct config_context *ctx UNUSED,
@@ -279,7 +279,7 @@ static int read_merge_config(const char *var, const char *value,
 	size_t namelen;
 
 	if (!strcmp(var, "merge.default"))
-		return git_config_string(&default_ll_merge, var, value);
+		return git_config_string_dup(&default_ll_merge, var, value);
 
 	/*
 	 * We are not interested in anything but "merge.<name>.variable";
@@ -305,7 +305,7 @@ static int read_merge_config(const char *var, const char *value,
 	}
 
 	if (!strcmp("name", key))
-		return git_config_string(&fn->description, var, value);
+		return git_config_string_dup(&fn->description, var, value);
 
 	if (!strcmp("driver", key)) {
 		/*
@@ -335,7 +335,7 @@ static int read_merge_config(const char *var, const char *value,
 	}
 
 	if (!strcmp("recursive", key))
-		return git_config_string(&fn->recursive, var, value);
+		return git_config_string_dup(&fn->recursive, var, value);
 
 	return 0;
 }
diff --git a/pager.c b/pager.c
index b8822a9381..6d75528e56 100644
--- a/pager.c
+++ b/pager.c
@@ -13,7 +13,7 @@ int pager_use_color = 1;
 #endif
 
 static struct child_process pager_process;
-static const char *pager_program;
+static char *pager_program;
 
 /* Is the value coming back from term_columns() just a guess? */
 static int term_columns_guessed;
@@ -47,7 +47,7 @@ static int core_pager_config(const char *var, const char *value,
 			     void *data UNUSED)
 {
 	if (!strcmp(var, "core.pager"))
-		return git_config_string(&pager_program, var, value);
+		return git_config_string_dup(&pager_program, var, value);
 	return 0;
 }
 
diff --git a/promisor-remote.c b/promisor-remote.c
index ac3aa1e365..65693e7931 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -139,7 +139,7 @@ static int promisor_remote_config(const char *var, const char *value,
 		if (!r)
 			return 0;
 
-		return git_config_string(&r->partial_clone_filter, var, value);
+		return git_config_string_dup(&r->partial_clone_filter, var, value);
 	}
 
 	return 0;
diff --git a/promisor-remote.h b/promisor-remote.h
index 2cb9eda9ea..88cb599c39 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -13,7 +13,7 @@ struct object_id;
  */
 struct promisor_remote {
 	struct promisor_remote *next;
-	const char *partial_clone_filter;
+	char *partial_clone_filter;
 	const char name[FLEX_ARRAY];
 };
 
diff --git a/remote.c b/remote.c
index 09912bebf1..761a8b8fe3 100644
--- a/remote.c
+++ b/remote.c
@@ -367,9 +367,9 @@ static int handle_config(const char *key, const char *value,
 			return -1;
 		branch = make_branch(remote_state, name, namelen);
 		if (!strcmp(subkey, "remote")) {
-			return git_config_string(&branch->remote_name, key, value);
+			return git_config_string_dup(&branch->remote_name, key, value);
 		} else if (!strcmp(subkey, "pushremote")) {
-			return git_config_string(&branch->pushremote_name, key, value);
+			return git_config_string_dup(&branch->pushremote_name, key, value);
 		} else if (!strcmp(subkey, "merge")) {
 			if (!value)
 				return config_error_nonbool(key);
@@ -401,8 +401,8 @@ static int handle_config(const char *key, const char *value,
 
 	/* Handle remote.* variables */
 	if (!name && !strcmp(subkey, "pushdefault"))
-		return git_config_string(&remote_state->pushremote_name, key,
-					 value);
+		return git_config_string_dup(&remote_state->pushremote_name, key,
+					     value);
 
 	if (!name)
 		return 0;
@@ -471,13 +471,12 @@ static int handle_config(const char *key, const char *value,
 		else if (!strcmp(value, "--tags"))
 			remote->fetch_tags = 2;
 	} else if (!strcmp(subkey, "proxy")) {
-		return git_config_string((const char **)&remote->http_proxy,
-					 key, value);
+		return git_config_string_dup(&remote->http_proxy, key, value);
 	} else if (!strcmp(subkey, "proxyauthmethod")) {
-		return git_config_string((const char **)&remote->http_proxy_authmethod,
-					 key, value);
+		return git_config_string_dup(&remote->http_proxy_authmethod,
+					     key, value);
 	} else if (!strcmp(subkey, "vcs")) {
-		return git_config_string(&remote->foreign_vcs, key, value);
+		return git_config_string_dup(&remote->foreign_vcs, key, value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index dfd4837e60..e8c6655e42 100644
--- a/remote.h
+++ b/remote.h
@@ -46,7 +46,7 @@ struct remote_state {
 	struct hashmap branches_hash;
 
 	struct branch *current_branch;
-	const char *pushremote_name;
+	char *pushremote_name;
 
 	struct rewrites rewrites;
 	struct rewrites rewrites_push;
@@ -65,7 +65,7 @@ struct remote {
 
 	int origin, configured_in_repo;
 
-	const char *foreign_vcs;
+	char *foreign_vcs;
 
 	/* An array of all of the url_nr URLs configured for the remote */
 	const char **url;
@@ -309,9 +309,9 @@ struct branch {
 	const char *refname;
 
 	/* The name of the remote listed in the configuration. */
-	const char *remote_name;
+	char *remote_name;
 
-	const char *pushremote_name;
+	char *pushremote_name;
 
 	/* An array of the "merge" lines in the configuration. */
 	const char **merge_name;
diff --git a/sequencer.c b/sequencer.c
index 3e5d82e0e5..29f019b2a0 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -266,7 +266,7 @@ static int git_sequencer_config(const char *k, const char *v,
 	}
 
 	if (!opts->default_strategy && !strcmp(k, "pull.twohead")) {
-		int ret = git_config_string((const char**)&opts->default_strategy, k, v);
+		int ret = git_config_string_dup(&opts->default_strategy, k, v);
 		if (ret == 0) {
 			/*
 			 * pull.twohead is allowed to be multi-valued; we only
diff --git a/upload-pack.c b/upload-pack.c
index 902144b9d3..5f5943cc19 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -94,7 +94,7 @@ struct upload_pack_data {
 
 	struct packet_writer writer;
 
-	const char *pack_objects_hook;
+	char *pack_objects_hook;
 
 	unsigned stateless_rpc : 1;				/* v0 only */
 	unsigned no_done : 1;					/* v0 only */
@@ -1386,7 +1386,7 @@ static int upload_pack_protected_config(const char *var, const char *value,
 	struct upload_pack_data *data = cb_data;
 
 	if (!strcmp("uploadpack.packobjectshook", var))
-		return git_config_string(&data->pack_objects_hook, var, value);
+		return git_config_string_dup(&data->pack_objects_hook, var, value);
 	return 0;
 }
 
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/11] config: use git_config_pathname_dup() in easy cases
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (5 preceding siblings ...)
  2024-04-07  1:02   ` [PATCH 06/11] config: use git_config_string() in easy cases Jeff King
@ 2024-04-07  1:03   ` Jeff King
  2024-04-07  1:03   ` [PATCH 08/11] http: use git_config_string_dup() Jeff King
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:03 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

As with the previous commit for git_config_string(), this fixes simple
cases where calling git_config_pathname() may leak if a config variable
is seen multiple times.

Signed-off-by: Jeff King <peff@peff.net>
---
I mostly split this out because the diff is big, but maybe it would make
sense squashed to share the rationale?

 builtin/commit.c | 4 ++--
 builtin/log.c    | 4 ++--
 config.c         | 8 ++++----
 diff.c           | 4 ++--
 environment.c    | 6 +++---
 environment.h    | 6 +++---
 gpg-interface.c  | 6 +++---
 mailmap.c        | 2 +-
 mailmap.h        | 2 +-
 9 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 9b6546d401..693175e5c4 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -107,7 +107,7 @@ static enum {
 } commit_style;
 
 static const char *logfile, *force_author;
-static const char *template_file;
+static char *template_file;
 /*
  * The _message variables are commit names from which to take
  * the commit message and/or authorship.
@@ -1626,7 +1626,7 @@ static int git_commit_config(const char *k, const char *v,
 	struct wt_status *s = cb;
 
 	if (!strcmp(k, "commit.template"))
-		return git_config_pathname(&template_file, k, v);
+		return git_config_pathname_dup(&template_file, k, v);
 	if (!strcmp(k, "commit.status")) {
 		include_status = git_config_bool(k, v);
 		return 0;
diff --git a/builtin/log.c b/builtin/log.c
index eb336f7efb..3df261f94a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -957,7 +957,7 @@ static int do_signoff;
 static enum auto_base_setting auto_base;
 static char *from;
 static const char *signature = git_version_string;
-static const char *signature_file;
+static char *signature_file;
 static enum cover_setting config_cover_letter;
 static char *config_output_directory;
 static enum cover_from_description cover_from_description_mode = COVER_FROM_MESSAGE;
@@ -1044,7 +1044,7 @@ static int git_format_config(const char *var, const char *value,
 	if (!strcmp(var, "format.signature"))
 		return git_config_string(&signature, var, value);
 	if (!strcmp(var, "format.signaturefile"))
-		return git_config_pathname(&signature_file, var, value);
+		return git_config_pathname_dup(&signature_file, var, value);
 	if (!strcmp(var, "format.coverletter")) {
 		if (value && !strcasecmp(value, "auto")) {
 			config_cover_letter = COVER_AUTO;
diff --git a/config.c b/config.c
index 449b8f4f66..c2863d6514 100644
--- a/config.c
+++ b/config.c
@@ -1434,10 +1434,10 @@ static int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.attributesfile"))
-		return git_config_pathname(&git_attributes_file, var, value);
+		return git_config_pathname_dup(&git_attributes_file, var, value);
 
 	if (!strcmp(var, "core.hookspath"))
-		return git_config_pathname(&git_hooks_path, var, value);
+		return git_config_pathname_dup(&git_hooks_path, var, value);
 
 	if (!strcmp(var, "core.bare")) {
 		is_bare_repository_cfg = git_config_bool(var, value);
@@ -1601,7 +1601,7 @@ static int git_default_core_config(const char *var, const char *value,
 		return git_config_string_dup(&askpass_program, var, value);
 
 	if (!strcmp(var, "core.excludesfile"))
-		return git_config_pathname(&excludes_file, var, value);
+		return git_config_pathname_dup(&excludes_file, var, value);
 
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
@@ -1780,7 +1780,7 @@ static int git_default_push_config(const char *var, const char *value)
 static int git_default_mailmap_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "mailmap.file"))
-		return git_config_pathname(&git_mailmap_file, var, value);
+		return git_config_pathname_dup(&git_mailmap_file, var, value);
 	if (!strcmp(var, "mailmap.blob"))
 		return git_config_string_dup(&git_mailmap_blob, var, value);
 
diff --git a/diff.c b/diff.c
index bb00bc1110..6236746920 100644
--- a/diff.c
+++ b/diff.c
@@ -58,7 +58,7 @@ static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static char *diff_word_regex_cfg;
 static char *external_diff_cmd_cfg;
-static const char *diff_order_file_cfg;
+static char *diff_order_file_cfg;
 int diff_auto_refresh_index = 1;
 static int diff_mnemonic_prefix;
 static int diff_no_prefix;
@@ -433,7 +433,7 @@ int git_diff_ui_config(const char *var, const char *value,
 	if (!strcmp(var, "diff.wordregex"))
 		return git_config_string_dup(&diff_word_regex_cfg, var, value);
 	if (!strcmp(var, "diff.orderfile"))
-		return git_config_pathname(&diff_order_file_cfg, var, value);
+		return git_config_pathname_dup(&diff_order_file_cfg, var, value);
 
 	if (!strcmp(var, "diff.ignoresubmodules")) {
 		if (!value)
diff --git a/environment.c b/environment.c
index 5e9fab4d1a..87123f902c 100644
--- a/environment.c
+++ b/environment.c
@@ -46,8 +46,8 @@ char *git_commit_encoding;
 char *git_log_output_encoding;
 char *apply_default_whitespace;
 char *apply_default_ignorewhitespace;
-const char *git_attributes_file;
-const char *git_hooks_path;
+char *git_attributes_file;
+char *git_hooks_path;
 int zlib_compression_level = Z_BEST_SPEED;
 int pack_compression_level = Z_DEFAULT_COMPRESSION;
 int fsync_object_files = -1;
@@ -60,7 +60,7 @@ size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
 char *editor_program;
 char *askpass_program;
-const char *excludes_file;
+char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
 enum eol core_eol = EOL_UNSET;
 int global_conv_flags_eol = CONV_EOL_RNDTRP_WARN;
diff --git a/environment.h b/environment.h
index f06024a457..c33f101b24 100644
--- a/environment.h
+++ b/environment.h
@@ -124,8 +124,8 @@ extern int warn_ambiguous_refs;
 extern int warn_on_object_refname_ambiguity;
 extern char *apply_default_whitespace;
 extern char *apply_default_ignorewhitespace;
-extern const char *git_attributes_file;
-extern const char *git_hooks_path;
+extern char *git_attributes_file;
+extern char *git_hooks_path;
 extern int zlib_compression_level;
 extern int pack_compression_level;
 extern size_t packed_git_window_size;
@@ -222,7 +222,7 @@ extern char *git_log_output_encoding;
 
 extern char *editor_program;
 extern char *askpass_program;
-extern const char *excludes_file;
+extern char *excludes_file;
 
 /*
  * Should we print an ellipsis after an abbreviated SHA-1 value
diff --git a/gpg-interface.c b/gpg-interface.c
index 4b2f70ef44..d6e45ff09c 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -28,7 +28,7 @@ static void gpg_interface_lazy_init(void)
 
 static char *configured_signing_key;
 static char *ssh_default_key_command;
-static const char *ssh_allowed_signers, *ssh_revocation_file;
+static char *ssh_allowed_signers, *ssh_revocation_file;
 static enum signature_trust_level configured_min_trust_level = TRUST_UNDEFINED;
 
 struct gpg_format {
@@ -766,10 +766,10 @@ static int git_gpg_config(const char *var, const char *value,
 		return git_config_string_dup(&ssh_default_key_command, var, value);
 
 	if (!strcmp(var, "gpg.ssh.allowedsignersfile"))
-		return git_config_pathname(&ssh_allowed_signers, var, value);
+		return git_config_pathname_dup(&ssh_allowed_signers, var, value);
 
 	if (!strcmp(var, "gpg.ssh.revocationfile"))
-		return git_config_pathname(&ssh_revocation_file, var, value);
+		return git_config_pathname_dup(&ssh_revocation_file, var, value);
 
 	if (!strcmp(var, "gpg.program") || !strcmp(var, "gpg.openpgp.program"))
 		fmtname = "openpgp";
diff --git a/mailmap.c b/mailmap.c
index 9bd11c90e6..b2efe29b3d 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -6,7 +6,7 @@
 #include "object-store-ll.h"
 #include "setup.h"
 
-const char *git_mailmap_file;
+char *git_mailmap_file;
 char *git_mailmap_blob;
 
 struct mailmap_info {
diff --git a/mailmap.h b/mailmap.h
index a1bc019b52..cbda9bc5e0 100644
--- a/mailmap.h
+++ b/mailmap.h
@@ -3,7 +3,7 @@
 
 struct string_list;
 
-extern const char *git_mailmap_file;
+extern char *git_mailmap_file;
 extern char *git_mailmap_blob;
 
 int read_mailmap(struct string_list *map);
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/11] http: use git_config_string_dup()
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (6 preceding siblings ...)
  2024-04-07  1:03   ` [PATCH 07/11] config: use git_config_pathname_dup() " Jeff King
@ 2024-04-07  1:03   ` Jeff King
  2024-04-07  1:04   ` [PATCH 09/11] merge: use git_config_string_dup() for pull strategies Jeff King
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:03 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

There are many calls to git_config_string() in the HTTP code, leading to
possible leaks if a config variable is seen multiple times. I split
these out from the other "easy" cases because the change to non-const
pointers has some follow-on effects:

  1. When these variables are passed to var_override(), that should now
     take a non-const pointer (and consequently can drop the cast it
     passes to free).

  2. That file also has the somewhat redundant set_from_env() helper,
     which did _not_ free the existing value, and so has basically the
     same leak as git_config_string(). Rather than adjust it to take a
     non-const pointer, we can just swap it out for var_override(),
     fixing the leak and reducing the number of helpers at the same
     time.

This makes for a fairly big patch, but we have to do it all at once to
keep the compiler happy (or alternatively, insert a bunch of temporary
casts into http.c).

Note that some of these are git_config_pathname_dup(); it seemed better
to convert them all together, since the root of the problem (and the
solution) are essentially the same.

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c | 105 +++++++++++++++++++++++++++------------------------------
 1 file changed, 49 insertions(+), 56 deletions(-)

diff --git a/http.c b/http.c
index e73b136e58..f2bcc9083e 100644
--- a/http.c
+++ b/http.c
@@ -38,11 +38,11 @@ char curl_errorstr[CURL_ERROR_SIZE];
 
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
-static const char *curl_http_version = NULL;
-static const char *ssl_cert;
-static const char *ssl_cert_type;
-static const char *ssl_cipherlist;
-static const char *ssl_version;
+static char *curl_http_version;
+static char *ssl_cert;
+static char *ssl_cert_type;
+static char *ssl_cipherlist;
+static char *ssl_version;
 static struct {
 	const char *name;
 	long ssl_version;
@@ -59,23 +59,23 @@ static struct {
 	{ "tlsv1.3", CURL_SSLVERSION_TLSv1_3 },
 #endif
 };
-static const char *ssl_key;
-static const char *ssl_key_type;
-static const char *ssl_capath;
-static const char *curl_no_proxy;
+static char *ssl_key;
+static char *ssl_key_type;
+static char *ssl_capath;
+static char *curl_no_proxy;
 #ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
-static const char *ssl_pinnedkey;
+static char *ssl_pinnedkey;
 #endif
-static const char *ssl_cainfo;
+static char *ssl_cainfo;
 static long curl_low_speed_limit = -1;
 static long curl_low_speed_time = -1;
 static int curl_ftp_no_epsv;
-static const char *curl_http_proxy;
-static const char *http_proxy_authmethod;
+static char *curl_http_proxy;
+static char *http_proxy_authmethod;
 
-static const char *http_proxy_ssl_cert;
-static const char *http_proxy_ssl_key;
-static const char *http_proxy_ssl_ca_info;
+static char *http_proxy_ssl_cert;
+static char *http_proxy_ssl_key;
+static char *http_proxy_ssl_ca_info;
 static struct credential proxy_cert_auth = CREDENTIAL_INIT;
 static int proxy_ssl_cert_password_required;
 
@@ -95,7 +95,7 @@ static struct {
 	 */
 };
 #ifdef CURLGSSAPI_DELEGATION_FLAG
-static const char *curl_deleg;
+static char *curl_deleg;
 static struct {
 	const char *name;
 	long curl_deleg_param;
@@ -108,11 +108,11 @@ static struct {
 
 static struct credential proxy_auth = CREDENTIAL_INIT;
 static const char *curl_proxyuserpwd;
-static const char *curl_cookie_file;
+static char *curl_cookie_file;
 static int curl_save_cookies;
 struct credential http_auth = CREDENTIAL_INIT;
 static int http_proactive_auth;
-static const char *user_agent;
+static char *user_agent;
 static int curl_empty_auth = -1;
 
 enum http_follow_config http_follow_config = HTTP_FOLLOW_INITIAL;
@@ -366,28 +366,28 @@ static int http_options(const char *var, const char *value,
 			const struct config_context *ctx, void *data)
 {
 	if (!strcmp("http.version", var)) {
-		return git_config_string(&curl_http_version, var, value);
+		return git_config_string_dup(&curl_http_version, var, value);
 	}
 	if (!strcmp("http.sslverify", var)) {
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("http.sslcipherlist", var))
-		return git_config_string(&ssl_cipherlist, var, value);
+		return git_config_string_dup(&ssl_cipherlist, var, value);
 	if (!strcmp("http.sslversion", var))
-		return git_config_string(&ssl_version, var, value);
+		return git_config_string_dup(&ssl_version, var, value);
 	if (!strcmp("http.sslcert", var))
-		return git_config_pathname(&ssl_cert, var, value);
+		return git_config_pathname_dup(&ssl_cert, var, value);
 	if (!strcmp("http.sslcerttype", var))
-		return git_config_string(&ssl_cert_type, var, value);
+		return git_config_string_dup(&ssl_cert_type, var, value);
 	if (!strcmp("http.sslkey", var))
-		return git_config_pathname(&ssl_key, var, value);
+		return git_config_pathname_dup(&ssl_key, var, value);
 	if (!strcmp("http.sslkeytype", var))
-		return git_config_string(&ssl_key_type, var, value);
+		return git_config_string_dup(&ssl_key_type, var, value);
 	if (!strcmp("http.sslcapath", var))
-		return git_config_pathname(&ssl_capath, var, value);
+		return git_config_pathname_dup(&ssl_capath, var, value);
 	if (!strcmp("http.sslcainfo", var))
-		return git_config_pathname(&ssl_cainfo, var, value);
+		return git_config_pathname_dup(&ssl_cainfo, var, value);
 	if (!strcmp("http.sslcertpasswordprotected", var)) {
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
@@ -436,27 +436,27 @@ static int http_options(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp("http.proxy", var))
-		return git_config_string(&curl_http_proxy, var, value);
+		return git_config_string_dup(&curl_http_proxy, var, value);
 
 	if (!strcmp("http.proxyauthmethod", var))
-		return git_config_string(&http_proxy_authmethod, var, value);
+		return git_config_string_dup(&http_proxy_authmethod, var, value);
 
 	if (!strcmp("http.proxysslcert", var))
-		return git_config_string(&http_proxy_ssl_cert, var, value);
+		return git_config_string_dup(&http_proxy_ssl_cert, var, value);
 
 	if (!strcmp("http.proxysslkey", var))
-		return git_config_string(&http_proxy_ssl_key, var, value);
+		return git_config_string_dup(&http_proxy_ssl_key, var, value);
 
 	if (!strcmp("http.proxysslcainfo", var))
-		return git_config_string(&http_proxy_ssl_ca_info, var, value);
+		return git_config_string_dup(&http_proxy_ssl_ca_info, var, value);
 
 	if (!strcmp("http.proxysslcertpasswordprotected", var)) {
 		proxy_ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
 
 	if (!strcmp("http.cookiefile", var))
-		return git_config_pathname(&curl_cookie_file, var, value);
+		return git_config_pathname_dup(&curl_cookie_file, var, value);
 	if (!strcmp("http.savecookies", var)) {
 		curl_save_cookies = git_config_bool(var, value);
 		return 0;
@@ -472,7 +472,7 @@ static int http_options(const char *var, const char *value,
 	}
 
 	if (!strcmp("http.useragent", var))
-		return git_config_string(&user_agent, var, value);
+		return git_config_string_dup(&user_agent, var, value);
 
 	if (!strcmp("http.emptyauth", var)) {
 		if (value && !strcmp("auto", value))
@@ -484,7 +484,7 @@ static int http_options(const char *var, const char *value,
 
 	if (!strcmp("http.delegation", var)) {
 #ifdef CURLGSSAPI_DELEGATION_FLAG
-		return git_config_string(&curl_deleg, var, value);
+		return git_config_string_dup(&curl_deleg, var, value);
 #else
 		warning(_("Delegation control is not supported with cURL < 7.22.0"));
 		return 0;
@@ -493,7 +493,7 @@ static int http_options(const char *var, const char *value,
 
 	if (!strcmp("http.pinnedpubkey", var)) {
 #ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
-		return git_config_pathname(&ssl_pinnedkey, var, value);
+		return git_config_pathname_dup(&ssl_pinnedkey, var, value);
 #else
 		warning(_("Public key pinning not supported with cURL < 7.39.0"));
 		return 0;
@@ -572,10 +572,10 @@ static void init_curl_http_auth(CURL *result)
 }
 
 /* *var must be free-able */
-static void var_override(const char **var, char *value)
+static void var_override(char **var, char *value)
 {
 	if (value) {
-		free((void *)*var);
+		free(*var);
 		*var = xstrdup(value);
 	}
 }
@@ -1208,13 +1208,6 @@ static CURL *get_curl_handle(void)
 	return result;
 }
 
-static void set_from_env(const char **var, const char *envname)
-{
-	const char *val = getenv(envname);
-	if (val)
-		*var = val;
-}
-
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -1291,14 +1284,14 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (getenv("GIT_SSL_NO_VERIFY"))
 		curl_ssl_verify = 0;
 
-	set_from_env(&ssl_cert, "GIT_SSL_CERT");
-	set_from_env(&ssl_cert_type, "GIT_SSL_CERT_TYPE");
-	set_from_env(&ssl_key, "GIT_SSL_KEY");
-	set_from_env(&ssl_key_type, "GIT_SSL_KEY_TYPE");
-	set_from_env(&ssl_capath, "GIT_SSL_CAPATH");
-	set_from_env(&ssl_cainfo, "GIT_SSL_CAINFO");
+	var_override(&ssl_cert, getenv("GIT_SSL_CERT"));
+	var_override(&ssl_cert_type, getenv("GIT_SSL_CERT_TYPE"));
+	var_override(&ssl_key, getenv("GIT_SSL_KEY"));
+	var_override(&ssl_key_type, getenv("GIT_SSL_KEY_TYPE"));
+	var_override(&ssl_capath, getenv("GIT_SSL_CAPATH"));
+	var_override(&ssl_cainfo, getenv("GIT_SSL_CAINFO"));
 
-	set_from_env(&user_agent, "GIT_HTTP_USER_AGENT");
+	var_override(&user_agent, getenv("GIT_HTTP_USER_AGENT"));
 
 	low_speed_limit = getenv("GIT_HTTP_LOW_SPEED_LIMIT");
 	if (low_speed_limit)
@@ -1314,9 +1307,9 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (max_requests < 1)
 		max_requests = DEFAULT_MAX_REQUESTS;
 
-	set_from_env(&http_proxy_ssl_cert, "GIT_PROXY_SSL_CERT");
-	set_from_env(&http_proxy_ssl_key, "GIT_PROXY_SSL_KEY");
-	set_from_env(&http_proxy_ssl_ca_info, "GIT_PROXY_SSL_CAINFO");
+	var_override(&http_proxy_ssl_cert, getenv("GIT_PROXY_SSL_CERT"));
+	var_override(&http_proxy_ssl_key, getenv("GIT_PROXY_SSL_KEY"));
+	var_override(&http_proxy_ssl_ca_info, getenv("GIT_PROXY_SSL_CAINFO"));
 
 	if (getenv("GIT_PROXY_SSL_CERT_PASSWORD_PROTECTED"))
 		proxy_ssl_cert_password_required = 1;
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/11] merge: use git_config_string_dup() for pull strategies
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (7 preceding siblings ...)
  2024-04-07  1:03   ` [PATCH 08/11] http: use git_config_string_dup() Jeff King
@ 2024-04-07  1:04   ` Jeff King
  2024-04-07  1:04   ` [PATCH 10/11] userdiff: use git_config_string_dup() when we can Jeff King
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:04 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

Converting pull.twohead and pull.octopus to use git_config_string_dup()
fixes possible leaks if we see those config variables defined multiple
times. Doing so is mostly an "easy" case, except that we later may
assign a string literal to pull_twohead (which is now a non-const
pointer).

That's actually not _too_ bad in practice, as it happens after we've
done all of our config parsing (so nobody would try to free it). And the
compiler won't complain unless -Wwrite-strings is used (and turning that
on creates a host of other warnings, some useful and some not). But
while we're here let's future proof it as best we can.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/merge.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index c2be29ed2f..a9e5100e70 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -101,7 +101,7 @@ static struct strategy all_strategy[] = {
 	{ "subtree",    NO_FAST_FORWARD | NO_TRIVIAL },
 };
 
-static const char *pull_twohead, *pull_octopus;
+static char *pull_twohead, *pull_octopus;
 
 enum ff_type {
 	FF_NO,
@@ -615,9 +615,9 @@ static int git_merge_config(const char *k, const char *v,
 	else if (!strcmp(k, "merge.verifysignatures"))
 		verify_signatures = git_config_bool(k, v);
 	else if (!strcmp(k, "pull.twohead"))
-		return git_config_string(&pull_twohead, k, v);
+		return git_config_string_dup(&pull_twohead, k, v);
 	else if (!strcmp(k, "pull.octopus"))
-		return git_config_string(&pull_octopus, k, v);
+		return git_config_string_dup(&pull_octopus, k, v);
 	else if (!strcmp(k, "commit.cleanup"))
 		return git_config_string_dup(&cleanup_arg, k, v);
 	else if (!strcmp(k, "merge.ff")) {
@@ -1291,7 +1291,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (!pull_twohead) {
 		char *default_strategy = getenv("GIT_TEST_MERGE_ALGORITHM");
 		if (default_strategy && !strcmp(default_strategy, "ort"))
-			pull_twohead = "ort";
+			pull_twohead = xstrdup("ort");
 	}
 
 	init_diff_ui_defaults();
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/11] userdiff: use git_config_string_dup() when we can
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (8 preceding siblings ...)
  2024-04-07  1:04   ` [PATCH 09/11] merge: use git_config_string_dup() for pull strategies Jeff King
@ 2024-04-07  1:04   ` Jeff King
  2024-04-07  1:07   ` [PATCH 11/11] blame: use "dup" string_list for ignore-revs files Jeff King
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:04 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

There are many uses of git_config_string() in the userdiff code which
have the usual "leak when we see the variable multiple times" problem.
We can fix many of these in the usual way by switching to the _dup()
variant.

But note there are some adjacent ones we cannot fix right now: funcname
patterns and word regexes default to string literals, and we can't use
those with our dup variant (which would try to free the literals). For
now let's convert what we can.

Signed-off-by: Jeff King <peff@peff.net>
---
 userdiff.c | 6 +++---
 userdiff.h | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/userdiff.c b/userdiff.c
index 92ef649c99..ba168f50b5 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -409,15 +409,15 @@ int userdiff_config(const char *k, const char *v)
 	if (!strcmp(type, "binary"))
 		return parse_tristate(&drv->binary, k, v);
 	if (!strcmp(type, "command"))
-		return git_config_string(&drv->external, k, v);
+		return git_config_string_dup(&drv->external, k, v);
 	if (!strcmp(type, "textconv"))
-		return git_config_string(&drv->textconv, k, v);
+		return git_config_string_dup(&drv->textconv, k, v);
 	if (!strcmp(type, "cachetextconv"))
 		return parse_bool(&drv->textconv_want_cache, k, v);
 	if (!strcmp(type, "wordregex"))
 		return git_config_string(&drv->word_regex, k, v);
 	if (!strcmp(type, "algorithm"))
-		return git_config_string(&drv->algorithm, k, v);
+		return git_config_string_dup(&drv->algorithm, k, v);
 
 	return 0;
 }
diff --git a/userdiff.h b/userdiff.h
index d726804c3e..7cae1f4da0 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -13,13 +13,13 @@ struct userdiff_funcname {
 
 struct userdiff_driver {
 	const char *name;
-	const char *external;
-	const char *algorithm;
+	char *external;
+	char *algorithm;
 	int binary;
 	struct userdiff_funcname funcname;
 	const char *word_regex;
 	const char *word_regex_multi_byte;
-	const char *textconv;
+	char *textconv;
 	struct notes_cache *textconv_cache;
 	int textconv_want_cache;
 };
-- 
2.44.0.872.g288abe5b5b


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/11] blame: use "dup" string_list for ignore-revs files
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (9 preceding siblings ...)
  2024-04-07  1:04   ` [PATCH 10/11] userdiff: use git_config_string_dup() when we can Jeff King
@ 2024-04-07  1:07   ` Jeff King
  2024-04-07  2:42     ` Eric Sunshine
  2024-04-07  1:08   ` [PATCH 0/12] git_config_string() considered harmful Jeff King
  2024-04-07 17:58   ` Rubén Justo
  12 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:07 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

We feed items into ignore_revs_file_list in two ways: from config and
from the command-line. Items from the command-line point to argv entries
that we don't own, but items from config are hea-allocated strings whose
ownership is handed to the string list when we insert them.

Because the string_list is declared with "NODUP", our string_list_clear()
call will not free the entries themselves. This is the right thing for
the argv entries, but means that we leak the allocated config entries.

Let's declare the string list as owning its own strings, which means
that the argv entries will be copied.

For the config entries we would ideally use string_list_append_nodup()
to just hand off ownership of our strings. But we insert them into the
sorted list with string_list_insert(), which doesn't have a nodup
variant. Since this isn't a hot path, we can accept that the path
interpolation will produce a temporary string which is then freed.

Signed-off-by: Jeff King <peff@peff.net>
---
Not strictly related to this series, but something I noticed while
converting this spot in an earlier patch. I had thought originally we
could switch to avoid the extra copy altogether:

  if (!value)
	return config_error_nonbool();
  string_list_insert(&ignore_revs_file_list, value);

but of course that is missing the call to interpolate_path().

I imagine that it could also be further refactored to append while
reading the config, and then sort after. That's more efficient overall,
and would let us use append_nodup().

 builtin/blame.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 0b07ceb110..fa7f8fce09 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -67,7 +67,7 @@ static int no_whole_file_rename;
 static int show_progress;
 static char repeated_meta_color[COLOR_MAXLEN];
 static int coloring_mode;
-static struct string_list ignore_revs_file_list = STRING_LIST_INIT_NODUP;
+static struct string_list ignore_revs_file_list = STRING_LIST_INIT_DUP;
 static int mark_unblamable_lines;
 static int mark_ignored_lines;
 
@@ -725,6 +725,7 @@ static int git_blame_config(const char *var, const char *value,
 		if (ret)
 			return ret;
 		string_list_insert(&ignore_revs_file_list, str);
+		free(str);
 		return 0;
 	}
 	if (!strcmp(var, "blame.markunblamablelines")) {
-- 
2.44.0.872.g288abe5b5b

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/12] git_config_string() considered harmful
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (10 preceding siblings ...)
  2024-04-07  1:07   ` [PATCH 11/11] blame: use "dup" string_list for ignore-revs files Jeff King
@ 2024-04-07  1:08   ` Jeff King
  2024-04-07 17:58   ` Rubén Justo
  12 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  1:08 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Rubén Justo

On Sat, Apr 06, 2024 at 08:56:56PM -0400, Jeff King wrote:

> Subject: Re: [PATCH 0/12] git_config_string() considered harmful
> [...]
>   [01/11]: config: make sequencer.c's git_config_string_dup() public
>   [02/11]: config: add git_config_pathname_dup()
>   [03/11]: config: prefer git_config_string_dup() for temp variables
>   [04/11]: config: use git_config_string_dup() for open-coded equivalents
>   [05/11]: config: use git_config_string_dup() to fix leaky open coding
>   [06/11]: config: use git_config_string() in easy cases
>   [07/11]: config: use git_config_pathname_dup() in easy cases
>   [08/11]: http: use git_config_string_dup()
>   [09/11]: merge: use git_config_string_dup() for pull strategies
>   [10/11]: userdiff: use git_config_string_dup() when we can
>   [11/11]: blame: use "dup" string_list for ignore-revs files

The cover letter subject is wrong, of course. :) I wrote it manually and
ended up squashing two of the patches at the last minute.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 03/11] config: prefer git_config_string_dup() for temp variables
  2024-04-07  1:00   ` [PATCH 03/11] config: prefer git_config_string_dup() for temp variables Jeff King
@ 2024-04-07  1:50     ` Eric Sunshine
  2024-04-07  2:45       ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Sunshine @ 2024-04-07  1:50 UTC (permalink / raw
  To: Jeff King; +Cc: Junio C Hamano, git, Rubén Justo

On Sat, Apr 6, 2024 at 9:00 PM Jeff King <peff@peff.net> wrote:
> In some cases we may use git_config_string() or git_config_pathname() to
> read a value into a temporary variable, and then either hand off memory
> ownership of the new variable or free it. These are not potential leaks,
> since we know that there is no previous value we are overwriting.
>
> However, it's worth converting these to use git_config_string_dup() and
> git_config_pathname_dup(). It makes it easier to audit for leaky cases,
> and possibly we can get rid of the leak-prone functions in the future.
> Plus it lets the const-ness of our variables match their expected memory
> ownership, which avoids some casts when calling free().
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/builtin/config.c b/builtin/config.c
> @@ -282,11 +282,11 @@ static int format_config(struct strbuf *buf, const char *key_,
> -                       const char *v;
> -                       if (git_config_pathname(&v, key_, value_) < 0)
> +                       char *v = NULL;
> +                       if (git_config_pathname_dup(&v, key_, value_) < 0)
>                                 return -1;
>                         strbuf_addstr(buf, v);
> -                       free((char *)v);
> +                       free(v);

This revised code implies that git_config_pathname_dup() doesn't
assign allocated memory to `v` in the "failure" case, thus it is safe
to `return` immediately without calling free(v). However, the
documentation for git_config_pathname_dup() and cousins doesn't state
this explicitly, which means the caller needs to peek into the
implementation of git_config_pathname_dup() to verify that it is safe
to write code such as the above. Hence, should the documentation be
updated to explain that `v` won't be modified in the "failure" case?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/11] config: use git_config_string() in easy cases
  2024-04-07  1:02   ` [PATCH 06/11] config: use git_config_string() in easy cases Jeff King
@ 2024-04-07  2:05     ` Eric Sunshine
  2024-04-07  2:47       ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Sunshine @ 2024-04-07  2:05 UTC (permalink / raw
  To: Jeff King; +Cc: Junio C Hamano, git, Rubén Justo

On Sat, Apr 6, 2024 at 9:02 PM Jeff King <peff@peff.net> wrote:
> config: use git_config_string() in easy cases

Did you instead mean?

    config: use git_config_string_dup() in easy cases

> (like the one above), we can easily convert them to use a variant of the
> helper that frees the existing value before writing over it. That works
> because free(NULL) is a noop. What we can't change, though, is cases
> where the pointer is initialized to a string literal, like:

Fun historical (historic?) fact: On NeXTSTEP operating system,
free(NULL) and realloc(NULL,n) would crash the program, so getting
non-native (open-source) code to run correctly often involved more
than a little effort..

>   const char *foo = "some string literal";
>
> In that case it would be incorrect to free() the original value. Those
> cases will take more refactoring to make them leak-free, and this patch
> punts on that for now.
>
> In each case we switch the underlying variable from "const char *" to
> "char *" to indicate that it will always point to allocated memory (and
> our git_config_string_dup() interface enforces this).
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 11/11] blame: use "dup" string_list for ignore-revs files
  2024-04-07  1:07   ` [PATCH 11/11] blame: use "dup" string_list for ignore-revs files Jeff King
@ 2024-04-07  2:42     ` Eric Sunshine
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Sunshine @ 2024-04-07  2:42 UTC (permalink / raw
  To: Jeff King; +Cc: Junio C Hamano, git, Rubén Justo

On Sat, Apr 6, 2024 at 9:07 PM Jeff King <peff@peff.net> wrote:
> We feed items into ignore_revs_file_list in two ways: from config and
> from the command-line. Items from the command-line point to argv entries
> that we don't own, but items from config are hea-allocated strings whose
> ownership is handed to the string list when we insert them.

s/hea/heap/

> Because the string_list is declared with "NODUP", our string_list_clear()
> call will not free the entries themselves. This is the right thing for
> the argv entries, but means that we leak the allocated config entries.
>
> Let's declare the string list as owning its own strings, which means
> that the argv entries will be copied.
>
> For the config entries we would ideally use string_list_append_nodup()
> to just hand off ownership of our strings. But we insert them into the
> sorted list with string_list_insert(), which doesn't have a nodup
> variant. Since this isn't a hot path, we can accept that the path
> interpolation will produce a temporary string which is then freed.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 03/11] config: prefer git_config_string_dup() for temp variables
  2024-04-07  1:50     ` Eric Sunshine
@ 2024-04-07  2:45       ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  2:45 UTC (permalink / raw
  To: Eric Sunshine; +Cc: Junio C Hamano, git, Rubén Justo

On Sat, Apr 06, 2024 at 09:50:23PM -0400, Eric Sunshine wrote:

> On Sat, Apr 6, 2024 at 9:00 PM Jeff King <peff@peff.net> wrote:
> > In some cases we may use git_config_string() or git_config_pathname() to
> > read a value into a temporary variable, and then either hand off memory
> > ownership of the new variable or free it. These are not potential leaks,
> > since we know that there is no previous value we are overwriting.
> >
> > However, it's worth converting these to use git_config_string_dup() and
> > git_config_pathname_dup(). It makes it easier to audit for leaky cases,
> > and possibly we can get rid of the leak-prone functions in the future.
> > Plus it lets the const-ness of our variables match their expected memory
> > ownership, which avoids some casts when calling free().
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > diff --git a/builtin/config.c b/builtin/config.c
> > @@ -282,11 +282,11 @@ static int format_config(struct strbuf *buf, const char *key_,
> > -                       const char *v;
> > -                       if (git_config_pathname(&v, key_, value_) < 0)
> > +                       char *v = NULL;
> > +                       if (git_config_pathname_dup(&v, key_, value_) < 0)
> >                                 return -1;
> >                         strbuf_addstr(buf, v);
> > -                       free((char *)v);
> > +                       free(v);
> 
> This revised code implies that git_config_pathname_dup() doesn't
> assign allocated memory to `v` in the "failure" case, thus it is safe
> to `return` immediately without calling free(v). However, the
> documentation for git_config_pathname_dup() and cousins doesn't state
> this explicitly, which means the caller needs to peek into the
> implementation of git_config_pathname_dup() to verify that it is safe
> to write code such as the above. Hence, should the documentation be
> updated to explain that `v` won't be modified in the "failure" case?

That's nothing new with my patch, though, is it? The same would apply
for git_config_pathname(). And git_config_string() for that matter. I'm
also struggling to imagine what it _would_ copy into "v" on failure
anyway.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/11] config: use git_config_string() in easy cases
  2024-04-07  2:05     ` Eric Sunshine
@ 2024-04-07  2:47       ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2024-04-07  2:47 UTC (permalink / raw
  To: Eric Sunshine; +Cc: Junio C Hamano, git, Rubén Justo

On Sat, Apr 06, 2024 at 10:05:00PM -0400, Eric Sunshine wrote:

> On Sat, Apr 6, 2024 at 9:02 PM Jeff King <peff@peff.net> wrote:
> > config: use git_config_string() in easy cases
> 
> Did you instead mean?
> 
>     config: use git_config_string_dup() in easy cases

Oops, yes.

> > (like the one above), we can easily convert them to use a variant of the
> > helper that frees the existing value before writing over it. That works
> > because free(NULL) is a noop. What we can't change, though, is cases
> > where the pointer is initialized to a string literal, like:
> 
> Fun historical (historic?) fact: On NeXTSTEP operating system,
> free(NULL) and realloc(NULL,n) would crash the program, so getting
> non-native (open-source) code to run correctly often involved more
> than a little effort..

I'm happy that's no longer something we need to worry about. ;) IIRC we
had some discussion long ago about whether free(NULL) needed to be
protected, and came to the conclusion that it's OK everywhere.

(There's a discussion in a nearby thread about a possible git_free()
wrapper. The point there is errno handling, but of course it could also
cover this case for us if we found a modern platform where it was a
problem).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/12] git_config_string() considered harmful
  2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
                     ` (11 preceding siblings ...)
  2024-04-07  1:08   ` [PATCH 0/12] git_config_string() considered harmful Jeff King
@ 2024-04-07 17:58   ` Rubén Justo
  2024-04-08 20:55     ` Jeff King
  12 siblings, 1 reply; 23+ messages in thread
From: Rubén Justo @ 2024-04-07 17:58 UTC (permalink / raw
  To: Jeff King, Junio C Hamano; +Cc: git

On Sat, Apr 06, 2024 at 08:56:56PM -0400, Jeff King wrote:

> And it's not just git_config_pathname(), but really git_config_string(),

Indeed.

After Junio's series and yours, I'm on the fence now, but my envision was
to introduce:

--- >8 ---
diff --git a/config.c b/config.c
index eebce8c7e0..7322bdfb94 100644
--- a/config.c
+++ b/config.c
@@ -1345,6 +1345,15 @@ int git_config_string(const char **dest, const char *var, const char *value)
 	return 0;
 }
 
+int git_config_strbuf(struct strbuf *dest, const char *var, const char *value)
+{
+	if (!value)
+		return config_error_nonbool(var);
+	strbuf_reset(dest);
+	strbuf_addstr(dest, value);
+	return 0;
+}
+
 int git_config_pathname(const char **dest, const char *var, const char *value)
 {
 	if (!value)
diff --git a/config.h b/config.h
index f4966e3749..46e3137612 100644
--- a/config.h
+++ b/config.h
@@ -282,6 +282,12 @@ int git_config_bool(const char *, const char *);
  */
 int git_config_string(const char **, const char *, const char *);
 
+/**
+ * Copies the value string into the `dest` parameter; if no
+ * string is given, prints an error message and returns -1.
+ */
+int git_config_strbuf(struct strbuf *, const char *, const char *);
+
 /**
  * Similar to `git_config_string`, but expands `~` or `~user` into the
  * user's home directory when found at the beginning of the path.
--- 8< ---

To allow uses like:

--- >8 ---
diff --git a/config.c b/config.c
index 7322bdfb94..03884fa782 100644
--- a/config.c
+++ b/config.c
@@ -1572,7 +1572,7 @@ static int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.editor"))
-		return git_config_string(&editor_program, var, value);
+		return git_config_strbuf(&editor_program, var, value);
 
 	if (!strcmp(var, "core.commentchar") ||
 	    !strcmp(var, "core.commentstring")) {
diff --git a/editor.c b/editor.c
index b67b802ddf..618c193249 100644
--- a/editor.c
+++ b/editor.c
@@ -27,8 +27,8 @@ const char *git_editor(void)
 	const char *editor = getenv("GIT_EDITOR");
 	int terminal_is_dumb = is_terminal_dumb();
 
-	if (!editor && editor_program)
-		editor = editor_program;
+	if (!editor && editor_program.len)
+		editor = editor_program.buf;
 	if (!editor && !terminal_is_dumb)
 		editor = getenv("VISUAL");
 	if (!editor)
diff --git a/environment.c b/environment.c
index a73ba9c12c..b5073ff972 100644
--- a/environment.c
+++ b/environment.c
@@ -58,7 +58,7 @@ size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
 size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
-const char *editor_program;
+struct strbuf editor_program = STRBUF_INIT;
 const char *askpass_program;
 const char *excludes_file;
 enum auto_crlf auto_crlf = AUTO_CRLF_FALSE;
diff --git a/environment.h b/environment.h
index 05fd94d7be..c20898345e 100644
--- a/environment.h
+++ b/environment.h
@@ -220,7 +220,7 @@ const char *get_commit_output_encoding(void);
 extern const char *git_commit_encoding;
 extern const char *git_log_output_encoding;
 
-extern const char *editor_program;
+extern struct strbuf editor_program;
 extern const char *askpass_program;
 extern const char *excludes_file;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/12] git_config_string() considered harmful
  2024-04-07 17:58   ` Rubén Justo
@ 2024-04-08 20:55     ` Jeff King
  2024-04-11 23:36       ` Rubén Justo
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2024-04-08 20:55 UTC (permalink / raw
  To: Rubén Justo; +Cc: Junio C Hamano, git

On Sun, Apr 07, 2024 at 07:58:02PM +0200, Rubén Justo wrote:

> After Junio's series and yours, I'm on the fence now, but my envision was
> to introduce:
> 
> --- >8 ---
> diff --git a/config.c b/config.c
> index eebce8c7e0..7322bdfb94 100644
> --- a/config.c
> +++ b/config.c
> @@ -1345,6 +1345,15 @@ int git_config_string(const char **dest, const char *var, const char *value)
>  	return 0;
>  }
>  
> +int git_config_strbuf(struct strbuf *dest, const char *var, const char *value)
> +{
> +	if (!value)
> +		return config_error_nonbool(var);
> +	strbuf_reset(dest);
> +	strbuf_addstr(dest, value);
> +	return 0;
> +}
> +
>  int git_config_pathname(const char **dest, const char *var, const char *value)
>  {
>  	if (!value)

Hmm. I think that is nice in some ways, because it is a much bigger
signal about memory ownership than just dropping "const" from the
pointer.

But it is less nice in other ways. Every _user_ of the value now needs
to care that it is a strbuf, and use foo.buf consistently (which you
obviously noticed). Likewise, any downstream writers of the variable
need to treat it like a strbuf, too. So the parse-options OPT_FILENAME()
macro, for example, needs to be replaced with a strbuf-aware variant
(though arguably that is an improvement, as using the wrong one would
fail catastrophically, whereas using a non-const pointer with
OPT_FILENAME() creates a subtle bug).

I'm also not sure what the solution is for setting default values, like:

  const char *my_config = "default";

Of course that is a problem with my solution, too. Perhaps in the long
run we need to accept that they should always default to NULL, and
readers of the variable need to fill in the default when they look at it
(possibly with an accessor function).

Or I guess the alternative is to stop using bare pointers, and carry a
bit which tells us if something is heap-allocated. Which starts to look
an awful lot like a strbuf. ;)

I think in the past we've talked about being able to initialize a strbuf
like:

  struct strbuf foo = STRBUF_INIT_VAL("bar");

That would use "bar" instead of the usual slopbuf, but we can still tell
it's not a heap buffer by the fact that foo.alloc is 0.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/12] git_config_string() considered harmful
  2024-04-08 20:55     ` Jeff King
@ 2024-04-11 23:36       ` Rubén Justo
  0 siblings, 0 replies; 23+ messages in thread
From: Rubén Justo @ 2024-04-11 23:36 UTC (permalink / raw
  To: Jeff King; +Cc: Junio C Hamano, git

On Mon, Apr 08, 2024 at 04:55:11PM -0400, Jeff King wrote:
> On Sun, Apr 07, 2024 at 07:58:02PM +0200, Rubén Justo wrote:
> 
> > After Junio's series and yours, I'm on the fence now, but my envision was
> > to introduce:
> > 
> > --- >8 ---
> > diff --git a/config.c b/config.c
> > index eebce8c7e0..7322bdfb94 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -1345,6 +1345,15 @@ int git_config_string(const char **dest, const char *var, const char *value)
> >  	return 0;
> >  }
> >  
> > +int git_config_strbuf(struct strbuf *dest, const char *var, const char *value)
> > +{
> > +	if (!value)
> > +		return config_error_nonbool(var);
> > +	strbuf_reset(dest);
> > +	strbuf_addstr(dest, value);
> > +	return 0;
> > +}
> > +
> >  int git_config_pathname(const char **dest, const char *var, const char *value)
> >  {
> >  	if (!value)
> 
> Hmm. I think that is nice in some ways, because it is a much bigger
> signal about memory ownership than just dropping "const" from the
> pointer.
> 
> But it is less nice in other ways. Every _user_ of the value now needs
> to care that it is a strbuf, and use foo.buf consistently (which you
> obviously noticed). Likewise, any downstream writers of the variable
> need to treat it like a strbuf, too. So the parse-options OPT_FILENAME()
> macro, for example, needs to be replaced with a strbuf-aware variant
> (though arguably that is an improvement, as using the wrong one would
> fail catastrophically, whereas using a non-const pointer with
> OPT_FILENAME() creates a subtle bug).
> 
> I'm also not sure what the solution is for setting default values, like:
> 
>   const char *my_config = "default";
> 
> Of course that is a problem with my solution, too. Perhaps in the long
> run we need to accept that they should always default to NULL, and
> readers of the variable need to fill in the default when they look at it
> (possibly with an accessor function).
> 
> Or I guess the alternative is to stop using bare pointers, and carry a
> bit which tells us if something is heap-allocated. Which starts to look
> an awful lot like a strbuf. ;)
> 
> I think in the past we've talked about being able to initialize a strbuf
> like:
> 
>   struct strbuf foo = STRBUF_INIT_VAL("bar");
> 
> That would use "bar" instead of the usual slopbuf, but we can still tell
> it's not a heap buffer by the fact that foo.alloc is 0.
> 
> -Peff

Hi Peff.

Thanks for your ideas.

For the globals we have in environment.h, maybe we can keep them const
and avoid the other inconveniences, doing something like:

diff --git a/config.c b/config.c
index 146856567a..ead3565c27 100644
--- a/config.c
+++ b/config.c
@@ -1671,8 +1671,13 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
                return 0;
        }

-       if (!strcmp(var, "core.editor"))
-               return git_config_string(&editor_program, var, value);
+       if (!strcmp(var, "core.editor")) {
+               static struct strbuf editor_program_ = STRBUF_INIT;
+               if (git_config_strbuf(&editor_program_, var, value))
+                       return -1;
+               editor_program = editor_program_.buf;
+               return 0;
+       }

        if (!strcmp(var, "core.commentchar")) {
                if (!value)


^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-04-11 23:36 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-06 18:11 [PATCH] config: do not leak excludes_file Junio C Hamano
2024-04-06 19:17 ` [WIP] git_config_pathname() leakfix Junio C Hamano
2024-04-07  0:56 ` [PATCH 0/12] git_config_string() considered harmful Jeff King
2024-04-07  0:58   ` [PATCH 01/11] config: make sequencer.c's git_config_string_dup() public Jeff King
2024-04-07  1:00   ` [PATCH 02/11] config: add git_config_pathname_dup() Jeff King
2024-04-07  1:00   ` [PATCH 03/11] config: prefer git_config_string_dup() for temp variables Jeff King
2024-04-07  1:50     ` Eric Sunshine
2024-04-07  2:45       ` Jeff King
2024-04-07  1:01   ` [PATCH 04/11] config: use git_config_string_dup() for open-coded equivalents Jeff King
2024-04-07  1:01   ` [PATCH 05/11] config: use git_config_string_dup() to fix leaky open coding Jeff King
2024-04-07  1:02   ` [PATCH 06/11] config: use git_config_string() in easy cases Jeff King
2024-04-07  2:05     ` Eric Sunshine
2024-04-07  2:47       ` Jeff King
2024-04-07  1:03   ` [PATCH 07/11] config: use git_config_pathname_dup() " Jeff King
2024-04-07  1:03   ` [PATCH 08/11] http: use git_config_string_dup() Jeff King
2024-04-07  1:04   ` [PATCH 09/11] merge: use git_config_string_dup() for pull strategies Jeff King
2024-04-07  1:04   ` [PATCH 10/11] userdiff: use git_config_string_dup() when we can Jeff King
2024-04-07  1:07   ` [PATCH 11/11] blame: use "dup" string_list for ignore-revs files Jeff King
2024-04-07  2:42     ` Eric Sunshine
2024-04-07  1:08   ` [PATCH 0/12] git_config_string() considered harmful Jeff King
2024-04-07 17:58   ` Rubén Justo
2024-04-08 20:55     ` Jeff King
2024-04-11 23:36       ` Rubén Justo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).