Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: vdye@github.com, johannes.schindelin@gmx.de, newren@gmail.com,
	peff@peff.net, gitster@pobox.com,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH 0/6] [RFC] Lazy-loaded default Git config
Date: Fri, 02 Jun 2023 14:33:34 +0000	[thread overview]
Message-ID: <pull.1539.git.1685716420.gitgitgadget@gmail.com> (raw)

While working on the stronger protections around replace refs config [1], I
wanted to think about a way to scale that concept of ensuring that we've
read the config before accessing the global to a wider scope of config keys.

[1]
https://lore.kernel.org/git/pull.1537.v2.git.1685716157.gitgitgadget@gmail.com/

If you wish to apply these patches, then do so on top of v2 of [1].

This RFC attempt to explore this space, to get a feeling whether this is
solving a worthwhile problem and what this system should look like.

One example option is presented here, but I'm not super-confident in it and
only implemented enough translations such that I could gather enough
examples to know this would work at least a little bit.

The basic idea is to create a new 'global-config' library that has enums for
different config keys, and from each enum value we can load the config
string and the stored global value, as well as access an indicator that the
value has been loaded from config yet.

In this way, we can guard accesses to the global state through a method
which will lazy-load the config value as needed. This avoids the kind of bug
where a builtin forgets to initialize config through a git_config() call at
the right time.

One of the issues that becomes particularly important is that some of our
globals are accessed before config is available. Specifically, in patch 6,
we replace the ignore_case global with the method accessor. This is used in
fspathcmp() to toggle between strcasecmp() and strcmp(). However, this
method is called as part of the process to set up the Git directory, which
must happen before we can start loading config. The current behavior is to
always use the default of 0 at this time.

To get around this, I introduce a declare_config_available() method which is
called at the right time in git.c and test-tool.c. Before this is called, we
will silently pass the default value instead of loading from config. Without
this behavior, our test suite would fail in a BUG() statement due to the
inconsistent use of the config library.

My hope with submitting this RFC is that we could come to some conclusions
on these questions:

 * Is this a worthwhile category of bug to protect against?
 * Is wrapping global state in accessor methods the right way to protect
   against that class of bugs?
 * Are there other benefits to doing this?
 * Are there reasons why this cure is worse than the disease?

There are a few things that I personally find could use improvement and I
would like to update before submitting this for full review. (I chose to not
spend extra time re-working this until there is agreement on the general
direction.) Here are the things I've identified so far:

 1. The (static) global state in global-config.c is split across two arrays
    and a bitmask. It would be better to have a single array of structs so
    we can have the default values described right next to the config key
    strings.
 2. The current accessor method is limited to returning ints, and is named
    as such. But many of the keys it replace are actually boolean-valued.
    The current implementation tries ...get_maybe_bool() before then trying
    ...get_int(), which would extend values like core.filemode to allow
    taking the value "3". This could be fixed by using the struct approach
    and signalling that some int-typed globals only allow boolean values.
    (This holds the same for int-valued things that shouldn't take "true".)
 3. The method is currently scoped globally, and does not allow for
    repository-scoped values. None of these globals are currently scoped to
    repositories, but it would be less intrusive to do repository-scoping
    later if the accessor method allowed repository scoping and then looked
    at the key-specific settings as to whether we should look globally or in
    a repository.
 4. The replacement of a constant like 'ignore_case' with a method like
    'get_int_config_global(INT_CONFIG_IGNORE_CASE)' is quite verbose and
    ugly. I specifically chose this approach instead of dedicated methods
    like 'replace_refs_enabled()' in [1] because I didn't think that
    approach would scale. However, we could perhaps use macros as a way to
    reduce the verbosity, especially when deciding to add repository-scoping
    as necessary. For instance, we could define 'get_ignore_case()' to be
    'get_int_config_global(the_repository, INT_CONFIG_IGNORE_CASE)' in
    global-config.h to avoid spreading this verbosity across the codebase.
    Alternatively, we could define these as inline methods if that is
    preferred.

Thank you for your help determining the right way forward here.

Thanks, -Stolee

Derrick Stolee (6):
  config: create new global config helpers
  config: add trust_executable_bit to global config
  config: move trust_ctime to global config
  config: move quote_path_fully to global config
  config: move has_symlinks to global config
  config: move ignore_case to global config

 Makefile                                |  1 +
 apply.c                                 |  6 ++-
 builtin/difftool.c                      |  4 +-
 builtin/mv.c                            |  2 +-
 cache.h                                 |  9 ++++-
 combine-diff.c                          |  2 +-
 config.c                                | 24 -----------
 dir.c                                   | 23 ++++++-----
 entry.c                                 |  3 +-
 environment.c                           |  4 --
 environment.h                           |  4 --
 git.c                                   |  4 ++
 global-config.c                         | 54 +++++++++++++++++++++++++
 global-config.h                         | 30 ++++++++++++++
 merge-recursive.c                       | 11 ++---
 name-hash.c                             |  6 +--
 quote.c                                 |  6 +--
 quote.h                                 |  2 -
 read-cache.c                            | 22 +++++-----
 t/helper/test-lazy-init-name-hash.c     |  5 ---
 t/helper/test-tool.c                    |  3 ++
 t/perf/p0004-lazy-init-name-hash.sh     |  6 +++
 t/t3008-ls-files-lazy-init-name-hash.sh |  1 +
 unpack-trees.c                          |  2 +-
 24 files changed, 156 insertions(+), 78 deletions(-)
 create mode 100644 global-config.c
 create mode 100644 global-config.h


base-commit: 4c7dbeb8c6dd6ab4c1903196967d5e0906a293c2
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1539%2Fderrickstolee%2Fdefault-config-safety-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1539/derrickstolee/default-config-safety-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1539
-- 
gitgitgadget

             reply	other threads:[~2023-06-02 14:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 14:33 Derrick Stolee via GitGitGadget [this message]
2023-06-02 14:33 ` [PATCH 1/6] config: create new global config helpers Derrick Stolee via GitGitGadget
2023-06-02 14:33 ` [PATCH 2/6] config: add trust_executable_bit to global config Derrick Stolee via GitGitGadget
2023-06-02 14:33 ` [PATCH 3/6] config: move trust_ctime " Derrick Stolee via GitGitGadget
2023-06-02 14:33 ` [PATCH 4/6] config: move quote_path_fully " Derrick Stolee via GitGitGadget
2023-06-02 14:33 ` [PATCH 5/6] config: move has_symlinks " Derrick Stolee via GitGitGadget
2023-06-02 14:33 ` [PATCH 6/6] config: move ignore_case " Derrick Stolee via GitGitGadget
2023-06-08 18:19 ` [PATCH 0/6] [RFC] Lazy-loaded default Git config Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1539.git.1685716420.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).