Git Mailing List Archive mirror
 help / color / mirror / code / Atom feed
* [RFC PATCH 0/8] Introduce Git Standard Library
@ 2023-06-27 19:52 Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
                   ` (10 more replies)
  0 siblings, 11 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Introduction / Pre-reading
================

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.
This cover letter will explain the rationale behind having a root
dependency that encompasses many files in the form of a standard library
rather than many root dependencies/libraries of those files. This does
not mean that the Git Standard Library will be the only possible root
dependency in the future, but rather the most significant and widely
used one. I will also explain why each file was chosen to be a part of
Git Standard Library v1. I will not explain entirely why we would like
to libify parts of Git -- see here[1] for that context.

Before looking at this series, it probably makes sense to look at the
other series that this is built on top of since that is the state I will
be referring to in this cover letter:

  - Elijah's final cache.h cleanup series[2]
  - my strbuf cleanup series[3]
  - my git-compat-util cleanup series[4]

Most importantly, in the git-compat-util series, the declarations for
functions implemented in wrapper.c and usage.c have been moved to their
respective header files, wrapper.h and usage.h, from git-compat-util.h.
Also config.[ch] had its general parsing code moved to parse.[ch].

Dependency graph in libified Git
================

If you look in the Git Makefile, all of the objects defined in the Git
library are compiled and archived into a singular file, libgit.a, which
is linked against by common-main.o with other external dependencies and
turned into the Git executable. In other words, the Git executable has
dependencies on libgit.a and a couple of external libraries. While our
efforts to libify Git will not affect this current build flow, it will
provide an alternate method for building Git.

With our current method of building Git, we can imagine the dependency
graph as such:

        Git
         /\
        /  \
       /    \
  libgit.a   ext deps

In libifying parts of Git, we want to shrink the dependency graph to
only the minimal set of dependencies, so libraries should not use
libgit.a. Instead, it would look like:

                Git
                /\
               /  \
              /    \
          libgit.a  ext deps
             /\
            /  \
           /    \
object-store.a  (other lib)
      |        /
      |       /
      |      /
 config.a   / 
      |    /
      |   /
      |  /
git-std-lib.a

Instead of containing all of the objects in Git, libgit.a would contain
objects that are not built by libraries it links against. Consequently,
if someone wanted their own custom build of Git with their own custom
implementation of the object store, they would only have to swap out
object-store.a rather than do a hard fork of Git.

Rationale behind Git Standard Library
================

The rationale behind Git Standard Library essentially is the result of
two observations within the Git codebase: every file includes
git-compat-util.h which defines functions in a couple of different
files, and wrapper.c + usage.c have difficult-to-separate circular
dependencies with each other and other files.

Ubiquity of git-compat-util.h and circular dependencies
========

Every file in the Git codebase includes git-compat-util.h. It serves as
"a compatibility aid that isolates the knowledge of platform specific
inclusion order and what feature macros to define before including which
system header" (Junio[5]). Since every file includes git-compat-util.h, and
git-compat-util.h includes wrapper.h and usage.h, it would make sense
for wrapper.c and usage.c to be a part of the root library. They have
difficult to separate circular dependencies with each other so they
can't be independent libraries. Wrapper.c has dependencies on parse.c,
abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
wrapper.c -- more circular dependencies. 

Tradeoff between swappability and refactoring
========

From the above dependency graph, we can see that git-std-lib.a could be
many smaller libraries rather than a singular library. So why choose a
singular library when multiple libraries can be individually easier to
swap and are more modular? A singular library requires less work to
separate out circular dependencies within itself so it becomes a
tradeoff question between work and reward. While there may be a point in
the future where a file like usage.c would want its own library so that
someone can have custom die() or error(), the work required to refactor
out the circular dependencies in some files would be enormous due to
their ubiquity so therefore I believe it is not worth the tradeoff
currently. Additionally, we can in the future choose to do this refactor
and change the API for the library if there becomes enough of a reason
to do so (remember we are avoiding promising stability of the interfaces
of those libraries).

Reuse of compatibility functions in git-compat-util.h
========

Most functions defined in git-compat-util.h are implemented in compat/
and have dependencies limited to strbuf.h and wrapper.h so they can be
easily included in git-std-lib.a, which as a root dependency means that
higher level libraries do not have to worry about compatibility files in
compat/. The rest of the functions defined in git-compat-util.h are
implemented in top level files and, in this patch set, are hidden behind
an #ifdef if their implementation is not in git-std-lib.a.

Rationale summary
========

The Git Standard Library allows us to get the libification ball rolling
with other libraries in Git (such as Glen's removal of global state from
config iteration[6] prepares a config library). By not spending many
more months attempting to refactor difficult circular dependencies and
instead spending that time getting to a state where we can test out
swapping a library out such as config or object store, we can prove the
viability of Git libification on a much faster time scale. Additionally
the code cleanups that have happened so far have been minor and
beneficial for the codebase. It is probable that making large movements
would negatively affect code clarity.

Git Standard Library boundary
================

While I have described above some useful heuristics for identifying
potential candidates for git-std-lib.a, a standard library should not
have a shaky definition for what belongs in it.

 - Low-level files (aka operates only on other primitive types) that are
   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
   - Dependencies that are low-level and widely used
     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
 - low-level git/* files with functions defined in git-compat-util.h
   (ctype.c)
 - compat/*

There are other files that might fit this definition, but that does not
mean it should belong in git-std-lib.a. Those files should start as
their own separate library since any file added to git-std-lib.a loses
its flexibility of being easily swappable.

Files inside of Git Standard Library
================

The initial set of files in git-std-lib.a are:
abspath.c
ctype.c
date.c
hex-ll.c
parse.c
strbuf.c
usage.c
utf8.c
wrapper.c
relevant compat/ files

Pitfalls
================

In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
certain function headers. As other parts of Git are libified, if we
have to use more ifdefs for each different library, then the codebase
will become uglier and harder to understand. 

There are a small amount of files under compat/* that have dependencies
not inside of git-std-lib.a. While those functions are not called on
Linux, other OSes might call those problematic functions. I don't see
this as a major problem, just moreso an observation that libification in
general may also require some minor compatibility work in the future.

Testing
================

Patch 8 introduces a temporary test file which will be replaced with
unit tests once a unit testing framework is decided upon[7]. It simply
proves that all of the functions in git-std-lib.a do not have any
missing dependencies and can stand up by itself.

I have not yet tested building Git with git-std-lib.a yet (basically
removing the objects in git-std-lib.a from LIB_OBJS and linking against
git-std-lib.a instead), but I intend on testing this in a future version
of this patch. As an RFC, I want to showcase git-std-lib.a as an
experimental dependency that other executables can include in order to
use Git binaries. Internally we have tested building and calling
functions in git-std-lib.a from other programs.

Unit tests should catch any breakages caused by changes to files in
git-std-lib.a (i.e. introduction of a out of scope dependency) and new
functions introduced to git-std-lib.a will require unit tests written
for them.

Series structure
================

While my strbuf and git-compat-util series can stand alone, they also
function as preparatory patches for this series. There are more cleanup
patches in this series, but since most of them have marginal benefits
probably not worth the churn on its own, I decided not to split them
into a separate series like with strbuf and git-compat-util. As an RFC,
I am looking for comments on whether the rationale behind git-std-lib
makes sense as well as whether there are better ways to build and enable
git-std-lib in patch 7, specifically regarding Makefile rules and the
usage of ifdef's to stub out certain functions and headers. 

The patch series is structured as follows:

Patches 1-6 are cleanup patches to remove the last few extraneous
dependencies from git-std-lib.a. Here's a short summary of the
dependencies that are specifically removed from git-std-lib.a since some
of the commit messages and diffs showcase dependency cleanups for other
files not directly related to git-std-lib.a:
 - Patch 1 removes trace2.h and repository.h dependencies from wrapper.c
 - Patch 2 removes the repository.h dependency from strbuf.c inherited from
   hex.c by separating it into hex-ll.c and hex.c
 - Patch 3 removes the object.h dependency from wrapper.c
 - Patch 4 is a bug fix that sets up the next patch. This importantly
   removes the git_config_bool() call from git_env_bool() so that env
   parsing can go in a separate file
 - Patch 5 removes the config.h dependency from wrapper.c and swaps it
   with a dependency to parse.h, which doesn't have extraneous
   dependencies to files outside of git-std-lib.a
 - Patch 6 removes the pager.h dependency from date.c

Patch 7 introduces Git standard library.

Patch 8 introduces a temporary test file for Git standard library. The
test file directly or indirectly calls all functions in git-std-lib.a to
showcase that the functions don't reference missing objects and that
git-std-lib.a can stand on its own.

[1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
[2] https://lore.kernel.org/git/pull.1525.v3.git.1684218848.gitgitgadget@gmail.com/
[3] https://lore.kernel.org/git/20230606194720.2053551-1-calvinwan@google.com/
[4] https://lore.kernel.org/git/20230606170711.912972-1-calvinwan@google.com/
[5] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
[6] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com/
[7] https://lore.kernel.org/git/8afdb215d7e10ca16a2ce8226b4127b3d8a2d971.1686352386.git.steadmon@google.com/

Calvin Wan (8):
  trace2: log fsync stats in trace2 rather than wrapper
  hex-ll: split out functionality from hex
  object: move function to object.c
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  pager: remove pager_in_use()
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++
 Makefile                                |  30 ++-
 attr.c                                  |   2 +-
 builtin/log.c                           |   2 +-
 color.c                                 |   4 +-
 column.c                                |   2 +-
 config.c                                | 173 +----------------
 config.h                                |  14 +-
 date.c                                  |   4 +-
 git-compat-util.h                       |   7 +-
 git.c                                   |   2 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 object.c                                |   5 +
 object.h                                |   6 +
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 pager.c                                 |   5 -
 pager.h                                 |   1 -
 parse-options.c                         |   3 +-
 parse.c                                 | 182 ++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 strbuf.c                                |   2 +-
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 239 ++++++++++++++++++++++++
 trace2.c                                |  13 ++
 trace2.h                                |   5 +
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 usage.c                                 |   8 +
 wrapper.c                               |  25 +--
 wrapper.h                               |   9 +-
 write-or-die.c                          |   2 +-
 44 files changed, 813 insertions(+), 311 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 t/stdlib-test.c

-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28  2:05   ` Victoria Dye
  2023-07-11 20:07   ` Jeff Hostetler
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

As a library boundary, wrapper.c should not directly log trace2
statistics, but instead provide those statistics upon
request. Therefore, move the trace2 logging code to trace2.[ch.]. This
also allows wrapper.c to not be dependent on trace2.h and repository.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 trace2.c  | 13 +++++++++++++
 trace2.h  |  5 +++++
 wrapper.c | 17 ++++++-----------
 wrapper.h |  4 ++--
 4 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/trace2.c b/trace2.c
index 0efc4e7b95..f367a1ce31 100644
--- a/trace2.c
+++ b/trace2.c
@@ -915,3 +915,16 @@ const char *trace2_session_id(void)
 {
 	return tr2_sid_get();
 }
+
+static void log_trace_fsync_if(const char *key)
+{
+	intmax_t value = get_trace_git_fsync_stats(key);
+	if (value)
+		trace2_data_intmax("fsync", the_repository, key, value);
+}
+
+void trace_git_fsync_stats(void)
+{
+	log_trace_fsync_if("fsync/writeout-only");
+	log_trace_fsync_if("fsync/hardware-flush");
+}
diff --git a/trace2.h b/trace2.h
index 4ced30c0db..689e9a4027 100644
--- a/trace2.h
+++ b/trace2.h
@@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
 
 const char *trace2_session_id(void);
 
+/*
+ * Writes out trace statistics for fsync
+ */
+void trace_git_fsync_stats(void);
+
 #endif /* TRACE2_H */
diff --git a/wrapper.c b/wrapper.c
index 22be9812a7..bd7f0a9752 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -6,9 +6,7 @@
 #include "config.h"
 #include "gettext.h"
 #include "object.h"
-#include "repository.h"
 #include "strbuf.h"
-#include "trace2.h"
 
 static intmax_t count_fsync_writeout_only;
 static intmax_t count_fsync_hardware_flush;
@@ -600,16 +598,13 @@ int git_fsync(int fd, enum fsync_action action)
 	}
 }
 
-static void log_trace_fsync_if(const char *key, intmax_t value)
+intmax_t get_trace_git_fsync_stats(const char *key)
 {
-	if (value)
-		trace2_data_intmax("fsync", the_repository, key, value);
-}
-
-void trace_git_fsync_stats(void)
-{
-	log_trace_fsync_if("fsync/writeout-only", count_fsync_writeout_only);
-	log_trace_fsync_if("fsync/hardware-flush", count_fsync_hardware_flush);
+	if (!strcmp(key, "fsync/writeout-only"))
+		return count_fsync_writeout_only;
+	if (!strcmp(key, "fsync/hardware-flush"))
+		return count_fsync_hardware_flush;
+	return 0;
 }
 
 static int warn_if_unremovable(const char *op, const char *file, int rc)
diff --git a/wrapper.h b/wrapper.h
index c85b1328d1..db1bc109ed 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -88,9 +88,9 @@ enum fsync_action {
 int git_fsync(int fd, enum fsync_action action);
 
 /*
- * Writes out trace statistics for fsync using the trace2 API.
+ * Returns trace statistics for fsync using the trace2 API.
  */
-void trace_git_fsync_stats(void);
+intmax_t get_trace_git_fsync_stats(const char *key);
 
 /*
  * Preserves errno, prints a message, but gives no warning for ENOENT.
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28 13:15   ` Phillip Wood
  2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 045e2187c4..83b385b0be 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index 83abb11eda..f3c0a4659b 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 7bb440e794..03e55841ed 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 7df4b3c460..c07c8b34c2 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a SHA1 in hexadecimal format from the 40 characters
@@ -32,13 +17,6 @@ int get_oid_hex(const char *hex, struct object_id *sha1);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 2aeb20e5e6..eb34c30be7 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 8dac52b919..a2a05fe168 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index eba0bdd77f..f1aa87d1dd 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 3/8] object: move function to object.c
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

While remove_or_warn() is a simple ternary operator to call two other
wrapper functions, it creates an unnecessary dependency to object.h in
wrapper.c. Therefore move the function to object.[ch] where the concept
of GITLINKs is first defined.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 object.c  | 5 +++++
 object.h  | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index 60f954194f..cb29fcc304 100644
--- a/object.c
+++ b/object.c
@@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
 	FREE_AND_NULL(o->object_state);
 	FREE_AND_NULL(o->shallow_stat);
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/object.h b/object.h
index 5871615fee..e908ef6515 100644
--- a/object.h
+++ b/object.h
@@ -284,4 +284,10 @@ void clear_object_flags(unsigned flags);
  */
 void repo_clear_commit_marks(struct repository *r, unsigned int flags);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* OBJECT_H */
diff --git a/wrapper.c b/wrapper.c
index bd7f0a9752..62c04aeb17 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "strbuf.h"
 
 static intmax_t count_fsync_writeout_only;
@@ -642,11 +641,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index db1bc109ed..166740ae60 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 4/8] config: correct bad boolean env value error message
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (2 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 09851a6909..5b71ef1624 100644
--- a/config.c
+++ b/config.c
@@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 5/8] parse: create new library for parsing strings and env values
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (3 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 22:58   ` Junio C Hamano
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config.

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 83b385b0be..e9ad9f9ef1 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index e9c81b6e07..cb047b4618 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 5b71ef1624..cdd70999aa 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1204,129 +1205,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 static int reader_config_name(struct config_reader *reader, const char **out);
 static int reader_origin_type(struct config_reader *reader,
 			      enum config_origin_type *type);
@@ -1404,23 +1282,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value)
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1495,16 +1356,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
 {
 	int v = git_parse_maybe_bool_text(value);
@@ -2165,35 +2016,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 247b572b37..7a7f53e503 100644
--- a/config.h
+++ b/config.h
@@ -3,7 +3,7 @@
 
 #include "hashmap.h"
 #include "string-list.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -205,16 +205,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -343,8 +333,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index f8a155ee13..9f542950a7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 4991455281..39337999d4 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 62c04aeb17..3e554f50c6 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "strbuf.h"
 
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (4 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

pager_in_use() is simply a wrapper around
git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
git_env_bool() in this fashion also do not have a wrapper function
around it. By removing pager_in_use(), we can also get rid of the
pager.h dependency from a few files.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 builtin/log.c | 2 +-
 color.c       | 2 +-
 column.c      | 2 +-
 date.c        | 4 ++--
 git.c         | 2 +-
 pager.c       | 5 -----
 pager.h       | 1 -
 7 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 03954fb749..d5e979932f 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -82,7 +82,7 @@ struct line_opt_callback_data {
 
 static int session_is_interactive(void)
 {
-	return isatty(1) || pager_in_use();
+	return isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0);
 }
 
 static int auto_decoration_style(void)
diff --git a/color.c b/color.c
index f3c0a4659b..dd6f26b8db 100644
--- a/color.c
+++ b/color.c
@@ -388,7 +388,7 @@ static int check_auto_color(int fd)
 	int *is_tty_p = fd == 1 ? &color_stdout_is_tty : &color_stderr_is_tty;
 	if (*is_tty_p < 0)
 		*is_tty_p = isatty(fd);
-	if (*is_tty_p || (fd == 1 && pager_in_use() && pager_use_color)) {
+	if (*is_tty_p || (fd == 1 && git_env_bool("GIT_PAGER_IN_USE", 0) && pager_use_color)) {
 		if (!is_terminal_dumb())
 			return 1;
 	}
diff --git a/column.c b/column.c
index ff2f0abf39..e15ca70f36 100644
--- a/column.c
+++ b/column.c
@@ -214,7 +214,7 @@ int finalize_colopts(unsigned int *colopts, int stdout_is_tty)
 		if (stdout_is_tty < 0)
 			stdout_is_tty = isatty(1);
 		*colopts &= ~COL_ENABLE_MASK;
-		if (stdout_is_tty || pager_in_use())
+		if (stdout_is_tty || git_env_bool("GIT_PAGER_IN_USE", 0))
 			*colopts |= COL_ENABLED;
 	}
 	return 0;
diff --git a/date.c b/date.c
index 619ada5b20..95c0f568ba 100644
--- a/date.c
+++ b/date.c
@@ -7,7 +7,7 @@
 #include "git-compat-util.h"
 #include "date.h"
 #include "gettext.h"
-#include "pager.h"
+#include "parse.h"
 #include "strbuf.h"
 
 /*
@@ -1009,7 +1009,7 @@ void parse_date_format(const char *format, struct date_mode *mode)
 
 	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
 	if (skip_prefix(format, "auto:", &p)) {
-		if (isatty(1) || pager_in_use())
+		if (isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0))
 			format = p;
 		else
 			format = "default";
diff --git a/git.c b/git.c
index eb69f4f997..3bfb673a4c 100644
--- a/git.c
+++ b/git.c
@@ -131,7 +131,7 @@ static void commit_pager_choice(void)
 
 void setup_auto_pager(const char *cmd, int def)
 {
-	if (use_pager != -1 || pager_in_use())
+	if (use_pager != -1 || git_env_bool("GIT_PAGER_IN_USE", 0))
 		return;
 	use_pager = check_pager_config(cmd);
 	if (use_pager == -1)
diff --git a/pager.c b/pager.c
index 63055d0873..9b392622d2 100644
--- a/pager.c
+++ b/pager.c
@@ -149,11 +149,6 @@ void setup_pager(void)
 	atexit(wait_for_pager_atexit);
 }
 
-int pager_in_use(void)
-{
-	return git_env_bool("GIT_PAGER_IN_USE", 0);
-}
-
 /*
  * Return cached value (if set) or $COLUMNS environment variable (if
  * set and positive) or ioctl(1, TIOCGWINSZ).ws_col (if positive),
diff --git a/pager.h b/pager.h
index b77433026d..6832c6168d 100644
--- a/pager.h
+++ b/pager.h
@@ -5,7 +5,6 @@ struct child_process;
 
 const char *git_pager(int stdout_is_tty);
 void setup_pager(void);
-int pager_in_use(void);
 int term_columns(void);
 void term_clear_line(void);
 int decimal_width(uintmax_t);
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (5 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28 13:27   ` Phillip Wood
  2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++++++++
 Makefile                                |  28 +++-
 git-compat-util.h                       |   7 +-
 symlinks.c                              |   2 +
 usage.c                                 |   8 ++
 5 files changed, 225 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3dce36c9f9
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,182 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind Git Standard Library essentially is the result of
+two observations within the Git codebase: every file includes
+git-compat-util.h which defines functions in a couple of different
+files, and wrapper.c + usage.c have difficult-to-separate circular
+dependencies with each other and other files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and, in this patch set, are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+relevant compat/ files
+
+Pitfalls
+================
+
+In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
+certain function headers. As other parts of Git are libified, if we
+have to use more ifdefs for each different library, then the codebase
+will become uglier and harder to understand. 
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index e9ad9f9ef1..255bd10b82 100644
--- a/Makefile
+++ b/Makefile
@@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3654,7 +3659,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) po/git.pot po/git-core.pot
 	$(RM) git.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3834,3 +3839,24 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib GIT_STD_LIB=YesPlease`
+STD_LIB = git-std-lib.a
+
+GIT_STD_LIB_OBJS += abspath.o
+GIT_STD_LIB_OBJS += ctype.o
+GIT_STD_LIB_OBJS += date.o
+GIT_STD_LIB_OBJS += hex-ll.o
+GIT_STD_LIB_OBJS += parse.o
+GIT_STD_LIB_OBJS += strbuf.o
+GIT_STD_LIB_OBJS += usage.o
+GIT_STD_LIB_OBJS += utf8.o
+GIT_STD_LIB_OBJS += wrapper.o
+
+$(STD_LIB): $(GIT_STD_LIB_OBJS) $(COMPAT_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+git-std-lib: $(STD_LIB)
diff --git a/git-compat-util.h b/git-compat-util.h
index 481dac22b0..75aa9b263e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -787,9 +787,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 /*
  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
@@ -951,14 +953,17 @@ int git_access(const char *path, int mode);
 # endif
 #endif
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
diff --git a/usage.c b/usage.c
index 09f0ed509b..58994e0d5c 100644
--- a/usage.c
+++ b/usage.c
@@ -5,7 +5,15 @@
  */
 #include "git-compat-util.h"
 #include "gettext.h"
+
+#ifdef GIT_STD_LIB
+#undef trace2_cmd_name
+#undef trace2_cmd_error_va
+#define trace2_cmd_name(x) 
+#define trace2_cmd_error_va(x, y)
+#else
 #include "trace2.h"
+#endif
 
 static void vreportf(const char *prefix, const char *err, va_list params)
 {
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (6 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 243 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..0e4f6d5807
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,239 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	struct strbuf_expand_dict_entry dict[] = {
+		{ "foo", NULL, },
+		{ "bar", NULL, },
+	};
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo"); 
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
+	// strbuf_expand_literal_cb() called by strbuf_expand()
+	// strbuf_expand_dict_cb() called by strbuf_expand()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);	
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
\ No newline at end of file
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 5/8] parse: create new library for parsing strings and env values
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-06-27 22:58   ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-06-27 22:58 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
>
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].

Quite sensible and ...

>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  Makefile                   |   1 +
>  attr.c                     |   2 +-
>  config.c                   | 180 +-----------------------------------

... long overdue to have this.

>  config.h                   |  14 +--
>  pack-objects.c             |   2 +-
>  pack-revindex.c            |   2 +-
>  parse-options.c            |   3 +-
>  parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
>  parse.h                    |  20 ++++
>  pathspec.c                 |   2 +-
>  preload-index.c            |   2 +-
>  progress.c                 |   2 +-
>  prompt.c                   |   2 +-
>  rebase.c                   |   2 +-
>  t/helper/test-env-helper.c |   2 +-
>  unpack-trees.c             |   2 +-
>  wrapper.c                  |   2 +-
>  write-or-die.c             |   2 +-
>  18 files changed, 219 insertions(+), 205 deletions(-)
>  create mode 100644 parse.c
>  create mode 100644 parse.h

It is somewhat surprising and very pleasing to see so many *.c files
had and now can lose dependency on <config.h>.  Very nice.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
@ 2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 23:18     ` Calvin Wan
  2023-06-28  0:30     ` Glen Choo
  0 siblings, 2 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-06-27 23:00 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

> pager_in_use() is simply a wrapper around
> git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
> git_env_bool() in this fashion also do not have a wrapper function
> around it. By removing pager_in_use(), we can also get rid of the
> pager.h dependency from a few files.
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  builtin/log.c | 2 +-
>  color.c       | 2 +-
>  column.c      | 2 +-
>  date.c        | 4 ++--
>  git.c         | 2 +-
>  pager.c       | 5 -----
>  pager.h       | 1 -
>  7 files changed, 6 insertions(+), 12 deletions(-)

With so many (read: more than 3) callsites, I am not sure if this is
an improvement.  pager_in_use() cannot be misspelt without getting
noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
go silently unnoticed.  Is there no other way to lose the dependency
you do not like?


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 23:00   ` Junio C Hamano
@ 2023-06-27 23:18     ` Calvin Wan
  2023-06-28  0:30     ` Glen Choo
  1 sibling, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-27 23:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> With so many (read: more than 3) callsites, I am not sure if this is
> an improvement.  pager_in_use() cannot be misspelt without getting
> noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
> go silently unnoticed.  Is there no other way to lose the dependency
> you do not like?

I thought about only changing this call site, but that creates an
inconsistency that shouldn't exist. The other way is to move this
function into a different file, but it is also unclear to me which
file that would be. It would be awkward in parse.c and if it was in
environment.c then we would have many more inherited dependencies from
that. I agree that the value of this patch is dubious in and of
itself, which is why it's coupled together with this series rather
than in a separate standalone cleanup series.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (7 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-06-28  0:14 ` Glen Choo
  2023-06-28 16:30   ` Calvin Wan
  2023-06-30  7:01 ` Linus Arver
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  10 siblings, 1 reply; 70+ messages in thread
From: Glen Choo @ 2023-06-28  0:14 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: Calvin Wan, nasamuffin, johnathantanmy

I see that this doesn't apply cleanly to 'master'. Do you have a base
commit that reviewers can easily apply this to?

Calvin Wan <calvinwan@google.com> writes:

> Before looking at this series, it probably makes sense to look at the
> other series that this is built on top of since that is the state I will
> be referring to in this cover letter:
>
>   - Elijah's final cache.h cleanup series[2]
>   - my strbuf cleanup series[3]
>   - my git-compat-util cleanup series[4]

Unfortunately, not all of these series apply cleanly to 'master' either,
so I went digging for the topic branches, which I think are:

- en/header-split-cache-h-part-3
- cw/header-compat-util-shuffle
- cw/strbuf-cleanup

And then I tried merging them, but it looks like they don't merge
cleanly either :/

(Btw Junio, I think cw/header-compat-util-shuffle didn't get called out
in What's Cooking.)

> [2] https://lore.kernel.org/git/pull.1525.v3.git.1684218848.gitgitgadget@gmail.com/
> [3] https://lore.kernel.org/git/20230606194720.2053551-1-calvinwan@google.com/
> [4] https://lore.kernel.org/git/20230606170711.912972-1-calvinwan@google.com/

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 23:18     ` Calvin Wan
@ 2023-06-28  0:30     ` Glen Choo
  2023-06-28 16:37       ` Glen Choo
  2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 2 replies; 70+ messages in thread
From: Glen Choo @ 2023-06-28  0:30 UTC (permalink / raw)
  To: Junio C Hamano, Calvin Wan; +Cc: git, nasamuffin, johnathantanmy

Junio C Hamano <gitster@pobox.com> writes:

>> pager_in_use() is simply a wrapper around
>> git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
>> git_env_bool() in this fashion also do not have a wrapper function
>> around it. By removing pager_in_use(), we can also get rid of the
>> pager.h dependency from a few files.
>
> With so many (read: more than 3) callsites, I am not sure if this is
> an improvement.  pager_in_use() cannot be misspelt without getting
> noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
> go silently unnoticed.  Is there no other way to lose the dependency
> you do not like?

Having the function isn't just nice for typo prevention - it's also a
reasonable boundary around the pager subsystem. We could imagine a
world where we wanted to track the pager status using a static
var instead of an env var (not that we'd even want that :P), and this
inlining makes that harder.

From the cover letter, it seems like we only need this to remove
"#include pager.h" from date.c, and that's only used in
parse_date_format(). Could we add a is_pager/pager_in_use to that
function and push the pager.h dependency upwards?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
@ 2023-06-28  2:05   ` Victoria Dye
  2023-07-05 17:57     ` Calvin Wan
  2023-07-11 20:07   ` Jeff Hostetler
  1 sibling, 1 reply; 70+ messages in thread
From: Victoria Dye @ 2023-06-28  2:05 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Calvin Wan wrote:
> As a library boundary, wrapper.c should not directly log trace2
> statistics, but instead provide those statistics upon
> request. Therefore, move the trace2 logging code to trace2.[ch.]. This
> also allows wrapper.c to not be dependent on trace2.h and repository.h.
> 

...

> diff --git a/trace2.h b/trace2.h
> index 4ced30c0db..689e9a4027 100644
> --- a/trace2.h
> +++ b/trace2.h
> @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
>  
>  const char *trace2_session_id(void);
>  
> +/*
> + * Writes out trace statistics for fsync
> + */
> +void trace_git_fsync_stats(void);
> +

This function does not belong in 'trace2.h', IMO. The purpose of that file
is to contain the generic API for Trace2 (e.g., 'trace2_printf()',
'trace2_region_(enter|exit)'), whereas this function is effectively a
wrapper around a specific invocation of that API. 

You note in the commit message that "wrapper.c should not directly log
trace2 statistics" with the reasoning of "[it's] a library boundary," but I
suspect the unstated underlying reason is "because it tracks 'count_fsync_*'
in static variables." This case would be better handled, then, by replacing
the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]).
That keeps this usage consistent with the API already established for
Trace2, rather than starting an unsustainable trend of creating ad-hoc,
per-metric wrappers in 'trace2.[c|h]'.

An added note re: the commit message - it's extremely important that
functions _anywhere in Git_ are able to use the Trace2 API directly. A
developer could reasonably want to measure performance, keep track of an
interesting metric, log when a region is entered in the larger trace,
capture error information, etc. for any function, regardless of where in
falls in the internal library organization. To that end, I think either the
commit message should be rephrased to remove that statement (if the issue is
really "we're using a static variable and we want to avoid that"), or the
libification effort should be updated to accommodate use of Trace2 anywhere
in Git. 

[1] https://lore.kernel.org/git/pull.1373.v4.git.1666618868.gitgitgadget@gmail.com/


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
@ 2023-06-28 13:15   ` Phillip Wood
  2023-06-28 16:55     ` Calvin Wan
  0 siblings, 1 reply; 70+ messages in thread
From: Phillip Wood @ 2023-06-28 13:15 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Hi Calvin

On 27/06/2023 20:52, Calvin Wan wrote:
> Separate out hex functionality that doesn't require a hash algo into
> hex-ll.[ch]. Since the hash algo is currently a global that sits in
> repository, this separation removes that dependency for files that only
> need basic hex manipulation functions.
>
> diff --git a/hex.h b/hex.h
> index 7df4b3c460..c07c8b34c2 100644
> --- a/hex.h
> +++ b/hex.h
> @@ -2,22 +2,7 @@
>   #define HEX_H
>   
>   #include "hash-ll.h"
> -
> -extern const signed char hexval_table[256];
> -static inline unsigned int hexval(unsigned char c)
> -{
> -	return hexval_table[c];
> -}
> -
> -/*
> - * Convert two consecutive hexadecimal digits into a char.  Return a
> - * negative value on error.  Don't run over the end of short strings.
> - */
> -static inline int hex2chr(const char *s)
> -{
> -	unsigned int val = hexval(s[0]);
> -	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
> -}
> +#include "hex-ll.h"

I don't think any of the remaining declarations in hex.h depend on the 
ones that are moved to "hex-ll.h" so this include should probably be in 
"hex.c" rather than "hex.h"

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
@ 2023-06-28 13:27   ` Phillip Wood
  2023-06-28 21:15     ` Calvin Wan
  0 siblings, 1 reply; 70+ messages in thread
From: Phillip Wood @ 2023-06-28 13:27 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Hi Calvin

On 27/06/2023 20:52, Calvin Wan wrote:
> The Git Standard Library intends to serve as the foundational library
> and root dependency that other libraries in Git will be built off of.
> That is to say, suppose we have libraries X and Y; a user that wants to
> use X and Y would need to include X, Y, and this Git Standard Library.

I think having a library of commonly used functions and structures is a 
good idea. While I appreciate that we don't want to include everything 
I'm surprised to see it does not include things like "hashmap.c" and 
"string-list.c" that will be required by the config library as well as 
other code in "libgit.a". I don't think we want "libgitconfig.a" and 
"libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"

> diff --git a/Makefile b/Makefile
> index e9ad9f9ef1..255bd10b82 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
>   	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
>   endif
>   
> +ifdef GIT_STD_LIB
> +	BASIC_CFLAGS += -DGIT_STD_LIB
> +	BASIC_CFLAGS += -DNO_GETTEXT

I can see other projects may want to build git-std-lib without gettext 
support but if we're going to use git-std-lib within git it needs to be 
able to be built with that support. The same goes for the trace 
functions that you are redefining in usage.h

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 481dac22b0..75aa9b263e 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
>   #define platform_core_config noop_core_config
>   #endif
>   
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>   int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>   #define rmdir lstat_cache_aware_rmdir
>   #endif

I'm not sure why the existing condition is being moved here

Thanks for posting this RFC. I've only really given it a quick glance 
but on the whole it seems to make sense.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
@ 2023-06-28 16:30   ` Calvin Wan
  0 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-28 16:30 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, nasamuffin, Jonathan Tan

Ah I failed to mention that this is built on top of 2.41. You can also
get this series with the correctly applied patches from:
https://github.com/calvin-wan-google/git/tree/git-std-lib-rfc

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28  0:30     ` Glen Choo
@ 2023-06-28 16:37       ` Glen Choo
  2023-06-28 16:44         ` Calvin Wan
  2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 1 reply; 70+ messages in thread
From: Glen Choo @ 2023-06-28 16:37 UTC (permalink / raw)
  To: Junio C Hamano, Calvin Wan; +Cc: git, nasamuffin, johnathantanmy

Glen Choo <chooglen@google.com> writes:

>                      Could we add a is_pager/pager_in_use to that
> function and push the pager.h dependency upwards?

Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
that function?"

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28 16:37       ` Glen Choo
@ 2023-06-28 16:44         ` Calvin Wan
  2023-06-28 17:30           ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-28 16:44 UTC (permalink / raw)
  To: Glen Choo; +Cc: Junio C Hamano, git, nasamuffin, johnathantanmy

> Glen Choo <chooglen@google.com> writes:
>
> >                      Could we add a is_pager/pager_in_use to that
> > function and push the pager.h dependency upwards?
>
> Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
> that function?"

Refactoring the function signature to:

parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)

as you suggested is a much better solution, thanks! I'll make that
change in the next reroll.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-28 13:15   ` Phillip Wood
@ 2023-06-28 16:55     ` Calvin Wan
  0 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-06-28 16:55 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> I don't think any of the remaining declarations in hex.h depend on the
> ones that are moved to "hex-ll.h" so this include should probably be in
> "hex.c" rather than "hex.h"

The reason why hex-ll.h is included in hex.h isn't because there might
be other declarations in hex.h that depend on it. It is for files that
include hex.h to also inherit the inclusion of hex-ll.h. If we moved
the inclusion of hex-ll.h to hex.c rather than hex.h, then those files
would have to include both hex.h and hex-ll.h. It clarifies whether a
file needs all of hex or just the low level functionality of hex.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28 16:44         ` Calvin Wan
@ 2023-06-28 17:30           ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-06-28 17:30 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Glen Choo, git, nasamuffin, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

>> Glen Choo <chooglen@google.com> writes:
>>
>> >                      Could we add a is_pager/pager_in_use to that
>> > function and push the pager.h dependency upwards?
>>
>> Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
>> that function?"
>
> Refactoring the function signature to:
>
> parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
>
> as you suggested is a much better solution, thanks! I'll make that
> change in the next reroll.

Yeah, the date format "auto:" that changes behaviour between the
output medium feels a serious layering violation, but given the
constraints, it looks like the best thing to do.

Thanks.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28  0:30     ` Glen Choo
  2023-06-28 16:37       ` Glen Choo
@ 2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-06-28 20:58 UTC (permalink / raw)
  To: Glen Choo; +Cc: Calvin Wan, git, nasamuffin, jonathantanmy

Glen Choo <chooglen@google.com> writes:

> Having the function isn't just nice for typo prevention - it's also a
> reasonable boundary around the pager subsystem. We could imagine a
> world where we wanted to track the pager status using a static
> var instead of an env var (not that we'd even want that :P), and this
> inlining makes that harder.
>
> From the cover letter, it seems like we only need this to remove
> "#include pager.h" from date.c, and that's only used in
> parse_date_format(). Could we add a is_pager/pager_in_use to that
> function and push the pager.h dependency upwards?

Thanks---I think that may show a good direction.  parse_date_format()
reacts to "auto:foo" and as long as that feature needs to be there,
pager_in_use() must be available to the function.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-28 13:27   ` Phillip Wood
@ 2023-06-28 21:15     ` Calvin Wan
  2023-06-30 10:00       ` Phillip Wood
  0 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-06-28 21:15 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> On 27/06/2023 20:52, Calvin Wan wrote:
> > The Git Standard Library intends to serve as the foundational library
> > and root dependency that other libraries in Git will be built off of.
> > That is to say, suppose we have libraries X and Y; a user that wants to
> > use X and Y would need to include X, Y, and this Git Standard Library.
>
> I think having a library of commonly used functions and structures is a
> good idea. While I appreciate that we don't want to include everything
> I'm surprised to see it does not include things like "hashmap.c" and
> "string-list.c" that will be required by the config library as well as
> other code in "libgit.a". I don't think we want "libgitconfig.a" and
> "libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"

I chose not to include hashmap and string-list in git-std-lib.a in the
first pass since they can exist as libraries built on top of
git-std-lib.a. There is no harm starting off with more libraries than
fewer besides having something like the config library be dependent on
lib-hashmap.a, lib-string-list.a, and git-std-lib.a rather than only
git-std-lib.a. They can always be added into git-std-lib.a in the
future. That being said, I do find it extremely unlikely that someone
would want to swap out the implementation for hashmap or string-list
so it is also very reasonable to include them into git-std-lib.a

>
> > diff --git a/Makefile b/Makefile
> > index e9ad9f9ef1..255bd10b82 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
> >       COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
> >   endif
> >
> > +ifdef GIT_STD_LIB
> > +     BASIC_CFLAGS += -DGIT_STD_LIB
> > +     BASIC_CFLAGS += -DNO_GETTEXT
>
> I can see other projects may want to build git-std-lib without gettext
> support but if we're going to use git-std-lib within git it needs to be
> able to be built with that support. The same goes for the trace
> functions that you are redefining in usage.h

Taking a closer look at gettext.[ch], I believe I can also include it
into git-std-lib.a with a couple of minor changes. I'm currently
thinking about how the trace functions should interact with
git-std-lib.a since Victoria had similar comments on patch 1. I'll
reply to that thread when I come up with an answer.

>
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index 481dac22b0..75aa9b263e 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
> >   #define platform_core_config noop_core_config
> >   #endif
> >
> > +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
> >   int lstat_cache_aware_rmdir(const char *path);
> > -#if !defined(__MINGW32__) && !defined(_MSC_VER)
> >   #define rmdir lstat_cache_aware_rmdir
> >   #endif
>
> I'm not sure why the existing condition is being moved here

Ah I see that this changes behavior for callers of
lstat_cache_aware_rmdir if those conditions aren't satisfied. I
should've added an extra #if for GIT_STD_LIB instead of adding it to
the end of the current check and moving it. Thanks for spotting this.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (8 preceding siblings ...)
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
@ 2023-06-30  7:01 ` Linus Arver
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  10 siblings, 0 replies; 70+ messages in thread
From: Linus Arver @ 2023-06-30  7:01 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Hello Calvin,

Calvin Wan <calvinwan@google.com> writes:
> With our current method of building Git, we can imagine the dependency
> graph as such:
>
>         Git
>          /\
>         /  \
>        /    \
>   libgit.a   ext deps
>
> In libifying parts of Git, we want to shrink the dependency graph to
> only the minimal set of dependencies, so libraries should not use
> libgit.a. Instead, it would look like:
>
>                 Git
>                 /\
>                /  \
>               /    \
>           libgit.a  ext deps
>              /\
>             /  \
>            /    \
> object-store.a  (other lib)
>       |        /
>       |       /
>       |      /
>  config.a   / 
>       |    /
>       |   /
>       |  /
> git-std-lib.a
>
> Instead of containing all of the objects in Git, libgit.a would contain
> objects that are not built by libraries it links against. Consequently,
> if someone wanted their own custom build of Git with their own custom
> implementation of the object store, they would only have to swap out
> object-store.a rather than do a hard fork of Git.

What about the case where someone wants to build program Foo which just
pulls in some bits of Git? For example, I am thinking of trailer.[ch]
which could be refactored to expose a public API. Then the Foo program
could pull this public trailer manipulation API in as a library
dependency (so that Foo can parse trailers in commit messages without
re-implementing that logic in Foo's own codebase). With the proposed Git
Standard Library (GSL) model above, would my Foo program also have to
pull in GSL? If so, isn't this onerous because of the additional bloat?
The Foo developers just want the banana, not the gorilla holding the
banana in the jungle, so to speak.

> Rationale behind Git Standard Library
> ================
>
> The rationale behind Git Standard Library essentially is the result of
> two observations within the Git codebase: every file includes
> git-compat-util.h which defines functions in a couple of different
> files, and wrapper.c + usage.c have difficult-to-separate circular
> dependencies with each other and other files.
>
> Ubiquity of git-compat-util.h and circular dependencies
> ========
>
> Every file in the Git codebase includes git-compat-util.h. It serves as
> "a compatibility aid that isolates the knowledge of platform specific
> inclusion order and what feature macros to define before including which
> system header" (Junio[5]). Since every file includes git-compat-util.h, and
> git-compat-util.h includes wrapper.h and usage.h, it would make sense
> for wrapper.c and usage.c to be a part of the root library. They have
> difficult to separate circular dependencies with each other so they

s/difficult to separate/difficult-to-separate

> can't be independent libraries. Wrapper.c has dependencies on parse.c,
> abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
> wrapper.c -- more circular dependencies. 
>
> Tradeoff between swappability and refactoring
> ========
>
> From the above dependency graph, we can see that git-std-lib.a could be
> many smaller libraries rather than a singular library. So why choose a
> singular library when multiple libraries can be individually easier to
> swap and are more modular? A singular library requires less work to
> separate out circular dependencies within itself so it becomes a
> tradeoff question between work and reward. While there may be a point in
> the future where a file like usage.c would want its own library so that
> someone can have custom die() or error(), the work required to refactor
> out the circular dependencies in some files would be enormous due to
> their ubiquity so therefore I believe it is not worth the tradeoff
> currently. Additionally, we can in the future choose to do this refactor
> and change the API for the library if there becomes enough of a reason
> to do so (remember we are avoiding promising stability of the interfaces
> of those libraries).

Would getting us down the currently proposed path make it even more
difficult to do this refactor? If so, I think it's worth mentioning.

> Reuse of compatibility functions in git-compat-util.h
> ========
>
> Most functions defined in git-compat-util.h are implemented in compat/
> and have dependencies limited to strbuf.h and wrapper.h so they can be
> easily included in git-std-lib.a, which as a root dependency means that
> higher level libraries do not have to worry about compatibility files in
> compat/. The rest of the functions defined in git-compat-util.h are
> implemented in top level files and, in this patch set, are hidden behind
> an #ifdef if their implementation is not in git-std-lib.a.
>
> Rationale summary
> ========
>
> The Git Standard Library allows us to get the libification ball rolling
> with other libraries in Git (such as Glen's removal of global state from
> config iteration[6] prepares a config library). By not spending many
> more months attempting to refactor difficult circular dependencies and
> instead spending that time getting to a state where we can test out
> swapping a library out such as config or object store, we can prove the
> viability of Git libification on a much faster time scale. Additionally
> the code cleanups that have happened so far have been minor and
> beneficial for the codebase. It is probable that making large movements
> would negatively affect code clarity.

It sounds like the circular dependencies are so difficult to untangle that they
are the primary motivation behind grouping these tightly-coupled libraries
together into the Git Standard Library (GSL) banner. Still, I think it would
help reviewers if you explain what tradeoffs we are making by accepting the
circular dependencies as they are instead of untangling them. Conversely, if we
assume that there are no circular dependencies, what kind of benefits do we get
when designing the GSL from this (improved) position? Would there be little to
no additional benefits? If so, then I think it would be easier to support the
current approach (as removing the circularities would not give us significant
advantages for libification).

> Git Standard Library boundary
> ================
>
> While I have described above some useful heuristics for identifying
> potential candidates for git-std-lib.a, a standard library should not
> have a shaky definition for what belongs in it.
>
>  - Low-level files (aka operates only on other primitive types) that are
>    used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
>    - Dependencies that are low-level and widely used
>      (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
>  - low-level git/* files with functions defined in git-compat-util.h
>    (ctype.c)
>  - compat/*

I'm confused. Is the list above an example of a shaky definition, or the
opposite? IOW, do you mean that the list above should be the initial set
of content to include in the GSL? Or _not_ to include?

> Series structure
> ================
>
> While my strbuf and git-compat-util series can stand alone, they also
> function as preparatory patches for this series. There are more cleanup
> patches in this series, but since most of them have marginal benefits
> probably not worth the churn on its own, I decided not to split them
> into a separate series like with strbuf and git-compat-util. As an RFC,
> I am looking for comments on whether the rationale behind git-std-lib
> makes sense as well as whether there are better ways to build and enable
> git-std-lib in patch 7, specifically regarding Makefile rules and the
> usage of ifdef's to stub out certain functions and headers. 

If the cleanups are independent I think it would be simpler to put them
in a separate series.

In general, I think the doc would make a stronger case if it expanded
the discussions around alternative approaches to the one proposed, with
the reasons why they were rejected.

Minor nits:
- Documentation/technical/git-std-lib.txt: (style) prefer "we" over "I" ("we
  believe" instead of "I believe").
- There are some "\ No newline at end of file" warnings in this series.

Thanks,
Linus

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-28 21:15     ` Calvin Wan
@ 2023-06-30 10:00       ` Phillip Wood
  0 siblings, 0 replies; 70+ messages in thread
From: Phillip Wood @ 2023-06-30 10:00 UTC (permalink / raw)
  To: Calvin Wan, phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

Hi Calvin

On 28/06/2023 22:15, Calvin Wan wrote:
>> On 27/06/2023 20:52, Calvin Wan wrote:
>>> The Git Standard Library intends to serve as the foundational library
>>> and root dependency that other libraries in Git will be built off of.
>>> That is to say, suppose we have libraries X and Y; a user that wants to
>>> use X and Y would need to include X, Y, and this Git Standard Library.
>>
>> I think having a library of commonly used functions and structures is a
>> good idea. While I appreciate that we don't want to include everything
>> I'm surprised to see it does not include things like "hashmap.c" and
>> "string-list.c" that will be required by the config library as well as
>> other code in "libgit.a". I don't think we want "libgitconfig.a" and
>> "libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"
> 
> I chose not to include hashmap and string-list in git-std-lib.a in the
> first pass since they can exist as libraries built on top of
> git-std-lib.a. There is no harm starting off with more libraries than
> fewer besides having something like the config library be dependent on
> lib-hashmap.a, lib-string-list.a, and git-std-lib.a rather than only
> git-std-lib.a. They can always be added into git-std-lib.a in the
> future. That being said, I do find it extremely unlikely that someone
> would want to swap out the implementation for hashmap or string-list
> so it is also very reasonable to include them into git-std-lib.a

Finding the right boundary for git-std-lib is a bit of a judgement call. 
We certainly could have separate libraries for things like hashmap, 
string-list, strvec, strmap and wildmatch but there is some overhead 
adding each one to the Makefile. I think their use is common enough that 
it would be continent to have them in git-std-lib but we can always add 
them later.

>>> diff --git a/Makefile b/Makefile
>>> index e9ad9f9ef1..255bd10b82 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
>>>        COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
>>>    endif
>>>
>>> +ifdef GIT_STD_LIB
>>> +     BASIC_CFLAGS += -DGIT_STD_LIB
>>> +     BASIC_CFLAGS += -DNO_GETTEXT
>>
>> I can see other projects may want to build git-std-lib without gettext
>> support but if we're going to use git-std-lib within git it needs to be
>> able to be built with that support. The same goes for the trace
>> functions that you are redefining in usage.h
> 
> Taking a closer look at gettext.[ch], I believe I can also include it
> into git-std-lib.a with a couple of minor changes.

That's great

> I'm currently
> thinking about how the trace functions should interact with
> git-std-lib.a since Victoria had similar comments on patch 1. I'll
> reply to that thread when I come up with an answer.

One thought I had was to have a compile time flag so someone building 
git-std-lib for an external project could build it with

	make git-std-lib NO_TRACE2=YesPlease

and then we'd either compile against a stub version of trace2 that does 
nothing or use some #define magic to get rid of the calls if that is not 
too invasive.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-28  2:05   ` Victoria Dye
@ 2023-07-05 17:57     ` Calvin Wan
  2023-07-05 18:22       ` Victoria Dye
  0 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-07-05 17:57 UTC (permalink / raw)
  To: Victoria Dye; +Cc: git, nasamuffin, chooglen, johnathantanmy

> This function does not belong in 'trace2.h', IMO. The purpose of that file
> is to contain the generic API for Trace2 (e.g., 'trace2_printf()',
> 'trace2_region_(enter|exit)'), whereas this function is effectively a
> wrapper around a specific invocation of that API.
>
> You note in the commit message that "wrapper.c should not directly log
> trace2 statistics" with the reasoning of "[it's] a library boundary," but I
> suspect the unstated underlying reason is "because it tracks 'count_fsync_*'
> in static variables." This case would be better handled, then, by replacing
> the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]).
> That keeps this usage consistent with the API already established for
> Trace2, rather than starting an unsustainable trend of creating ad-hoc,
> per-metric wrappers in 'trace2.[c|h]'.

The underlying reason is for removing the trace2 dependency from
wrapper.c so that when git-std-lib is compiled, there isn't a missing
object for  trace_git_fsync_stats(), resulting in a compilation error.
However I do agree that the method I chose to do so by creating an
ad-hoc wrapper is unsustainable and I will come up with a better
method for doing so.

>
> An added note re: the commit message - it's extremely important that
> functions _anywhere in Git_ are able to use the Trace2 API directly. A
> developer could reasonably want to measure performance, keep track of an
> interesting metric, log when a region is entered in the larger trace,
> capture error information, etc. for any function, regardless of where in
> falls in the internal library organization.

I don't quite agree that functions _anywhere in Git_ are able to use
the Trace2 API directly for the same reason that we don't have the
ability to log functions in external libraries -- logging common,
low-level functionality creates an unnecessary amount of log churn and
those logs generally contain practically useless information. However,
that does not mean that all of the functions in git-std-lib fall into
that category (usage has certain functions definitely worth logging).
This means that files like usage.c could instead be separated into its
own library and git-std-lib would only contain files that we deem
"should never be logged".

> To that end, I think either the
> commit message should be rephrased to remove that statement (if the issue is
> really "we're using a static variable and we want to avoid that"), or the
> libification effort should be updated to accommodate use of Trace2 anywhere
> in Git.

Besides potentially redrawing the boundaries of git-std-lib to
accommodate Trace2, we're also looking into the possibility of
stubbing out tracing in git-std-lib so that it and other libraries can
be built and tested, and then when Trace2 is turned into a library,
it's full functionality can be linked to.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-07-05 17:57     ` Calvin Wan
@ 2023-07-05 18:22       ` Victoria Dye
  0 siblings, 0 replies; 70+ messages in thread
From: Victoria Dye @ 2023-07-05 18:22 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan wrote:
>> An added note re: the commit message - it's extremely important that
>> functions _anywhere in Git_ are able to use the Trace2 API directly. A
>> developer could reasonably want to measure performance, keep track of an
>> interesting metric, log when a region is entered in the larger trace,
>> capture error information, etc. for any function, regardless of where in
>> falls in the internal library organization.
> 
> I don't quite agree that functions _anywhere in Git_ are able to use
> the Trace2 API directly for the same reason that we don't have the
> ability to log functions in external libraries -- logging common,
> low-level functionality creates an unnecessary amount of log churn and
> those logs generally contain practically useless information. 

That may be true in your use cases, but it isn't in mine and may not be for
others'. In fact, I was just using these exact fsync metrics a couple weeks
ago to do some performance analysis; I could easily imagine doing something
similar for another "low level" function. It's unreasonable - and unfair to
future development - to make an absolute declaration about "what's useful
vs. useless" and use that decision to justify severely limiting our future
flexibility on the matter.

> However,
> that does not mean that all of the functions in git-std-lib fall into
> that category (usage has certain functions definitely worth logging).
> This means that files like usage.c could instead be separated into its
> own library and git-std-lib would only contain files that we deem
> "should never be logged".

How do you make that determination? What about if/when someone realizes,
somewhere down the line, that one of those "should never be logged" files
would actually benefit from some aggregate metric, e.g. a Trace2 timer? This
isn't a case of extracting an extraneous dependency (where a function really
doesn't _need_ something it has access to); tracing & logging is a core
functionality in Git, and should not be artificially constrained in the name
of organization. 

>> To that end, I think either the
>> commit message should be rephrased to remove that statement (if the issue is
>> really "we're using a static variable and we want to avoid that"), or the
>> libification effort should be updated to accommodate use of Trace2 anywhere
>> in Git.
> 
> Besides potentially redrawing the boundaries of git-std-lib to
> accommodate Trace2, we're also looking into the possibility of
> stubbing out tracing in git-std-lib so that it and other libraries can
> be built and tested, and then when Trace2 is turned into a library,
> it's full functionality can be linked to.

If that allows you to meet your libification goals without limiting Trace2's
accessibility throughout the codebase, that works for me.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
  2023-06-28  2:05   ` Victoria Dye
@ 2023-07-11 20:07   ` Jeff Hostetler
  1 sibling, 0 replies; 70+ messages in thread
From: Jeff Hostetler @ 2023-07-11 20:07 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy



On 6/27/23 3:52 PM, Calvin Wan wrote:
> As a library boundary, wrapper.c should not directly log trace2
> statistics, but instead provide those statistics upon
> request. Therefore, move the trace2 logging code to trace2.[ch.]. This
> also allows wrapper.c to not be dependent on trace2.h and repository.h.
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>   trace2.c  | 13 +++++++++++++
>   trace2.h  |  5 +++++
>   wrapper.c | 17 ++++++-----------
>   wrapper.h |  4 ++--
>   4 files changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/trace2.c b/trace2.c
> index 0efc4e7b95..f367a1ce31 100644
> --- a/trace2.c
> +++ b/trace2.c
> @@ -915,3 +915,16 @@ const char *trace2_session_id(void)
>   {
>   	return tr2_sid_get();
>   }
> +
> +static void log_trace_fsync_if(const char *key)
> +{
> +	intmax_t value = get_trace_git_fsync_stats(key);
> +	if (value)
> +		trace2_data_intmax("fsync", the_repository, key, value);
> +}
> +
> +void trace_git_fsync_stats(void)
> +{
> +	log_trace_fsync_if("fsync/writeout-only");
> +	log_trace_fsync_if("fsync/hardware-flush");
> +}
> diff --git a/trace2.h b/trace2.h
> index 4ced30c0db..689e9a4027 100644
> --- a/trace2.h
> +++ b/trace2.h
> @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
>   
>   const char *trace2_session_id(void);
>   
> +/*
> + * Writes out trace statistics for fsync
> + */
> +void trace_git_fsync_stats(void);
> +
>   #endif /* TRACE2_H */

Sorry to be late to this party, but none of this belongs
in trace2.[ch].

As Victoria stated, you can/should use the new "timers and counters"
feature in Trace2 to collect and log these stats.

And then you don't need specific log_trace_* functions or wrappers
-- just use the trace2_timer_start()/stop() or trace2_counter_add()
functions as necessary around the various fsync operations.


I haven't really followed the lib-ification effort, so I'm just going
to GUESS that all of the Trace2_ and tr2_ prefixed functions and data
structures will need to be in the lowest-level .a so that it can be
called from the main .exe and any other .a's between them.

Jeff


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (9 preceding siblings ...)
  2023-06-30  7:01 ` Linus Arver
@ 2023-08-10 16:33 ` Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
                     ` (8 more replies)
  10 siblings, 9 replies; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:33 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Original cover letter:
https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/

In the initial RFC, I had a patch that removed the trace2 dependency
from usage.c so that git-std-lib.a would not have dependencies outside
of git-std-lib.a files. Consequently this meant that tracing would not
be possible in git-std-lib.a files for other developers of Git, and it
is not a good idea for the libification effort to close the door on
tracing in certain files for future development (thanks Victoria for
pointing this out). That patch has been removed and instead I introduce
stubbed out versions of repository.[ch] and trace2.[ch] that are swapped
in during compilation time (I'm no Makefile expert so any advice on how
on I could do this better would be much appreciated). These stubbed out
files contain no implementations and therefore do not have any
additional dependencies, allowing git-std-lib.a to compile with only the
stubs as additional dependencies. This also has the added benefit of
removing `#ifdef GIT_STD_LIB` macros in C files for specific library
compilation rules. Libification shouldn't pollute C files with these
macros. The boundaries for git-std-lib.a have also been updated to
contain these stubbed out files.

I have also made some additional changes to the Makefile to piggy back
off of our existing build rules for .c/.o targets and their
dependencies. As I learn more about Makefiles, I am continuing to look
for ways to improve these rules. Eventually I would like to be able to
have a set of rules that future libraries can emulate and is scalable
in the sense of not creating additional toil for developers that are not
interested in libification.

Calvin Wan (7):
  hex-ll: split out functionality from hex
  object: move function to object.c
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  date: push pager.h dependency up
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 186 ++++++++++++++++++
 Makefile                                |  64 ++++++-
 attr.c                                  |   2 +-
 builtin/blame.c                         |   2 +-
 builtin/log.c                           |   2 +-
 color.c                                 |   2 +-
 config.c                                | 173 +----------------
 config.h                                |  14 +-
 date.c                                  |   5 +-
 date.h                                  |   2 +-
 git-compat-util.h                       |   7 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 object.c                                |   5 +
 object.h                                |   6 +
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 parse-options.c                         |   3 +-
 parse.c                                 | 182 ++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 ref-filter.c                            |   3 +-
 revision.c                              |   3 +-
 strbuf.c                                |   2 +-
 stubs/repository.c                      |   4 +
 stubs/repository.h                      |   8 +
 stubs/trace2.c                          |  22 +++
 stubs/trace2.h                          |  69 +++++++
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-date.c                    |   3 +-
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 239 ++++++++++++++++++++++++
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 wrapper.c                               |   8 +-
 wrapper.h                               |   5 -
 write-or-die.c                          |   2 +-
 46 files changed, 925 insertions(+), 295 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 stubs/repository.c
 create mode 100644 stubs/repository.h
 create mode 100644 stubs/trace2.c
 create mode 100644 stubs/trace2.h
 create mode 100644 t/stdlib-test.c

Range-diff against v1:
1:  f7abe7a239 < -:  ---------- trace2: log fsync stats in trace2 rather than wrapper
2:  c302ae0052 = 1:  78634bc406 hex-ll: split out functionality from hex
3:  74e8e35ae2 ! 2:  21ec1d276e object: move function to object.c
    @@ wrapper.c
      #include "config.h"
      #include "gettext.h"
     -#include "object.h"
    + #include "repository.h"
      #include "strbuf.h"
    - 
    - static intmax_t count_fsync_writeout_only;
    + #include "trace2.h"
     @@ wrapper.c: int rmdir_or_warn(const char *file)
      	return warn_if_unremovable("rmdir", file, rmdir(file));
      }
4:  419c702633 = 3:  41dcf8107c config: correct bad boolean env value error message
5:  a325002438 ! 4:  3e800a41c4 parse: create new library for parsing strings and env values
    @@ wrapper.c
     -#include "config.h"
     +#include "parse.h"
      #include "gettext.h"
    + #include "repository.h"
      #include "strbuf.h"
    - 
     
      ## write-or-die.c ##
     @@
6:  475190310a < -:  ---------- pager: remove pager_in_use()
-:  ---------- > 5:  7a4a088bc3 date: push pager.h dependency up
7:  d7f4d4a137 ! 6:  c9002734d0 git-std-lib: introduce git standard library
    @@ Documentation/technical/git-std-lib.txt (new)
     +easily included in git-std-lib.a, which as a root dependency means that
     +higher level libraries do not have to worry about compatibility files in
     +compat/. The rest of the functions defined in git-compat-util.h are
    -+implemented in top level files and, in this patch set, are hidden behind
    ++implemented in top level files and are hidden behind
     +an #ifdef if their implementation is not in git-std-lib.a.
     +
     +Rationale summary
    @@ Documentation/technical/git-std-lib.txt (new)
     + - low-level git/* files with functions defined in git-compat-util.h
     +   (ctype.c)
     + - compat/*
    ++ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
     +
     +There are other files that might fit this definition, but that does not
     +mean it should belong in git-std-lib.a. Those files should start as
     +their own separate library since any file added to git-std-lib.a loses
     +its flexibility of being easily swappable.
     +
    ++Wrapper.c and usage.c have dependencies on repository and trace2 that are
    ++possible to remove at the cost of sacrificing the ability for standard Git
    ++to be able to trace functions in those files and other files in git-std-lib.a.
    ++In order for git-std-lib.a to compile with those dependencies, stubbed out
    ++versions of those files are implemented and swapped in during compilation time.
    ++
     +Files inside of Git Standard Library
     +================
     +
    @@ Documentation/technical/git-std-lib.txt (new)
     +usage.c
     +utf8.c
     +wrapper.c
    ++stubs/repository.c
    ++stubs/trace2.c
     +relevant compat/ files
     +
     +Pitfalls
     +================
     +
    -+In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
    -+certain function headers. As other parts of Git are libified, if we
    -+have to use more ifdefs for each different library, then the codebase
    -+will become uglier and harder to understand. 
    -+
     +There are a small amount of files under compat/* that have dependencies
     +not inside of git-std-lib.a. While those functions are not called on
     +Linux, other OSes might call those problematic functions. I don't see
    @@ Documentation/technical/git-std-lib.txt (new)
      \ No newline at end of file
     
      ## Makefile ##
    +@@ Makefile: FUZZ_PROGRAMS =
    + GIT_OBJS =
    + LIB_OBJS =
    + SCALAR_OBJS =
    ++STUB_OBJS =
    + OBJECTS =
    + OTHER_PROGRAMS =
    + PROGRAM_OBJS =
    +@@ Makefile: COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
    + 
    + LIB_H = $(FOUND_H_SOURCES)
    + 
    ++ifndef GIT_STD_LIB
    + LIB_OBJS += abspath.o
    + LIB_OBJS += add-interactive.o
    + LIB_OBJS += add-patch.o
    +@@ Makefile: LIB_OBJS += write-or-die.o
    + LIB_OBJS += ws.o
    + LIB_OBJS += wt-status.o
    + LIB_OBJS += xdiff-interface.o
    ++else ifdef GIT_STD_LIB
    ++LIB_OBJS += abspath.o
    ++LIB_OBJS += ctype.o
    ++LIB_OBJS += date.o
    ++LIB_OBJS += hex-ll.o
    ++LIB_OBJS += parse.o
    ++LIB_OBJS += strbuf.o
    ++LIB_OBJS += usage.o
    ++LIB_OBJS += utf8.o
    ++LIB_OBJS += wrapper.o
    ++
    ++ifdef STUB_REPOSITORY
    ++STUB_OBJS += stubs/repository.o
    ++endif
    ++
    ++ifdef STUB_TRACE2
    ++STUB_OBJS += stubs/trace2.o
    ++endif
    ++
    ++LIB_OBJS += $(STUB_OBJS)
    ++endif
    + 
    + BUILTIN_OBJS += builtin/add.o
    + BUILTIN_OBJS += builtin/am.o
     @@ Makefile: ifdef FSMONITOR_OS_SETTINGS
      	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
      endif
    @@ Makefile: $(FUZZ_PROGRAMS): all
     +### Libified Git rules
     +
     +# git-std-lib
    -+# `make git-std-lib GIT_STD_LIB=YesPlease`
    ++# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
     +STD_LIB = git-std-lib.a
     +
    -+GIT_STD_LIB_OBJS += abspath.o
    -+GIT_STD_LIB_OBJS += ctype.o
    -+GIT_STD_LIB_OBJS += date.o
    -+GIT_STD_LIB_OBJS += hex-ll.o
    -+GIT_STD_LIB_OBJS += parse.o
    -+GIT_STD_LIB_OBJS += strbuf.o
    -+GIT_STD_LIB_OBJS += usage.o
    -+GIT_STD_LIB_OBJS += utf8.o
    -+GIT_STD_LIB_OBJS += wrapper.o
    -+
    -+$(STD_LIB): $(GIT_STD_LIB_OBJS) $(COMPAT_OBJS)
    ++$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
     +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
     +
    -+git-std-lib: $(STD_LIB)
    ++TEMP_HEADERS = temp_headers/
    ++
    ++git-std-lib:
    ++# Move headers to temporary folder and replace them with stubbed headers.
    ++# After building, move headers and stubbed headers back.
    ++ifneq ($(STUB_OBJS),)
    ++	mkdir -p $(TEMP_HEADERS); \
    ++	for d in $(STUB_OBJS); do \
    ++		BASE=$${d%.*}; \
    ++		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
    ++		mv $${BASE}.h $${BASE##*/}.h; \
    ++	done; \
    ++	$(MAKE) $(STD_LIB); \
    ++	for d in $(STUB_OBJS); do \
    ++		BASE=$${d%.*}; \
    ++		mv $${BASE##*/}.h $${BASE}.h; \
    ++		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
    ++	done; \
    ++	rm -rf temp_headers
    ++else
    ++	$(MAKE) $(STD_LIB)
    ++endif
     
      ## git-compat-util.h ##
     @@ git-compat-util.h: static inline int noop_core_config(const char *var UNUSED,
    @@ git-compat-util.h: int git_access(const char *path, int mode);
      /*
       * You can mark a stack variable with UNLEAK(var) to avoid it being
     
    + ## stubs/repository.c (new) ##
    +@@
    ++#include "git-compat-util.h"
    ++#include "repository.h"
    ++
    ++struct repository *the_repository;
    +
    + ## stubs/repository.h (new) ##
    +@@
    ++#ifndef REPOSITORY_H
    ++#define REPOSITORY_H
    ++
    ++struct repository { int stub; };
    ++
    ++extern struct repository *the_repository;
    ++
    ++#endif /* REPOSITORY_H */
    +
    + ## stubs/trace2.c (new) ##
    +@@
    ++#include "git-compat-util.h"
    ++#include "trace2.h"
    ++
    ++void trace2_region_enter_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...) { }
    ++void trace2_region_leave_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...) { }
    ++void trace2_data_string_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   const char *value) { }
    ++void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
    ++void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    ++			    va_list ap) { }
    ++void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
    ++void trace2_thread_start_fl(const char *file, int line,
    ++			    const char *thread_base_name) { }
    ++void trace2_thread_exit_fl(const char *file, int line) { }
    ++void trace2_data_intmax_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   intmax_t value) { }
    ++int trace2_is_enabled(void) { return 0; }
    ++void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
    +
    + ## stubs/trace2.h (new) ##
    +@@
    ++#ifndef TRACE2_H
    ++#define TRACE2_H
    ++
    ++struct child_process { int stub; };
    ++struct repository;
    ++struct json_writer { int stub; };
    ++
    ++void trace2_region_enter_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...);
    ++
    ++#define trace2_region_enter(category, label, repo) \
    ++	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
    ++
    ++void trace2_region_leave_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...);
    ++
    ++#define trace2_region_leave(category, label, repo) \
    ++	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
    ++
    ++void trace2_data_string_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   const char *value);
    ++
    ++#define trace2_data_string(category, repo, key, value)                       \
    ++	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
    ++			      (value))
    ++
    ++void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
    ++
    ++#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
    ++
    ++void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    ++			    va_list ap);
    ++
    ++#define trace2_cmd_error_va(fmt, ap) \
    ++	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
    ++
    ++
    ++void trace2_cmd_name_fl(const char *file, int line, const char *name);
    ++
    ++#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
    ++
    ++void trace2_thread_start_fl(const char *file, int line,
    ++			    const char *thread_base_name);
    ++
    ++#define trace2_thread_start(thread_base_name) \
    ++	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
    ++
    ++void trace2_thread_exit_fl(const char *file, int line);
    ++
    ++#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
    ++
    ++void trace2_data_intmax_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   intmax_t value);
    ++
    ++#define trace2_data_intmax(category, repo, key, value)                       \
    ++	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
    ++			      (value))
    ++
    ++enum trace2_process_info_reason {
    ++	TRACE2_PROCESS_INFO_STARTUP,
    ++	TRACE2_PROCESS_INFO_EXIT,
    ++};
    ++int trace2_is_enabled(void);
    ++void trace2_collect_process_info(enum trace2_process_info_reason reason);
    ++
    ++#endif /* TRACE2_H */
    ++
    +
      ## symlinks.c ##
     @@ symlinks.c: void invalidate_lstat_cache(void)
      	reset_lstat_cache(&default_cache);
    @@ symlinks.c: int lstat_cache_aware_rmdir(const char *path)
      	return ret;
      }
     +#endif
    -
    - ## usage.c ##
    -@@
    -  */
    - #include "git-compat-util.h"
    - #include "gettext.h"
    -+
    -+#ifdef GIT_STD_LIB
    -+#undef trace2_cmd_name
    -+#undef trace2_cmd_error_va
    -+#define trace2_cmd_name(x) 
    -+#define trace2_cmd_error_va(x, y)
    -+#else
    - #include "trace2.h"
    -+#endif
    - 
    - static void vreportf(const char *prefix, const char *err, va_list params)
    - {
8:  cb96e67774 ! 7:  0bead8f980 git-std-lib: add test file to call git-std-lib.a functions
    @@ t/stdlib-test.c (new)
     +	strbuf_splice(sb, 0, 1, "foo", 3);
     +	strbuf_insert(sb, 0, "foo", 3);
     +	// strbuf_vinsertf() called by strbuf_insertf
    -+	strbuf_insertf(sb, 0, "%s", "foo"); 
    ++	strbuf_insertf(sb, 0, "%s", "foo");
     +	strbuf_remove(sb, 0, 1);
     +	strbuf_add(sb, "foo", 3);
     +	strbuf_addbuf(sb, sb2);
    @@ t/stdlib-test.c (new)
     +	unlink(path);
     +	read_in_full(fd, &sb, 1);
     +	write_in_full(fd, &sb, 1);
    -+	pread_in_full(fd, &sb, 1, 0);	
    ++	pread_in_full(fd, &sb, 1, 0);
     +}
     +
     +int main() {
    @@ t/stdlib-test.c (new)
     +	fprintf(stderr, "all git-std-lib functions finished calling\n");
     +	return 0;
     +}
    - \ No newline at end of file
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 1/7] hex-ll: split out functionality from hex
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 045e2187c4..83b385b0be 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index 83abb11eda..f3c0a4659b 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 7bb440e794..03e55841ed 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 7df4b3c460..c07c8b34c2 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a SHA1 in hexadecimal format from the 40 characters
@@ -32,13 +17,6 @@ int get_oid_hex(const char *hex, struct object_id *sha1);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 2aeb20e5e6..eb34c30be7 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 8dac52b919..a2a05fe168 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index eba0bdd77f..f1aa87d1dd 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 20:32     ` Junio C Hamano
  2023-08-10 22:36     ` Glen Choo
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
                     ` (6 subsequent siblings)
  8 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

While remove_or_warn() is a simple ternary operator to call two other
wrapper functions, it creates an unnecessary dependency to object.h in
wrapper.c. Therefore move the function to object.[ch] where the concept
of GITLINKs is first defined.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 object.c  | 5 +++++
 object.h  | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index 60f954194f..cb29fcc304 100644
--- a/object.c
+++ b/object.c
@@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
 	FREE_AND_NULL(o->object_state);
 	FREE_AND_NULL(o->shallow_stat);
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/object.h b/object.h
index 5871615fee..e908ef6515 100644
--- a/object.h
+++ b/object.h
@@ -284,4 +284,10 @@ void clear_object_flags(unsigned flags);
  */
 void repo_clear_commit_marks(struct repository *r, unsigned int flags);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* OBJECT_H */
diff --git a/wrapper.c b/wrapper.c
index 22be9812a7..118d3033de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
@@ -647,11 +646,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index c85b1328d1..272795f863 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 3/7] config: correct bad boolean env value error message
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 20:36     ` Junio C Hamano
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 09851a6909..5b71ef1624 100644
--- a/config.c
+++ b/config.c
@@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (2 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 23:21     ` Glen Choo
  2023-08-14 22:09     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
                     ` (4 subsequent siblings)
  8 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config.

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 83b385b0be..e9ad9f9ef1 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index e9c81b6e07..cb047b4618 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 5b71ef1624..cdd70999aa 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1204,129 +1205,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 static int reader_config_name(struct config_reader *reader, const char **out);
 static int reader_origin_type(struct config_reader *reader,
 			      enum config_origin_type *type);
@@ -1404,23 +1282,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value)
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1495,16 +1356,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
 {
 	int v = git_parse_maybe_bool_text(value);
@@ -2165,35 +2016,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 247b572b37..7a7f53e503 100644
--- a/config.h
+++ b/config.h
@@ -3,7 +3,7 @@
 
 #include "hashmap.h"
 #include "string-list.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -205,16 +205,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -343,8 +333,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index f8a155ee13..9f542950a7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 4991455281..39337999d4 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 118d3033de..a6249cc30e 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "repository.h"
 #include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (3 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 23:41     ` Glen Choo
  2023-08-14 22:17     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
                     ` (3 subsequent siblings)
  8 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

In order for date.c to be included in git-std-lib, the dependency to
pager.h must be removed since it has dependencies on many other files
not in git-std-lib. We achieve this by passing a boolean for
"pager_in_use", rather than checking for it in parse_date_format() so
callers of the function will have that dependency.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 builtin/blame.c      | 2 +-
 builtin/log.c        | 2 +-
 date.c               | 5 ++---
 date.h               | 2 +-
 ref-filter.c         | 3 ++-
 revision.c           | 3 ++-
 t/helper/test-date.c | 3 ++-
 7 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 9a3f9facea..665511570d 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -714,7 +714,7 @@ static int git_blame_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "blame.date")) {
 		if (!value)
 			return config_error_nonbool(var);
-		parse_date_format(value, &blame_date_mode);
+		parse_date_format(value, &blame_date_mode, pager_in_use());
 		return 0;
 	}
 	if (!strcmp(var, "blame.ignorerevsfile")) {
diff --git a/builtin/log.c b/builtin/log.c
index 03954fb749..a72ce30c2e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -185,7 +185,7 @@ static void cmd_log_init_defaults(struct rev_info *rev)
 	rev->diffopt.flags.allow_textconv = 1;
 
 	if (default_date_mode)
-		parse_date_format(default_date_mode, &rev->date_mode);
+		parse_date_format(default_date_mode, &rev->date_mode, pager_in_use());
 }
 
 static void set_default_decoration_filter(struct decoration_filter *decoration_filter)
diff --git a/date.c b/date.c
index 619ada5b20..55f73ce2e0 100644
--- a/date.c
+++ b/date.c
@@ -7,7 +7,6 @@
 #include "git-compat-util.h"
 #include "date.h"
 #include "gettext.h"
-#include "pager.h"
 #include "strbuf.h"
 
 /*
@@ -1003,13 +1002,13 @@ static enum date_mode_type parse_date_type(const char *format, const char **end)
 	die("unknown date format %s", format);
 }
 
-void parse_date_format(const char *format, struct date_mode *mode)
+void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
 {
 	const char *p;
 
 	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
 	if (skip_prefix(format, "auto:", &p)) {
-		if (isatty(1) || pager_in_use())
+		if (isatty(1) || pager_in_use)
 			format = p;
 		else
 			format = "default";
diff --git a/date.h b/date.h
index 6136212a19..d9bd6dc09f 100644
--- a/date.h
+++ b/date.h
@@ -53,7 +53,7 @@ const char *show_date(timestamp_t time, int timezone, const struct date_mode *mo
  * be used with strbuf_addftime(), in which case you'll need to call
  * date_mode_release() later.
  */
-void parse_date_format(const char *format, struct date_mode *mode);
+void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use);
 
 /**
  * Release a "struct date_mode", currently only required if
diff --git a/ref-filter.c b/ref-filter.c
index 2ed0ecf260..1b96bb7822 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -28,6 +28,7 @@
 #include "worktree.h"
 #include "hashmap.h"
 #include "strvec.h"
+#include "pager.h"
 
 static struct ref_msg {
 	const char *gone;
@@ -1323,7 +1324,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	formatp = strchr(atomname, ':');
 	if (formatp) {
 		formatp++;
-		parse_date_format(formatp, &date_mode);
+		parse_date_format(formatp, &date_mode, pager_in_use());
 	}
 
 	if (!eoemail)
diff --git a/revision.c b/revision.c
index 985b8b2f51..c7efd11914 100644
--- a/revision.c
+++ b/revision.c
@@ -46,6 +46,7 @@
 #include "resolve-undo.h"
 #include "parse-options.h"
 #include "wildmatch.h"
+#include "pager.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -2577,7 +2578,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->date_mode.type = DATE_RELATIVE;
 		revs->date_mode_explicit = 1;
 	} else if ((argcount = parse_long_opt("date", argv, &optarg))) {
-		parse_date_format(optarg, &revs->date_mode);
+		parse_date_format(optarg, &revs->date_mode, pager_in_use());
 		revs->date_mode_explicit = 1;
 		return argcount;
 	} else if (!strcmp(arg, "--log-size")) {
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 0683d46574..b3927a95b3 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -1,5 +1,6 @@
 #include "test-tool.h"
 #include "date.h"
+#include "pager.h"
 #include "trace.h"
 
 static const char *usage_msg = "\n"
@@ -37,7 +38,7 @@ static void show_dates(const char **argv, const char *format)
 {
 	struct date_mode mode = DATE_MODE_INIT;
 
-	parse_date_format(format, &mode);
+	parse_date_format(format, &mode, pager_in_use());
 	for (; *argv; argv++) {
 		char *arg;
 		timestamp_t t;
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 6/7] git-std-lib: introduce git standard library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (4 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-14 22:26     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Documentation/technical/git-std-lib.txt | 186 ++++++++++++++++++++++++
 Makefile                                |  62 +++++++-
 git-compat-util.h                       |   7 +-
 stubs/repository.c                      |   4 +
 stubs/repository.h                      |   8 +
 stubs/trace2.c                          |  22 +++
 stubs/trace2.h                          |  69 +++++++++
 symlinks.c                              |   2 +
 8 files changed, 358 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/repository.c
 create mode 100644 stubs/repository.h
 create mode 100644 stubs/trace2.c
 create mode 100644 stubs/trace2.h

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3d901a89b0
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,186 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind Git Standard Library essentially is the result of
+two observations within the Git codebase: every file includes
+git-compat-util.h which defines functions in a couple of different
+files, and wrapper.c + usage.c have difficult-to-separate circular
+dependencies with each other and other files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Wrapper.c and usage.c have dependencies on repository and trace2 that are
+possible to remove at the cost of sacrificing the ability for standard Git
+to be able to trace functions in those files and other files in git-std-lib.a.
+In order for git-std-lib.a to compile with those dependencies, stubbed out
+versions of those files are implemented and swapped in during compilation time.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+stubs/repository.c
+stubs/trace2.c
+relevant compat/ files
+
+Pitfalls
+================
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index e9ad9f9ef1..82510cf50e 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
 SCALAR_OBJS =
+STUB_OBJS =
 OBJECTS =
 OTHER_PROGRAMS =
 PROGRAM_OBJS =
@@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 LIB_H = $(FOUND_H_SOURCES)
 
+ifndef GIT_STD_LIB
 LIB_OBJS += abspath.o
 LIB_OBJS += add-interactive.o
 LIB_OBJS += add-patch.o
@@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
 LIB_OBJS += ws.o
 LIB_OBJS += wt-status.o
 LIB_OBJS += xdiff-interface.o
+else ifdef GIT_STD_LIB
+LIB_OBJS += abspath.o
+LIB_OBJS += ctype.o
+LIB_OBJS += date.o
+LIB_OBJS += hex-ll.o
+LIB_OBJS += parse.o
+LIB_OBJS += strbuf.o
+LIB_OBJS += usage.o
+LIB_OBJS += utf8.o
+LIB_OBJS += wrapper.o
+
+ifdef STUB_REPOSITORY
+STUB_OBJS += stubs/repository.o
+endif
+
+ifdef STUB_TRACE2
+STUB_OBJS += stubs/trace2.o
+endif
+
+LIB_OBJS += $(STUB_OBJS)
+endif
 
 BUILTIN_OBJS += builtin/add.o
 BUILTIN_OBJS += builtin/am.o
@@ -2162,6 +2185,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3654,7 +3682,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) po/git.pot po/git-core.pot
 	$(RM) git.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3834,3 +3862,35 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
+STD_LIB = git-std-lib.a
+
+$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+TEMP_HEADERS = temp_headers/
+
+git-std-lib:
+# Move headers to temporary folder and replace them with stubbed headers.
+# After building, move headers and stubbed headers back.
+ifneq ($(STUB_OBJS),)
+	mkdir -p $(TEMP_HEADERS); \
+	for d in $(STUB_OBJS); do \
+		BASE=$${d%.*}; \
+		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
+		mv $${BASE}.h $${BASE##*/}.h; \
+	done; \
+	$(MAKE) $(STD_LIB); \
+	for d in $(STUB_OBJS); do \
+		BASE=$${d%.*}; \
+		mv $${BASE##*/}.h $${BASE}.h; \
+		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
+	done; \
+	rm -rf temp_headers
+else
+	$(MAKE) $(STD_LIB)
+endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 481dac22b0..75aa9b263e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -787,9 +787,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 /*
  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
@@ -951,14 +953,17 @@ int git_access(const char *path, int mode);
 # endif
 #endif
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/stubs/repository.c b/stubs/repository.c
new file mode 100644
index 0000000000..f81520d083
--- /dev/null
+++ b/stubs/repository.c
@@ -0,0 +1,4 @@
+#include "git-compat-util.h"
+#include "repository.h"
+
+struct repository *the_repository;
diff --git a/stubs/repository.h b/stubs/repository.h
new file mode 100644
index 0000000000..18262d748e
--- /dev/null
+++ b/stubs/repository.h
@@ -0,0 +1,8 @@
+#ifndef REPOSITORY_H
+#define REPOSITORY_H
+
+struct repository { int stub; };
+
+extern struct repository *the_repository;
+
+#endif /* REPOSITORY_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
new file mode 100644
index 0000000000..efc3f9c1f3
--- /dev/null
+++ b/stubs/trace2.c
@@ -0,0 +1,22 @@
+#include "git-compat-util.h"
+#include "trace2.h"
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value) { }
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap) { }
+void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name) { }
+void trace2_thread_exit_fl(const char *file, int line) { }
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value) { }
+int trace2_is_enabled(void) { return 0; }
+void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/stubs/trace2.h b/stubs/trace2.h
new file mode 100644
index 0000000000..88ad7387ff
--- /dev/null
+++ b/stubs/trace2.h
@@ -0,0 +1,69 @@
+#ifndef TRACE2_H
+#define TRACE2_H
+
+struct child_process { int stub; };
+struct repository;
+struct json_writer { int stub; };
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...);
+
+#define trace2_region_enter(category, label, repo) \
+	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
+
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...);
+
+#define trace2_region_leave(category, label, repo) \
+	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
+
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value);
+
+#define trace2_data_string(category, repo, key, value)                       \
+	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
+			      (value))
+
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
+
+#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
+
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap);
+
+#define trace2_cmd_error_va(fmt, ap) \
+	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
+
+
+void trace2_cmd_name_fl(const char *file, int line, const char *name);
+
+#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
+
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name);
+
+#define trace2_thread_start(thread_base_name) \
+	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
+
+void trace2_thread_exit_fl(const char *file, int line);
+
+#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
+
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value);
+
+#define trace2_data_intmax(category, repo, key, value)                       \
+	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
+			      (value))
+
+enum trace2_process_info_reason {
+	TRACE2_PROCESS_INFO_STARTUP,
+	TRACE2_PROCESS_INFO_EXIT,
+};
+int trace2_is_enabled(void);
+void trace2_collect_process_info(enum trace2_process_info_reason reason);
+
+#endif /* TRACE2_H */
+
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (5 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-14 22:28     ` Jonathan Tan
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
  2023-08-15  9:41   ` Phillip Wood
  8 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 243 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..a5d7374e2f
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,239 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	struct strbuf_expand_dict_entry dict[] = {
+		{ "foo", NULL, },
+		{ "bar", NULL, },
+	};
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo");
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
+	// strbuf_expand_literal_cb() called by strbuf_expand()
+	// strbuf_expand_dict_cb() called by strbuf_expand()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
@ 2023-08-10 20:32     ` Junio C Hamano
  2023-08-10 22:36     ` Glen Choo
  1 sibling, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-10 20:32 UTC (permalink / raw)
  To: Calvin Wan
  Cc: git, nasamuffin, chooglen, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While remove_or_warn() is a simple ternary operator to call two other
> wrapper functions, it creates an unnecessary dependency to object.h in
> wrapper.c. Therefore move the function to object.[ch] where the concept
> of GITLINKs is first defined.

An untold assumption here is that we would want to make wrapper.[ch]
independent of Git's internals?

If so, where the thing is moved to (i.e. object.c) is much less
interesting than the fact that the goal of this function is to make
wrapper.[ch] less dependent on Git, so the title should reflect
that, no?

> +/*
> + * Calls the correct function out of {unlink,rmdir}_or_warn based on
> + * the supplied file mode.
> + */
> +int remove_or_warn(unsigned int mode, const char *path);

OK.  That "file mode" thing is not a regular "struct stat .st_mode",
but knows Git's internals, hence it makes sense to have it on our
side, not on the wrapper.[ch] side.  That makes sense.

>  #endif /* OBJECT_H */
> diff --git a/wrapper.c b/wrapper.c
> index 22be9812a7..118d3033de 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -5,7 +5,6 @@
>  #include "abspath.h"
>  #include "config.h"
>  #include "gettext.h"
> -#include "object.h"
>  #include "repository.h"
>  #include "strbuf.h"
>  #include "trace2.h"
> @@ -647,11 +646,6 @@ int rmdir_or_warn(const char *file)
>  	return warn_if_unremovable("rmdir", file, rmdir(file));
>  }
>  
> -int remove_or_warn(unsigned int mode, const char *file)
> -{
> -	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
> -}
> -
>  static int access_error_is_ok(int err, unsigned flag)
>  {
>  	return (is_missing_file_error(err) ||
> diff --git a/wrapper.h b/wrapper.h
> index c85b1328d1..272795f863 100644
> --- a/wrapper.h
> +++ b/wrapper.h
> @@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
>   * not exist.
>   */
>  int rmdir_or_warn(const char *path);
> -/*
> - * Calls the correct function out of {unlink,rmdir}_or_warn based on
> - * the supplied file mode.
> - */
> -int remove_or_warn(unsigned int mode, const char *path);
>  
>  /*
>   * Call access(2), but warn for any error except "missing file"

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 3/7] config: correct bad boolean env value error message
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
@ 2023-08-10 20:36     ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-10 20:36 UTC (permalink / raw)
  To: Calvin Wan
  Cc: git, nasamuffin, chooglen, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> An incorrectly defined boolean environment value would result in the
> following error message:
>
> bad boolean config value '%s' for '%s'
>
> This is a misnomer since environment value != config value. Instead of
> calling git_config_bool() to parse the environment value, mimic the
> functionality inside of git_config_bool() but with the correct error
> message.

Makes sense.

>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  config.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/config.c b/config.c
> index 09851a6909..5b71ef1624 100644
> --- a/config.c
> +++ b/config.c
> @@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
>  int git_env_bool(const char *k, int def)
>  {
>  	const char *v = getenv(k);
> -	return v ? git_config_bool(k, v) : def;
> +	int val;
> +	if (!v)
> +		return def;
> +	val = git_parse_maybe_bool(v);
> +	if (val < 0)
> +		die(_("bad boolean environment value '%s' for '%s'"),
> +		    v, k);
> +	return val;
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (6 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-08-10 22:05   ` Glen Choo
  2023-08-15  9:20     ` Phillip Wood
  2023-08-15  9:41   ` Phillip Wood
  8 siblings, 1 reply; 70+ messages in thread
From: Glen Choo @ 2023-08-10 22:05 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> Calvin Wan (7):
>   hex-ll: split out functionality from hex
>   object: move function to object.c
>   config: correct bad boolean env value error message
>   parse: create new library for parsing strings and env values
>   date: push pager.h dependency up
>   git-std-lib: introduce git standard library
>   git-std-lib: add test file to call git-std-lib.a functions

This doesn't seem to apply to 'master'. Do you have a base commit that
reviewers could apply the patches to?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
  2023-08-10 20:32     ` Junio C Hamano
@ 2023-08-10 22:36     ` Glen Choo
  2023-08-10 22:43       ` Junio C Hamano
  1 sibling, 1 reply; 70+ messages in thread
From: Glen Choo @ 2023-08-10 22:36 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While remove_or_warn() is a simple ternary operator to call two other
> wrapper functions, it creates an unnecessary dependency to object.h in
> wrapper.c. Therefore move the function to object.[ch] where the concept
> of GITLINKs is first defined.

As Junio mentioned elsewhere, I think we need to establish that
wrapper.c should be free of Git-specific internals.

> diff --git a/object.c b/object.c
> index 60f954194f..cb29fcc304 100644
> --- a/object.c
> +++ b/object.c
> @@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
>  	FREE_AND_NULL(o->object_state);
>  	FREE_AND_NULL(o->shallow_stat);
>  }
> +
> +int remove_or_warn(unsigned int mode, const char *file)
> +{
> +	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
> +}

Since this function really needs S_ISGITLINK (I tried to see if we could
just replace it with S_ISDIR and get the same behavior, but we can't),
this really is a Git-specific thing, so yes, this should be moved out of
wrapper.c.

Minor point: I think a better home might be entry.[ch], because those
files care about performing changes on the worktree based on the
Git-specific file modes in the index, whereas object.[ch] seems more
concerned about the format of objects.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 22:36     ` Glen Choo
@ 2023-08-10 22:43       ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-10 22:43 UTC (permalink / raw)
  To: Glen Choo
  Cc: Calvin Wan, git, nasamuffin, jonathantanmy, linusa,
	phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:

> Minor point: I think a better home might be entry.[ch], because those
> files care about performing changes on the worktree based on the
> Git-specific file modes in the index, whereas object.[ch] seems more
> concerned about the format of objects.

Yeah, I wasn't paying much attention on that point while reading the
patch, and I do agree with you that entry.[ch] may be a better fit.

Thanks.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-08-10 23:21     ` Glen Choo
  2023-08-10 23:43       ` Junio C Hamano
  2023-08-14 22:15       ` Jonathan Tan
  2023-08-14 22:09     ` Jonathan Tan
  1 sibling, 2 replies; 70+ messages in thread
From: Glen Choo @ 2023-08-10 23:21 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
>
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].

An unstated purpose of this patch is that parse.[ch] becomes part of
git-std-lib, but not config.[ch], right?

I think it's reasonable to have the string value parsing logic in
git-std-lib, e.g. this parsing snippet from diff.c seems like a good
thing to put into a library that wants to accept user input:

  static int parse_color_moved(const char *arg)
  {
    switch (git_parse_maybe_bool(arg)) {
    case 0:
      return COLOR_MOVED_NO;
    case 1:
      return COLOR_MOVED_DEFAULT;
    default:
      break;
    }

    if (!strcmp(arg, "no"))
      return COLOR_MOVED_NO;
    else if (!strcmp(arg, "plain"))
      return COLOR_MOVED_PLAIN;
    else if (!strcmp(arg, "blocks"))
      return COLOR_MOVED_BLOCKS;
    /* ... */
  }

But, I don't see a why a non-Git caller would want environment value
parsing in git-std-lib. I wouldn't think that libraries should be
reading Git-formatted environment variables. If I had to guess, you
arranged it this way because you want to keep xmalloc in git-std-lib,
which has a dependency on env var parsing here:

  static int memory_limit_check(size_t size, int gentle)
  {
    static size_t limit = 0;
    if (!limit) {
      limit = git_env_ulong("GIT_ALLOC_LIMIT", 0);
      if (!limit)
        limit = SIZE_MAX;
    }
    if (size > limit) {
      if (gentle) {
        error("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
              (uintmax_t)size, (uintmax_t)limit);
        return -1;
      } else
        die("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
            (uintmax_t)size, (uintmax_t)limit);
    }
    return 0;
  }

If we libified this as-is, wouldn't our caller start paying attention to
the GIT_ALLOC_LIMIT environment variable? That seems like an undesirable
side effect.

I see later in the series that you have "stubs", which are presumably
entrypoints for the caller to specify their own implementations of
Git-specific things. If so, then an alternative would be to provide a
"stub" to get the memory limit, something like:

  /* wrapper.h aka the things to stub */
  size_t git_get_memory_limit(void);

  /* stub-wrapper-or-something.c aka Git's implementation of the stub */

  #include "wrapper.h"
  size_t git_get_memory_limit(void)
  {
      return git_env_ulong("GIT_ALLOC_LIMIT", 0);
  }

  /* wrapper.c aka the thing in git-stb-lib */
  static int memory_limit_check(size_t size, int gentle)
  {
    static size_t limit = 0;
    if (!limit) {
      limit = git_get_memory_limit();
      if (!limit)
        limit = SIZE_MAX;
    }
    if (size > limit) {
      if (gentle) {
        error("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
              (uintmax_t)size, (uintmax_t)limit);
        return -1;
      } else
        die("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
            (uintmax_t)size, (uintmax_t)limit);
    }
    return 0;
  }

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
@ 2023-08-10 23:41     ` Glen Choo
  2023-08-14 22:17     ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Glen Choo @ 2023-08-10 23:41 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> In order for date.c to be included in git-std-lib, the dependency to
> pager.h must be removed since it has dependencies on many other files
> not in git-std-lib.

Dependencies aside, I doubt callers of Git libraries want Git's
pager-handling logic bundled in git-std-lib ;)

> @@ -1003,13 +1002,13 @@ static enum date_mode_type parse_date_type(const char *format, const char **end)
>  	die("unknown date format %s", format);
>  }
>  
> -void parse_date_format(const char *format, struct date_mode *mode)
> +void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
>  {
>  	const char *p;
>  
>  	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
>  	if (skip_prefix(format, "auto:", &p)) {
> -		if (isatty(1) || pager_in_use())
> +		if (isatty(1) || pager_in_use)
>  			format = p;
>  		else
>  			format = "default";

Hm, it feels odd to ship a parsing option that changes based on whether
the caller isatty or not. Ideally we would stub this "switch the value
of auto" logic too.

Without reading ahead, I'm not sure if there are other sorts of "library
influencing process-wide" oddities like the one here and in the previous
patch. I think it would be okay for us to merge this series with these,
as long as we advertise to callers that the library boundary isn't very
clean yet, and we eventually clean it up.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 23:21     ` Glen Choo
@ 2023-08-10 23:43       ` Junio C Hamano
  2023-08-14 22:15       ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-10 23:43 UTC (permalink / raw)
  To: Glen Choo
  Cc: Calvin Wan, git, nasamuffin, jonathantanmy, linusa,
	phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:

> I think it's reasonable to have the string value parsing logic in
> git-std-lib, e.g. this parsing snippet from diff.c seems like a good
> thing to put into a library that wants to accept user input:
>
>   static int parse_color_moved(const char *arg)
>   {
>     switch (git_parse_maybe_bool(arg)) {
>     case 0:
>       return COLOR_MOVED_NO;
>     case 1:
>       return COLOR_MOVED_DEFAULT;
>     default:
>       break;
>     }
>
>     if (!strcmp(arg, "no"))
>       return COLOR_MOVED_NO;
>     else if (!strcmp(arg, "plain"))
>       return COLOR_MOVED_PLAIN;
>     else if (!strcmp(arg, "blocks"))
>       return COLOR_MOVED_BLOCKS;
>     /* ... */
>   }
>
> But, I don't see a why a non-Git caller would want environment value
> parsing in git-std-lib.

It also is debatable why a non-Git caller wants to parse the value
to the "--color-moved" option (or a configuration variable) to begin
with.  Its vocabulary is closely tied to what the diff machinery in
Git can do, isn't it?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
  2023-08-10 23:21     ` Glen Choo
@ 2023-08-14 22:09     ` Jonathan Tan
  2023-08-14 22:19       ` Junio C Hamano
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:09 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
> 
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>

Thanks - I think that patches 1 through 4 are worth merging even now.
One thing we hoped to accomplish through the libification effort is to
make changes that are beneficial even outside the libification context,
and it seems that this is one of them. Previously, code needed to
include config.h even when it didn't use the main functionality that
config.h provides (config), but now it no longer needs to do so. (And
same argument for hex, although on a smaller scale.)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 23:21     ` Glen Choo
  2023-08-10 23:43       ` Junio C Hamano
@ 2023-08-14 22:15       ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:15 UTC (permalink / raw)
  To: Glen Choo
  Cc: Jonathan Tan, Calvin Wan, git, nasamuffin, linusa, phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:
> But, I don't see a why a non-Git caller would want environment value
> parsing in git-std-lib. I wouldn't think that libraries should be
> reading Git-formatted environment variables.

I think environment parsing in git-std-lib is fine, at least for the
short term. First, currently we expect a lot from a user of our library
(including tolerating breaking changes in API), so I think it is
reasonable for such a user to be aware that some functionality can be
changed by an environment variable. Second, the purpose of the library
is to provide functionality that currently is only accessible through
CLI in library form, and if the CLI deems that some functionality
should be accessible through an environment variable instead of a config
variable or CLI parameter for whatever reason, we should reflect that in
the library as well.
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
  2023-08-10 23:41     ` Glen Choo
@ 2023-08-14 22:17     ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:17 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> In order for date.c to be included in git-std-lib, the dependency to
> pager.h must be removed since it has dependencies on many other files
> not in git-std-lib. We achieve this by passing a boolean for
> "pager_in_use", rather than checking for it in parse_date_format() so
> callers of the function will have that dependency.

Instead of doing it as you describe here, could this be another stub
instead? That way, we don't need to change the code here.

I don't feel strongly about this, though, so if other reviewers think
that the approach in this patch makes the code better, I'm OK with that.
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-14 22:09     ` Jonathan Tan
@ 2023-08-14 22:19       ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-14 22:19 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Calvin Wan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks - I think that patches 1 through 4 are worth merging even now.
> One thing we hoped to accomplish through the libification effort is to
> make changes that are beneficial even outside the libification context,
> and it seems that this is one of them. Previously, code needed to
> include config.h even when it didn't use the main functionality that
> config.h provides (config), but now it no longer needs to do so. (And
> same argument for hex, although on a smaller scale.)

Thanks for writing this down.  The parser is shared across handling
data that come from config, environ, and options, and separating it
as a component different from the config does make sense.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 6/7] git-std-lib: introduce git standard library
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
@ 2023-08-14 22:26     ` Jonathan Tan
  0 siblings, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:26 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> +Rationale behind Git Standard Library
> +================

Would it be clearer to write "Rationale behind what's in and what's not
in the Git Standard Library"? Or maybe that is too much of a mouthful.

> +Files inside of Git Standard Library
> +================
> +
> +The initial set of files in git-std-lib.a are:
> +abspath.c
> +ctype.c
> +date.c
> +hex-ll.c
> +parse.c
> +strbuf.c
> +usage.c
> +utf8.c
> +wrapper.c
> +stubs/repository.c
> +stubs/trace2.c
> +relevant compat/ files

I noticed that an earlier version did not have the "stubs" lines and
this version does, but could not find a comment about why these were
added. For me, what would make sense is to remove the "stubs" lines,
and then say "When these files are compiled together with the following
files (or user-provided files that provide the same functions), they
form a complete library", and then list the stubs after.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 481dac22b0..75aa9b263e 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
>  #define platform_core_config noop_core_config
>  #endif
>  
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>  int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>  #define rmdir lstat_cache_aware_rmdir
>  #endif

(and other changes that use defined(GIT_STD_LIB))

One alternative is to add stubs for lstat_cache_aware_rmdir that call
the "real" rmdir, but I guess that would be unnecessarily confusing.
Also, it would be strange if a user included a header file that
redefined a standard library function, so I guess we do need such a
"defined()" guard.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-08-14 22:28     ` Jonathan Tan
  0 siblings, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:28 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> Add test file that directly or indirectly calls all functions defined in
> git-std-lib.a object files to showcase that they do not reference
> missing objects and that git-std-lib.a can stand on its own.
> 
> Certain functions that cause the program to exit or are already called
> by other functions are commented out.
> 
> TODO: replace with unit tests
> Signed-off-by: Calvin Wan <calvinwan@google.com>

Thanks for this patch - it's useful for reviewers to see what this
patch set accomplishes (a way to compile a subset of files in Git that
can provide library functionality). I don't think we should merge it
as-is but should wait until we have a unit test that also exercises
functions, and then merge that instead (I think your TODO expresses the
same sentiment).
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
@ 2023-08-15  9:20     ` Phillip Wood
  2023-08-16 17:17       ` Calvin Wan
  0 siblings, 1 reply; 70+ messages in thread
From: Phillip Wood @ 2023-08-15  9:20 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

On 10/08/2023 23:05, Glen Choo wrote:
> Calvin Wan <calvinwan@google.com> writes:
> 
>> Calvin Wan (7):
>>    hex-ll: split out functionality from hex
>>    object: move function to object.c
>>    config: correct bad boolean env value error message
>>    parse: create new library for parsing strings and env values
>>    date: push pager.h dependency up
>>    git-std-lib: introduce git standard library
>>    git-std-lib: add test file to call git-std-lib.a functions
> 
> This doesn't seem to apply to 'master'. Do you have a base commit that
> reviewers could apply the patches to?

I don't know what they are based on, but I did manage to apply them to 
master by using "am -3" and resolving the conflicts. The result is at 
https://github.com/phillipwood/git/tree/cw/git-std-lib/rfc-v2 if anyone 
is interested.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (7 preceding siblings ...)
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
@ 2023-08-15  9:41   ` Phillip Wood
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  8 siblings, 1 reply; 70+ messages in thread
From: Phillip Wood @ 2023-08-15  9:41 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, jonathantanmy, linusa, vdye

Hi Calvin

On 10/08/2023 17:33, Calvin Wan wrote:
> Original cover letter:
> https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/
> 
> In the initial RFC, I had a patch that removed the trace2 dependency
> from usage.c so that git-std-lib.a would not have dependencies outside
> of git-std-lib.a files. Consequently this meant that tracing would not
> be possible in git-std-lib.a files for other developers of Git, and it
> is not a good idea for the libification effort to close the door on
> tracing in certain files for future development (thanks Victoria for
> pointing this out). That patch has been removed and instead I introduce
> stubbed out versions of repository.[ch] and trace2.[ch] that are swapped
> in during compilation time (I'm no Makefile expert so any advice on how
> on I could do this better would be much appreciated). These stubbed out
> files contain no implementations and therefore do not have any
> additional dependencies, allowing git-std-lib.a to compile with only the
> stubs as additional dependencies.

I think stubbing out trace2 is a sensible approach. I don't think we
need separate headers when using the stub though, or a stub for
repository.c as we don't call any of the functions declared in that
header. I've appended a patch that shows a simplified stub. It also
removes the recursive make call as it no-longer needs to juggle the
header files.

> This also has the added benefit of
> removing `#ifdef GIT_STD_LIB` macros in C files for specific library
> compilation rules. Libification shouldn't pollute C files with these
> macros. The boundaries for git-std-lib.a have also been updated to
> contain these stubbed out files.

Do you have any plans to support building with gettext support so we
can use git-std-lib.a as a dependency of libgit.a?
  
> I have also made some additional changes to the Makefile to piggy back
> off of our existing build rules for .c/.o targets and their
> dependencies. As I learn more about Makefiles, I am continuing to look
> for ways to improve these rules. Eventually I would like to be able to
> have a set of rules that future libraries can emulate and is scalable
> in the sense of not creating additional toil for developers that are not
> interested in libification.

I'm not sure reusing LIB_OBJS for different targets is a good idea.
Once libgit.a starts to depend on git-std-lib.a we'll want to build them
both with a single make invocation without resorting to recursive make
calls. I think we could perhaps make a template function to create the
compilation rules for each library - see the end of
https://wingolog.org/archives/2023/08/08/a-negative-result

Best Wishes

Phillip

---- >8 -----
 From 194403e42f116cc3c6ed8eb8b03d6933b24067e4 Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Sat, 12 Aug 2023 17:27:23 +0100
Subject: [PATCH] git-std-lib: simplify sub implementation

The code in std-lib does not depend directly on the functions declared
in repository.h and so it does not need to provide stub
implementations of the functions declared in repository.h. There is a
transitive dependency on `struct repository` from the functions
declared in trace2.h but the stub implementation of those functions
can simply define its own stub for struct repository. There is also no
need to use different headers when compiling against the stub
implementation of trace2.

This means we can simplify the stub implementation by removing
stubs/{repository.[ch],trace2.h} and simplify the Makefile by removing
the code that replaces header files when compiling against the trace2
stub. git-std-lib.a can now be built by running

   make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease

There is one other small fixup in this commit:

  - `wrapper.c` includes `repository.h` but does not use any of the
    declarations.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
  Makefile           | 29 +-------------------
  stubs/repository.c |  4 ---
  stubs/repository.h |  8 ------
  stubs/trace2.c     |  5 ++++
  stubs/trace2.h     | 68 ----------------------------------------------
  wrapper.c          |  1 -
  6 files changed, 6 insertions(+), 109 deletions(-)
  delete mode 100644 stubs/repository.c
  delete mode 100644 stubs/repository.h
  delete mode 100644 stubs/trace2.h

diff --git a/Makefile b/Makefile
index a821d73c9d0..8eff4021025 100644
--- a/Makefile
+++ b/Makefile
@@ -1209,10 +1209,6 @@ LIB_OBJS += usage.o
  LIB_OBJS += utf8.o
  LIB_OBJS += wrapper.o
  
-ifdef STUB_REPOSITORY
-STUB_OBJS += stubs/repository.o
-endif
-
  ifdef STUB_TRACE2
  STUB_OBJS += stubs/trace2.o
  endif
@@ -3866,31 +3862,8 @@ fuzz-all: $(FUZZ_PROGRAMS)
  ### Libified Git rules
  
  # git-std-lib
-# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
+# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease`
  STD_LIB = git-std-lib.a
  
  $(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
  	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
-
-TEMP_HEADERS = temp_headers/
-
-git-std-lib:
-# Move headers to temporary folder and replace them with stubbed headers.
-# After building, move headers and stubbed headers back.
-ifneq ($(STUB_OBJS),)
-	mkdir -p $(TEMP_HEADERS); \
-	for d in $(STUB_OBJS); do \
-		BASE=$${d%.*}; \
-		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
-		mv $${BASE}.h $${BASE##*/}.h; \
-	done; \
-	$(MAKE) $(STD_LIB); \
-	for d in $(STUB_OBJS); do \
-		BASE=$${d%.*}; \
-		mv $${BASE##*/}.h $${BASE}.h; \
-		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
-	done; \
-	rm -rf temp_headers
-else
-	$(MAKE) $(STD_LIB)
-endif
diff --git a/stubs/repository.c b/stubs/repository.c
deleted file mode 100644
index f81520d083a..00000000000
--- a/stubs/repository.c
+++ /dev/null
@@ -1,4 +0,0 @@
-#include "git-compat-util.h"
-#include "repository.h"
-
-struct repository *the_repository;
diff --git a/stubs/repository.h b/stubs/repository.h
deleted file mode 100644
index 18262d748e5..00000000000
--- a/stubs/repository.h
+++ /dev/null
@@ -1,8 +0,0 @@
-#ifndef REPOSITORY_H
-#define REPOSITORY_H
-
-struct repository { int stub; };
-
-extern struct repository *the_repository;
-
-#endif /* REPOSITORY_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
index efc3f9c1f39..7d894822288 100644
--- a/stubs/trace2.c
+++ b/stubs/trace2.c
@@ -1,6 +1,10 @@
  #include "git-compat-util.h"
  #include "trace2.h"
  
+struct child_process { int stub; };
+struct repository { int stub; };
+struct json_writer { int stub; };
+
  void trace2_region_enter_fl(const char *file, int line, const char *category,
  			    const char *label, const struct repository *repo, ...) { }
  void trace2_region_leave_fl(const char *file, int line, const char *category,
@@ -19,4 +23,5 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
  			   const struct repository *repo, const char *key,
  			   intmax_t value) { }
  int trace2_is_enabled(void) { return 0; }
+void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
  void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/stubs/trace2.h b/stubs/trace2.h
deleted file mode 100644
index 836a14797cc..00000000000
--- a/stubs/trace2.h
+++ /dev/null
@@ -1,68 +0,0 @@
-#ifndef TRACE2_H
-#define TRACE2_H
-
-struct child_process { int stub; };
-struct repository;
-struct json_writer { int stub; };
-
-void trace2_region_enter_fl(const char *file, int line, const char *category,
-			    const char *label, const struct repository *repo, ...);
-
-#define trace2_region_enter(category, label, repo) \
-	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
-
-void trace2_region_leave_fl(const char *file, int line, const char *category,
-			    const char *label, const struct repository *repo, ...);
-
-#define trace2_region_leave(category, label, repo) \
-	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
-
-void trace2_data_string_fl(const char *file, int line, const char *category,
-			   const struct repository *repo, const char *key,
-			   const char *value);
-
-#define trace2_data_string(category, repo, key, value)                       \
-	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
-			      (value))
-
-void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
-
-#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
-
-void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
-			    va_list ap);
-
-#define trace2_cmd_error_va(fmt, ap) \
-	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
-
-
-void trace2_cmd_name_fl(const char *file, int line, const char *name);
-
-#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
-
-void trace2_thread_start_fl(const char *file, int line,
-			    const char *thread_base_name);
-
-#define trace2_thread_start(thread_base_name) \
-	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
-
-void trace2_thread_exit_fl(const char *file, int line);
-
-#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
-
-void trace2_data_intmax_fl(const char *file, int line, const char *category,
-			   const struct repository *repo, const char *key,
-			   intmax_t value);
-
-#define trace2_data_intmax(category, repo, key, value)                       \
-	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
-			      (value))
-
-enum trace2_process_info_reason {
-	TRACE2_PROCESS_INFO_STARTUP,
-	TRACE2_PROCESS_INFO_EXIT,
-};
-int trace2_is_enabled(void);
-void trace2_collect_process_info(enum trace2_process_info_reason reason);
-
-#endif /* TRACE2_H */
diff --git a/wrapper.c b/wrapper.c
index 9eae4a8b3a0..e6facc5ff0c 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
  #include "abspath.h"
  #include "parse.h"
  #include "gettext.h"
-#include "repository.h"
  #include "strbuf.h"
  #include "trace2.h"
  
-- 
2.40.1.850.ge5e148ffb7d



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-15  9:20     ` Phillip Wood
@ 2023-08-16 17:17       ` Calvin Wan
  2023-08-16 21:19         ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-08-16 17:17 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, jonathantanmy, linusa, vdye

Thanks for resolving the conflicts on master. I should've rebased
before sending out this v2 since it's built off of 2.41 with some of
my other patch cleanup series.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-16 17:17       ` Calvin Wan
@ 2023-08-16 21:19         ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-08-16 21:19 UTC (permalink / raw)
  To: Calvin Wan; +Cc: phillip.wood, git, nasamuffin, jonathantanmy, linusa, vdye

Calvin Wan <calvinwan@google.com> writes:

> Thanks for resolving the conflicts on master. I should've rebased
> before sending out this v2 since it's built off of 2.41 with some of
> my other patch cleanup series.

I think the freeze period before the release would be a good time to
rebuild on an updated base to prepare v3 for posting.

Thanks.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 0/6] Introduce Git Standard Library
  2023-08-15  9:41   ` Phillip Wood
@ 2023-09-08 17:41     ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
                         ` (6 more replies)
  0 siblings, 7 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:41 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Original cover letter:
https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/

I have taken this series out of RFC since there weren't any significant
concerns with the overall concept and design of this series. This reroll
incorporates some smaller changes such as dropping the "push pager
dependency" patch in favor of stubbing it out. The main change this
reroll cleans up the Makefile rules and stubs, as suggested by
Phillip Wood (appreciate the help on this one)!

This series has been rebased onto 1fc548b2d6a: The sixth batch

Originally this series was built on other patches that have since been
merged, which is why the range-diff is shown removing many of them.

Calvin Wan (6):
  hex-ll: split out functionality from hex
  wrapper: remove dependency to Git-specific internal file
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++
 Makefile                                |  41 ++++-
 attr.c                                  |   2 +-
 color.c                                 |   2 +-
 config.c                                | 173 +-----------------
 config.h                                |  14 +-
 entry.c                                 |   5 +
 entry.h                                 |   6 +
 git-compat-util.h                       |   7 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 parse-options.c                         |   3 +-
 parse.c                                 | 182 +++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 strbuf.c                                |   2 +-
 stubs/pager.c                           |   6 +
 stubs/pager.h                           |   6 +
 stubs/trace2.c                          |  27 +++
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 231 ++++++++++++++++++++++++
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 wrapper.c                               |   9 +-
 wrapper.h                               |   5 -
 write-or-die.c                          |   2 +-
 38 files changed, 824 insertions(+), 287 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/pager.h
 create mode 100644 stubs/trace2.c
 create mode 100644 t/stdlib-test.c

Range-diff against v2:
 1:  121788f263 <  -:  ---------- strbuf: clarify API boundary
 2:  5e91404ecd <  -:  ---------- strbuf: clarify dependency
 3:  5c05f40181 <  -:  ---------- abspath: move related functions to abspath
 4:  e1addc77e5 <  -:  ---------- credential-store: move related functions to credential-store file
 5:  62e8c42f59 <  -:  ---------- object-name: move related functions to object-name
 6:  0abba57acb <  -:  ---------- path: move related function to path
 7:  d33267a390 <  -:  ---------- strbuf: remove global variable
 8:  665d2c2089 <  -:  ---------- init-db: document existing bug with core.bare in template config
 9:  68d0a8ff16 <  -:  ---------- init-db: remove unnecessary global variable
10:  8c8ec85507 <  -:  ---------- init-db, clone: change unnecessary global into passed parameter
11:  d555e2b365 <  -:  ---------- setup: adopt shared init-db & clone code
12:  689a7bc8aa <  -:  ---------- read-cache: move shared commit and ls-files code
13:  392f8e75b7 <  -:  ---------- add: modify add_files_to_cache() to avoid globals
14:  49ce237013 <  -:  ---------- read-cache: move shared add/checkout/commit code
15:  c5d8370d40 <  -:  ---------- statinfo: move stat_{data,validity} functions from cache/read-cache
16:  90a72b6f86 <  -:  ---------- run-command.h: move declarations for run-command.c from cache.h
17:  f27516c780 <  -:  ---------- name-hash.h: move declarations for name-hash.c from cache.h
18:  895c38a050 <  -:  ---------- sparse-index.h: move declarations for sparse-index.c from cache.h
19:  8678d4ad20 <  -:  ---------- preload-index.h: move declarations for preload-index.c from elsewhere
20:  4a463abaae <  -:  ---------- diff.h: move declaration for global in diff.c from cache.h
21:  3440e762c7 <  -:  ---------- merge.h: move declarations for merge.c from cache.h
22:  e70853e398 <  -:  ---------- repository.h: move declaration of the_index from cache.h
23:  ccd2014d73 <  -:  ---------- read-cache*.h: move declarations for read-cache.c functions from cache.h
24:  d3a482afa9 <  -:  ---------- cache.h: remove this no-longer-used header
25:  eaa087f446 <  -:  ---------- log-tree: replace include of revision.h with simple forward declaration
26:  5d2b0a9c75 <  -:  ---------- repository: remove unnecessary include of path.h
27:  250f83014e <  -:  ---------- diff.h: remove unnecessary include of oidset.h
28:  d0f9913958 <  -:  ---------- list-objects-filter-options.h: remove unneccessary include
29:  03a2b2a515 <  -:  ---------- builtin.h: remove unneccessary includes
30:  15edc22d00 <  -:  ---------- git-compat-util.h: remove unneccessary include of wildmatch.h
31:  e4e1bec8bd <  -:  ---------- merge-ll: rename from ll-merge
32:  9185495fd0 <  -:  ---------- khash: name the structs that khash declares
33:  15fb05e453 <  -:  ---------- object-store-ll.h: split this header out of object-store.h
34:  2608fe4b23 <  -:  ---------- hash-ll, hashmap: move oidhash() to hash-ll
35:  5e8dc5b574 <  -:  ---------- fsmonitor-ll.h: split this header out of fsmonitor.h
36:  37d32fc3fd <  -:  ---------- git-compat-util: move strbuf.c funcs to its header
37:  6ed19d5fe2 <  -:  ---------- git-compat-util: move wrapper.c funcs to its header
38:  555d1b8942 <  -:  ---------- sane-ctype.h: create header for sane-ctype macros
39:  72d591e282 <  -:  ---------- kwset: move translation table from ctype
40:  5d1dc2a118 <  -:  ---------- common.h: move non-compat specific macros and functions
41:  33e07e552e <  -:  ---------- git-compat-util: move usage.c funcs to its header
42:  417a8aa733 <  -:  ---------- treewide: remove unnecessary includes for wrapper.h
43:  65e35d00c1 <  -:  ---------- common: move alloc macros to common.h
44:  78634bc406 !  1:  2f99eb2ca4 hex-ll: split out functionality from hex
    @@ hex.h
     +#include "hex-ll.h"
      
      /*
    -  * Try to read a SHA1 in hexadecimal format from the 40 characters
    -@@ hex.h: int get_oid_hex(const char *hex, struct object_id *sha1);
    +  * Try to read a hash (specified by the_hash_algo) in hexadecimal
    +@@ hex.h: int get_oid_hex(const char *hex, struct object_id *oid);
      /* Like get_oid_hex, but for an arbitrary hash algorithm. */
      int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
      
45:  21ec1d276e !  2:  7b2d123628 object: move function to object.c
    @@ Metadata
     Author: Calvin Wan <calvinwan@google.com>
     
      ## Commit message ##
    -    object: move function to object.c
    +    wrapper: remove dependency to Git-specific internal file
     
    -    While remove_or_warn() is a simple ternary operator to call two other
    -    wrapper functions, it creates an unnecessary dependency to object.h in
    -    wrapper.c. Therefore move the function to object.[ch] where the concept
    -    of GITLINKs is first defined.
    +    In order for wrapper.c to be built independently as part of a smaller
    +    library, it cannot have dependencies to other Git specific
    +    internals. remove_or_warn() creates an unnecessary dependency to
    +    object.h in wrapper.c. Therefore move the function to entry.[ch] which
    +    performs changes on the worktree based on the Git-specific file modes in
    +    the index.
     
    - ## object.c ##
    -@@ object.c: void parsed_object_pool_clear(struct parsed_object_pool *o)
    - 	FREE_AND_NULL(o->object_state);
    - 	FREE_AND_NULL(o->shallow_stat);
    + ## entry.c ##
    +@@ entry.c: void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
    + 		return;
    + 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
      }
     +
     +int remove_or_warn(unsigned int mode, const char *file)
    @@ object.c: void parsed_object_pool_clear(struct parsed_object_pool *o)
     +	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
     +}
     
    - ## object.h ##
    -@@ object.h: void clear_object_flags(unsigned flags);
    -  */
    - void repo_clear_commit_marks(struct repository *r, unsigned int flags);
    + ## entry.h ##
    +@@ entry.h: int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
    + void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
    + 			   struct stat *st);
      
     +/*
     + * Calls the correct function out of {unlink,rmdir}_or_warn based on
    @@ object.h: void clear_object_flags(unsigned flags);
     + */
     +int remove_or_warn(unsigned int mode, const char *path);
     +
    - #endif /* OBJECT_H */
    + #endif /* ENTRY_H */
     
      ## wrapper.c ##
     @@
46:  41dcf8107c =  3:  b37beb206a config: correct bad boolean env value error message
47:  3e800a41c4 !  4:  3a827cf45c parse: create new library for parsing strings and env values
    @@ Commit message
         config.c, there are other files that only need parsing functionality and
         not config functionality. By separating out string and environment value
         parsing from config, those files can instead be dependent on parse,
    -    which has a much smaller dependency chain than config.
    +    which has a much smaller dependency chain than config. This ultimately
    +    allows us to inclue parse.[ch] in an independent library since it
    +    doesn't have dependencies to Git-specific internals unlike in
    +    config.[ch].
     
         Move general string and env parsing functions from config.[ch] to
         parse.[ch].
    @@ config.c: static int git_parse_source(struct config_source *cs, config_fn_t fn,
     -	return 1;
     -}
     -
    - static int reader_config_name(struct config_reader *reader, const char **out);
    - static int reader_origin_type(struct config_reader *reader,
    - 			      enum config_origin_type *type);
    -@@ config.c: ssize_t git_config_ssize_t(const char *name, const char *value)
    + NORETURN
    + static void die_bad_number(const char *name, const char *value,
    + 			   const struct key_value_info *kvi)
    +@@ config.c: ssize_t git_config_ssize_t(const char *name, const char *value,
      	return ret;
      }
      
    @@ config.c: static enum fsync_component parse_fsync_components(const char *var, co
     -	return -1;
     -}
     -
    - int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
    + int git_config_bool_or_int(const char *name, const char *value,
    + 			   const struct key_value_info *kvi, int *is_bool)
      {
    - 	int v = git_parse_maybe_bool_text(value);
     @@ config.c: void git_global_config(char **user_out, char **xdg_out)
      	*xdg_out = xdg_config;
      }
    @@ config.c: void git_global_config(char **user_out, char **xdg_out)
     
      ## config.h ##
     @@
    - 
      #include "hashmap.h"
      #include "string-list.h"
    + #include "repository.h"
     -
     +#include "parse.h"
      
48:  7a4a088bc3 <  -:  ---------- date: push pager.h dependency up
49:  c9002734d0 !  5:  f8e4ac50a0 git-std-lib: introduce git standard library
    @@ Documentation/technical/git-std-lib.txt (new)
     +Rationale behind Git Standard Library
     +================
     +
    -+The rationale behind Git Standard Library essentially is the result of
    -+two observations within the Git codebase: every file includes
    -+git-compat-util.h which defines functions in a couple of different
    -+files, and wrapper.c + usage.c have difficult-to-separate circular
    -+dependencies with each other and other files.
    ++The rationale behind what's in and what's not in the Git Standard
    ++Library essentially is the result of two observations within the Git
    ++codebase: every file includes git-compat-util.h which defines functions
    ++in a couple of different files, and wrapper.c + usage.c have
    ++difficult-to-separate circular dependencies with each other and other
    ++files.
     +
     +Ubiquity of git-compat-util.h and circular dependencies
     +========
    @@ Documentation/technical/git-std-lib.txt (new)
     + - low-level git/* files with functions defined in git-compat-util.h
     +   (ctype.c)
     + - compat/*
    -+ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
    ++ - stubbed out dependencies in stubs/ (stubs/pager.c, stubs/trace2.c)
     +
     +There are other files that might fit this definition, but that does not
     +mean it should belong in git-std-lib.a. Those files should start as
     +their own separate library since any file added to git-std-lib.a loses
     +its flexibility of being easily swappable.
     +
    -+Wrapper.c and usage.c have dependencies on repository and trace2 that are
    ++Wrapper.c and usage.c have dependencies on pager and trace2 that are
     +possible to remove at the cost of sacrificing the ability for standard Git
     +to be able to trace functions in those files and other files in git-std-lib.a.
     +In order for git-std-lib.a to compile with those dependencies, stubbed out
    @@ Documentation/technical/git-std-lib.txt (new)
     +usage.c
     +utf8.c
     +wrapper.c
    -+stubs/repository.c
    -+stubs/trace2.c
     +relevant compat/ files
     +
    ++When these files are compiled together with the following files (or
    ++user-provided files that provide the same functions), they form a
    ++complete library:
    ++stubs/pager.c
    ++stubs/trace2.c
    ++
     +Pitfalls
     +================
     +
    @@ Makefile: LIB_OBJS += write-or-die.o
     +LIB_OBJS += utf8.o
     +LIB_OBJS += wrapper.o
     +
    -+ifdef STUB_REPOSITORY
    -+STUB_OBJS += stubs/repository.o
    -+endif
    -+
     +ifdef STUB_TRACE2
     +STUB_OBJS += stubs/trace2.o
     +endif
     +
    ++ifdef STUB_PAGER
    ++STUB_OBJS += stubs/pager.o
    ++endif
    ++
     +LIB_OBJS += $(STUB_OBJS)
     +endif
      
    @@ Makefile: ifdef FSMONITOR_OS_SETTINGS
      NO_TCLTK = NoThanks
      endif
     @@ Makefile: clean: profile-clean coverage-clean cocciclean
    - 	$(RM) po/git.pot po/git-core.pot
      	$(RM) git.res
      	$(RM) $(OBJECTS)
    + 	$(RM) headless-git.o
     -	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
     +	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
      	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
    @@ Makefile: $(FUZZ_PROGRAMS): all
     +### Libified Git rules
     +
     +# git-std-lib
    -+# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
    ++# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
     +STD_LIB = git-std-lib.a
     +
     +$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
     +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
    -+
    -+TEMP_HEADERS = temp_headers/
    -+
    -+git-std-lib:
    -+# Move headers to temporary folder and replace them with stubbed headers.
    -+# After building, move headers and stubbed headers back.
    -+ifneq ($(STUB_OBJS),)
    -+	mkdir -p $(TEMP_HEADERS); \
    -+	for d in $(STUB_OBJS); do \
    -+		BASE=$${d%.*}; \
    -+		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
    -+		mv $${BASE}.h $${BASE##*/}.h; \
    -+	done; \
    -+	$(MAKE) $(STD_LIB); \
    -+	for d in $(STUB_OBJS); do \
    -+		BASE=$${d%.*}; \
    -+		mv $${BASE##*/}.h $${BASE}.h; \
    -+		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
    -+	done; \
    -+	rm -rf temp_headers
    -+else
    -+	$(MAKE) $(STD_LIB)
    -+endif
     
      ## git-compat-util.h ##
     @@ git-compat-util.h: static inline int noop_core_config(const char *var UNUSED,
    @@ git-compat-util.h: const char *inet_ntop(int af, const void *src, char *dst, siz
      #endif
     +#endif
      
    - /*
    -  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
    -@@ git-compat-util.h: int git_access(const char *path, int mode);
    - # endif
    - #endif
    + static inline size_t st_add(size_t a, size_t b)
    + {
    +@@ git-compat-util.h: static inline int is_missing_file_error(int errno_)
    + 	return (errno_ == ENOENT || errno_ == ENOTDIR);
    + }
      
     +#ifndef GIT_STD_LIB
      int cmd_main(int, const char **);
    @@ git-compat-util.h: int git_access(const char *path, int mode);
      /*
       * You can mark a stack variable with UNLEAK(var) to avoid it being
     
    - ## stubs/repository.c (new) ##
    + ## stubs/pager.c (new) ##
     @@
    -+#include "git-compat-util.h"
    -+#include "repository.h"
    ++#include "pager.h"
     +
    -+struct repository *the_repository;
    ++int pager_in_use(void)
    ++{
    ++	return 0;
    ++}
     
    - ## stubs/repository.h (new) ##
    + ## stubs/pager.h (new) ##
     @@
    -+#ifndef REPOSITORY_H
    -+#define REPOSITORY_H
    ++#ifndef PAGER_H
    ++#define PAGER_H
     +
    -+struct repository { int stub; };
    ++int pager_in_use(void);
     +
    -+extern struct repository *the_repository;
    -+
    -+#endif /* REPOSITORY_H */
    ++#endif /* PAGER_H */
     
      ## stubs/trace2.c (new) ##
     @@
     +#include "git-compat-util.h"
     +#include "trace2.h"
     +
    ++struct child_process { int stub; };
    ++struct repository { int stub; };
    ++struct json_writer { int stub; };
    ++
     +void trace2_region_enter_fl(const char *file, int line, const char *category,
     +			    const char *label, const struct repository *repo, ...) { }
     +void trace2_region_leave_fl(const char *file, int line, const char *category,
    @@ stubs/trace2.c (new)
     +			   const struct repository *repo, const char *key,
     +			   intmax_t value) { }
     +int trace2_is_enabled(void) { return 0; }
    ++void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
     +void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
     
    - ## stubs/trace2.h (new) ##
    -@@
    -+#ifndef TRACE2_H
    -+#define TRACE2_H
    -+
    -+struct child_process { int stub; };
    -+struct repository;
    -+struct json_writer { int stub; };
    -+
    -+void trace2_region_enter_fl(const char *file, int line, const char *category,
    -+			    const char *label, const struct repository *repo, ...);
    -+
    -+#define trace2_region_enter(category, label, repo) \
    -+	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
    -+
    -+void trace2_region_leave_fl(const char *file, int line, const char *category,
    -+			    const char *label, const struct repository *repo, ...);
    -+
    -+#define trace2_region_leave(category, label, repo) \
    -+	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
    -+
    -+void trace2_data_string_fl(const char *file, int line, const char *category,
    -+			   const struct repository *repo, const char *key,
    -+			   const char *value);
    -+
    -+#define trace2_data_string(category, repo, key, value)                       \
    -+	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
    -+			      (value))
    -+
    -+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
    -+
    -+#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
    -+
    -+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    -+			    va_list ap);
    -+
    -+#define trace2_cmd_error_va(fmt, ap) \
    -+	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
    -+
    -+
    -+void trace2_cmd_name_fl(const char *file, int line, const char *name);
    -+
    -+#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
    -+
    -+void trace2_thread_start_fl(const char *file, int line,
    -+			    const char *thread_base_name);
    -+
    -+#define trace2_thread_start(thread_base_name) \
    -+	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
    -+
    -+void trace2_thread_exit_fl(const char *file, int line);
    -+
    -+#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
    -+
    -+void trace2_data_intmax_fl(const char *file, int line, const char *category,
    -+			   const struct repository *repo, const char *key,
    -+			   intmax_t value);
    -+
    -+#define trace2_data_intmax(category, repo, key, value)                       \
    -+	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
    -+			      (value))
    -+
    -+enum trace2_process_info_reason {
    -+	TRACE2_PROCESS_INFO_STARTUP,
    -+	TRACE2_PROCESS_INFO_EXIT,
    -+};
    -+int trace2_is_enabled(void);
    -+void trace2_collect_process_info(enum trace2_process_info_reason reason);
    -+
    -+#endif /* TRACE2_H */
    -+
    -
      ## symlinks.c ##
     @@ symlinks.c: void invalidate_lstat_cache(void)
      	reset_lstat_cache(&default_cache);
    @@ symlinks.c: int lstat_cache_aware_rmdir(const char *path)
      	return ret;
      }
     +#endif
    +
    + ## wrapper.c ##
    +@@
    + #include "abspath.h"
    + #include "parse.h"
    + #include "gettext.h"
    +-#include "repository.h"
    + #include "strbuf.h"
    + #include "trace2.h"
    + 
50:  0bead8f980 !  6:  7840e1830a git-std-lib: add test file to call git-std-lib.a functions
    @@ t/stdlib-test.c (new)
     +	struct strbuf sb3 = STRBUF_INIT;
     +	struct string_list list = STRING_LIST_INIT_NODUP;
     +	char *buf = "foo";
    -+	struct strbuf_expand_dict_entry dict[] = {
    -+		{ "foo", NULL, },
    -+		{ "bar", NULL, },
    -+	};
     +	int fd = open("/dev/null", O_RDONLY);
     +
     +	fprintf(stderr, "calling strbuf functions\n");
    @@ t/stdlib-test.c (new)
     +	strbuf_add_commented_lines(sb, "foo", 3, '#');
     +	strbuf_commented_addf(sb, '#', "%s", "foo");
     +	// strbuf_vaddf() called by strbuf_addf()
    -+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
    -+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
    -+	// strbuf_expand_literal_cb() called by strbuf_expand()
    -+	// strbuf_expand_dict_cb() called by strbuf_expand()
     +	strbuf_addbuf_percentquote(sb, &sb3);
     +	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
     +	strbuf_fread(sb, 0, stdin);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 1/6] hex-ll: split out functionality from hex
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 5776309365..861e643708 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index b24b19566b..f663c06ac4 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 01f17fe5c9..d42262bdca 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 87abf66602..e0b83f776f 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a hash (specified by the_hash_algo) in hexadecimal
@@ -34,13 +19,6 @@ int get_oid_hex(const char *hex, struct object_id *oid);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 931505363c..a07d2da16d 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 4c9ac6dc5e..7827178d8e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index 1c45f23adf..1d0254abac 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-15 17:54         ` Jonathan Tan
  2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
                         ` (4 subsequent siblings)
  6 siblings, 1 reply; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

In order for wrapper.c to be built independently as part of a smaller
library, it cannot have dependencies to other Git specific
internals. remove_or_warn() creates an unnecessary dependency to
object.h in wrapper.c. Therefore move the function to entry.[ch] which
performs changes on the worktree based on the Git-specific file modes in
the index.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 entry.c   | 5 +++++
 entry.h   | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 43767f9043..076e97eb89 100644
--- a/entry.c
+++ b/entry.c
@@ -581,3 +581,8 @@ void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
 		return;
 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/entry.h b/entry.h
index 7329f918a9..ca3ed35bc0 100644
--- a/entry.h
+++ b/entry.h
@@ -62,4 +62,10 @@ int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 			   struct stat *st);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* ENTRY_H */
diff --git a/wrapper.c b/wrapper.c
index 48065c4f53..453a20ed99 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
@@ -632,11 +631,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index 79c7321bb3..1b2b047ea0 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -106,11 +106,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 3/6] config: correct bad boolean env value error message
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 3846a37be9..7dde0aaa02 100644
--- a/config.c
+++ b/config.c
@@ -2133,7 +2133,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 4/6] parse: create new library for parsing strings and env values
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (2 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config. This ultimately
allows us to inclue parse.[ch] in an independent library since it
doesn't have dependencies to Git-specific internals unlike in
config.[ch].

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 861e643708..9226c719a0 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index 71c84fbcf8..3c0b4fb3d9 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 7dde0aaa02..c7bc21a25d 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1165,129 +1166,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 NORETURN
 static void die_bad_number(const char *name, const char *value,
 			   const struct key_value_info *kvi)
@@ -1363,23 +1241,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value,
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1454,16 +1315,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value,
 			   const struct key_value_info *kvi, int *is_bool)
 {
@@ -2126,35 +1977,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 6332d74904..14f881ecfa 100644
--- a/config.h
+++ b/config.h
@@ -4,7 +4,7 @@
 #include "hashmap.h"
 #include "string-list.h"
 #include "repository.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -243,16 +243,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -385,8 +375,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index e8e076c3a6..093eaf2db8 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 3a3a5724c4..7f88f1c02b 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 453a20ed99..7da15a56da 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "repository.h"
 #include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (3 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-11 13:22         ` Phillip Wood
  2023-09-15 18:39         ` Jonathan Tan
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
  6 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
---
 Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++++++
 Makefile                                |  39 ++++-
 git-compat-util.h                       |   7 +-
 stubs/pager.c                           |   6 +
 stubs/pager.h                           |   6 +
 stubs/trace2.c                          |  27 ++++
 symlinks.c                              |   2 +
 wrapper.c                               |   1 -
 8 files changed, 276 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/pager.h
 create mode 100644 stubs/trace2.c

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..397c1da8c8
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,191 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind what's in and what's not in the Git Standard
+Library essentially is the result of two observations within the Git
+codebase: every file includes git-compat-util.h which defines functions
+in a couple of different files, and wrapper.c + usage.c have
+difficult-to-separate circular dependencies with each other and other
+files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+ - stubbed out dependencies in stubs/ (stubs/pager.c, stubs/trace2.c)
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Wrapper.c and usage.c have dependencies on pager and trace2 that are
+possible to remove at the cost of sacrificing the ability for standard Git
+to be able to trace functions in those files and other files in git-std-lib.a.
+In order for git-std-lib.a to compile with those dependencies, stubbed out
+versions of those files are implemented and swapped in during compilation time.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+relevant compat/ files
+
+When these files are compiled together with the following files (or
+user-provided files that provide the same functions), they form a
+complete library:
+stubs/pager.c
+stubs/trace2.c
+
+Pitfalls
+================
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index 9226c719a0..0a2d1ae3cc 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
 SCALAR_OBJS =
+STUB_OBJS =
 OBJECTS =
 OTHER_PROGRAMS =
 PROGRAM_OBJS =
@@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 LIB_H = $(FOUND_H_SOURCES)
 
+ifndef GIT_STD_LIB
 LIB_OBJS += abspath.o
 LIB_OBJS += add-interactive.o
 LIB_OBJS += add-patch.o
@@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
 LIB_OBJS += ws.o
 LIB_OBJS += wt-status.o
 LIB_OBJS += xdiff-interface.o
+else ifdef GIT_STD_LIB
+LIB_OBJS += abspath.o
+LIB_OBJS += ctype.o
+LIB_OBJS += date.o
+LIB_OBJS += hex-ll.o
+LIB_OBJS += parse.o
+LIB_OBJS += strbuf.o
+LIB_OBJS += usage.o
+LIB_OBJS += utf8.o
+LIB_OBJS += wrapper.o
+
+ifdef STUB_TRACE2
+STUB_OBJS += stubs/trace2.o
+endif
+
+ifdef STUB_PAGER
+STUB_OBJS += stubs/pager.o
+endif
+
+LIB_OBJS += $(STUB_OBJS)
+endif
 
 BUILTIN_OBJS += builtin/add.o
 BUILTIN_OBJS += builtin/am.o
@@ -2162,6 +2185,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3668,7 +3696,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) git.res
 	$(RM) $(OBJECTS)
 	$(RM) headless-git.o
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3849,3 +3877,12 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
+STD_LIB = git-std-lib.a
+
+$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
diff --git a/git-compat-util.h b/git-compat-util.h
index 3e7a59b5ff..14bf71c530 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -966,9 +966,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 static inline size_t st_add(size_t a, size_t b)
 {
@@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
 	return (errno_ == ENOENT || errno_ == ENOTDIR);
 }
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/stubs/pager.c b/stubs/pager.c
new file mode 100644
index 0000000000..4f575cada7
--- /dev/null
+++ b/stubs/pager.c
@@ -0,0 +1,6 @@
+#include "pager.h"
+
+int pager_in_use(void)
+{
+	return 0;
+}
diff --git a/stubs/pager.h b/stubs/pager.h
new file mode 100644
index 0000000000..b797910881
--- /dev/null
+++ b/stubs/pager.h
@@ -0,0 +1,6 @@
+#ifndef PAGER_H
+#define PAGER_H
+
+int pager_in_use(void);
+
+#endif /* PAGER_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
new file mode 100644
index 0000000000..7d89482228
--- /dev/null
+++ b/stubs/trace2.c
@@ -0,0 +1,27 @@
+#include "git-compat-util.h"
+#include "trace2.h"
+
+struct child_process { int stub; };
+struct repository { int stub; };
+struct json_writer { int stub; };
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value) { }
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap) { }
+void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name) { }
+void trace2_thread_exit_fl(const char *file, int line) { }
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value) { }
+int trace2_is_enabled(void) { return 0; }
+void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
+void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
diff --git a/wrapper.c b/wrapper.c
index 7da15a56da..eeac3741cf 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "parse.h"
 #include "gettext.h"
-#include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (4 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-09  5:26         ` Junio C Hamano
  2023-09-15 18:43         ` Jonathan Tan
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
  6 siblings, 2 replies; 70+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 231 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 235 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..76fed9ecbf
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,231 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo");
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] Introduce Git Standard Library
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (5 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-09-08 20:36       ` Junio C Hamano
  2023-09-08 21:30         ` Junio C Hamano
  6 siblings, 1 reply; 70+ messages in thread
From: Junio C Hamano @ 2023-09-08 20:36 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> I have taken this series out of RFC since there weren't any significant
> concerns with the overall concept and design of this series. This reroll
> incorporates some smaller changes such as dropping the "push pager
> dependency" patch in favor of stubbing it out. The main change this
> reroll cleans up the Makefile rules and stubs, as suggested by
> Phillip Wood (appreciate the help on this one)!

What is your plan for the "config-parse" stuff?  The "create new library"
step in this series seem to aim for the same goal in a different ways.

> This series has been rebased onto 1fc548b2d6a: The sixth batch
>
> Originally this series was built on other patches that have since been
> merged, which is why the range-diff is shown removing many of them.

Good.  Previous rounds did not really attract much interest from the
public if I recall correctly.  Let's see how well this round fares.

>  Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++

It is interesting to see that there is no "std.*lib\.c" in the set
of source files, or "std.*lib\.a" target in the Makefile.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 0/6] Introduce Git Standard Library
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
@ 2023-09-08 21:30         ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-09-08 21:30 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Junio C Hamano <gitster@pobox.com> writes:

> Calvin Wan <calvinwan@google.com> writes:
>
>> I have taken this series out of RFC since there weren't any significant
>> concerns with the overall concept and design of this series. This reroll
>> incorporates some smaller changes such as dropping the "push pager
>> dependency" patch in favor of stubbing it out. The main change this
>> reroll cleans up the Makefile rules and stubs, as suggested by
>> Phillip Wood (appreciate the help on this one)!
>
> What is your plan for the "config-parse" stuff?  The "create new library"
> step in this series seem to aim for the same goal in a different ways.

Actually, this one is far less ambitious in touching "config"
subsystem, in that it only deals with parsing strings as values.
The other one knows how a config file is laid out, what key the
value we are about to read is expected for, etc., and it will
benefit by having the "parse" code separated out by this series, but
they are more or less orthogonal.

Queued.  Thanks.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-09-09  5:26         ` Junio C Hamano
  2023-09-15 18:43         ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-09-09  5:26 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> +
> +test-git-std-lib:
> +	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a

Yuck, no.  Try to share as much with the main Makefile one level up.

> +	./stdlib-test
> diff --git a/t/stdlib-test.c b/t/stdlib-test.c
> new file mode 100644
> index 0000000000..76fed9ecbf
> --- /dev/null
> +++ b/t/stdlib-test.c
> @@ -0,0 +1,231 @@
> +#include "../git-compat-util.h"
> +#include "../abspath.h"
> +#include "../hex-ll.h"
> +#include "../parse.h"
> +#include "../strbuf.h"
> +#include "../string-list.h"

Use -I.. or something, to match what the main Makefile does, so that
you do not have to have these "../".  With -I.., you could even say

    #include <hex-ll.h>
    #include <parse.h>

etc.


> +	// skip_to_optional_arg_default(const char *str, const char *prefix,
> +	// 			 const char **arg, const char *def)

No // comments in this codebase, please.

> +	strbuf_addchars(sb, 1, 1);
> +	strbuf_addf(sb, "%s", "foo");

https://github.com/git/git/actions/runs/6126669144/job/16631124765#step:4:657


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
@ 2023-09-11 13:22         ` Phillip Wood
  2023-09-15 18:39         ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Phillip Wood @ 2023-09-11 13:22 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

Hi Calvin

On 08/09/2023 18:44, Calvin Wan wrote:
> +ifndef GIT_STD_LIB
>   LIB_OBJS += abspath.o
>   LIB_OBJS += add-interactive.o
>   LIB_OBJS += add-patch.o
> @@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
>   LIB_OBJS += ws.o
>   LIB_OBJS += wt-status.o
>   LIB_OBJS += xdiff-interface.o
> +else ifdef GIT_STD_LIB
> +LIB_OBJS += abspath.o
> +LIB_OBJS += ctype.o
> +LIB_OBJS += date.o
> +LIB_OBJS += hex-ll.o
> +LIB_OBJS += parse.o
> +LIB_OBJS += strbuf.o
> +LIB_OBJS += usage.o
> +LIB_OBJS += utf8.o
> +LIB_OBJS += wrapper.o

It is still not clear to me how re-using LIB_OBJS like this is 
compatible with building libgit.a and git-stb-lib.a in a single make 
process c.f. [1].

> +ifdef GIT_STD_LIB
> +	BASIC_CFLAGS += -DGIT_STD_LIB
> +	BASIC_CFLAGS += -DNO_GETTEXT

As I've said before [2] I think that being able to built git-std-lib.a 
with gettext support is a prerequisite for using it to build git (just 
like trace2 support is). If we cannot build git using git-std-lib then 
the latter is likely to bit rot and so I don't think git-std-lib should 
be merged until there is a demonstration of building git using it.


> +### Libified Git rules
> +
> +# git-std-lib
> +# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
> +STD_LIB = git-std-lib.a
> +
> +$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
> +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^

This is much nicer that the previous version.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 3e7a59b5ff..14bf71c530 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
>   #define platform_core_config noop_core_config
>   #endif
>   
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>   int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>   #define rmdir lstat_cache_aware_rmdir
>   #endif

I thought we'd agreed that this represents a change in behavior that 
should be fixed c.f. [2]

> @@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
>   	return (errno_ == ENOENT || errno_ == ENOTDIR);
>   }
>   
> +#ifndef GIT_STD_LIB
>   int cmd_main(int, const char **);
>   
>   /*
>    * Intercept all calls to exit() and route them to trace2 to
>    * optionally emit a message before calling the real exit().
>    */
> +

Nit: this blank line seems unnecessary

>   int common_exit(const char *file, int line, int code);
>   #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
> +#endif
>   
>   /*
>    * You can mark a stack variable with UNLEAK(var) to avoid it being
> diff --git a/stubs/pager.c b/stubs/pager.c

> diff --git a/stubs/pager.h b/stubs/pager.h
> new file mode 100644
> index 0000000000..b797910881
> --- /dev/null
> +++ b/stubs/pager.h
> @@ -0,0 +1,6 @@
> +#ifndef PAGER_H
> +#define PAGER_H
> +
> +int pager_in_use(void);
> +
> +#endif /* PAGER_H */

Is this file actually used for anything? pager_in_use() is already 
declared in pager.h in the project root directory.

> diff --git a/wrapper.c b/wrapper.c
> index 7da15a56da..eeac3741cf 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -5,7 +5,6 @@
>   #include "abspath.h"
>   #include "parse.h"
>   #include "gettext.h"
> -#include "repository.h"

It is probably worth splitting this change out with a commit message 
explaining why the include is unneeded.

This is looking good, it would be really nice to see a demonstration of 
building git using git-std-lib (with gettext support) in the next iteration.

Best Wishes

Phillip


[1] 
https://lore.kernel.org/git/a0f04bd7-3a1e-b303-fd52-eee2af4d38b3@gmail.com/
[2] 
https://lore.kernel.org/git/CAFySSZBMng9nEdCkuT5+fc6rfFgaFfU2E0NP3=jUQC1yRcUE6Q@mail.gmail.com/

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
@ 2023-09-15 17:54         ` Jonathan Tan
  0 siblings, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-09-15 17:54 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> In order for wrapper.c to be built independently as part of a smaller
> library, it cannot have dependencies to other Git specific
> internals. remove_or_warn() creates an unnecessary dependency to
> object.h in wrapper.c. Therefore move the function to entry.[ch] which
> performs changes on the worktree based on the Git-specific file modes in
> the index.

Looking at remove_or_warn(), it's only used from entry.c and apply.c
(which already includes entry.h for another reason) so moving it to
entry.c looks fine.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
  2023-09-11 13:22         ` Phillip Wood
@ 2023-09-15 18:39         ` Jonathan Tan
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Tan @ 2023-09-15 18:39 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> diff --git a/Makefile b/Makefile
> index 9226c719a0..0a2d1ae3cc 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
>  GIT_OBJS =
>  LIB_OBJS =
>  SCALAR_OBJS =
> +STUB_OBJS =
>  OBJECTS =
>  OTHER_PROGRAMS =
>  PROGRAM_OBJS =

I don't think stubs should be compiled into git-std-lib.a - I would
expect a consumer of this library to be able to specify their own
implementations if needed (e.g. their own trace2).

> @@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>  
>  LIB_H = $(FOUND_H_SOURCES)
>  
> +ifndef GIT_STD_LIB
>  LIB_OBJS += abspath.o
>  LIB_OBJS += add-interactive.o
>  LIB_OBJS += add-patch.o
> @@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
>  LIB_OBJS += ws.o
>  LIB_OBJS += wt-status.o
>  LIB_OBJS += xdiff-interface.o
> +else ifdef GIT_STD_LIB
> +LIB_OBJS += abspath.o
> +LIB_OBJS += ctype.o
> +LIB_OBJS += date.o
> +LIB_OBJS += hex-ll.o
> +LIB_OBJS += parse.o
> +LIB_OBJS += strbuf.o
> +LIB_OBJS += usage.o
> +LIB_OBJS += utf8.o
> +LIB_OBJS += wrapper.o

This means that LIB_OBJS (in this patch, used both by git-std-lib and
as part of compiling the regular Git binary) can differ based on the
GIT_STD_LIB variable. It does seem that we cannot avoid GIT_STD_LIB
for now, because the git-std-lib can only be compiled without GETTEXT
(so we need a variable to make sure that none of these .o files are
compiled with GETTEXT), but we should still minimize the changes between
compiling with GIT_STD_LIB and without it, at least to minimize future
work. Could we have two separate lists? So, leave LIB_OBJS alone and
make a new STD_LIB_OBJS.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 3e7a59b5ff..14bf71c530 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
>  #define platform_core_config noop_core_config
>  #endif
>  
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>  int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>  #define rmdir lstat_cache_aware_rmdir
>  #endif

I think we still want to keep the idea of "the code should still be good
even if we have no use for git-std-lib" as much as possible, so could we
stub lstat_cache_aware_rmdir() instead? We could have a new git-compat-
util-stub.c (or whatever we want to call it).

> @@ -966,9 +966,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
>  #endif
>  
>  #ifdef NO_PTHREADS
> +#ifdef GIT_STD_LIB
>  #define atexit git_atexit
>  int git_atexit(void (*handler)(void));
>  #endif
> +#endif
>  
>  static inline size_t st_add(size_t a, size_t b)
>  {

Same for git_atexit().

> @@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
>  	return (errno_ == ENOENT || errno_ == ENOTDIR);
>  }
>  
> +#ifndef GIT_STD_LIB
>  int cmd_main(int, const char **);
>  
>  /*
>   * Intercept all calls to exit() and route them to trace2 to
>   * optionally emit a message before calling the real exit().
>   */
> +
>  int common_exit(const char *file, int line, int code);
>  #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
> +#endif
>  
>  /*
>   * You can mark a stack variable with UNLEAK(var) to avoid it being

And for common_exit().

As for cmd_main(), that seems to be a convenience so that we can link
common_main.o with various other files (e.g. http-backend.c). I think
the right thing to do is to define a new cmd-main.h that declares only
cmd_main(), and then have only the files that need it (common_main.c and
all the files that define cmd_main()) include it. This cleanup patch can
be done before this patch. I think this is a good change that we would
want even without libification.
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
  2023-09-09  5:26         ` Junio C Hamano
@ 2023-09-15 18:43         ` Jonathan Tan
  2023-09-15 20:22           ` Junio C Hamano
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Tan @ 2023-09-15 18:43 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> Add test file that directly or indirectly calls all functions defined in
> git-std-lib.a object files to showcase that they do not reference
> missing objects and that git-std-lib.a can stand on its own.
> 
> Certain functions that cause the program to exit or are already called
> by other functions are commented out.
> 
> TODO: replace with unit tests
> Signed-off-by: Calvin Wan <calvinwan@google.com>

I think the TODO should go into the code, so that when we add a unit
test that also deletes stdlib-test.c, we can see what's happening just
from the diff. The TODO should also explain what stdlib-test.c is hoping
to do, and why replacing it is OK. (Also, do we need to invoke all the
functions? I thought that missing functions are checked at link time, or
at the very latest, when the executable is run. No need to change this,
though - invoking all the functions we can is fine.)
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-15 18:43         ` Jonathan Tan
@ 2023-09-15 20:22           ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2023-09-15 20:22 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Calvin Wan, git, nasamuffin, linusa, phillip.wood123, vdye

Jonathan Tan <jonathantanmy@google.com> writes:

> Calvin Wan <calvinwan@google.com> writes:
>> Add test file that directly or indirectly calls all functions defined in
>> git-std-lib.a object files to showcase that they do not reference
>> missing objects and that git-std-lib.a can stand on its own.
>> 
>> Certain functions that cause the program to exit or are already called
>> by other functions are commented out.
>> 
>> TODO: replace with unit tests
>> Signed-off-by: Calvin Wan <calvinwan@google.com>
>
> I think the TODO should go into the code, so that when we add a unit
> test that also deletes stdlib-test.c, we can see what's happening just
> from the diff. The TODO should also explain what stdlib-test.c is hoping
> to do, and why replacing it is OK. (Also, do we need to invoke all the
> functions? I thought that missing functions are checked at link time, or
> at the very latest, when the executable is run. No need to change this,
> though - invoking all the functions we can is fine.)
>  

Thanks for excellent reviews (not just against this 6/6 but others,
too).


^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2023-09-15 20:23 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
2023-06-28  2:05   ` Victoria Dye
2023-07-05 17:57     ` Calvin Wan
2023-07-05 18:22       ` Victoria Dye
2023-07-11 20:07   ` Jeff Hostetler
2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
2023-06-28 13:15   ` Phillip Wood
2023-06-28 16:55     ` Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
2023-06-27 22:58   ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
2023-06-27 23:00   ` Junio C Hamano
2023-06-27 23:18     ` Calvin Wan
2023-06-28  0:30     ` Glen Choo
2023-06-28 16:37       ` Glen Choo
2023-06-28 16:44         ` Calvin Wan
2023-06-28 17:30           ` Junio C Hamano
2023-06-28 20:58       ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
2023-06-28 13:27   ` Phillip Wood
2023-06-28 21:15     ` Calvin Wan
2023-06-30 10:00       ` Phillip Wood
2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
2023-06-28 16:30   ` Calvin Wan
2023-06-30  7:01 ` Linus Arver
2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
2023-08-10 20:32     ` Junio C Hamano
2023-08-10 22:36     ` Glen Choo
2023-08-10 22:43       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
2023-08-10 20:36     ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
2023-08-10 23:21     ` Glen Choo
2023-08-10 23:43       ` Junio C Hamano
2023-08-14 22:15       ` Jonathan Tan
2023-08-14 22:09     ` Jonathan Tan
2023-08-14 22:19       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
2023-08-10 23:41     ` Glen Choo
2023-08-14 22:17     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
2023-08-14 22:26     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-08-14 22:28     ` Jonathan Tan
2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
2023-08-15  9:20     ` Phillip Wood
2023-08-16 17:17       ` Calvin Wan
2023-08-16 21:19         ` Junio C Hamano
2023-08-15  9:41   ` Phillip Wood
2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
2023-09-15 17:54         ` Jonathan Tan
2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
2023-09-11 13:22         ` Phillip Wood
2023-09-15 18:39         ` Jonathan Tan
2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-09-09  5:26         ` Junio C Hamano
2023-09-15 18:43         ` Jonathan Tan
2023-09-15 20:22           ` Junio C Hamano
2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
2023-09-08 21:30         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/pub/scm/git/git.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).