Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Glen Choo <glencbz@gmail.com>,
	Christian Couder <chriscool@tuxfamily.org>,
	Phillip Wood <phillip.wood123@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Linus Arver <linusa@google.com>, Linus Arver <linusa@google.com>
Subject: [PATCH v5 2/3] trailer: find the end of the log message
Date: Fri, 20 Oct 2023 19:01:34 +0000	[thread overview]
Message-ID: <ce25420db29c9953095db652584dbed4e35d67ad.1697828495.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1563.v5.git.1697828495.gitgitgadget@gmail.com>

From: Linus Arver <linusa@google.com>

Previously, trailer_info_get() computed the trailer block end position
by

(1) checking for the opts->no_divider flag and optionally calling
    find_patch_start() to find the "patch start" location (patch_start), and
(2) calling find_trailer_end() to find the end of the trailer block
    using patch_start as a guide, saving the return value into
    "trailer_end".

The logic in (1) was awkward because the variable "patch_start" is
misleading if there is no patch in the input. The logic in (2) was
misleading because it could be the case that no trailers are in the
input (yet we are setting a "trailer_end" variable before even searching
for trailers, which happens later in find_trailer_start()). The name
"find_trailer_end" was misleading because that function did not look for
any trailer block itself --- instead it just computed the end position
of the log message in the input where the end of the trailer block (if
it exists) would be (because trailer blocks must always come after the
end of the log message).

Combine the logic in (1) and (2) together into find_patch_start() by
renaming it to find_end_of_log_message(). The end of the log message is
the starting point which find_trailer_start() needs to start searching
backward to parse individual trailers (if any).

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
---
 trailer.c | 64 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 23 deletions(-)

diff --git a/trailer.c b/trailer.c
index 3c54b38a85a..70c81fda710 100644
--- a/trailer.c
+++ b/trailer.c
@@ -809,21 +809,50 @@ static ssize_t last_line(const char *buf, size_t len)
 }
 
 /*
- * Return the position of the start of the patch or the length of str if there
- * is no patch in the message.
+ * Find the end of the log message as an offset from the start of the input
+ * (where callers of this function are interested in looking for a trailers
+ * block in the same input). We have to consider two categories of content that
+ * can come at the end of the input which we want to ignore (because they don't
+ * belong in the log message):
+ *
+ * (1) the "patch part" which begins with a "---" divider and has patch
+ * information (like the output of git-format-patch), and
+ *
+ * (2) any trailing comment lines, blank lines like in the output of "git
+ * commit -v", or stuff below the "cut" (scissor) line.
+ *
+ * As a formula, the situation looks like this:
+ *
+ *     INPUT = LOG MESSAGE + IGNORED
+ *
+ * where IGNORED can be either of the two categories described above. It may be
+ * that there is nothing to ignore. Now it may be the case that the LOG MESSAGE
+ * contains a trailer block, but that's not the concern of this function.
  */
-static size_t find_patch_start(const char *str)
+static size_t find_end_of_log_message(const char *input, int no_divider)
 {
+	size_t end;
 	const char *s;
 
-	for (s = str; *s; s = next_line(s)) {
+	/* Assume the naive end of the input is already what we want. */
+	end = strlen(input);
+
+	if (no_divider) {
+		return end;
+	}
+
+	/* Optionally skip over any patch part ("---" line and below). */
+	for (s = input; *s; s = next_line(s)) {
 		const char *v;
 
-		if (skip_prefix(s, "---", &v) && isspace(*v))
-			return s - str;
+		if (skip_prefix(s, "---", &v) && isspace(*v)) {
+			end = s - input;
+			break;
+		}
 	}
 
-	return s - str;
+	/* Skip over other ignorable bits. */
+	return end - ignored_log_message_bytes(input, end);
 }
 
 /*
@@ -925,12 +954,6 @@ continue_outer_loop:
 	return len;
 }
 
-/* Return the position of the end of the trailers. */
-static size_t find_trailer_end(const char *buf, size_t len)
-{
-	return len - ignored_log_message_bytes(buf, len);
-}
-
 static int ends_with_blank_line(const char *buf, size_t len)
 {
 	ssize_t ll = last_line(buf, len);
@@ -1101,7 +1124,7 @@ void process_trailers(const char *file,
 void trailer_info_get(struct trailer_info *info, const char *str,
 		      const struct process_trailer_options *opts)
 {
-	int patch_start, trailer_end, trailer_start;
+	int end_of_log_message, trailer_start;
 	struct strbuf **trailer_lines, **ptr;
 	char **trailer_strings = NULL;
 	size_t nr = 0, alloc = 0;
@@ -1109,16 +1132,11 @@ void trailer_info_get(struct trailer_info *info, const char *str,
 
 	ensure_configured();
 
-	if (opts->no_divider)
-		patch_start = strlen(str);
-	else
-		patch_start = find_patch_start(str);
-
-	trailer_end = find_trailer_end(str, patch_start);
-	trailer_start = find_trailer_start(str, trailer_end);
+	end_of_log_message = find_end_of_log_message(str, opts->no_divider);
+	trailer_start = find_trailer_start(str, end_of_log_message);
 
 	trailer_lines = strbuf_split_buf(str + trailer_start,
-					 trailer_end - trailer_start,
+					 end_of_log_message - trailer_start,
 					 '\n',
 					 0);
 	for (ptr = trailer_lines; *ptr; ptr++) {
@@ -1141,7 +1159,7 @@ void trailer_info_get(struct trailer_info *info, const char *str,
 	info->blank_line_before_trailer = ends_with_blank_line(str,
 							       trailer_start);
 	info->trailer_start = str + trailer_start;
-	info->trailer_end = str + trailer_end;
+	info->trailer_end = str + end_of_log_message;
 	info->trailers = trailer_strings;
 	info->trailer_nr = nr;
 }
-- 
gitgitgadget


  parent reply	other threads:[~2023-10-20 19:01 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-05  5:04 [PATCH 0/5] Trailer readability cleanups Linus Arver via GitGitGadget
2023-08-05  5:04 ` [PATCH 1/5] trailer: separate public from internal portion of trailer_iterator Linus Arver via GitGitGadget
2023-08-07 21:16   ` Glen Choo
2023-08-08 12:19     ` Phillip Wood
2023-08-10 23:15       ` Linus Arver
2023-08-10 22:50     ` Linus Arver
2023-08-05  5:04 ` [PATCH 2/5] trailer: split process_input_file into separate pieces Linus Arver via GitGitGadget
2023-08-07 22:39   ` Glen Choo
2023-08-11  0:41     ` Linus Arver
2023-08-05  5:04 ` [PATCH 3/5] trailer: split process_command_line_args into separate functions Linus Arver via GitGitGadget
2023-08-07 22:51   ` Glen Choo
2023-08-11  0:59     ` Linus Arver
2023-08-11  1:02       ` Linus Arver
2023-08-11 21:11       ` Glen Choo
2023-08-05  5:04 ` [PATCH 4/5] trailer: teach find_patch_start about --no-divider Linus Arver via GitGitGadget
2023-08-07 23:28   ` Glen Choo
2023-08-11  1:25     ` Linus Arver
2023-08-11 20:51       ` Glen Choo
2023-08-05  5:04 ` [PATCH 5/5] trailer: rename *_DEFAULT enums to *_UNSPECIFIED Linus Arver via GitGitGadget
2023-08-07 23:45   ` Glen Choo
2023-08-11 18:00     ` Linus Arver
2023-09-09  6:16 ` [PATCH v2 0/6] Trailer readability cleanups Linus Arver via GitGitGadget
2023-09-09  6:16   ` [PATCH v2 1/6] trailer: separate public from internal portion of trailer_iterator Linus Arver via GitGitGadget
2023-09-11 17:10     ` Junio C Hamano
2023-09-09  6:16   ` [PATCH v2 2/6] trailer: split process_input_file into separate pieces Linus Arver via GitGitGadget
2023-09-11 17:10     ` Junio C Hamano
2023-09-09  6:16   ` [PATCH v2 3/6] trailer: split process_command_line_args into separate functions Linus Arver via GitGitGadget
2023-09-09  6:16   ` [PATCH v2 4/6] trailer: teach find_patch_start about --no-divider Linus Arver via GitGitGadget
2023-09-11 17:25     ` Junio C Hamano
2023-09-14  2:19       ` Linus Arver
2023-09-14  3:12         ` Junio C Hamano
2023-09-14  5:31           ` Linus Arver
2023-09-09  6:16   ` [PATCH v2 5/6] trailer: rename *_DEFAULT enums to *_UNSPECIFIED Linus Arver via GitGitGadget
2023-09-11 18:54     ` Junio C Hamano
2023-09-14  2:41       ` Linus Arver
2023-09-14  3:16         ` Junio C Hamano
2023-09-22 18:23           ` Linus Arver
2023-09-22 19:48             ` Junio C Hamano
2023-09-26  5:30               ` Linus Arver
2023-09-09  6:16   ` [PATCH v2 6/6] trailer: use offsets for trailer_start/trailer_end Linus Arver via GitGitGadget
2023-09-11 19:01     ` Junio C Hamano
2023-09-14  1:21       ` Linus Arver
2023-09-14  3:18         ` Linus Arver
2023-09-22 19:50   ` [PATCH v3 0/9] Trailer readability cleanups Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 1/9] trailer: separate public from internal portion of trailer_iterator Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 2/9] trailer: split process_input_file into separate pieces Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 3/9] trailer: split process_command_line_args into separate functions Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 4/9] trailer: rename *_DEFAULT enums to *_UNSPECIFIED Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 5/9] commit: ignore_non_trailer computes number of bytes to ignore Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 6/9] trailer: find the end of the log message Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 7/9] trailer: use offsets for trailer_start/trailer_end Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 8/9] trailer: only use trailer_block_* variables if trailers were found Linus Arver via GitGitGadget
2023-09-22 19:50     ` [PATCH v3 9/9] trailer: make stack variable names match field names Linus Arver via GitGitGadget
2023-09-22 22:47     ` [PATCH v3 0/9] Trailer readability cleanups Junio C Hamano
2023-09-22 23:13       ` Linus Arver
2023-09-23  0:48         ` Junio C Hamano
2023-09-26  5:40           ` Linus Arver
2023-09-26  6:22     ` [PATCH v4 0/4] " Linus Arver via GitGitGadget
2023-09-26  6:22       ` [PATCH v4 1/4] commit: ignore_non_trailer computes number of bytes to ignore Linus Arver via GitGitGadget
2023-09-26  6:22       ` [PATCH v4 2/4] trailer: find the end of the log message Linus Arver via GitGitGadget
2023-09-28 23:16         ` Jonathan Tan
2023-10-20  0:24           ` Linus Arver
2023-10-20  0:36             ` Junio C Hamano
2023-09-26  6:22       ` [PATCH v4 3/4] trailer: use offsets for trailer_start/trailer_end Linus Arver via GitGitGadget
2023-09-26  6:22       ` [PATCH v4 4/4] trailer: only use trailer_block_* variables if trailers were found Linus Arver via GitGitGadget
2023-10-20 19:01       ` [PATCH v5 0/3] Trailer readability cleanups Linus Arver via GitGitGadget
2023-10-20 19:01         ` [PATCH v5 1/3] commit: ignore_non_trailer computes number of bytes to ignore Linus Arver via GitGitGadget
2023-10-20 19:01         ` Linus Arver via GitGitGadget [this message]
2023-10-20 21:29           ` [PATCH v5 2/3] trailer: find the end of the log message Junio C Hamano
2023-12-29  6:42             ` Linus Arver
2023-12-29 21:03               ` Linus Arver
2023-10-20 19:01         ` [PATCH v5 3/3] trailer: use offsets for trailer_start/trailer_end Linus Arver via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce25420db29c9953095db652584dbed4e35d67ad.1697828495.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=glencbz@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=linusa@google.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).