Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Toon Claes <toon@iotcl.com>,
	Phillip Wood <phillip.wood123@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter
Date: Tue, 6 Jun 2023 07:19:37 +0200	[thread overview]
Message-ID: <8127eeac97200da9aafccdf16cb7ba06f68b0121.1686028409.git.ps@pks.im> (raw)
In-Reply-To: <cover.1686028409.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]

Many of our commands support reading input that is separated either via
newlines or via NUL characters. Furthermore, in order to be a better
cross platform citizen, these commands typically know to strip the CRLF
sequence so that we also support reading newline-separated inputs on
e.g. the Windows platform. This results in the following kind of awkward
pattern:

```
struct strbuf input = STRBUF_INIT;

while (1) {
	int ret;

	if (nul_terminated)
		ret = strbuf_getline_nul(&input, stdin);
	else
		ret = strbuf_getline(&input, stdin);
	if (ret)
		break;

	...
}
```

Introduce a new CRLF-aware helper function that can read up to a user
specified delimiter. If the delimiter is `\n` the function knows to also
strip CRLF, otherwise it will only strip the specified delimiter. This
results in the following, much more readable code pattern:

```
struct strbuf input = STRBUF_INIT;

while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) {
	...
}
```

The new function will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 strbuf.c | 11 ++++++++---
 strbuf.h | 12 ++++++++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 08eec8f1d8..31dc48c0ab 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -721,11 +721,11 @@ static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term)
 	return 0;
 }
 
-int strbuf_getline(struct strbuf *sb, FILE *fp)
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term)
 {
-	if (strbuf_getwholeline(sb, fp, '\n'))
+	if (strbuf_getwholeline(sb, fp, term))
 		return EOF;
-	if (sb->buf[sb->len - 1] == '\n') {
+	if (term == '\n' && sb->buf[sb->len - 1] == '\n') {
 		strbuf_setlen(sb, sb->len - 1);
 		if (sb->len && sb->buf[sb->len - 1] == '\r')
 			strbuf_setlen(sb, sb->len - 1);
@@ -733,6 +733,11 @@ int strbuf_getline(struct strbuf *sb, FILE *fp)
 	return 0;
 }
 
+int strbuf_getline(struct strbuf *sb, FILE *fp)
+{
+	return strbuf_getdelim_strip_crlf(sb, fp, '\n');
+}
+
 int strbuf_getline_lf(struct strbuf *sb, FILE *fp)
 {
 	return strbuf_getdelim(sb, fp, '\n');
diff --git a/strbuf.h b/strbuf.h
index 3dfeadb44c..0e69b656bc 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -475,6 +475,18 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
  */
 ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
 
+/**
+ * Read from a FILE * until the specified terminator is encountered,
+ * overwriting the existing contents of the strbuf.
+ *
+ * Reading stops after the terminator or at EOF.  The terminator is
+ * removed from the buffer before returning.  If the terminator is LF
+ * and if it is preceded by a CR, then the whole CRLF is stripped.
+ * Returns 0 unless there was nothing left before EOF, in which case
+ * it returns `EOF`.
+ */
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term);
+
 /**
  * Read a line from a FILE *, overwriting the existing contents of
  * the strbuf.  The strbuf_getline*() family of functions share
-- 
2.41.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2023-06-06  5:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 13:02 [PATCH 0/5] cat-file: introduce NUL-terminated output format Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Patrick Steinhardt
2023-06-05 15:47   ` Phillip Wood
2023-06-05 23:54     ` Junio C Hamano
2023-06-06  4:52       ` Patrick Steinhardt
2023-06-06  5:22         ` Junio C Hamano
2023-06-06  5:31           ` Patrick Steinhardt
2023-06-12 19:12             ` Junio C Hamano
2023-06-06  5:00     ` Patrick Steinhardt
2023-06-06  1:23   ` Junio C Hamano
2023-06-03  1:44 ` [PATCH 0/5] cat-file: introduce NUL-terminated output format Junio C Hamano
2023-06-06  5:19 ` [PATCH v2 0/5] catfile: " Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-06  5:19   ` Patrick Steinhardt [this message]
2023-06-06  5:19   ` [PATCH v2 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Patrick Steinhardt
2023-06-12 20:43   ` [PATCH v2 0/5] catfile: introduce NUL-terminated output format Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8127eeac97200da9aafccdf16cb7ba06f68b0121.1686028409.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood123@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).