From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Toon Claes <toon@iotcl.com>,
Phillip Wood <phillip.wood123@gmail.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter
Date: Tue, 6 Jun 2023 07:19:37 +0200 [thread overview]
Message-ID: <8127eeac97200da9aafccdf16cb7ba06f68b0121.1686028409.git.ps@pks.im> (raw)
In-Reply-To: <cover.1686028409.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]
Many of our commands support reading input that is separated either via
newlines or via NUL characters. Furthermore, in order to be a better
cross platform citizen, these commands typically know to strip the CRLF
sequence so that we also support reading newline-separated inputs on
e.g. the Windows platform. This results in the following kind of awkward
pattern:
```
struct strbuf input = STRBUF_INIT;
while (1) {
int ret;
if (nul_terminated)
ret = strbuf_getline_nul(&input, stdin);
else
ret = strbuf_getline(&input, stdin);
if (ret)
break;
...
}
```
Introduce a new CRLF-aware helper function that can read up to a user
specified delimiter. If the delimiter is `\n` the function knows to also
strip CRLF, otherwise it will only strip the specified delimiter. This
results in the following, much more readable code pattern:
```
struct strbuf input = STRBUF_INIT;
while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) {
...
}
```
The new function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
strbuf.c | 11 ++++++++---
strbuf.h | 12 ++++++++++++
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/strbuf.c b/strbuf.c
index 08eec8f1d8..31dc48c0ab 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -721,11 +721,11 @@ static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term)
return 0;
}
-int strbuf_getline(struct strbuf *sb, FILE *fp)
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term)
{
- if (strbuf_getwholeline(sb, fp, '\n'))
+ if (strbuf_getwholeline(sb, fp, term))
return EOF;
- if (sb->buf[sb->len - 1] == '\n') {
+ if (term == '\n' && sb->buf[sb->len - 1] == '\n') {
strbuf_setlen(sb, sb->len - 1);
if (sb->len && sb->buf[sb->len - 1] == '\r')
strbuf_setlen(sb, sb->len - 1);
@@ -733,6 +733,11 @@ int strbuf_getline(struct strbuf *sb, FILE *fp)
return 0;
}
+int strbuf_getline(struct strbuf *sb, FILE *fp)
+{
+ return strbuf_getdelim_strip_crlf(sb, fp, '\n');
+}
+
int strbuf_getline_lf(struct strbuf *sb, FILE *fp)
{
return strbuf_getdelim(sb, fp, '\n');
diff --git a/strbuf.h b/strbuf.h
index 3dfeadb44c..0e69b656bc 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -475,6 +475,18 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
*/
ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
+/**
+ * Read from a FILE * until the specified terminator is encountered,
+ * overwriting the existing contents of the strbuf.
+ *
+ * Reading stops after the terminator or at EOF. The terminator is
+ * removed from the buffer before returning. If the terminator is LF
+ * and if it is preceded by a CR, then the whole CRLF is stripped.
+ * Returns 0 unless there was nothing left before EOF, in which case
+ * it returns `EOF`.
+ */
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term);
+
/**
* Read a line from a FILE *, overwriting the existing contents of
* the strbuf. The strbuf_getline*() family of functions share
--
2.41.0
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-06-06 5:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-02 13:02 [PATCH 0/5] cat-file: introduce NUL-terminated output format Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Patrick Steinhardt
2023-06-05 15:47 ` Phillip Wood
2023-06-05 23:54 ` Junio C Hamano
2023-06-06 4:52 ` Patrick Steinhardt
2023-06-06 5:22 ` Junio C Hamano
2023-06-06 5:31 ` Patrick Steinhardt
2023-06-12 19:12 ` Junio C Hamano
2023-06-06 5:00 ` Patrick Steinhardt
2023-06-06 1:23 ` Junio C Hamano
2023-06-03 1:44 ` [PATCH 0/5] cat-file: introduce NUL-terminated output format Junio C Hamano
2023-06-06 5:19 ` [PATCH v2 0/5] catfile: " Patrick Steinhardt
2023-06-06 5:19 ` [PATCH v2 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-06 5:19 ` [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-06 5:19 ` Patrick Steinhardt [this message]
2023-06-06 5:19 ` [PATCH v2 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-06 5:19 ` [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Patrick Steinhardt
2023-06-12 20:43 ` [PATCH v2 0/5] catfile: introduce NUL-terminated output format Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8127eeac97200da9aafccdf16cb7ba06f68b0121.1686028409.git.ps@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=phillip.wood123@gmail.com \
--cc=toon@iotcl.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).