Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: phillip.wood@dunelm.org.uk
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
	Toon Claes <toon@iotcl.com>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters
Date: Tue, 6 Jun 2023 07:00:00 +0200	[thread overview]
Message-ID: <ZH69UBT_H0OtGR6l@ncase> (raw)
In-Reply-To: <9900512f-b0da-2e47-f1ab-ed51ec2c78ff@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4261 bytes --]

On Mon, Jun 05, 2023 at 04:47:14PM +0100, Phillip Wood wrote:
> > @@ -384,6 +390,11 @@ notdir SP <size> LF
> >   is printed when, during symlink resolution, a file is used as a
> >   directory name.
> >   
> > +Alternatively, when `-Z` is passed, the line feeds in any of the above examples
> > +are replaced with NUL terminators. This ensures that output will be parsable if
> > +the output itself would contain a linefeed and is thus recommended for
> > +scripting purposes.
> > +
> >   CAVEATS
> >   -------
> >   
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 001dcb24d6..90ef407d30 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -492,17 +494,18 @@ static void batch_object_write(const char *obj_name,
> >   	strbuf_reset(scratch);
> >   
> >   	if (!opt->format) {
> > -		print_default_format(scratch, data);
> > +		print_default_format(scratch, data, opt);
> >   	} else {
> >   		strbuf_expand(scratch, opt->format, expand_format, data);
> > -		strbuf_addch(scratch, '\n');
> > +		strbuf_addch(scratch, opt->output_delim);
> >   	}
> >   
> >   	batch_write(opt, scratch->buf, scratch->len);
> >   
> >   	if (opt->batch_mode == BATCH_MODE_CONTENTS) {
> > +		char buf[] = {opt->output_delim};
> 
> I found this a bit confusing, I think it would be clearer just to do
> 
> 	batch_write(opt, &opt->output_delim, 1);

Agreed, that's cleaner.

> >   		print_object_or_die(opt, data);
> > -		batch_write(opt, "\n", 1);
> > +		batch_write(opt, buf, 1);
> >   	}
> >   }
> 
> > @@ -920,7 +927,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> >   		N_("git cat-file (-t | -s) [--allow-unknown-type] <object>"),
> >   		N_("git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]\n"
> >   		   "             [--buffer] [--follow-symlinks] [--unordered]\n"
> > -		   "             [--textconv | --filters] [-z]"),
> > +		   "             [--textconv | --filters] [-z] [-Z]"),
> 
> If we're recommending that people don't use '-z' then maybe we should 
> remove it from the synopsis and add OPT_HIDDEN to it below.

I might still change this depending on the conclusion Junio and I will
arrive at, but for now I agree that this makes sense.

> >   		N_("git cat-file (--textconv | --filters)\n"
> >   		   "             [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]"),
> >   		NULL
> > @@ -950,6 +957,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> >   			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> >   			batch_option_callback),
> >   		OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")),
> > +		OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")),
> 
> > diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> > index 7b985cfded..d73a0be1b9 100755
> > --- a/t/t1006-cat-file.sh
> > +++ b/t/t1006-cat-file.sh
> > @@ -392,17 +393,18 @@ deadbeef
> >   
> >   "
> >   
> > -batch_output="$hello_sha1 blob $hello_size
> > -$hello_content
> > -$commit_sha1 commit $commit_size
> > -$commit_content
> > -$tag_sha1 tag $tag_size
> > -$tag_content
> > -deadbeef missing
> > - missing"
> > +printf "%s\0" \
> > +	"$hello_sha1 blob $hello_size" \
> > +	"$hello_content" \
> > +	"$commit_sha1 commit $commit_size" \
> > +	"$commit_content" \
> > +	"$tag_sha1 tag $tag_size" \
> > +	"$tag_content" \
> > +	"deadbeef missing" \
> > +	" missing" >batch_output
> 
> I think writing the expected output to a file is a good change as we 
> always use it with test_cmp. As "-z" is deprecated I think it makes 
> sense to model the expected output for "-Z" and use tr for the "-z" 
> tests as you have done here. It looks like we have good coverage of the 
> new option.

It's actually also required in order to not have to specify the expected
output twice. While we could leave this as-is, translating it to be NUL
terminated via `tr \n \0` doesn't work as the output contains newlines
in places where we don't want to translate them to NUL delimiters. And
storing the NUL-delimited string in a variable doesn't work either as
shells will truncate the C strings.

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2023-06-06  5:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 13:02 [PATCH 0/5] cat-file: introduce NUL-terminated output format Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Patrick Steinhardt
2023-06-05 15:47   ` Phillip Wood
2023-06-05 23:54     ` Junio C Hamano
2023-06-06  4:52       ` Patrick Steinhardt
2023-06-06  5:22         ` Junio C Hamano
2023-06-06  5:31           ` Patrick Steinhardt
2023-06-12 19:12             ` Junio C Hamano
2023-06-06  5:00     ` Patrick Steinhardt [this message]
2023-06-06  1:23   ` Junio C Hamano
2023-06-03  1:44 ` [PATCH 0/5] cat-file: introduce NUL-terminated output format Junio C Hamano
2023-06-06  5:19 ` [PATCH v2 0/5] catfile: " Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Patrick Steinhardt
2023-06-12 20:43   ` [PATCH v2 0/5] catfile: introduce NUL-terminated output format Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZH69UBT_H0OtGR6l@ncase \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).