Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Toon Claes <toon@iotcl.com>
Cc: git@vger.kernel.org, Phillip Wood <phillip.wood123@gmail.com>,
	Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v5 1/1] cat-file: quote-format name in error when using -z
Date: Fri, 12 May 2023 09:57:56 -0700	[thread overview]
Message-ID: <xmqqy1ltqygb.fsf@gitster.g> (raw)
In-Reply-To: <87h6sh6f81.fsf@iotcl.com> (Toon Claes's message of "Fri, 12 May 2023 10:54:20 +0200")

Toon Claes <toon@iotcl.com> writes:

> Ideally the output should be NUL-terminated if -z is used. This was also
> suggested[2] when the flag was introduced. Obviously we cannot change
> this now, because it would break behavior for *everyone* using -z, not
> only when funny names are used. So if we want to go this route, we
> should only do so with another flag (e.g. `--null-output`) or a config
> option.

Yes, `--null-output` came also to my mind.  As this new mode of
output is for consumption by programs, letting them read
NUL-terminated records is a viable, if cumbersome, possibility.

> But I was looking at the git-config(1) documentation:
>
>> core.quotePath::
>> 	Commands that output paths (e.g. 'ls-files', 'diff'), will
>> 	quote "unusual" characters in the pathname by enclosing the
>> 	pathname in double-quotes and escaping those characters with
>> 	backslashes in the same way C escapes control characters (e.g.
>> 	`\t` for TAB, `\n` for LF, `\\` for backslash) or bytes with
>> 	values larger than 0x80 (e.g. octal `\302\265` for "micro" in
>> 	UTF-8).  If this variable is set to false, bytes higher than
>> 	0x80 are not considered "unusual" any more. Double-quotes,
>> 	backslash and control characters are always escaped regardless
>> 	of the setting of this variable.  A simple space character is
>> 	not considered "unusual".  Many commands can output pathnames
>> 	completely verbatim using the `-z` option. The default value
>> 	is true.
>
> If you read this, the changes of this patch fully contradict this.

Hmph, I do not quite see where the contradiction is.  If you mean
"Many commands can output" part, I do not think it applies here.
First, your "cat-file" does not have to be a part of "many".  More
importantly, the mention of `-z` there is about the option accepted
by the diff family of commants, e.g. "git diff --name-only -z
HEAD^", that is an output record separator.  Your "-z" is about the
input record separator, and if you are not changing "-z" to suddenly
mean both input and output  separator to break existing scripts that
expect "-z" only applies to input, the above "completely verbatim"
does not apply to you.

> Also
> documentation on other commands (e.g. git-check-ignore(1)) using `-z`
> will mention the verbatim output.

Again, it is about the output.

Stepping back a bit, how big a problem is this in real life?  It
certainly is possible to create a pathname with funny byte values in
it, and in some environments,letters like single-quote that are
considered cumbersome to handle by those who are used to CLI
programs may be commonplace.  But a path with newline?  Or any
control character for that matter?  And this is not even the primary
output from the program but is an error message for consumption by
humans, no?

I am wondering if it is simpler to just declare that the paths
output in error messages have certain bytes, probably all control
characters other than HT, replaced with a dot, and tell the users
not to rely on the pathnames being intact if they contain funny
bytes in them.  That way, with the definition of "work" being "you
can read the path out of error messages that talk about it", paths
with bytes that c-quote mechanism butchers, like double quotes and
backslashes, that have worked before will not be broken, and paths
with LF or CRLF in them that have never worked would not work, but
at least does not break the input stream of whoever is reading the
error messages line by line.

I dunno.



  reply	other threads:[~2023-05-12 16:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-09 15:00 [PATCH 0/1] cat-file: quote-format name in error when using -z Toon Claes
2022-12-09 15:00 ` [PATCH 1/1] " Toon Claes
2022-12-09 19:33   ` Phillip Wood
2022-12-09 23:58     ` Junio C Hamano
2022-12-11 16:30       ` Phillip Wood
2022-12-12  0:11         ` Junio C Hamano
2022-12-12 11:34           ` Toon Claes
2022-12-12 22:09             ` Junio C Hamano
2022-12-13 15:06           ` Phillip Wood
2022-12-14  8:29             ` Junio C Hamano
2022-12-20  5:31     ` Toon Claes
2022-12-20 10:18       ` Phillip Wood
2022-12-21 12:42         ` Toon Claes
2023-01-05  6:24 ` [PATCH v2 0/1] " Toon Claes
2023-01-05  6:24   ` [PATCH v2 1/1] " Toon Claes
2023-01-16 19:07   ` [PATCH v3 0/1] " Toon Claes
2023-01-16 19:07     ` [PATCH v3 1/1] " Toon Claes
2023-01-17 15:24       ` Phillip Wood
2023-03-03 19:17     ` [PATCH v4 0/2] " Toon Claes
2023-03-03 19:17       ` [PATCH v4 1/2] cat-file: extract printing batch error message into function Toon Claes
2023-03-03 20:26         ` Junio C Hamano
2023-03-03 23:14           ` Junio C Hamano
2023-05-10 19:01             ` [PATCH v5 0/1] cat-file: quote-format name in error when using -z Toon Claes
2023-05-10 19:01               ` [PATCH v5 1/1] " Toon Claes
2023-05-10 20:13                 ` Junio C Hamano
2023-05-12  8:54                   ` Toon Claes
2023-05-12 16:57                     ` Junio C Hamano [this message]
2023-05-15  8:47                       ` Phillip Wood
2023-05-15 17:20                         ` Junio C Hamano
2023-06-02 13:29                           ` Phillip Wood
2023-03-03 19:17       ` [PATCH v4 2/2] " Toon Claes
2023-03-03 20:14       ` [PATCH v4 0/2] " Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy1ltqygb.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood123@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).