From: Phillip Wood <phillip.wood123@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Toon Claes <toon@iotcl.com>,
git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v5 1/1] cat-file: quote-format name in error when using -z
Date: Fri, 2 Jun 2023 14:29:08 +0100 [thread overview]
Message-ID: <5b83dba0-e900-ebae-2ad8-f036a40a15c5@gmail.com> (raw)
In-Reply-To: <xmqqmt25a4uk.fsf@gitster.g>
Hi Junio
Sorry for the slow reply, I had intended to reply before but got
distracted and forgot about it.
On 15/05/2023 18:20, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> On 12/05/2023 17:57, Junio C Hamano wrote:
>>> Toon Claes <toon@iotcl.com> writes:
>>> Stepping back a bit, how big a problem is this in real life? It
>>> certainly is possible to create a pathname with funny byte values in
>>> it, and in some environments,letters like single-quote that are
>>> considered cumbersome to handle by those who are used to CLI
>>> programs may be commonplace. But a path with newline? Or any
>>> control character for that matter? And this is not even the primary
>>> output from the program but is an error message for consumption by
>>> humans, no?
>>> I am wondering if it is simpler to just declare that the paths
>>> output in error messages have certain bytes, probably all control
>>> characters other than HT, replaced with a dot, and tell the users
>>> not to rely on the pathnames being intact if they contain funny
>>> bytes in them.
>>
>> We could only c-quote the name when it contains a control character
>> other that HT. That way names containing double quotes and backslashes
>> are unchanged but it will still be possible to parse the path from the
>> error message. If we're going to munge the name we might as well use
>> our standard quoting rather than some ad-hoc scheme.
>
> In the above suggestion, I gave up and no longer aim to do
> "quoting". A more appropriate word for the approach is "redacting".
> The message essentially is: If you use truly problematic bytes in
> your path, they are redacted (so do not use them if it hurts).
>
> This is because I am not sure how "names containing dq and bs are
> unchanged" can be done without ambiguity.
D'oh, I should have thought of that. You're right it ends up being
ambiguous. Anyway Patrick has just posted a patch to add NUL terminated
output which looks like a cleaner approach.
Best Wishes
Phillip
> If I see a message that
> comes out of this:
>
> printf("%s missing\n", obj_name);
>
> and it looks like
>
> "a\nb" missing
>
> how do I tell if it is complaining about the object the user named
> with a three-byte string (i.e. lowercase-A, newline, lowercase-B),
> or a six-byte string (i.e. dq, lowercase-A, bs, lowercase-N,
> lowercase-B, dq)?
>
> If we were forbidding '"' to appear in a refname, then we could take
> advantage of the fact that the name of an object inside a tree at a
> funny path would not start with '"', to disambiguate. For the
> three- and six-byte string cases above, the formatting function will
> give these messages (referred to as "sample output" below):
>
> "master:a\nb" missing
> master:"a\nb" missing
>
> because of your "we do not exactly do our standard c-quote; we
> exempt dq and bs from the bytes to be quoted" rule.
>
> But it still feels a bit misleading. This codepath may have the
> whole objectname as a single string so that c-quoting the entire
> "<commit> <colon> <path>" inside a single c-quoted string that
> begins with a dq is easy, but not all codepaths are lucky and some
> may have to show <commit> and <path> separately, concatenated with
> <colon> at the outermost output layer, which means that the second
> one from the sample output may still mean the path with three-byte
> name in the tree of 'master' commit.
>
> And worse yet, because
>
> git branch '"master'
>
> is possible (even though nobody sane would do that), so "treat the
> string as c-quoted only if the object name as a whole begins with a
> dq", this disambiguation idea would not work. The first one from
> the sample output could be the blob at the path with a five-byte
> string name (i.e. lowercase-A, bs, lowercase-N, lowercase-B, dq)
> in the tree of the commit at the tip of branch with seven-byte
> string name (i.e. dq followed by 'master').
>
> So, I dunno.
next prev parent reply other threads:[~2023-06-02 13:29 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-09 15:00 [PATCH 0/1] cat-file: quote-format name in error when using -z Toon Claes
2022-12-09 15:00 ` [PATCH 1/1] " Toon Claes
2022-12-09 19:33 ` Phillip Wood
2022-12-09 23:58 ` Junio C Hamano
2022-12-11 16:30 ` Phillip Wood
2022-12-12 0:11 ` Junio C Hamano
2022-12-12 11:34 ` Toon Claes
2022-12-12 22:09 ` Junio C Hamano
2022-12-13 15:06 ` Phillip Wood
2022-12-14 8:29 ` Junio C Hamano
2022-12-20 5:31 ` Toon Claes
2022-12-20 10:18 ` Phillip Wood
2022-12-21 12:42 ` Toon Claes
2023-01-05 6:24 ` [PATCH v2 0/1] " Toon Claes
2023-01-05 6:24 ` [PATCH v2 1/1] " Toon Claes
2023-01-16 19:07 ` [PATCH v3 0/1] " Toon Claes
2023-01-16 19:07 ` [PATCH v3 1/1] " Toon Claes
2023-01-17 15:24 ` Phillip Wood
2023-03-03 19:17 ` [PATCH v4 0/2] " Toon Claes
2023-03-03 19:17 ` [PATCH v4 1/2] cat-file: extract printing batch error message into function Toon Claes
2023-03-03 20:26 ` Junio C Hamano
2023-03-03 23:14 ` Junio C Hamano
2023-05-10 19:01 ` [PATCH v5 0/1] cat-file: quote-format name in error when using -z Toon Claes
2023-05-10 19:01 ` [PATCH v5 1/1] " Toon Claes
2023-05-10 20:13 ` Junio C Hamano
2023-05-12 8:54 ` Toon Claes
2023-05-12 16:57 ` Junio C Hamano
2023-05-15 8:47 ` Phillip Wood
2023-05-15 17:20 ` Junio C Hamano
2023-06-02 13:29 ` Phillip Wood [this message]
2023-03-03 19:17 ` [PATCH v4 2/2] " Toon Claes
2023-03-03 20:14 ` [PATCH v4 0/2] " Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b83dba0-e900-ebae-2ad8-f036a40a15c5@gmail.com \
--to=phillip.wood123@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=phillip.wood@dunelm.org.uk \
--cc=toon@iotcl.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).