Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Alex Henrie <alexhenrie24@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, eyvind.bernhardsen@gmail.com, tboegi@web.de
Subject: Re: [PATCH v2] docs: rewrite the documentation of the text and eol attributes
Date: Sat, 29 Apr 2023 13:19:00 -0600	[thread overview]
Message-ID: <CAMMLpeQB1zdTEWd+=d0kaKwpzax093iLTuytZtvn0nTSJ2xT4A@mail.gmail.com> (raw)
In-Reply-To: <xmqqsfcjevbz.fsf@gitster.g>

On Fri, Apr 28, 2023 at 12:05 PM Junio C Hamano <gitster@pobox.com> wrote:

> Thanks.  I wonder helped-by tboegi@ might be in order, but hopefully
> it may come during the review of this round.

Sure, I'll add a Helped-by line in the next revision.

> 1. Overall, the new text is clearly written, and I think it made the
>    updated description of "Set" and "Set to string 'auto'" easier to
>    understand.

I'm glad you like it!

> 2. "on checkin and possibly also checkout" may be redundant, as the
>    section in which the `text` attribute is described is all about
>    various conversions upon checking in (i.e. moving contents to the
>    index) and checking out (i.e. moving contents out of the index).

For the `text` attribute's behavior to make sense, the first thing a
newbie has to understand is that `text` definitely enables
normalization on checkin and might or might not enable normalization
on checkout. The original text made it sound like `text` only controls
checkin and `eol` controls checkout, a false impression that is
especially misleading because on Linux and OS X, `text` without `eol`
(or core.autocrlf or core.eol) does only enable normalization on
checkin. The new text may be slightly redundant, but it makes the
important part crystal clear.

> 3. It is unclear why we would want to mention "even if it has CRLF
>    line endings in the working directory".  "Regardless of the line
>    endings in the working tree files" may slightly be better [*1*],
>    but I think the text reads better without it as storing contents
>    with possibly different end-of-line convention is the point of
>    "end-of-line normalization" in the first place.

> *1* The intent is not about special-casing CRLF but any other; if
>     Git were to be extended to support ancient MacOS, we would
>     convert even if it uses CR line endings.

I called out CRLF precisely because I was thinking about platforms
that use CR instead of LF or CRLF. I wanted to make it clear that if
you add a file that was written on or for an ancient Mac or Commodore,
the line endings are not normalized. Git can be used (and probably is
being used by someone) to version-control files for those platforms
without actually running on those platforms.

Then again, maybe the fact that the `text` attribute does not
normalize CR line endings is a bug in Git, and we shouldn't advertise
it in the documentation as if it were intended behavior. What do you
think?

>    "the file is stored in the index with LF line endings" may better
>    be rephrased to "the contents is stored, after converting to use
>    LF line endings if necessary, in the index", so that we can use
>    the verb "convert" to stress the point of setting this attribute.

The phrase "if necessary" would be a bit confusing. What makes
conversion on checkin "necessary"? The reader would wonder if it
depends on the Git config and the platform like conversion on checkout
does.

Would you be OK with your proposed wording minus "if necessary"?

> 4. This problem is shared with the original text, but "This
>    attribute marks the path as a text file" might be a bit
>    misleading.  If it is left unspecified (i.e. unmentioned, or any
>    earlier settings by explicitly overriding them with "!text") or
>    worse yet, explicitly unset (i.e. "-text"), it marks the path as
>    a non text.
>
>       This attribute is used to mark the path as a text file, which
>       needs certain end-of-line normalization upon check-in and
>       check-out, or a non-text file, which do not.
>
>   I dunno if the extra verbosity is too much, though.

Hmm, I would find your proposed wording misleading because it seems to
imply that the default is `text=auto` and that `-text` (or a config
variable) is necessary to turn end-of-line normalization off. In my
opinion it's already clear enough that the attribute is unset by
default and that `-text` unsets a previous `text`.

> >  Set::
> >
> >       Setting the `text` attribute on a path enables end-of-line
> > -     normalization and marks the path as a text file.  End-of-line
> > -     conversion takes place without guessing the content type.
> > +     normalization on checkin and checkout as described above.  Line
> > +     endings are normalized in the index the next time the file is
> > +     checked in, even if the file was previously added to Git with CRLF
> > +     line endings.
>
> "the next time" -> "every time", but that is probably moot, because
> I would question the need for the whole sentence.
>
> The last sentence, starting with "Line endigns are normalized", may
> be redundant and we probably are better off without it.  It would
> risk becoming maintenance burden because we have to make sure it
> stays in sync with what you wrote in the introductory text.  "as
> described above" before that sentence is clear enough.

I added that sentence to make the difference between `text` and
`text=auto` more clear. If all we have is the introductory paragraph,
it sounds like Git only converts CRLF to LF on checkin if it had
converted LF to CRLF on checkout. However, if you feel strongly that
the sentence is unhelpful here, I'll remove it.

> We may want to use the phrase "checking out" somewhere in
> conjunction with "in the working directory" to be consistent with
> how the section is titled.  Something like
>
>     This attribute marks a path to use a specific line-ending style
>     in the working tree when it is checked out.

That's fine.

> >  Set to string value "lf"::
> >
> > -     This setting forces Git to normalize line endings to LF on
> > -     checkin and prevents conversion to CRLF when the file is
> > -     checked out.
> > +     This setting uses the same line endings in the working directory as
> > +     in the index, whether they are LF or CRLF.  However, unless
> > +     `text=auto`, adding the file to the index again will normalize its
> > +     line endings to LF in the index.
>
> First of all, "uses the same, except if ... then LF is used" is much
> harder to understand than "use LF, except if ..." as an explanation
> for "lf".
>
> The description may be correct but is 'eol=lf' what controls how the
> normalization is done upon checking in?
>
> As you said in the explanation for the `eol` attribute as a whole,
> `eol` is effective only when `text` says the path is a text file,
> and that (i.e. the fact that `text` controls the end-of-line) makes
> the path to use LF line endings, isn't it?  The last sentence
> starting with "However, unless" that talks about conversion going
> into the index feels out of place.

That's a good point, it would be more clear here to explicitly say
that the setting affects checkout, and not mention what happens on
checkin (which was described previously in the `text` documentation).
Also, if we delete the sentence here then the seemingly redundant
sentence that starts with "Line endings are normalized" wouldn't be as
redundant.

> There are some exception handling in the code for odd cases like the
> contents with mixed line endings, a path set to use LF but the file
> actually has CRLF, etc.  While they are worth describing, I wonder
> if they should be done in a separate paragraph.

Probably best done in a separate patch after this rewrite is done.

Wow, that was a LOT of feedback on a relatively short piece of text.
Are you sure you don't want to rewrite the documentation yourself? ;-)

-Alex

  reply	other threads:[~2023-04-29 19:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-21  5:56 [PATCH] docs: clarify how the text and text=auto attributes are different Alex Henrie
2023-04-21 16:35 ` Junio C Hamano
2023-04-21 18:25 ` Torsten Bögershausen
2023-04-26 13:18   ` Alex Henrie
2023-04-28  4:22 ` [PATCH v2] docs: rewrite the documentation of the text and eol attributes Alex Henrie
2023-04-28 18:05   ` Junio C Hamano
2023-04-29 19:19     ` Alex Henrie [this message]
2023-04-30  5:16       ` Junio C Hamano
2023-05-01  0:40         ` Alex Henrie
2023-04-30 13:19   ` Torsten Bögershausen
2023-04-30 23:43     ` Alex Henrie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMMLpeQB1zdTEWd+=d0kaKwpzax093iLTuytZtvn0nTSJ2xT4A@mail.gmail.com' \
    --to=alexhenrie24@gmail.com \
    --cc=eyvind.bernhardsen@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).