Linux-man Archive mirror
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: Lee Griffiths <poddster@gmail.com>
Cc: linux-man@vger.kernel.org
Subject: Re: Fwd: [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS
Date: Sat, 9 Dec 2023 12:55:32 +0100	[thread overview]
Message-ID: <ZXRVtFY9lffPFnyI@debian> (raw)
In-Reply-To: <CAKXok1Fdm0aYskE25+DPkiOc194gMLYdJyvVMybZLAUf+uwn1A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4109 bytes --]

On Thu, Dec 07, 2023 at 09:50:35PM +0000, Lee Griffiths wrote:
> (repost to mailing list, as my previous message attempt looked like
> plain-text but was actually html)
> 
> 
> 
> > Hi Lee!
> 
> > Thanks for the report.  After seeing how much frustration it has caused,
> > I propose this change.  Does it look good to you?
> 
> I don't wish to bike-shed this (as the current man-page is fine by me)
> and I have no idea on the style guide used by the man-pages, but if I
> was making the change I would replace the 'deprecated' on every
> integer specifier with "CAVEAT: SEE BUGS". That way the inexperienced
> reader is still frightened into using the function carefully. But if
> that kind of thing isn't allowed then the proposed patch looks good to
> me.

We could do that kind of thing.  There are pages where the first line in
the DESCRIPTION is something like 'Never use this function.' (that
exact text appears in gets(3)).

> 
> As a general point: A _lot_ of inexperienced users use this function
> to parse user input. At the start of every semester you see an influx
> of "why is my use of scanf broken?" posts on the various C and
> learn-programming based subreddits, as well as Stackoverflow.

Not exactly.  This page is only about sscanf(3), which is not as bad as
scanf(3).

For scanf(3), I've re-read the page after these discussions, and have
added some more text, documenting some of the problems:

-  <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=4ea602c6ab2716c00d189d28199a9236180d2145>

-  <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=8c3bd620bca7de41c9d3e28d73f09ec88fd52a86>

> I have
> no idea why but it seems there's a large body of professors out there
> teaching people to use scanf() instead of getc() or fgets() etc, so
> I'm of the opinion that the scanf() page needs to be as scary as
> possible :)

My guess is that the old manual page wasn't scary enough (if at all).

I've done a few steps to try to prevent that.

Split [f]scanf(3) from sscanf(3).  The latter is not so bad, since it
doesn't need to differentiate newlines from other white space, and it
doesn't leave the unrecognized text in the input stream.

So, the new page for sscanf(3) is what documents the conversions and
all that, and the new page for scanf(3) (and fscanf(3)) is shorter
and just recommending avoiding these functions at all (but still
referring to sscanf(3) for documentation of the conversions).

> 
> Again, I know nothing about how man pages are written, but if it was
> documentation for legacy code I'd inherited I'm make sure to stress
> the following somewhere on the page:

We have man-pages(7) with a small style guide.

> 1. scanf() is intended to parse FORMATTED input, i.e. it consumes the
> kind of strings produced by printf(), and NOT user input. (I'm not
> 100% sure if K&R had that as their rationale, but that's the way it's
> designed now. Though this might confuse people into thinking they can
> use their similar, but not identical, format strings between printf
> and scanf!). Currently the word "format" or "formatted" barely
> appears. But it's this feature that distinguishes it from the other
> parsing functions.

Agree.  I've added this commit:

<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=bb4dbdb82f141f6394984aced67d65810ec7f747>

> 2. Things like fgets() are much better for consuming user input, which
> you can then parse with all the other functions.

That's already specified in scanf(3), in the first paragraph:

DESCRIPTION
     The scanf() family of functions scans input like  sscanf(3),  but
     read  from  a  FILE.  It is very difficult to use these functions
     correctly, and  it  is  preferable  to  read  entire  lines  with
     fgets(3)  or  getline(3)  and  parse them later with sscanf(3) or
     more specialized functions such as strtol(3).

Thanks,
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2023-12-09 11:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06 14:52 [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS Alejandro Colomar
2023-12-06 16:36 ` Alejandro Colomar
2023-12-06 18:33   ` Matthew House
2023-12-06 20:17     ` Alejandro Colomar
2023-12-06 20:45       ` Matthew House
2023-12-06 20:54         ` Matthew House
2023-12-06 21:12         ` Alejandro Colomar
     [not found] ` <CAKXok1GQvKi2HiBU89CSd+KF_dd9+mOMVhHrMKAVLLwcyJDN2g@mail.gmail.com>
2023-12-07 21:50   ` Fwd: " Lee Griffiths
2023-12-09 11:55     ` Alejandro Colomar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXRVtFY9lffPFnyI@debian \
    --to=alx@kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=poddster@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).