From: Paul Eggert <eggert@cs.ucla.edu>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 8545@debbugs.gnu.org
Subject: bug#8545: issues with recent doprnt-related changes
Date: Wed, 27 Apr 2011 16:51:06 -0700 [thread overview]
Message-ID: <4DB8ABEA.3080503@cs.ucla.edu> (raw)
In-Reply-To: <83aafb8p4a.fsf@gnu.org>
On 04/27/11 12:34, Eli Zaretskii wrote:
> I added more checks, thanks for pointing this out.
Thanks, but I don't see the need for this newly-added check:
if (fmt > format_end)
fmt = format_end;
If fmt is actually greater than format_end, it's pointing past the end
of an object, so the C code is relying on undefined behavior and the
check therefore isn't portable. But how can fmt ever be greater than
format_end here? I suggest removing the check.
> ??? %-10s means to print a string LEFT-justified, and the code handles
> that in this loop...
> for %s and %c the "0" flag is not supported anyway (as stated in the
> comments) and GCC flags that with a warning.
Good points both; thanks.
>> A quick second scan found a minor bug in size parsing: the
>> expression "n >= SIZE_MAX / 10" should be "n > SIZE_MAX / 10".
>
> When they get to messages as long as SIZE_MAX, let them sue me for
> taking away one byte.
It's not a question of saving space at run-time. It's a question of
helping the reader. The reader is left wondering: why is that ">="
there? And why is there another test "n * 10 > SIZE_MAX - (fmt[1] -
'0')" that always returns 0, no matter what? Code like that will
puzzle future maintainers, who may well mess it up later because they
don't know what's going on. Also, most printf implementations will
mess up if given a width or precision greater than INT_MAX, so I
suggest not allowing widths or precisions greater than that. In summary,
I suggest replacing this:
if (n >= SIZE_MAX / 10
|| n * 10 > SIZE_MAX - (fmt[1] - '0'))
error ("Format width or precision too large");
with this:
/* Avoid int overflow, because many sprintfs seriously mess up
with widths or precisions greater than INT_MAX. Avoid size_t
overflow, since our counters use size_t. This test is slightly
conservative, for speed and simplicity. */
if (n >= min (INT_MAX, SIZE_MAX) / 10)
error ("Format width or precision too large");
> "MOST_POSITIVE_FIXNUM + 1" is too much, since MOST_POSITIVE_FIXNUM
> should be able to cover the terminating null character in Emacs.
Why? Emacs size fields count the bytes in the string, and does not
count the terminating null byte (which is not part of the string).
In briefly reviewing the new version, I found that doprnt
doesn't support the "ll" modifier for "long long", and thus wouldn't
port to platforms that use long long for EMACS_INT. This is easy to
fix, so I installed a fix for that. I also found three more problems:
* doprnt invokes strlen to find the length of the format. The
vsnprintf code didn't need to do that: it traversed the format once.
Surely it shouldn't be hard to change doprnt so that it traverses
the format once rather than twice.
* Sometimes verror will incorrectly truncate a string, even when there
is plenty of memory. verror might call doprnt (buffer, SIZE, m, m +
mlen, ap), and doprnt might discover that a multibyte character is
chopped in half at the end of the output buffer, and might return
(say) SIZE - 2. verror will incorrectly conclude that the output
was just fine, but it wasn't complete.
* verror might invoke doprnt two or more times, which means that
doprnt will traverse ap twice. This does not work in general; the C
standard is quite clear that the behavior is undefined in this case.
One way to fix this would be to modify doprnt so that it always
walks through the format completely, and generates all the output
that it is going to generate, and that it reallocates the output
buffer as needed as it goes. This would require an API change.
next prev parent reply other threads:[~2011-04-27 23:51 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-25 5:46 bug#8545: issues with recent doprnt-related changes Paul Eggert
2011-04-25 9:00 ` Eli Zaretskii
2011-04-25 13:37 ` Stefan Monnier
2011-04-26 20:25 ` Paul Eggert
2011-04-27 1:14 ` Stefan Monnier
2011-04-26 6:02 ` Paul Eggert
2011-04-27 19:34 ` Eli Zaretskii
2011-04-27 23:51 ` Paul Eggert [this message]
2011-04-28 1:32 ` Juanma Barranquero
2011-04-28 3:11 ` Paul Eggert
2011-04-28 3:42 ` Juanma Barranquero
2011-04-28 5:06 ` Paul Eggert
2011-04-28 5:15 ` Eli Zaretskii
2011-04-28 5:29 ` Paul Eggert
2011-04-28 6:10 ` Eli Zaretskii
2011-04-28 6:42 ` Paul Eggert
2011-04-28 7:26 ` Eli Zaretskii
2011-04-28 7:54 ` Paul Eggert
2011-04-28 11:14 ` Eli Zaretskii
2011-04-29 12:28 ` Richard Stallman
2011-04-29 19:56 ` Eli Zaretskii
2011-04-29 23:49 ` Paul Eggert
2011-04-30 21:03 ` Richard Stallman
2011-05-01 5:41 ` Paul Eggert
2011-05-01 23:59 ` Richard Stallman
2011-05-02 0:23 ` Paul Eggert
[not found] ` <E1QH37h-0001yM-HR@fencepost.gnu.org>
2011-05-03 20:24 ` Paul Eggert
2011-05-01 4:25 ` Jason Rumney
2011-05-01 5:56 ` Paul Eggert
2011-05-01 8:12 ` Jason Rumney
2011-05-01 11:02 ` Andreas Schwab
2011-04-28 5:02 ` Eli Zaretskii
2011-04-28 5:50 ` Eli Zaretskii
[not found] ` <4DB9146D.2040702@cs.ucla.edu>
[not found] ` <E1QFQVO-0004Dq-6o@fencepost.gnu.org>
[not found] ` <4DB9E5FF.9020506@cs.ucla.edu>
2011-04-29 11:16 ` Eli Zaretskii
2011-04-29 14:41 ` Paul Eggert
2011-04-29 19:35 ` Eli Zaretskii
2011-04-29 20:32 ` Paul Eggert
2011-04-30 8:59 ` Eli Zaretskii
2011-05-04 7:28 ` Paul Eggert
2011-05-04 9:52 ` Eli Zaretskii
2011-05-04 14:56 ` Paul Eggert
[not found] ` <4DC1692B.1090101@cs.ucla.edu>
2011-05-05 20:36 ` Eli Zaretskii
[not found] ` <83ei4cnau6.fsf@gnu.org>
2011-05-06 13:33 ` Stefan Monnier
[not found] ` <jwvsjss2bz3.fsf-monnier+emacs@gnu.org>
2011-05-06 14:41 ` Paul Eggert
2011-05-06 15:03 ` Eli Zaretskii
[not found] ` <83vcxnlvl9.fsf@gnu.org>
2011-05-06 17:13 ` Stefan Monnier
[not found] ` <jwv8vuj21q0.fsf-monnier+emacs@gnu.org>
2011-05-06 19:57 ` Eli Zaretskii
[not found] ` <83k4e3lhzp.fsf@gnu.org>
2011-05-07 3:18 ` Stefan Monnier
[not found] ` <jwvr58byz9s.fsf-monnier+emacs@gnu.org>
2011-05-07 7:55 ` Eli Zaretskii
-- strict thread matches above, loose matches on Subject: below --
2011-05-01 18:19 bug#8601: * 2 -> * 4 typo fix in detect_coding_charset Paul Eggert
2011-05-01 19:06 ` Andreas Schwab
2011-05-01 19:25 ` Paul Eggert
2011-05-06 7:29 ` bug#8601: Merged fixes for 8600, 8601, 8602, and (partially) for 8545 Paul Eggert
2020-09-14 12:37 ` bug#8545: " Lars Ingebrigtsen
2020-09-14 18:41 ` Eli Zaretskii
2020-09-16 2:01 ` Paul Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DB8ABEA.3080503@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=8545@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).