From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#8545: issues with recent doprnt-related changes Date: Mon, 25 Apr 2011 12:00:44 +0300 Message-ID: <83tydmaeo3.fsf@gnu.org> References: <4DB50AB9.6060100@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1303722509 5740 80.91.229.12 (25 Apr 2011 09:08:29 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 25 Apr 2011 09:08:29 +0000 (UTC) Cc: 8545@debbugs.gnu.org To: Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Apr 25 11:08:25 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QEHmK-00056K-Ky for geb-bug-gnu-emacs@m.gmane.org; Mon, 25 Apr 2011 11:08:24 +0200 Original-Received: from localhost ([::1]:54131 helo=lists2.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEHmK-0005AV-1i for geb-bug-gnu-emacs@m.gmane.org; Mon, 25 Apr 2011 05:08:24 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:56891) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEHlY-0003sj-F8 for bug-gnu-emacs@gnu.org; Mon, 25 Apr 2011 05:07:37 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QEHlX-0001ar-0R for bug-gnu-emacs@gnu.org; Mon, 25 Apr 2011 05:07:36 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:56746) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEHlW-0001ah-Tj for bug-gnu-emacs@gnu.org; Mon, 25 Apr 2011 05:07:34 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1QEHfD-0000G8-2A; Mon, 25 Apr 2011 05:01:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 25 Apr 2011 09:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8545 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 8545-submit@debbugs.gnu.org id=B8545.1303722058984 (code B ref 8545); Mon, 25 Apr 2011 09:01:02 +0000 Original-Received: (at 8545) by debbugs.gnu.org; 25 Apr 2011 09:00:58 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QEHf7-0000Fp-ML for submit@debbugs.gnu.org; Mon, 25 Apr 2011 05:00:58 -0400 Original-Received: from mtaout22.012.net.il ([80.179.55.172]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QEHf4-0000Fb-Rs for 8545@debbugs.gnu.org; Mon, 25 Apr 2011 05:00:56 -0400 Original-Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0LK700100AC6JZ00@a-mtaout22.012.net.il> for 8545@debbugs.gnu.org; Mon, 25 Apr 2011 12:00:48 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.127.55.52]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LK7000E3AD4L3D0@a-mtaout22.012.net.il>; Mon, 25 Apr 2011 12:00:48 +0300 (IDT) In-reply-to: <4DB50AB9.6060100@cs.ucla.edu> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Mon, 25 Apr 2011 05:01:03 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:45985 Archived-At: > Date: Sun, 24 Apr 2011 22:46:33 -0700 > From: Paul Eggert > CC: Eli Zaretskii > > This is a followup to Bug#8435. Eli invited me to review the recent > doprnt-related changes, so here's a quick review: Thanks for the review and the comments. > * doprnt returns size_t. But Stefan wrote that he prefers sizes to be > signed values, and doprnt always returns a value that can fit in > EMACS_INT. So shouldn't doprnt return EMACS_INT, as it did before? I made it return size_t because all the related variables in verror are size_t, and I didn't want to mix signed with unsigned. AFAIU, the preference to use signed is for those values that come from Lisp or go back to the Lisp level, which is not the case here. But I will let Stefan comment on this. Changing doprnt to return a signed value, and making the respective changes in verror, would be trivial, and I won't mind doing that, if that's the verdict. > * doprnt supports only a small subset of the standard printf formats, > but this subset is not documented. It's unclear what the subset is. > Or it's a superset of the subset, with %S and %l? Anyway, this > should be documented clearly in the lead comment. I added such a documentation. > * I suggest that every feature in doprnt be a feature that is actually > needed and used; this will simplify maintainance. I agree, but I didn't add any features, except the support for %ld, which is surely needed for error messages that show EMACS_INT values. All the rest was already there in the original code of doprnt. I see no reason to remove that code just because no error message currently uses some of those features. According to "bzr annotate", most of the related code in doprnt survived largely untouched since the 1990s, except some cleanup. This is no guarantee of being bug-free, of course, but it does have _some_ weight in my eyes. > * Format strings never include embedded null bytes, so there's > no need for doprnt to support that. Potentially, someone could call `error' with its first argument taken from a Lisp string, which could include null characters. But again, this feature was there to begin with, and I see no particular need to remove it. > * If the format string is too long, the alloca inside doprnt will > crash Emacs on some hosts. You are right. I modified doprnt to use SAFE_ALLOCA instead. > I suggest removing the alloca, > instituting a fixed size limit on format specifiers, and documenting > that limit. Since user code cannot ever supply one of these > formats, that should be good enough. GNU coding standards frown on arbitrary limits, so I didn't want to take that route, what with SAFE_ALLOCA readily available and easy to use. > * The width features of doprnt (e.g., %25s) are never used. Again, an old feature that I see no reasons to remove. And, since doprnt produces error messages meant to be displayed, I find that this feature actually makes sense. > - doprnt mishandles format specifications such as %0.0.0d. > It passes them off to printf, and this results in undefined > behavior, near as I can tell. Since both error and verror are now marked as ATTRIBUTE_FORMAT_PRINTF, the compiler will detect such invalid formats and flag them. If the warning is disregarded, the result of such a format is just a somewhat illegible message. In any case, vsnprintf would do the same, right? > - doprnt uses atoi (&fmtcpy[1]), but surely this isn't right if > there are flags such as '-'. Why not? In that case, atoi will produce a negative value for `width', which is already handled by the code. If I'm missing something, please point out the specific problems with that. > - Quite possibly there are other problems in this area, but I > didn't want to spend further time reviewing a never-used feature. I did read that code. It looked solid to me, but if you or someone else see specific problems, please point them out. > * In this code, in verror: > > used = doprnt (buffer, size, m, m + mlen, ap); > > /* Note: the -1 below is because `doprnt' returns the number of bytes > excluding the terminating null byte, and it always terminates with a > null byte, even when producing a truncated message. */ > if (used < size - 1) > break; > > I don't see the reason for the "- 1". If you replace this with: > > used = doprnt (buffer, size, m, m + mlen, ap); > > if (used < size) > break; > > the code should still work, because, when used < size, the buffer > should be properly null-terminated. If it isn't then there's something > wrong with doprnt, no? As the comment says, doprnt always null-terminates the result, even if the result is truncated, and it never returns a value larger than the buffer size it was given. (In that, it differs from vsnprintf, which can return larger values.) When doprnt does truncate the output string, it returns `size - 1'; if we compare against `size', we will happily bail out of the loop, and never try to enlarge the buffer. I saw no reason to enhance doprnt to continue processing the format string and the arguments once the buffer is exhausted. So I modified verror instead to DTRT. > * In this code, in verror: > > else if (size < size_max - 1) > size = size_max - 1; > > there's no need for the "- 1"s. Just use this: > > else if (size < size_max) > size = size_max; I made that change, thanks. The reason I originally limited to `size_max - 1' is that the games you play with the maximum size, viz.: size_t size_max = min (MOST_POSITIVE_FIXNUM, min (INT_MAX, SIZE_MAX - 1)) + 1; are neither clear nor commented. E.g., why the second `min'? could INT_MAX be ever larger than SIZE_MAX-1? if so, what does that mean in terms of relation between `int' and `size_t' on such a platform? I'm only familiar with a very small number of architectures, where all these tricks are unnecessary. When I see such code, it makes me dizzy and unsure of what I may be missing. So I opted for a safer way out (since error messages as long as SIZE_MAX are only theoretically possible), that would not risk overflowing signed values into the sign bit. Perhaps in the future you could comment such obscure code to make it understandable by mere mortals such as myself. > * This code in verror: > > if (buffer == buf) > buffer = (char *) xmalloc (size); > else > buffer = (char *) xrealloc (buffer, size); > > uses xrealloc, which is unnecessarily expensive, as it may copy the > buffer's contents even though they are entirely garbage here. Use > this instead, to avoid the useless copy: > > if (buffer != buf) > xfree (buffer); > buffer = (char *) xmalloc (size); You are right, I made that change. I believe this takes care of all the imminent problems you discovered in your review. Nevertheless, I will leave this bug report open for a while, to allow you and others to come up with more problems and suggestions for improvements. Thanks again for taking the time to review and comment.