From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#8545: issues with recent doprnt-related changes Date: Mon, 25 Apr 2011 23:02:25 -0700 Organization: UCLA Computer Science Department Message-ID: <4DB65FF1.5010003@cs.ucla.edu> References: <4DB50AB9.6060100@cs.ucla.edu> <83tydmaeo3.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1303798069 27327 80.91.229.12 (26 Apr 2011 06:07:49 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 26 Apr 2011 06:07:49 +0000 (UTC) Cc: 8545@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Apr 26 08:07:45 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QEbR3-00026g-8R for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Apr 2011 08:07:45 +0200 Original-Received: from localhost ([::1]:41916 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEbR2-0008U9-MQ for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Apr 2011 02:07:44 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:49682) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEbR0-0008U4-5m for bug-gnu-emacs@gnu.org; Tue, 26 Apr 2011 02:07:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QEbQy-0006B3-QJ for bug-gnu-emacs@gnu.org; Tue, 26 Apr 2011 02:07:42 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:38641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QEbQy-0006Az-Oc for bug-gnu-emacs@gnu.org; Tue, 26 Apr 2011 02:07:40 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1QEbMT-0004fm-VY; Tue, 26 Apr 2011 02:03:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 26 Apr 2011 06:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8545 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 8545-submit@debbugs.gnu.org id=B8545.130379775517929 (code B ref 8545); Tue, 26 Apr 2011 06:03:01 +0000 Original-Received: (at 8545) by debbugs.gnu.org; 26 Apr 2011 06:02:35 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QEbM2-0004f8-BA for submit@debbugs.gnu.org; Tue, 26 Apr 2011 02:02:34 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QEbM0-0004ex-6x for 8545@debbugs.gnu.org; Tue, 26 Apr 2011 02:02:33 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id B604E39E80F5; Mon, 25 Apr 2011 23:02:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Len4N+F1+bZG; Mon, 25 Apr 2011 23:02:26 -0700 (PDT) Original-Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id E0ED639E8083; Mon, 25 Apr 2011 23:02:25 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 In-Reply-To: <83tydmaeo3.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Tue, 26 Apr 2011 02:03:01 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:46009 Archived-At: On 04/25/11 02:00, Eli Zaretskii wrote: >> * Format strings never include embedded null bytes, so there's >> no need for doprnt to support that. > > Potentially, someone could call `error' with its first argument taken > from a Lisp string, which could include null characters. But again, > this feature was there to begin with, and I see no particular need to > remove it. The feature is buggy, because the code does not check fmt versus fmt_end every time it increases fmt; it checks only sometimes. Hence one can construct examples where doprnt will overrun the format, e.g., by having '%l' at the end of the format. If doprnt were written to expect a null-terminated string, which is what all its callers pass anyway, it would be simpler and easier to maintain and would not have this problem. "%l" is a strange case anyway, since one cannot reliably use "%l" as an alias for "%d". For example, the format "%dx" prints an integer followed by an 'x', but if you try to use "%lx" instead, it doesn't work. At least, we should remove "%l" as a format specifier, as it's a rightly-unused feature and it's just asking for trouble to try to support it. This should also fix the format-overrun bug mentioned earlier. >> * If the format string is too long, the alloca inside doprnt will >> crash Emacs on some hosts. > > You are right. I modified doprnt to use SAFE_ALLOCA instead. There's no need for alloca or SAFE_ALLOCA or xmalloc or any dynamic allocator. Instead, convert any width and precision values to integers, and use "*". For example, if the caller specifies this: "%012345.6789g", 3.14 pass this to sprintf: "%0*.*g", 12345, 6789, 3.14 That way, the format string itself has easily-bounded size and the code never needs to use alloca or xmalloc or whatever. > Since both error and verror are now marked as ATTRIBUTE_FORMAT_PRINTF, > the compiler will detect such invalid formats and flag them. Sure, but in that case why maintain code to implement the two formats (%S and %l) that are flagged invalid and are never used and should never be used? >> - doprnt uses atoi (&fmtcpy[1]), but surely this isn't right if >> there are flags such as '-'. > > Why not? In that case, atoi will produce a negative value for > `width', which is already handled by the code. If I'm missing > something, please point out the specific problems with that. I don't see how the negative value is handled correctly. %-10s means to print a string right-justified, but the code surely treats it as if it were %0s. And other flags are possible, e.g., atoi will parse "%0-3d" as if the width were zero, but the width is 3 (the "0" is a flag). A quick second scan found a minor bug in size parsing: the expression "n >= SIZE_MAX / 10" should be "n > SIZE_MAX / 10". > The reason I originally limited to `size_max - 1' is that the games > you play with the maximum size, viz.: > > size_t size_max = > min (MOST_POSITIVE_FIXNUM, min (INT_MAX, SIZE_MAX - 1)) + 1; > > are neither clear nor commented. E.g., why the second `min'? could > INT_MAX be ever larger than SIZE_MAX-1? if so, what does that mean in > terms of relation between `int' and `size_t' on such a platform? The C Standard allows INT_MAX to be larger than SIZE_MAX - 1, yes. I don't know of any current targets with that property, but it didn't hurt to be safe. The second 'min' was needed because vsnprintf can't create a string longer than INT_MAX bytes. Since doprnt doesn't have that silly limit, the above line should be changed to something like the following (this time with a comment :-): /* Limit the string to sizes that both Emacs and size_t can represent. */ size_t size_max = min (MOST_POSITIVE_FIXNUM + 1, SIZE_MAX); > You are right, I made that change. Thanks, can you make a similar change inside doprint? It also uses xrealloc where xfree+xmalloc would be better. One other thing, the documentation says that lower-case l is a flag, but it's a length modifer and not a flag. It must be specified after the precision (if given) and immediately before the conversion specifier character, and it cannot be intermixed with flags like 0 and - and +.