From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel,gmane.comp.gnu.utils.bugs Subject: Re: diff-mode misinterprets empty lines. Date: Tue, 04 Dec 2007 23:35:08 -0800 Message-ID: <87ir3d1tn7.fsf@penguin.cs.ucla.edu> References: <85ir3l767y.fsf@lola.goethe.zz> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1196840136 27346 80.91.229.12 (5 Dec 2007 07:35:36 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 5 Dec 2007 07:35:36 +0000 (UTC) Cc: jim@meyering.net, bug-gnu-utils@gnu.org, rms@gnu.org, emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 05 08:35:45 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IzonG-0008G1-Jg for ged-emacs-devel@m.gmane.org; Wed, 05 Dec 2007 08:35:43 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Izomz-0000QP-R2 for ged-emacs-devel@m.gmane.org; Wed, 05 Dec 2007 02:35:25 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Izomu-0000Ma-F7 for emacs-devel@gnu.org; Wed, 05 Dec 2007 02:35:20 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Izomt-0000LB-VJ for emacs-devel@gnu.org; Wed, 05 Dec 2007 02:35:20 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Izomt-0000L0-4O; Wed, 05 Dec 2007 02:35:19 -0500 Original-Received: from kiwi.cs.ucla.edu ([131.179.128.19]) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Izomn-0003Ef-VX; Wed, 05 Dec 2007 02:35:14 -0500 Original-Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id lB57Z8Oe020679; Tue, 4 Dec 2007 23:35:09 -0800 (PST) Original-Received: from eggert by penguin.cs.ucla.edu with local (Exim 4.63) (envelope-from ) id 1Izomi-0004A5-EX; Tue, 04 Dec 2007 23:35:08 -0800 In-Reply-To: (Stefan Monnier's message of "Thu\, 29 Nov 2007 11\:09\:41 -0500") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) X-detected-kernel: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:84718 gmane.comp.gnu.utils.bugs:15302 Archived-At: Stefan Monnier writes: > Could you please revert this change? I did so, by installing the following patch into diffutils. The old behavior (with trailing blanks) is now the default, and the new behavior (without them) is enabled only if the new --suppress-blank-empty option is given. However, the next POSIX version specifies a "diff -u" format in which those trailing blanks are optional, and apparently some "diff" implementations in the wild omit them even though no official or test version of GNU diff ever did. So, as a practical matter, programs that parse "diff" output should not assume the trailing blanks are present, regardless of what GNU diff does. 2007-12-04 Paul Eggert * NEWS: New diff option --suppress-blank-empty (no longer -u default). * doc/diff.texi (Detailed Unified): Trailing spaces are no longer omitted. (Trailing Blanks): New section. (diff Options) Mention new option. * src/diff.h (suppress_blank_entry): New decl. * src/context.c (pr_unidiff_hunk): Support --suppress-blank-empty. * src/util.c (print_1_line): Likewise. * src/diff.c (longopts, main, option_help_msgid): Likewise. (SUPPRESS_BLANK_EMPTY_OPTION): New constant. Index: NEWS =================================================================== RCS file: /cvsroot/diffutils/diffutils/NEWS,v retrieving revision 1.27 diff -u -p -r1.27 NEWS --- NEWS 19 Jul 2007 17:45:26 -0000 1.27 +++ NEWS 5 Dec 2007 07:24:10 -0000 @@ -1,7 +1,6 @@ User-visible changes since 2.8.7 (in "version" 2.8.7-cvs): -* When reporting an unchanged empty line, diff -u now outputs an - empty line instead of a line consisting of a single space character. +* New diff option --suppress-blank-empty. * Bring back support for `diff -NUM', where NUM is a number, even when conforming to POSIX 1003.1-2001. This change reverts to Index: doc/diff.texi =================================================================== RCS file: /cvsroot/diffutils/diffutils/doc/diff.texi,v retrieving revision 1.33 diff -u -p -r1.33 diff.texi --- doc/diff.texi 15 Aug 2007 19:33:36 -0000 1.33 +++ doc/diff.texi 5 Dec 2007 07:24:11 -0000 @@ -842,8 +842,7 @@ line numbers look like @samp{@var{start} its end line number appears. An empty hunk is considered to end at the line that precedes the hunk. -Lines common to both files begin with a space character, except that -the space is omitted if its line is empty. Lines +The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column: @@ -1854,8 +1853,9 @@ specified patterns. These adjustments can be applied to any output format. @menu -* Tabs:: Preserving the alignment of tab stops. -* Pagination:: Page numbering and time-stamping @command{diff} output. +* Tabs:: Preserving the alignment of tab stops. +* Trailing Blanks:: Suppressing blanks before empty output lines. +* Pagination:: Page numbering and time-stamping @command{diff} output. @end menu @node Tabs @@ -1890,6 +1890,29 @@ output format, which does not have a spa type indicator character. Select this method with the @option{-T} or @option{--initial-tab} option. +@node Trailing Blanks +@section Omitting trailing blanks +@cindex trailing blanks +When outputting lines in normal or context format, or outputting an +unchanged line in unified format, @command{diff} normally outputs a +blank just before each line. If the line is empty, the output of +@command{diff} therefore contains trailing blanks even though the +input does not contain them. For example, when outputting an +unchanged empty line in context format, @command{diff} normally +outputs a line with two leading spaces. + +Some text editors and email agents routinely delete trailing blanks, +so it can be a problem to deal with diff output files that contain +them. You can avoid this problem with the +@option{--suppress-blank-empty} option. It causes @command{diff} to +omit trailing blanks at the end of output lines in normal, context, +and unified format, unless the trailing blanks were already present in +the input. This changes the output format slightly, so that output +lines are guaranteed to never end in a blank unless an input line ends +in a blank. This format is less likely to be munged by text editors +or by transmission via email. It is accepted by @acronym{GNU} +@command{patch} as well. + @node Pagination @section Paginating @command{diff} Output @cindex paginating @command{diff} output @@ -3878,6 +3901,11 @@ normal. @xref{Tabs}. Assume that tab stops are set every @var{columns} (default 8) print columns. @xref{Tabs}. +@item --suppress-blank-empty +Suppress any blanks before newlines when printing the representation +of an empty line, when outputting normal, context, or unified format. +@xref{Trailing Blanks}. + @item --to-file=@var{file} Compare each operand to @var{file}; @var{file} may be a directory. Index: src/diff.h =================================================================== RCS file: /cvsroot/diffutils/diffutils/src/diff.h,v retrieving revision 1.28 diff -u -p -r1.28 diff.h --- src/diff.h 19 Jul 2007 17:45:29 -0000 1.28 +++ src/diff.h 5 Dec 2007 07:24:11 -0000 @@ -150,6 +150,9 @@ XTERN size_t tabsize; without changing the characters in it (-T). */ XTERN bool initial_tab; +/* Do not output an initial space or tab before the text of an empty line. */ +XTERN bool suppress_blank_empty; + /* Remove trailing carriage returns from input. */ XTERN bool strip_trailing_cr; Index: src/context.c =================================================================== RCS file: /cvsroot/diffutils/diffutils/src/context.c,v retrieving revision 1.23 diff -u -p -r1.23 context.c --- src/context.c 19 Jul 2007 17:45:28 -0000 1.23 +++ src/context.c 5 Dec 2007 07:24:11 -0000 @@ -350,7 +350,7 @@ pr_unidiff_hunk (struct change *hunk) if (!next || i < next->line0) { char const *const *line = &files[0].linbuf[i++]; - if (**line != '\n') + if (! (suppress_blank_empty && **line == '\n')) putc (initial_tab ? '\t' : ' ', out); print_1_line (NULL, line); j++; @@ -362,10 +362,11 @@ pr_unidiff_hunk (struct change *hunk) k = next->deleted; while (k--) { + char const * const *line = &files[0].linbuf[i++]; putc ('-', out); - if (initial_tab) + if (initial_tab && ! (suppress_blank_empty && **line == '\n')) putc ('\t', out); - print_1_line (NULL, &files[0].linbuf[i++]); + print_1_line (NULL, line); } /* Then output the inserted part. */ @@ -373,10 +374,11 @@ pr_unidiff_hunk (struct change *hunk) k = next->inserted; while (k--) { + char const * const *line = &files[1].linbuf[j++]; putc ('+', out); - if (initial_tab) + if (initial_tab && ! (suppress_blank_empty && **line == '\n')) putc ('\t', out); - print_1_line (NULL, &files[1].linbuf[j++]); + print_1_line (NULL, line); } /* We're done with this hunk, so on to the next! */ Index: src/util.c =================================================================== RCS file: /cvsroot/diffutils/diffutils/src/util.c,v retrieving revision 1.37 diff -u -p -r1.37 util.c --- src/util.c 15 Aug 2007 19:32:22 -0000 1.37 +++ src/util.c 5 Dec 2007 07:24:11 -0000 @@ -506,7 +506,8 @@ print_script (struct change *script, /* Print the text of a single line LINE, flagging it with the characters in LINE_FLAG (which say whether - the line is inserted, deleted, changed, etc.). */ + the line is inserted, deleted, changed, etc.). LINE_FLAG must not + end in a blank, unless it is a single blank. */ void print_1_line (char const *line_flag, char const *const *line) @@ -517,12 +518,25 @@ print_1_line (char const *line_flag, cha /* If -T was specified, use a Tab between the line-flag and the text. Otherwise use a Space (as Unix diff does). - Print neither space nor tab if line-flags are empty. */ + Print neither space nor tab if line-flags are empty. + But omit trailing blanks if requested. */ if (line_flag && *line_flag) { - flag_format = initial_tab ? "%s\t" : "%s "; - fprintf (out, flag_format, line_flag); + char const *flag_format_1 = flag_format = initial_tab ? "%s\t" : "%s "; + char const *line_flag_1 = line_flag; + + if (suppress_blank_empty && **line == '\n') + { + flag_format_1 = "%s"; + + /* This hack to omit trailing blanks takes advantage of the + fact that the only way that LINE_FLAG can end in a blank + is when LINE_FLAG consists of a single blank. */ + line_flag_1 += *line_flag_1 == ' '; + } + + fprintf (out, flag_format_1, line_flag_1); } output_1_line (base, limit, flag_format, line_flag); Index: src/diff.c =================================================================== RCS file: /cvsroot/diffutils/diffutils/src/diff.c,v retrieving revision 1.46 diff -u -p -r1.46 diff.c --- src/diff.c 15 Aug 2007 19:32:22 -0000 1.46 +++ src/diff.c 5 Dec 2007 07:24:11 -0000 @@ -110,6 +110,7 @@ enum NORMAL_OPTION, SDIFF_MERGE_ASSIST_OPTION, STRIP_TRAILING_CR_OPTION, + SUPPRESS_BLANK_EMPTY_OPTION, SUPPRESS_COMMON_LINES_OPTION, TABSIZE_OPTION, TO_FILE_OPTION, @@ -187,6 +188,7 @@ static struct option const longopts[] = {"speed-large-files", 0, 0, 'H'}, {"starting-file", 1, 0, 'S'}, {"strip-trailing-cr", 0, 0, STRIP_TRAILING_CR_OPTION}, + {"suppress-blank-empty", 0, 0, SUPPRESS_BLANK_EMPTY_OPTION}, {"suppress-common-lines", 0, 0, SUPPRESS_COMMON_LINES_OPTION}, {"tabsize", 1, 0, TABSIZE_OPTION}, {"text", 0, 0, 'a'}, @@ -565,6 +567,10 @@ main (int argc, char **argv) strip_trailing_cr = true; break; + case SUPPRESS_BLANK_EMPTY_OPTION: + suppress_blank_empty = true; + break; + case SUPPRESS_COMMON_LINES_OPTION: suppress_common_lines = true; break; @@ -886,6 +892,7 @@ static char const * const option_help_ms N_("-t --expand-tabs Expand tabs to spaces in output."), N_("-T --initial-tab Make tabs line up by prepending a tab."), N_("--tabsize=NUM Tab stops are every NUM (default 8) print columns."), + N_("--suppress-blank-empty Suppress space or tab before empty output lines."), "", N_("-r --recursive Recursively compare any subdirectories found."), N_("-N --new-file Treat absent files as empty."),