unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@CS.UCLA.EDU>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: jim@meyering.net, bug-gnu-utils@gnu.org, rms@gnu.org,
	emacs-devel@gnu.org
Subject: Re: diff-mode misinterprets empty lines.
Date: Tue, 04 Dec 2007 23:35:08 -0800	[thread overview]
Message-ID: <87ir3d1tn7.fsf@penguin.cs.ucla.edu> (raw)
In-Reply-To: <jwv4pf5598v.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Thu\, 29 Nov 2007 11\:09\:41 -0500")

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Could you please revert this change?

I did so, by installing the following patch into diffutils.  The old
behavior (with trailing blanks) is now the default, and the new
behavior (without them) is enabled only if the new
--suppress-blank-empty option is given.

However, the next POSIX version specifies a "diff -u" format in which
those trailing blanks are optional, and apparently some "diff"
implementations in the wild omit them even though no official or test
version of GNU diff ever did.  So, as a practical matter, programs
that parse "diff" output should not assume the trailing blanks are
present, regardless of what GNU diff does.

2007-12-04  Paul Eggert  <eggert@cs.ucla.edu>

	* NEWS: New diff option --suppress-blank-empty (no longer -u default).
	* doc/diff.texi (Detailed Unified): Trailing spaces are no longer
	omitted.
	(Trailing Blanks): New section.
	(diff Options) Mention new option.
	* src/diff.h (suppress_blank_entry): New decl.
	* src/context.c (pr_unidiff_hunk): Support --suppress-blank-empty.
	* src/util.c (print_1_line): Likewise.
	* src/diff.c (longopts, main, option_help_msgid): Likewise.
	(SUPPRESS_BLANK_EMPTY_OPTION): New constant.

Index: NEWS
===================================================================
RCS file: /cvsroot/diffutils/diffutils/NEWS,v
retrieving revision 1.27
diff -u -p -r1.27 NEWS
--- NEWS	19 Jul 2007 17:45:26 -0000	1.27
+++ NEWS	5 Dec 2007 07:24:10 -0000
@@ -1,7 +1,6 @@
 User-visible changes since 2.8.7 (in "version" 2.8.7-cvs):

-* When reporting an unchanged empty line, diff -u now outputs an
-  empty line instead of a line consisting of a single space character.
+* New diff option --suppress-blank-empty.

 * Bring back support for `diff -NUM', where NUM is a number,
   even when conforming to POSIX 1003.1-2001.  This change reverts to
Index: doc/diff.texi
===================================================================
RCS file: /cvsroot/diffutils/diffutils/doc/diff.texi,v
retrieving revision 1.33
diff -u -p -r1.33 diff.texi
--- doc/diff.texi	15 Aug 2007 19:33:36 -0000	1.33
+++ doc/diff.texi	5 Dec 2007 07:24:11 -0000
@@ -842,8 +842,7 @@ line numbers look like @samp{@var{start}
 its end line number appears.  An empty hunk is considered to end at
 the line that precedes the hunk.

-Lines common to both files begin with a space character, except that
-the space is omitted if its line is empty.  Lines
+The lines common to both files begin with a space character.  The lines
 that actually differ between the two files have one of the following
 indicator characters in the left print column:

@@ -1854,8 +1853,9 @@ specified patterns.
 These adjustments can be applied to any output format.

 @menu
-* Tabs::       Preserving the alignment of tab stops.
-* Pagination:: Page numbering and time-stamping @command{diff} output.
+* Tabs::            Preserving the alignment of tab stops.
+* Trailing Blanks:: Suppressing blanks before empty output lines.
+* Pagination::      Page numbering and time-stamping @command{diff} output.
 @end menu

 @node Tabs
@@ -1890,6 +1890,29 @@ output format, which does not have a spa
 type indicator character.  Select this method with the @option{-T} or
 @option{--initial-tab} option.

+@node Trailing Blanks
+@section Omitting trailing blanks
+@cindex trailing blanks
+When outputting lines in normal or context format, or outputting an
+unchanged line in unified format, @command{diff} normally outputs a
+blank just before each line.  If the line is empty, the output of
+@command{diff} therefore contains trailing blanks even though the
+input does not contain them.  For example, when outputting an
+unchanged empty line in context format, @command{diff} normally
+outputs a line with two leading spaces.
+
+Some text editors and email agents routinely delete trailing blanks,
+so it can be a problem to deal with diff output files that contain
+them.  You can avoid this problem with the
+@option{--suppress-blank-empty} option.  It causes @command{diff} to
+omit trailing blanks at the end of output lines in normal, context,
+and unified format, unless the trailing blanks were already present in
+the input.  This changes the output format slightly, so that output
+lines are guaranteed to never end in a blank unless an input line ends
+in a blank.  This format is less likely to be munged by text editors
+or by transmission via email.  It is accepted by @acronym{GNU}
+@command{patch} as well.
+
 @node Pagination
 @section Paginating @command{diff} Output
 @cindex paginating @command{diff} output
@@ -3878,6 +3901,11 @@ normal.  @xref{Tabs}.
 Assume that tab stops are set every @var{columns} (default 8) print
 columns.  @xref{Tabs}.

+@item --suppress-blank-empty
+Suppress any blanks before newlines when printing the representation
+of an empty line, when outputting normal, context, or unified format.
+@xref{Trailing Blanks}.
+
 @item --to-file=@var{file}
 Compare each operand to @var{file}; @var{file} may be a directory.

Index: src/diff.h
===================================================================
RCS file: /cvsroot/diffutils/diffutils/src/diff.h,v
retrieving revision 1.28
diff -u -p -r1.28 diff.h
--- src/diff.h	19 Jul 2007 17:45:29 -0000	1.28
+++ src/diff.h	5 Dec 2007 07:24:11 -0000
@@ -150,6 +150,9 @@ XTERN size_t tabsize;
    without changing the characters in it (-T).  */
 XTERN bool initial_tab;

+/* Do not output an initial space or tab before the text of an empty line.  */
+XTERN bool suppress_blank_empty;
+
 /* Remove trailing carriage returns from input.  */
 XTERN bool strip_trailing_cr;

Index: src/context.c
===================================================================
RCS file: /cvsroot/diffutils/diffutils/src/context.c,v
retrieving revision 1.23
diff -u -p -r1.23 context.c
--- src/context.c	19 Jul 2007 17:45:28 -0000	1.23
+++ src/context.c	5 Dec 2007 07:24:11 -0000
@@ -350,7 +350,7 @@ pr_unidiff_hunk (struct change *hunk)
       if (!next || i < next->line0)
 	{
 	  char const *const *line = &files[0].linbuf[i++];
-	  if (**line != '\n')
+	  if (! (suppress_blank_empty && **line == '\n'))
 	    putc (initial_tab ? '\t' : ' ', out);
 	  print_1_line (NULL, line);
 	  j++;
@@ -362,10 +362,11 @@ pr_unidiff_hunk (struct change *hunk)
 	  k = next->deleted;
 	  while (k--)
 	    {
+	      char const * const *line = &files[0].linbuf[i++];
 	      putc ('-', out);
-	      if (initial_tab)
+	      if (initial_tab && ! (suppress_blank_empty && **line == '\n'))
 		putc ('\t', out);
-	      print_1_line (NULL, &files[0].linbuf[i++]);
+	      print_1_line (NULL, line);
 	    }

 	  /* Then output the inserted part. */
@@ -373,10 +374,11 @@ pr_unidiff_hunk (struct change *hunk)
 	  k = next->inserted;
 	  while (k--)
 	    {
+	      char const * const *line = &files[1].linbuf[j++];
 	      putc ('+', out);
-	      if (initial_tab)
+	      if (initial_tab && ! (suppress_blank_empty && **line == '\n'))
 		putc ('\t', out);
-	      print_1_line (NULL, &files[1].linbuf[j++]);
+	      print_1_line (NULL, line);
 	    }

 	  /* We're done with this hunk, so on to the next! */
Index: src/util.c
===================================================================
RCS file: /cvsroot/diffutils/diffutils/src/util.c,v
retrieving revision 1.37
diff -u -p -r1.37 util.c
--- src/util.c	15 Aug 2007 19:32:22 -0000	1.37
+++ src/util.c	5 Dec 2007 07:24:11 -0000
@@ -506,7 +506,8 @@ print_script (struct change *script,
 \f
 /* Print the text of a single line LINE,
    flagging it with the characters in LINE_FLAG (which say whether
-   the line is inserted, deleted, changed, etc.).  */
+   the line is inserted, deleted, changed, etc.).  LINE_FLAG must not
+   end in a blank, unless it is a single blank.  */

 void
 print_1_line (char const *line_flag, char const *const *line)
@@ -517,12 +518,25 @@ print_1_line (char const *line_flag, cha

   /* If -T was specified, use a Tab between the line-flag and the text.
      Otherwise use a Space (as Unix diff does).
-     Print neither space nor tab if line-flags are empty.  */
+     Print neither space nor tab if line-flags are empty.
+     But omit trailing blanks if requested.  */

   if (line_flag && *line_flag)
     {
-      flag_format = initial_tab ? "%s\t" : "%s ";
-      fprintf (out, flag_format, line_flag);
+      char const *flag_format_1 = flag_format = initial_tab ? "%s\t" : "%s ";
+      char const *line_flag_1 = line_flag;
+
+      if (suppress_blank_empty && **line == '\n')
+	{
+	  flag_format_1 = "%s";
+
+	  /* This hack to omit trailing blanks takes advantage of the
+	     fact that the only way that LINE_FLAG can end in a blank
+	     is when LINE_FLAG consists of a single blank.  */
+	  line_flag_1 += *line_flag_1 == ' ';
+	}
+
+      fprintf (out, flag_format_1, line_flag_1);
     }

   output_1_line (base, limit, flag_format, line_flag);
Index: src/diff.c
===================================================================
RCS file: /cvsroot/diffutils/diffutils/src/diff.c,v
retrieving revision 1.46
diff -u -p -r1.46 diff.c
--- src/diff.c	15 Aug 2007 19:32:22 -0000	1.46
+++ src/diff.c	5 Dec 2007 07:24:11 -0000
@@ -110,6 +110,7 @@ enum
   NORMAL_OPTION,
   SDIFF_MERGE_ASSIST_OPTION,
   STRIP_TRAILING_CR_OPTION,
+  SUPPRESS_BLANK_EMPTY_OPTION,
   SUPPRESS_COMMON_LINES_OPTION,
   TABSIZE_OPTION,
   TO_FILE_OPTION,
@@ -187,6 +188,7 @@ static struct option const longopts[] =
   {"speed-large-files", 0, 0, 'H'},
   {"starting-file", 1, 0, 'S'},
   {"strip-trailing-cr", 0, 0, STRIP_TRAILING_CR_OPTION},
+  {"suppress-blank-empty", 0, 0, SUPPRESS_BLANK_EMPTY_OPTION},
   {"suppress-common-lines", 0, 0, SUPPRESS_COMMON_LINES_OPTION},
   {"tabsize", 1, 0, TABSIZE_OPTION},
   {"text", 0, 0, 'a'},
@@ -565,6 +567,10 @@ main (int argc, char **argv)
 	  strip_trailing_cr = true;
 	  break;

+	case SUPPRESS_BLANK_EMPTY_OPTION:
+	  suppress_blank_empty = true;
+	  break;
+
 	case SUPPRESS_COMMON_LINES_OPTION:
 	  suppress_common_lines = true;
 	  break;
@@ -886,6 +892,7 @@ static char const * const option_help_ms
   N_("-t  --expand-tabs  Expand tabs to spaces in output."),
   N_("-T  --initial-tab  Make tabs line up by prepending a tab."),
   N_("--tabsize=NUM  Tab stops are every NUM (default 8) print columns."),
+  N_("--suppress-blank-empty  Suppress space or tab before empty output lines."),
   "",
   N_("-r  --recursive  Recursively compare any subdirectories found."),
   N_("-N  --new-file  Treat absent files as empty."),

  reply	other threads:[~2007-12-05  7:35 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-29  1:03 diff-mode misinterprets empty lines Richard Stallman
2007-11-29  9:26 ` David Kastrup
2007-11-29 16:09   ` Stefan Monnier
2007-12-05  7:35     ` Paul Eggert [this message]
2007-12-05 10:17       ` Jim Meyering
2007-12-05 10:58         ` David Kastrup
2007-12-05 11:27           ` Jim Meyering
2007-12-05 12:33             ` Andreas Schwab
2007-12-05 12:39               ` Jim Meyering
2007-12-05 14:59             ` David Kastrup
2007-12-05 17:45               ` Paul Eggert
2007-12-05 18:12                 ` David Kastrup
2007-12-06  0:54                   ` Paul Eggert
2007-12-06 10:11                     ` Andreas Schwab
2007-12-05 21:04                 ` Juanma Barranquero
2007-12-06 15:39                   ` Stefan Monnier
2008-01-06  0:15                     ` Glenn Morris
2008-01-06 18:09                       ` Richard Stallman
2008-01-14 21:08                       ` Stefan Monnier
2008-01-14 21:38                         ` Glenn Morris
2008-01-14 22:46                           ` Glenn Morris
2008-01-14 23:35                             ` Diffs between %s and %s end here (was: diff-mode misinterprets empty lines.) Reiner Steib
2008-01-15  3:29                               ` Diffs between %s and %s end here Miles Bader
2008-01-16  8:13                                 ` Glenn Morris
2008-01-15  0:09                             ` diff-mode misinterprets empty lines Dan Nicolaescu
2008-01-29 18:37                         ` Chong Yidong
2008-02-19 16:32                           ` Stefan Monnier
2008-02-19 20:44                             ` Stefan Monnier
2007-12-06  2:11               ` Richard Stallman
2007-12-05 17:48         ` Paul Eggert
2007-12-05 17:50           ` Jim Meyering
2007-11-29 22:31   ` Richard Stallman
2007-11-29 23:12     ` David Kastrup
2007-11-30  2:03     ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ir3d1tn7.fsf@penguin.cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=bug-gnu-utils@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=jim@meyering.net \
    --cc=monnier@iro.umontreal.ca \
    --cc=rms@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).