From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Reuben Thomas Newsgroups: gmane.emacs.bugs Subject: bug#25157: 26.0.50; whitespace-cleanup does not remove single trailing empty line anymore Date: Tue, 20 Dec 2016 18:37:29 +0000 Message-ID: References: <8760mr26dr.fsf@openmailbox.org> <87eg13b4jk.fsf@users.sourceforge.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=94eb2c075f461fbf3b05441b547d X-Trace: blaine.gmane.org 1482259094 8187 195.159.176.226 (20 Dec 2016 18:38:14 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 20 Dec 2016 18:38:14 +0000 (UTC) Cc: Mark Karpov , 25157@debbugs.gnu.org To: Noam Postavsky Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Dec 20 19:38:09 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJPId-00010I-NY for geb-bug-gnu-emacs@m.gmane.org; Tue, 20 Dec 2016 19:38:07 +0100 Original-Received: from localhost ([::1]:52938 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJPIi-0001IO-9w for geb-bug-gnu-emacs@m.gmane.org; Tue, 20 Dec 2016 13:38:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49490) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJPIb-0001I8-VR for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2016 13:38:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cJPIY-0004LQ-R8 for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2016 13:38:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:33867) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cJPIY-0004LL-LH for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2016 13:38:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cJPIY-0000n2-F1 for bug-gnu-emacs@gnu.org; Tue, 20 Dec 2016 13:38:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Reuben Thomas Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 20 Dec 2016 18:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25157 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: confirmed Original-Received: via spool by 25157-submit@debbugs.gnu.org id=B25157.14822590582999 (code B ref 25157); Tue, 20 Dec 2016 18:38:02 +0000 Original-Received: (at 25157) by debbugs.gnu.org; 20 Dec 2016 18:37:38 +0000 Original-Received: from localhost ([127.0.0.1]:49266 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJPIA-0000mJ-4Z for submit@debbugs.gnu.org; Tue, 20 Dec 2016 13:37:38 -0500 Original-Received: from mail-qk0-f174.google.com ([209.85.220.174]:33609) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJPI7-0000m3-C2 for 25157@debbugs.gnu.org; Tue, 20 Dec 2016 13:37:36 -0500 Original-Received: by mail-qk0-f174.google.com with SMTP id t184so57329788qkd.0 for <25157@debbugs.gnu.org>; Tue, 20 Dec 2016 10:37:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sc3d.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=g3S0YHGv6JIWWtJT2zBzg2st/ogH3HRM2iFQTEiSQIw=; b=EXsEWvh39MUoNpe8cOv9Pa71gsdt8Z+tl2KOs5br1Nr4jSsDUwGLrGcHHRU+ZCjAzG WFwz3W22uJJZLB4kSre/sRG9mf9v30aYdHJkMOen5JBAd8j2Dh7LaefGpZ8GB+2/YFjN fRgCqQuupYL/PunCWQbAngD7Tdp/9PJSn9kWs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=g3S0YHGv6JIWWtJT2zBzg2st/ogH3HRM2iFQTEiSQIw=; b=unh5tg0/AA75zS9HkXNvY+K56S0RTq8y9CZoa6+p2ocFivpPVNgXgCvxeBpsDAOxcV 3b9L0K3d+abqRahgXC9OtpYfXsGqUtA4xg9gAa7hbx/djRRKvRm+3/glVKcw9gVlMw2J fn6vQYoJh3doI23eNujaFWqHxulykqWPEyWtABwInd0oFas+OgDxKD15hMZV1Ddn9rcl uQGdGCp6JCpl0IW81XXOvHI3/jQIA/MU54bIN+DAfgTxiXzjGQtdJ/Z98Ry1vDmqIY4+ ULGf7ciqGswDAYzLGIZLOmIV8aNGA00WlZLtpPpH39Pfk3h+bZoD1osY3j8Y9xeGOVQB UZiQ== X-Gm-Message-State: AIkVDXINijBXPboV+6UjFNWpJsmVw/WwFYquc6MvWNW0ciDDPd+Y7rZ7dkXF3dQTqu3UkfMd3uqMDia4WAXju46V X-Received: by 10.55.135.69 with SMTP id j66mr822888qkd.31.1482259049718; Tue, 20 Dec 2016 10:37:29 -0800 (PST) Original-Received: by 10.140.88.51 with HTTP; Tue, 20 Dec 2016 10:37:29 -0800 (PST) In-Reply-To: <87eg13b4jk.fsf@users.sourceforge.net> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:127259 Archived-At: --94eb2c075f461fbf3b05441b547d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 20 December 2016 at 04:36, wrote: > tags 25157 confirmed > quit > > Mark Karpov writes: > > > The =E2=80=98whitespace-cleanup=E2=80=99 command does not remove single= trailing empty > > line anymore. > =E2=80=8B=E2=80=8BI can reproduce this; sorry! > -(defcustom whitespace-empty-at-eob-regexp "^\\([ \t\n]+\\)" > +(defcustom whitespace-empty-at-eob-regexp "^\\([ \t\n]*\\(\n\\{2,\\}\\|[ > \t]+\\)\\)\\'" > > I don't quite understand why this more complicated expression is > necessary. Reuben, can you explain? > =E2=80=8BWith the previous regexp, whitespace-cleanup would remove a single= newline at the end of a buffer. I think I tried to be a bit too clever. Thinking again, what we require is: Match at the end of the buffer, either: a. A mix of spaces and tabs, or b. Optional whitespace followed by a newline followed by whitespace. These two categories are not mutually exclusive (which is fine, and avoids being too clever). The point is that if there are any newlines, there must be at least two. Also note that the regexp does not need to be anchored at the start of a line (I'm not sure why I thought it did). So, I think a correct regexp, directly translating the above, is: \\([ \t]+\\|\\([ \t\n]*\n[ \t\n]+\\)\\)\\' However, there's still a problem: while this regexp will not match a single newline at the end of a buffer, when it does match any number of newlines (with or without extra space), it will remove all of them, whereas it should leave a single newline. I can't see a way around this purely in the regexp, because if for example the end of the buffer is: \t\n\t then the regexp should match (and this one does), but whitespace-cleanup should leave a newline. So I think a further change to the code is needed to whitespace-cleanup: when whitespace-empty-at-eob-regexp is matched, it should check match-string, and if it contains a newline, it should insert a newline in the buffer after deleting the matched string. Given my previous error of reasoning, I'm submitting the above for your consideration before I prepare a patch! --=20 http://rrt.sc3d.org --94eb2c075f461fbf3b05441b547d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On = 20 December 2016 at 04:36, <npostavs@users.sourceforge.= net> wrote:
tags 25157 confirmed
quit

Mark Karpov <markkarpov@openmailbox.org> writes:

> The =E2=80=98whitespace-cleanup=E2=80=99 command does not remove singl= e trailing empty
> line anymore.

=E2=80=8B=E2=80=8BI can reproduce this; = sorry!
=C2=A0
-(defcus= tom whitespace-empty-at-eob-regexp "^\\([ \t\n]+\\)"
+(defcustom whitespace-empty-at-eob-regexp "^\\([ \t\n]*\\(\n\\{2,\\}\= \|[ \t]+\\)\\)\\'"

I don't quite understand why this more complicated expression is
necessary.=C2=A0 Reuben, can you explain?

=E2=80=8BWith the previous regexp, whitespace-cleanup would remove a sin= gle newline at the end of a buffer.

I think I tried to be a bit too clever.

Thinking again, what we require is:

Match at the end of the buffer, either= :

a. A mix of spaces and = tabs, or

b. Optional whit= espace followed by a newline followed by whitespace.

These two categories are not mutually exclusive= (which is fine, and avoids being too clever).

The point is that if there are any newlines, there mu= st be at least two.

Also = note that the regexp does not need to be anchored at the start of a line (I= 'm not sure why I thought it did).

So, I think a correct regexp, directly translating the above,= is: \\([ \t]+\\|\\([ \t\n]*\n[ \t\n]+\\)\\)\\'

However, there's still a problem: while this= regexp will not match a single newline at the end of a buffer, when it doe= s match any number of newlines (with or without extra space), it will remov= e all of them, whereas it should leave a single newline.

I can't see a way around this purely in= the regexp, because if for example the end of the buffer is:

\t\n\t

then the regexp should match (and this one does), but = whitespace-cleanup should leave a newline.

So I think a further change to the code is needed to whit= espace-cleanup: when whitespace-empty-at-eob-regexp is matched, it should c= heck match-string, and if it contains a newline, it should insert a newline= in the buffer after deleting the matched string.

Given my previous error of reasoning, I'm subm= itting the above for your consideration before I prepare a patch!

--
--94eb2c075f461fbf3b05441b547d--