From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Clean-up of forward-paragraph [Re: Beginingless paragraphs: second stab at a patch.] Date: Fri, 21 Oct 2005 20:09:52 +0000 (GMT) Message-ID: References: Reply-To: Alan Mackenzie NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: sea.gmane.org 1129925273 12487 80.91.229.2 (21 Oct 2005 20:07:53 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 21 Oct 2005 20:07:53 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 21 22:07:52 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1ET39K-0006Ih-0E for ged-emacs-devel@m.gmane.org; Fri, 21 Oct 2005 22:06:01 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ET39G-0003wh-Kr for ged-emacs-devel@m.gmane.org; Fri, 21 Oct 2005 16:05:54 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1ET390-0003um-6e for emacs-devel@gnu.org; Fri, 21 Oct 2005 16:05:38 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1ET38o-0003pQ-3s for emacs-devel@gnu.org; Fri, 21 Oct 2005 16:05:37 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ET38n-0003pJ-Uy for emacs-devel@gnu.org; Fri, 21 Oct 2005 16:05:26 -0400 Original-Received: from [193.149.49.134] (helo=acm.acm) by monty-python.gnu.org with esmtp (Exim 4.34) id 1ET38k-0005Sv-2L for emacs-devel@gnu.org; Fri, 21 Oct 2005 16:05:25 -0400 Original-Received: from localhost (root@localhost) by acm.acm (8.8.8/8.8.8) with SMTP id UAA00309 for ; Fri, 21 Oct 2005 20:09:53 GMT X-Sender: root@acm.acm Original-To: emacs-devel@gnu.org In-Reply-To: X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:44529 Archived-At: On Fri, 21 Oct 2005, Richard M. Stallman wrote: > The current implementation doesn't test for paragraph-s\(tart\|eparate\) > on the same line as the fill-prefix. Should it? >I think it is important to see what past versions of Emacs did--for >instance, Emacs 19, before the support for a left margin was added. >If it was always done this way, then I think we should document it >clearly and not change it. There are no users asking for changes in >this, and changing it would be risky. However, if the past behavior >was confused or conflicting, we need to figure out which past behavior >to be compatible with. I think we are agreed, here: (i) The current implementation of forward-paragraph doesn't test for p-start/separate on the same lines as fill-prefixes; (ii) No users are clamouring for this facility; (iii) Any major modes which need this sort of thing can do so be setting p-start/separate appropriately (as CC Mode does). Let's leave it the way it already is! ;-) >Regarding your proposed definition of paragraphs, I am concerned about >possible incompatibilities. I'd like to stress I'm NOT trying to change the definition of paragraphs, merely to formulate the existing definition, which is to some extent embodied in forward-paragraph rather than being totally explicit. >In the "new" cases, those of use-hard-newlines and nonempty left margin, >we are not particularly bound by compatibility. However, in the other >cases we are. > (iv) A @dfn{divider (line)} is a line which is either a separator or > has the fill-prefix (after any left margin) and is otherwise only > whitespace. [This definition only applies when the fill-prefix is > non-null.] >I think that together with (vii) are very hard to understand. By "divider line" I was trying to say "a separator line when there's a fill-prefix". I didn't make a good job of it. Sorry. I've revised this formulation extensively, removing this confusing term. See below. > (vi) When `use-hard-newlines' is non-nil, all paragraph boundaries > are at hard BOLs. A paragraph starts at a non-separator line, and > ends at the next hard BOL. Here, fill-prefix and paragraph-start > are ignored. >Does this make some unstated assumption about how separator lines and >hard newlines relate to each other? Perhaps it is just that the text >is confusing. The existing code tests `use-hard-newlines', which it considers equivalent to Longlines Mode being enabled. I don't think there're any hidden assumptions there. Merely that, conceptually, only hard newlines are "real" newlines, since soft newlines are as fickle as SCO lawyers, shifting around hither and thither as the text changes. Thus, the only meaningful place to look for a separator is just after a hard newline. Is there any meaning for `use-hard-newlines' other than "Longlines Mode is enabled"? > (ix) If there happens to be a blank line before a paragraph start, > this line is NOT regarded as being part of the paragraph. [This is > the problem which was at the heart of this thread.] >I am not quite sure what that means in concrete terms. As I said >before, that blank line MUST be part of the following paragraph. >That is essential for compatibility. The problem which started me off on all this was that of a blank line belonging to two paragraphs, as in the following file: ------------------------------------------------------ 1st Line [starter] asdf 1st Line [starter] asdf - Local Variables: paragraph-separate: "-" paragraph-start: "1st Line\\|-" End: ----------------------------------------------------- I think I understand now what's going on. In the Emacs manual (page "Paragraphs") is: When you wish to operate on a paragraph, you can use the command `M-h' (`mark-paragraph') to set the region around it. .... If there are blank lines preceding the first line of the paragraph, one of these blank lines is included in the region. The idea here is that you can do M-h C-w to kill a paragraph, move somewhere else with M-{ and M-}, then insert it again with C-y. All this without having the hassle of manually deleting/inserting blank lines. This has been implemented as (forward-paragraph -1) moving to the blank line. This is a misfeature, IMAO, no matter how long it may have been so. Surely `mark-paragraph' should be doing the job of including this blank line, not forward-paragraph. This blank line is NOT itself part of the paragraph. [Suggestions for Emacs 23: "blank line" in the above should be generalised here to "separator line". We should make the definition of paragraph-separate explicitly state that it matches AT MOST a single line, so that separators can be found reliably whilst searching backwards. forward/backward-paragraph should be supplemented by (or even superseded by) beginning/end-of-paragraph, which would work like b/e-of-defun. The "blank line preceding the paragraph" should be moved into `mark-paragraph'.] I discovered this whilst writing @dfn{Paragraphs} in Elisp's searching.texi. I wanted to write "Paragraphs don't overlap.", and felt constrained to qualify it with "@footnote{In certain obscure circumstances it is possible for a blank line to be both the last line of one paragraph and the first line of the next.}". I now think I should just ignore this obscure case in the documentation, and fix the code somehow and sometime for Emacs 23. >I think that at present we should probably stick to fixing anything >which is most obviously a bug. For instance, all paragraph beginnings >and ends should be at BOL; when it fails to do that, that is worth >fixing. Bigger changes should wait for after the release. OK. There are several bugs in forward-paragraph. I will fix them. The easiest way to fix them is with a thoroughgoing refactoring of the function (which I have already done). I suspect, though, you will prefer the basic structure of the code to be left unchanged. Please confirm or deny this! Here is my formulation of paragraph boundaries, thorougly revised and incorporating the comments you've made: Note: Items enclosed in braces are purely for clarification. ######################################################################### DEFINITIONS: (i) A @dfn{hard BOL} is the beginning of a line following a hard newline. (ii) A @dfn{separator (line)} is a line which separates paragraphs without being part of a one. (iii) A @dfn{starter (line)} is a line which, when present, always begins a paragraph. {Note that not every paragraph need begin with a starter.} (iv) {A line can not be both a starter and a separator.} SPECIAL HANDLING OF A PRECEDING BLANK LINE: (v) In all of the following, if there should happen to be a blank line immediately preceding the beginning of a paragraph, this beginning will be modified to include the blank line. A "blank line" here is one which contains only whitespace, and no more than a left margin's worth of it. SPECIAL STUFF ABOUT BOB/EOB: (vi) The beginning and end of (the accessible portion of) the buffer always count as paragraph boundaries. [From this point on, "ALL paragraph boundaries" disregards BOB and EOB.] (vii) All other paragraph boundaries are at BOL, even when there is a left margin. {This is so that M-h C-w will always grab complete paragraphs.} CORE OF PARAGRAPH DEFINITION: (viii) A paragraph starts either at a starter, or at a line which isn't a separator, yet follows one. (ix) A paragraph ends at a starter or a separator line. WHEN use-hard-newlines IS NON-NIL {"Longlines Mode"}: (x) {All paragraph boundaries are at hard BOLs. A single "long line" is regarded as a paragraph.} (xi) A separator is a line at a hard BOL, and which matches paragraph-separate at its left-margin. (xii) A starter is any line at a hard BOL which isn't a separator. (xiii) {fill-prefix and paragraph-start play no role here.} OTHERWISE, WHEN fill-prefix IS NON-NULL: (xiv) A separator is a line which either: (a) matches paragraph-separate at its left margin; or (b) contains a valid fill-prefix and is otherwise blank (WS is allowed). (xv) A starter is a line which isn't a separator and lacks a valid fill-prefix. (xvi) {paragraph-start plays no role, here.} OTHERWISE, (use-hard-newlines AND fill-prefix ARE BOTH NULL): (xvii) A separator is a line which matches paragraph-separate at its left margin. (xviii) A starter is a line which isn't a separator and matches paragraph-start at its left margin. ######################################################################### -- Alan Mackenzie (Munich, Germany)