unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* shouldn't `delete-blank-lines' treat form-feed as whitespace?
@ 2010-02-28 20:44 Drew Adams
  2010-03-03 19:32 ` Juri Linkov
  2010-03-03 21:35 ` David De La Harpe Golden
  0 siblings, 2 replies; 17+ messages in thread
From: Drew Adams @ 2010-02-28 20:44 UTC (permalink / raw)
  To: emacs-devel

`delete-blank-lines' treats SPC, TAB, and newline as whitespace.
Shouldn't it also treat form-feed (aka \f, aka ^L) as whitespace?





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-02-28 20:44 Drew Adams
@ 2010-03-03 19:32 ` Juri Linkov
  2010-03-03 19:54   ` Drew Adams
  2010-03-03 21:35 ` David De La Harpe Golden
  1 sibling, 1 reply; 17+ messages in thread
From: Juri Linkov @ 2010-03-03 19:32 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel

> `delete-blank-lines' treats SPC, TAB, and newline as whitespace.
> Shouldn't it also treat form-feed (aka \f, aka ^L) as whitespace?

Maybe it should depend of the current buffer's syntax?  I mean using
"[:space:]" in `delete-blank-lines'.  This has a problem: e.g. in Lisp
mode \n has the `endcomment' syntax instead of `whitespace'.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-03 19:32 ` Juri Linkov
@ 2010-03-03 19:54   ` Drew Adams
  2010-03-04  4:28     ` Stefan Monnier
  0 siblings, 1 reply; 17+ messages in thread
From: Drew Adams @ 2010-03-03 19:54 UTC (permalink / raw)
  To: 'Juri Linkov'; +Cc: emacs-devel

> > `delete-blank-lines' treats SPC, TAB, and newline as whitespace.
> > Shouldn't it also treat form-feed (aka \f, aka ^L) as whitespace?
> 
> Maybe it should depend of the current buffer's syntax?  I mean using
> "[:space:]" in `delete-blank-lines'.

Yes, I suppose the notion of whitespace is mode-dependent, and [:space:] should
capture that notion appropriately for each mode. Good point.

> This has a problem: e.g. in Lisp mode \n has the
> `endcomment' syntax instead of `whitespace'.

Hm. Then we seem to have a choice:

1. Always consider \n, \t, \f, SPC, etc. as whitespace for purposes of
`delete-blank-lines', _in addition_ to whatever other characters might
correspond to [:space:] for the given mode.

2. Use only [:space:], always, as the sole "whitespace" criterion for
`delete-blank-lines'.

Your \n as `endcomment' example seems to show that the notion of "whitespace"
for syntax table purposes is too restrictive to use as the criterion for things
like `delete-blank-lines'. So [:space:] would not be an adequate (sole)
criterion.

We could add an optional arg to `delete-blank-lines' that would override
whatever default behavior we decide on, but the default choice is important, as
it affects existing calls to `delete-blank-lines'.

I'm leaning toward #1 above. It seems to me that `delete-blank-lines' uses a
notion of blank line, and that should always include things like \n, even if
that has `endcomment' syntax.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-02-28 20:44 Drew Adams
  2010-03-03 19:32 ` Juri Linkov
@ 2010-03-03 21:35 ` David De La Harpe Golden
  2010-03-03 22:21   ` Drew Adams
  2010-03-04  1:40   ` Miles Bader
  1 sibling, 2 replies; 17+ messages in thread
From: David De La Harpe Golden @ 2010-03-03 21:35 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel

Drew Adams wrote:
> `delete-blank-lines' treats SPC, TAB, and newline as whitespace.
> Shouldn't it also treat form-feed (aka \f, aka ^L) as whitespace?
> 


While it's logically formally a whitespace class character, I'm quite 
unconvinced it would "feel right" to me.   formfeed is usually put in 
much more rarely and with more forethought than space/tab/newline.

It's also not "blank" in one possibly important sense: in emacs 
"out-of-box" it shows up as a quite visible "^L", whereas 
SPC/TAB/newline are invisible unless you turn on whitespace.

So delete-blank-lines would end up deleting lines, that from a naive 
viewpoint, look like they "have something on them".

Imagine you've got a text file with "page breaks" in it (represented as 
formfeeds as is/was the convention (see "C-x [" / "C-x ]" !)):

^L
alpha
bravo
charlie[]



            	


^L
delta
epsilon


Say I hit C-x C-o where the point [] is. I wouldn't just delete those 
stray blank lines on the first page, I'd suddenly merge two pages.
So I strongly suspect changing it would annoy people who still sprinkle 
^L through their code or other files for pagination.  Something I guess
I personally don't do so much anymore (got a laser printer not a dot 
matrix...), but anyway.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-03 21:35 ` David De La Harpe Golden
@ 2010-03-03 22:21   ` Drew Adams
  2010-03-04  1:40   ` Miles Bader
  1 sibling, 0 replies; 17+ messages in thread
From: Drew Adams @ 2010-03-03 22:21 UTC (permalink / raw)
  To: 'David De La Harpe Golden'; +Cc: emacs-devel

> > `delete-blank-lines' treats SPC, TAB, and newline as whitespace.
> > Shouldn't it also treat form-feed (aka \f, aka ^L) as whitespace?
> 
> While it's logically formally a whitespace class character, I'm quite 
> unconvinced it would "feel right" to me.   formfeed is usually put in 
> much more rarely and with more forethought than space/tab/newline.
> 
> It's also not "blank" in one possibly important sense: in emacs 
> "out-of-box" it shows up as a quite visible "^L", whereas 
> SPC/TAB/newline are invisible unless you turn on whitespace.
> 
> So delete-blank-lines would end up deleting lines, that from a naive 
> viewpoint, look like they "have something on them".
> 
> Imagine you've got a text file with "page breaks" in it 
> (represented as 
> formfeeds as is/was the convention (see "C-x [" / "C-x ]" !)):
> 
> ^L
> alpha
> bravo
> charlie[]
> 
> 
> 
>             	
> 
> 
> ^L
> delta
> epsilon
> 
> 
> Say I hit C-x C-o where the point [] is. I wouldn't just delete those 
> stray blank lines on the first page, I'd suddenly merge two pages.
> So I strongly suspect changing it would annoy people who 
> still sprinkle 
> ^L through their code or other files for pagination.  
> Something I guess
> I personally don't do so much anymore (got a laser printer not a dot 
> matrix...), but anyway.

Your point is a good one.

Actually, the place where I ran into this, and where I thought it would be
appropriate for \f to be considered blank lines and be deleted, was somewhat
similar.

I use an even more noticeable (much more noticeable) display artifact than a
visible `^L' (see http://www.emacswiki.org/emacs/PrettyControlL). And I do use
^L in my libraries to separate various code and commentary sections (into
"pages").

I use `finder-commentary' in some of my code, and it leaves such a ^L at the end
(it comes before "Change log"), not considering it to be a blank line. (I also
tweak `finder-commentary' to `delete-blank-lines' at the top and bottom.)
Getting rid of that was the use case that prompted my post.

So while I agree with you for the use case you mention, in that somewhat similar
use case I think it does make sense to remove ^L lines as "blank".

Perhaps the criterion I'm looking for is ^L followed or preceded by only
whitespace lines (including ^L lines). With that as criterion, the only "pages"
that would be dropped (merged, if you like) would be blank ones.

For interactive use (`C-x C-o') I don't see a problem with deleting lines that
contain only ^Ls - we have undo. For programmatic use, it's less clear that we
should always delete ^L lines as being blank. That's why I mentioned possibly
adding an optional arg etc.

Being able to specify via such an arg just what to consider as whitespace
("blank") would give code such as (my tweaked) `finder-commentary' an easy way
to trim off leading and trailing whitespace lines, including lines with ^L.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-03 21:35 ` David De La Harpe Golden
  2010-03-03 22:21   ` Drew Adams
@ 2010-03-04  1:40   ` Miles Bader
  2010-03-04  6:21     ` Drew Adams
  1 sibling, 1 reply; 17+ messages in thread
From: Miles Bader @ 2010-03-04  1:40 UTC (permalink / raw)
  To: David De La Harpe Golden; +Cc: Drew Adams, emacs-devel

David De La Harpe Golden <david@harpegolden.net> writes:
> While it's logically formally a whitespace class character, I'm quite
> unconvinced it would "feel right" to me.   formfeed is usually put in
> much more rarely and with more forethought than space/tab/newline.

Yes I agree -- the formfeed character, when it is used, is for document
structuring; it's not "whitespace" in the normal sense.

It's sort of like the next-level above newline in a hierarchy.

This suggests that perhaps there should be a command
`delete-blank-pages', which would delete pages containing only blank
lines (including the terminating formfeed).

[Drew, wouldn't the latter command address your use?]

-Miles

-- 
Arrest, v. Formally to detain one accused of unusualness.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-03 19:54   ` Drew Adams
@ 2010-03-04  4:28     ` Stefan Monnier
  2010-03-05  8:48       ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2010-03-04  4:28 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Juri Linkov', emacs-devel

> Yes, I suppose the notion of whitespace is mode-dependent, and
> [:space:] should capture that notion appropriately for each
> mode. Good point.

It's actually even more delicate than that.

There's basically the notion of "blank" for characters that have no
associated semantics in the corresponding language.  And then there's
the notion of "blank" for characters which *users* consider as having
no semantics.

Usually the first includes form-feed and other such things and is best
handled by forward-comment.  Usually the second only includes \s, \t,
\n, \r but doesn't include form-feed.

syntax-tables usually use the `space' syntax for \s, \t, form-feed, \r,
and sometimes \n (but not always because of newline-terminated
comments).  So using the `space' syntax is usually not a good choice
because of the \n issue and because it doesn't handle comments.


        Stefan




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-04  1:40   ` Miles Bader
@ 2010-03-04  6:21     ` Drew Adams
  0 siblings, 0 replies; 17+ messages in thread
From: Drew Adams @ 2010-03-04  6:21 UTC (permalink / raw)
  To: 'Miles Bader', 'David De La Harpe Golden'; +Cc: emacs-devel

> > While it's logically formally a whitespace class character, 
> > I'm quite unconvinced it would "feel right" to me. 
> > formfeed is usually put in much more rarely and with more
> > forethought than space/tab/newline.
> 
> Yes I agree -- the formfeed character, when it is used, is 
> for document structuring; it's not "whitespace" in the normal
> sense. It's sort of like the next-level above newline in a hierarchy.
> 
> This suggests that perhaps there should be a command
> `delete-blank-pages', which would delete pages containing only blank
> lines (including the terminating formfeed).
> 
> [Drew, wouldn't the latter command address your use?]

Dunno. I hadn't really thought that much about this all 'round. I was just
thinking that in some sense (contexts) \f-only lines are blank.

Obviously, for any given bit of code (context), one can easily delete the
whitespace or non-whitespace one wants - no special need for a ready-made
function to do that. So no, I don't think we need a `delete-blank-pages'
function.

And we probably don't need to have `delete-blank-lines' always treat \f (or
always \n or \t..., for that matter) as whitespace. I guess I was thinking
either (a) `delete-blank-lines' should generally delete \f-only lines also or
(b) we might add an optional arg to `delete-blank-lines', to tell it what we
mean by "blank" at the point of call.

I agree now that (a) is not a great idea. (b) is probably not very useful
either. Chalk this up to (piddling) thinking out loud.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-04  4:28     ` Stefan Monnier
@ 2010-03-05  8:48       ` Eli Zaretskii
  2010-03-06 17:57         ` Juri Linkov
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2010-03-05  8:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 03 Mar 2010 23:28:36 -0500
> Cc: 'Juri Linkov' <juri@jurta.org>, emacs-devel@gnu.org
> 
> There's basically the notion of "blank" for characters that have no
> associated semantics in the corresponding language.  And then there's
> the notion of "blank" for characters which *users* consider as having
> no semantics.
> 
> Usually the first includes form-feed and other such things and is best
> handled by forward-comment.  Usually the second only includes \s, \t,
> \n, \r but doesn't include form-feed.

Btw, as long as we are on this subject: `delete-blank-lines' should
also support Unicode characters whose meaning is whitespace.  Right
now, it's strictly ASCII.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-05  8:48       ` Eli Zaretskii
@ 2010-03-06 17:57         ` Juri Linkov
  2010-03-06 18:56           ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Juri Linkov @ 2010-03-06 17:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

> Btw, as long as we are on this subject: `delete-blank-lines' should
> also support Unicode characters whose meaning is whitespace.  Right
> now, it's strictly ASCII.

Does e.g. NBSP (#xa0) have whitespace meaning wrt `delete-blank-lines'?

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-06 17:57         ` Juri Linkov
@ 2010-03-06 18:56           ` Eli Zaretskii
  0 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2010-03-06 18:56 UTC (permalink / raw)
  To: Juri Linkov; +Cc: monnier, emacs-devel

> From: Juri Linkov <juri@jurta.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  emacs-devel@gnu.org
> Date: Sat, 06 Mar 2010 19:57:47 +0200
> 
> > Btw, as long as we are on this subject: `delete-blank-lines' should
> > also support Unicode characters whose meaning is whitespace.  Right
> > now, it's strictly ASCII.
> 
> Does e.g. NBSP (#xa0) have whitespace meaning wrt `delete-blank-lines'?

No, not AFAICS: as currently written, `delete-blank-lines' considers
only spaces, tabs, and newlines.  See its code.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
       [not found] <E1No0yT-0007vs-HZ@mail.fsf.org>
@ 2010-03-06 22:34 ` Jonathan Yavner
  2010-03-07  4:14   ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Yavner @ 2010-03-06 22:34 UTC (permalink / raw)
  To: emacs-devel; +Cc: juri, eliz

>> Btw, as long as we are on this subject: `delete-blank-lines' should
>> also support Unicode characters whose meaning is whitespace.  Right
>> now, it's strictly ASCII.
>
> Does e.g. NBSP (#xa0) have whitespace meaning wrt
> `delete-blank-lines'?

http://www.unicode.org/reports/tr14/#DescriptionOfProperties

No.  NBSP is not a line-breaking character, so delete-blank-lines should 
not delete it.  However, U+2009 (THIN SPACE) is considered to be a 
"breaking space" and thus arguably should be deleted.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-06 22:34 ` shouldn't `delete-blank-lines' treat form-feed as whitespace? Jonathan Yavner
@ 2010-03-07  4:14   ` Eli Zaretskii
  2010-03-07  4:34     ` Jonathan Yavner
  2010-03-07  5:08     ` Stephen J. Turnbull
  0 siblings, 2 replies; 17+ messages in thread
From: Eli Zaretskii @ 2010-03-07  4:14 UTC (permalink / raw)
  To: Jonathan Yavner; +Cc: juri, emacs-devel

> From: Jonathan Yavner <jyavner@rogers.com>
> Date: Sat, 6 Mar 2010 17:34:27 -0500
> Cc: juri@jurta.org, eliz@gnu.org
> 
> >> Btw, as long as we are on this subject: `delete-blank-lines' should
> >> also support Unicode characters whose meaning is whitespace.  Right
> >> now, it's strictly ASCII.
> >
> > Does e.g. NBSP (#xa0) have whitespace meaning wrt
> > `delete-blank-lines'?
> 
> http://www.unicode.org/reports/tr14/#DescriptionOfProperties
> 
> No.  NBSP is not a line-breaking character, so delete-blank-lines should 
> not delete it.

How is the line-breaking property relevant to this function?  The doc
string says just "delete all surrounding blank lines".  It doesn't
mention line-breaking at all.  The question is, should a line
consisting only of NBSP characters be considered a blank line.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-07  4:14   ` Eli Zaretskii
@ 2010-03-07  4:34     ` Jonathan Yavner
  2010-03-07 17:42       ` Eli Zaretskii
  2010-03-07 18:11       ` Stephen J. Turnbull
  2010-03-07  5:08     ` Stephen J. Turnbull
  1 sibling, 2 replies; 17+ messages in thread
From: Jonathan Yavner @ 2010-03-07  4:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: juri, emacs-devel

> How is the line-breaking property relevant to this function?

If it isn't, why are we talking about FF and VT?  They don't display as 
"blank" in Emacs.

If the point is to close up blank lines, this is sort of like line-
breaking in reverse, so only line-break characters should be deleted.

Function delete-trailing-whitespace specifically states that it does not 
delete FF.  It also doesn't delete VT.  So why should
delete-blank-lines delete these?




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-07  4:14   ` Eli Zaretskii
  2010-03-07  4:34     ` Jonathan Yavner
@ 2010-03-07  5:08     ` Stephen J. Turnbull
  1 sibling, 0 replies; 17+ messages in thread
From: Stephen J. Turnbull @ 2010-03-07  5:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jonathan Yavner, juri, emacs-devel

Eli Zaretskii writes:

 > How is the line-breaking property relevant to this function?  The doc
 > string says just "delete all surrounding blank lines".  It doesn't
 > mention line-breaking at all.  The question is, should a line
 > consisting only of NBSP characters be considered a blank line.

My understanding of the Unicode standard is that whitespace characters
are blanks.  Interestingly enough, however, in many cases they are
treated as non-blank characters that happen to be extremely
conservative of ink. :-)  Eg, one oftens see the "<TAG>&nbsp;</TAG>"
idiom in HTML (especially in TD elements).

Nevertheless, I think NBSP ought to be treated as a blank for the
purpose of `delete-blank-lines'.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-07  4:34     ` Jonathan Yavner
@ 2010-03-07 17:42       ` Eli Zaretskii
  2010-03-07 18:11       ` Stephen J. Turnbull
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2010-03-07 17:42 UTC (permalink / raw)
  To: Jonathan Yavner; +Cc: juri, emacs-devel

> From: Jonathan Yavner <jyavner@rogers.com>
> Date: Sat, 6 Mar 2010 23:34:16 -0500
> Cc: emacs-devel@gnu.org, juri@jurta.org
> 
> > How is the line-breaking property relevant to this function?
> 
> If it isn't, why are we talking about FF and VT?  They don't display as 
> "blank" in Emacs.
> 
> If the point is to close up blank lines, this is sort of like line-
> breaking in reverse, so only line-break characters should be deleted.
> 
> Function delete-trailing-whitespace specifically states that it does not 
> delete FF.  It also doesn't delete VT.  So why should
> delete-blank-lines delete these?

I have no opinion about FF and VT.  I was talking about NBSP and its
ilk.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shouldn't `delete-blank-lines' treat form-feed as whitespace?
  2010-03-07  4:34     ` Jonathan Yavner
  2010-03-07 17:42       ` Eli Zaretskii
@ 2010-03-07 18:11       ` Stephen J. Turnbull
  1 sibling, 0 replies; 17+ messages in thread
From: Stephen J. Turnbull @ 2010-03-07 18:11 UTC (permalink / raw)
  To: Jonathan Yavner; +Cc: juri, Eli Zaretskii, emacs-devel

Jonathan Yavner writes:

 > Function delete-trailing-whitespace specifically states that it does not 
 > delete FF.  It also doesn't delete VT.

In Unicode, FF at least has line-breaking semantics.  I believe VT
does as well.  So they are not deleted because they aren't trailing
whitespace, they break the trail themselves.

 > So why should delete-blank-lines delete these?

Because they create vertical whitespace, and delete-blank-lines is
intended to close it up.  I can see an argument for not deleting them,
too, so I would want to hear from users about their use cases.






^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-03-07 18:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1No0yT-0007vs-HZ@mail.fsf.org>
2010-03-06 22:34 ` shouldn't `delete-blank-lines' treat form-feed as whitespace? Jonathan Yavner
2010-03-07  4:14   ` Eli Zaretskii
2010-03-07  4:34     ` Jonathan Yavner
2010-03-07 17:42       ` Eli Zaretskii
2010-03-07 18:11       ` Stephen J. Turnbull
2010-03-07  5:08     ` Stephen J. Turnbull
2010-02-28 20:44 Drew Adams
2010-03-03 19:32 ` Juri Linkov
2010-03-03 19:54   ` Drew Adams
2010-03-04  4:28     ` Stefan Monnier
2010-03-05  8:48       ` Eli Zaretskii
2010-03-06 17:57         ` Juri Linkov
2010-03-06 18:56           ` Eli Zaretskii
2010-03-03 21:35 ` David De La Harpe Golden
2010-03-03 22:21   ` Drew Adams
2010-03-04  1:40   ` Miles Bader
2010-03-04  6:21     ` Drew Adams

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).