unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Improvements to `(emacs)File Variables'
@ 2004-11-14 19:02 Reiner Steib
  2004-11-14 21:12 ` Stefan Monnier
  0 siblings, 1 reply; 27+ messages in thread
From: Reiner Steib @ 2004-11-14 19:02 UTC (permalink / raw)
  Cc: Simon Krahnke

Hi,

`unibyte: t' is not mentioned at all in (info "(emacs)File Variables").

Additionally, the paragraph starting with »Two "variable names" have
special meanings ...« is misleading because beside `mode' and `eval'
also `unibyte' and `coding' have special meanings:

,----[ (info "(emacs)File Variables") ]
|    Two "variable names" have special meanings in a local variables
| list: a value for the variable `mode' really sets the major mode, and a
| value for the variable `eval' is simply evaluated as an expression and
| the value is ignored.  `mode' and `eval' are not real variables;
`----

How about the following patches?  Okay to install?

2004-11-14  Reiner Steib  <Reiner.Steib@gmx.de>

	* custom.texi (File Variables): Add `unibyte' and make it more
	clear that `unibyte' and `coding' are special.  Suggested by Simon
	Krahnke <overlord@gmx.li>.

	* mule.texi (Enabling Multibyte): Refer to File Variables.
	Suggested by Simon Krahnke <overlord@gmx.li>.

--8<---------------cut here---------------start------------->8---
--- custom.texi	21 Sep 2004 12:47:25 +0200	1.67
+++ custom.texi	14 Nov 2004 19:39:36 +0100	
@@ -961,7 +961,8 @@
   You can also specify the coding system for a file in this way: just
 specify a value for the ``variable'' named @code{coding}.  The ``value''
 must be a coding system name that Emacs recognizes.  @xref{Coding
-Systems}.
+Systems}.  @w{@samp{unibyte: t}} specifies unibyte loading for a
+particular Lisp file.  @xref{Enabling Multibyte}.
 
   The @code{eval} pseudo-variable, described below, can be specified in
 the first line as well.
@@ -1022,14 +1023,15 @@
 # End:
 @end example
 
-  Two ``variable names'' have special meanings in a local variables
+  Some ``variable names'' have special meanings in a local variables
 list: a value for the variable @code{mode} really sets the major mode,
 and a value for the variable @code{eval} is simply evaluated as an
-expression and the value is ignored.  @code{mode} and @code{eval} are
-not real variables; setting variables named @code{mode} and @code{eval}
-in any other context has no special meaning.  @emph{If @code{mode} is
-used to set a major mode, it should be the first ``variable'' in the
-list.}  Otherwise, the entries that precede it in the list of the local
+expression and the value is ignored.  @code{coding}, @code{unibyte},
+@code{mode} and @code{eval} are not real variables; setting variables
+named @code{coding}, @code{unibyte}, @code{mode} and @code{eval} in any
+other context has no special meaning.  @emph{If @code{mode} is used to
+set a major mode, it should be the first ``variable'' in the list.}
+Otherwise, the entries that precede it in the list of the local
 variables are likely to be ignored, since most modes kill all local
 variables as part of their initialization.
--8<---------------cut here---------------end--------------->8---

(Not reformatted yet to make the change more clear:)
--8<---------------cut here---------------start------------->8---
--- mule.texi	04 Mar 2004 18:23:24 +0100	1.68
+++ mule.texi	14 Nov 2004 19:52:46 +0100	
@@ -199,7 +199,8 @@
 file, @file{.emacs}, and the initialization files of Emacs packages
 such as Gnus.  However, you can specify unibyte loading for a
 particular Lisp file, by putting @w{@samp{-*-unibyte: t;-*-}} in a
-comment on the first line.  Then that file is always loaded as unibyte
+comment on the first line (@xpref{File Variables}).
+Then that file is always loaded as unibyte
 text, even if you did not start Emacs with @samp{--unibyte}.  The
 motivation for these conventions is that it is more reliable to always
 load any particular Lisp file in the same way.  However, you can load
--8<---------------cut here---------------end--------------->8---

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 19:02 Improvements to `(emacs)File Variables' Reiner Steib
@ 2004-11-14 21:12 ` Stefan Monnier
  2004-11-14 23:26   ` Miles Bader
  0 siblings, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-14 21:12 UTC (permalink / raw)
  Cc: Simon Krahnke

> `unibyte: t' is not mentioned at all in (info "(emacs)File Variables").

BTW, what is the point of this `unibyte: t' marker?
Isn't `coding: binary' doing the same?

I think we should make the `unibyte: t' marker obsolete/deprecated.


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 21:12 ` Stefan Monnier
@ 2004-11-14 23:26   ` Miles Bader
  2004-11-14 23:46     ` Stefan Monnier
  2004-11-16 16:49     ` Richard Stallman
  0 siblings, 2 replies; 27+ messages in thread
From: Miles Bader @ 2004-11-14 23:26 UTC (permalink / raw)
  Cc: Simon Krahnke, emacs-devel

On Sun, Nov 14, 2004 at 04:12:39PM -0500, Stefan Monnier wrote:
> I think we should make the `unibyte: t' marker obsolete/deprecated.

I think we should make the whole _concept_ of "unibyte" deprecated...

Actually I thought it was, at least informally.

-Miles
-- 
.Numeric stability is probably not all that important when you're guessing.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 23:26   ` Miles Bader
@ 2004-11-14 23:46     ` Stefan Monnier
  2004-11-14 23:55       ` Miles Bader
  2004-11-16 16:49     ` Richard Stallman
  1 sibling, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-14 23:46 UTC (permalink / raw)
  Cc: Simon Krahnke, emacs-devel

> I think we should make the whole _concept_ of "unibyte" deprecated...

At the user-level, yes that was pretty much my understanding as well, which
is why the unibyte:t marker should be considered obsolete.

At the elisp level, unibyte strings and buffers are still very handy
to manipulate undecoded data (such as when talking to an NNTP server).
I don't see any reason why this should ever be deprecated.


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 23:46     ` Stefan Monnier
@ 2004-11-14 23:55       ` Miles Bader
  2004-11-15  0:18         ` Stefan Monnier
  0 siblings, 1 reply; 27+ messages in thread
From: Miles Bader @ 2004-11-14 23:55 UTC (permalink / raw)
  Cc: emacs-devel, Simon Krahnke, Miles Bader

On Sun, Nov 14, 2004 at 06:46:59PM -0500, Stefan Monnier wrote:
> > I think we should make the whole _concept_ of "unibyte" deprecated...
> 
> At the user-level, yes that was pretty much my understanding as well, which
> is why the unibyte:t marker should be considered obsolete.
> 
> At the elisp level, unibyte strings and buffers are still very handy
> to manipulate undecoded data (such as when talking to an NNTP server).
> I don't see any reason why this should ever be deprecated.

I think the lisp/C level are a complete mess also, but it's probably too
painful and too much work to fix it.

-Miles
-- 
`There are more things in heaven and earth, Horatio,
 Than are dreamt of in your philosophy.'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 23:55       ` Miles Bader
@ 2004-11-15  0:18         ` Stefan Monnier
  2004-11-15  4:53           ` Miles Bader
  0 siblings, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-15  0:18 UTC (permalink / raw)
  Cc: Simon Krahnke, emacs-devel

>> > I think we should make the whole _concept_ of "unibyte" deprecated...
>> 
>> At the user-level, yes that was pretty much my understanding as well, which
>> is why the unibyte:t marker should be considered obsolete.
>> 
>> At the elisp level, unibyte strings and buffers are still very handy
>> to manipulate undecoded data (such as when talking to an NNTP server).
>> I don't see any reason why this should ever be deprecated.

> I think the lisp/C level are a complete mess also, but it's probably too
> painful and too much work to fix it.

But are you still talking about the *concept* of unibyte, then?


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-15  0:18         ` Stefan Monnier
@ 2004-11-15  4:53           ` Miles Bader
  2004-11-15  5:15             ` Stefan Monnier
  2004-11-16 16:48             ` Richard Stallman
  0 siblings, 2 replies; 27+ messages in thread
From: Miles Bader @ 2004-11-15  4:53 UTC (permalink / raw)
  Cc: Simon Krahnke, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> I think the lisp/C level are a complete mess also, but it's probably too
>> painful and too much work to fix it.
>
> But are you still talking about the *concept* of unibyte, then?

I'm not sure.  "Unibyte" as used in emacs seems (to me) to imply several
things:  (1) of course, a single byte per character, (2) the concept of
strings/buffers whose encoding is "unknown".

If you were to consistently treat (2) as in fact meaning an explicit
"binary" encoding, maybe it would be useful, but my impression is that
at least historically, people/code have _not_ always done this, leading
to lots and lots of confusion.  I suppose much of the reason is that
people want the efficiency gain of (1), and either don't realize the
problems caused by (2) or think they can kludge around it.

As I've posted before, I think "unibyte" strings/buffers should be only
an optimization, and should have an explicit (8-bit) encoding associated
with them, so that any conversions to/from multibyte can automatically
do the correct thing; one of these encoding could of course be "binary",
which maybe would allow the historical usage of unibyte to be preserved.

[Note that I only vaguely understand Emacs unibyte stuff, so the above
may simply be a reusult of my confusion.]

-Miles
-- 
`Cars give people wonderful freedom and increase their opportunities.
 But they also destroy the environment, to an extent so drastic that
 they kill all social life' (from _A Pattern Language_)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-15  4:53           ` Miles Bader
@ 2004-11-15  5:15             ` Stefan Monnier
  2004-11-16 16:48             ` Richard Stallman
  1 sibling, 0 replies; 27+ messages in thread
From: Stefan Monnier @ 2004-11-15  5:15 UTC (permalink / raw)
  Cc: Simon Krahnke, emacs-devel

> I'm not sure.  "Unibyte" as used in emacs seems (to me) to imply several
> things:  (1) of course, a single byte per character, (2) the concept of
> strings/buffers whose encoding is "unknown".

> If you were to consistently treat (2) as in fact meaning an explicit
> "binary" encoding, maybe it would be useful, but my impression is that
> at least historically, people/code have _not_ always done this, leading
> to lots and lots of confusion.  I suppose much of the reason is that
> people want the efficiency gain of (1), and either don't realize the
> problems caused by (2) or think they can kludge around it.

> As I've posted before, I think "unibyte" strings/buffers should be only
> an optimization, and should have an explicit (8-bit) encoding associated
> with them, so that any conversions to/from multibyte can automatically
> do the correct thing; one of these encoding could of course be "binary",
> which maybe would allow the historical usage of unibyte to be preserved.

I'd tend to disagree on the idea of associating an encoding with
unibyte buffers.  I think a large part of the problem is that people with
a unibyte background (i.e. latin-1 mostly) typically confuse the notion of
character and byte and mix things up hopelessly.

In Emacs-20, automatic conversion between unibyte and multibyte was provided
mostly as a way to work "correctly" even with confused code which didn't
understand that there's more than 256 characters in this world.

It made sense at the time to avoid alienating too many Emacs coders.
But to get things right, the first thing we need to do is to make it very
clear that there is no way to automatically convert between unibyte
and multibyte.  Such a conversion should only be doable via
(en|de)coding-coding-foo functions, thus forcing anyone who wants to go down
that path to actually provide a coding system explicitly and thus to think
of what coding system should be used.

After all, autoconversion can only work for 8bit encoding, so any code which
uses autoconversion is in two possible cases:
1 - the code somehow knows that all the possible encodings it might need to
    use there are 8bit.  Most likely, it's the case where there's only ever
    one encoding used.
2 - the code *doesn't* know, but just assumes (probably without even being
    aware of it) that all encodings are 8bit.  Thus it will break if used
    in China, Japan, ...
Situation 2 is a bug.  Situation 1 seems rather unusual.  My conclusion is
that autoconversion is harmful.

I've hacked my own local Emacs to "disallow" autoconversion
(i.e. auto-conversion from unibyte->multibyte is allowed and generates
eight-bit-control and eight-bit-graphic chars; auto-conversion from
multibyte to unibyte is allowed but only for ascii, eight-bit-graphic, and
eight-bit-control chars, any other char causes an error).  It actually works
fairly well.  The main problems I encounter have to do with regexp matching
where the regexp is multibyte and the text is unibyte.


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-15  4:53           ` Miles Bader
  2004-11-15  5:15             ` Stefan Monnier
@ 2004-11-16 16:48             ` Richard Stallman
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Stallman @ 2004-11-16 16:48 UTC (permalink / raw)
  Cc: emacs-devel, monnier, overlord

    As I've posted before, I think "unibyte" strings/buffers should be only
    an optimization, and should have an explicit (8-bit) encoding associated
    with them, so that any conversions to/from multibyte can automatically
    do the correct thing; one of these encoding could of course be "binary",
    which maybe would allow the historical usage of unibyte to be preserved.

It might be possible to make this work, but the current handling is
not broken, so I don't want to make such a big change in it.

There are more important parts of Emacs to improve, even after the
next release.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-14 23:26   ` Miles Bader
  2004-11-14 23:46     ` Stefan Monnier
@ 2004-11-16 16:49     ` Richard Stallman
  2004-11-16 16:59       ` Stefan Monnier
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2004-11-16 16:49 UTC (permalink / raw)
  Cc: emacs-devel, monnier, overlord

    I think we should make the whole _concept_ of "unibyte" deprecated...

That would be too drastic--and unnecessary.
There are probably thousands of users who use it,
if not tens of thousands.  This isn't broken;
it works smoothly enough now, and no longer requires
a lot of maintenance.

So let's please not think of changing it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-16 16:49     ` Richard Stallman
@ 2004-11-16 16:59       ` Stefan Monnier
  2004-11-18  2:55         ` Richard Stallman
  0 siblings, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-16 16:59 UTC (permalink / raw)
  Cc: emacs-devel, overlord, Miles Bader

>     I think we should make the whole _concept_ of "unibyte" deprecated...
> That would be too drastic--and unnecessary.
> There are probably thousands of users who use it,
> if not tens of thousands.  This isn't broken;
> it works smoothly enough now, and no longer requires
> a lot of maintenance.

> So let's please not think of changing it.

I do not intend to change the code (other than in my own local Emacs build,
obviously), but whether or not unibyte is deprecated is important when
deciding how to change the manual.

I think the concept of unibyte and multibyte thingies should be kept within
the Elisp manual, but should be avoided when possible in the Emacs manual.


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-16 16:59       ` Stefan Monnier
@ 2004-11-18  2:55         ` Richard Stallman
  2004-11-18 16:47           ` Stefan Monnier
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2004-11-18  2:55 UTC (permalink / raw)
  Cc: miles, overlord, emacs-devel

I will not deprecate unibyte mode without polling the users.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-18  2:55         ` Richard Stallman
@ 2004-11-18 16:47           ` Stefan Monnier
  2004-11-18 17:07             ` Simon Krahnke
  2004-11-19  2:25             ` Improvements to `(emacs)File Variables' Richard Stallman
  0 siblings, 2 replies; 27+ messages in thread
From: Stefan Monnier @ 2004-11-18 16:47 UTC (permalink / raw)
  Cc: miles, overlord, emacs-devel

> I will not deprecate unibyte mode without polling the users.

Let's not forget that the original question in this thread is whether we
should *add* more documentation for the "unibyte:t" cookie.  I think we can
answer "no" without formally declaring the feature obsolete (after all, it
is already documented at other places, so we don't need to advertise it
further).


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-18 16:47           ` Stefan Monnier
@ 2004-11-18 17:07             ` Simon Krahnke
  2004-11-18 18:04               ` Stefan Monnier
  2004-11-19  2:25             ` Improvements to `(emacs)File Variables' Richard Stallman
  1 sibling, 1 reply; 27+ messages in thread
From: Simon Krahnke @ 2004-11-18 17:07 UTC (permalink / raw)
  Cc: overlord, miles, rms, emacs-devel

On Thu, Nov 18, 2004 at 11:47:31AM -0500, Stefan Monnier wrote:

> Let's not forget that the original question in this thread is whether we
> should *add* more documentation for the "unibyte:t" cookie.  I think we can
> answer "no" without formally declaring the feature obsolete (after all, it
> is already documented at other places, so we don't need to advertise it
> further).

Let's not forget the user that finds an "unibyte:t" cookie and has no
idea what that might mean. Where in the documentation he is supposed to
find it?

mfg,                     simon .... l

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-18 17:07             ` Simon Krahnke
@ 2004-11-18 18:04               ` Stefan Monnier
  2004-11-19  1:23                 ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Juri Linkov
  0 siblings, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-18 18:04 UTC (permalink / raw)
  Cc: overlord, miles, rms, emacs-devel

>> Let's not forget that the original question in this thread is whether we
>> should *add* more documentation for the "unibyte:t" cookie.  I think we can
>> answer "no" without formally declaring the feature obsolete (after all, it
>> is already documented at other places, so we don't need to advertise it
>> further).

> Let's not forget the user that finds an "unibyte:t" cookie and has no
> idea what that might mean. Where in the documentation he is supposed to
> find it?

Go to the Emacs manual and do a M-s search for "unibyte: *t".


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Info-search-whitespace (Was: Improvements to `(emacs)File Variables')
  2004-11-18 18:04               ` Stefan Monnier
@ 2004-11-19  1:23                 ` Juri Linkov
  2004-11-19  5:06                   ` Info-search-whitespace Stefan Monnier
  2004-11-19  7:15                   ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Eli Zaretskii
  0 siblings, 2 replies; 27+ messages in thread
From: Juri Linkov @ 2004-11-19  1:23 UTC (permalink / raw)
  Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:
> Go to the Emacs manual and do a M-s search for "unibyte: *t".

Out of curiosity I looked at what the Emacs manual does say about
unibyte mode and noticed such a problem: Emacs hangs with "unibyte: *t"
Info search string.  This is because Info-search expands it with
Info-search-whitespace to "unibyte:\\(?:\\s-+\\)*t" and on the line

* --unibyte:                             Initial Options.     (line 101)

Emacs goes into a long loop because it tries to match whitespace
\\(?:\\s-+\\)* in extremely many ways.

Perhaps Info-search should search for a string only, not a regexp:
i.e. preprocess a search string with `regexp-quote' before replacing
whitespace with Info-search-whitespace and re-searching for it.

Anyway, the standalone Info reader already doesn't search for a regexp
with the `s' key, and the Emacs Info reader should not differ from it.
Of course, this doesn't prohibit from adding a new command
`Info-search-regexp' (without keybinding) to the Emacs Info reader,
which won't expand whitespace to Info-search-whitespace and will
use a search regexp exactly as specified by the user.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-18 16:47           ` Stefan Monnier
  2004-11-18 17:07             ` Simon Krahnke
@ 2004-11-19  2:25             ` Richard Stallman
  2004-11-29 19:04               ` Reiner Steib
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2004-11-19  2:25 UTC (permalink / raw)
  Cc: miles, overlord, emacs-devel

    Let's not forget that the original question in this thread is whether we
    should *add* more documentation for the "unibyte:t" cookie.  I think we can
    answer "no" without formally declaring the feature obsolete (after all, it
    is already documented at other places, so we don't need to advertise it
    further).

I agree, on that question.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-19  1:23                 ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Juri Linkov
@ 2004-11-19  5:06                   ` Stefan Monnier
  2004-11-19 17:48                     ` Info-search-whitespace Juri Linkov
  2004-11-19 20:04                     ` Info-search-whitespace Richard Stallman
  2004-11-19  7:15                   ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Eli Zaretskii
  1 sibling, 2 replies; 27+ messages in thread
From: Stefan Monnier @ 2004-11-19  5:06 UTC (permalink / raw)
  Cc: emacs-devel

> Info-search-whitespace to "unibyte:\\(?:\\s-+\\)*t" and on the line

Huh!?  The \(?:..\) is actually both useless and harmful here and should be
removed (if possible).


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace (Was: Improvements to `(emacs)File Variables')
  2004-11-19  1:23                 ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Juri Linkov
  2004-11-19  5:06                   ` Info-search-whitespace Stefan Monnier
@ 2004-11-19  7:15                   ` Eli Zaretskii
  1 sibling, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2004-11-19  7:15 UTC (permalink / raw)
  Cc: emacs-devel

> From: Juri Linkov <juri@jurta.org>
> Date: Fri, 19 Nov 2004 03:23:29 +0200
> Cc: emacs-devel@gnu.org
> 
> Anyway, the standalone Info reader already doesn't search for a regexp
> with the `s' key, and the Emacs Info reader should not differ from it.

AFAIK, the standalone reader never searched for a regexp with the `s'
command (and isn't even linked with the regexp library).  IIRC, the
standalone reader was written _after_ the one in Emacs, not before it,
and was originally trying to emulate Emacs as much as was feasible;
evidently, the regexp-based search was one of those places where full
compatibility was deemed infeasible.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-19  5:06                   ` Info-search-whitespace Stefan Monnier
@ 2004-11-19 17:48                     ` Juri Linkov
  2004-11-19 20:04                     ` Info-search-whitespace Richard Stallman
  1 sibling, 0 replies; 27+ messages in thread
From: Juri Linkov @ 2004-11-19 17:48 UTC (permalink / raw)
  Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> Info-search-whitespace to "unibyte:\\(?:\\s-+\\)*t" and on the line
>
> The \(?:..\) is actually both useless and harmful here and should be
> removed (if possible).

The default value of Info-search-whitespace-regexp is the same as the
default value of search-whitespace-regexp used in regexp isearch.  And
C-M-s suffers from the same deficiency: Emacs hangs while re-searching
for an innocently looking regexp like "unibyte: *t" on relatively long
whitespace regions even in non-Info buffers.

Removing the grouping constructs from the default values of both
variables would solve both problems.  I see no reason not to do so.
It seems Emacs interprets an expanded regexp "unibyte:\\s-+*t" as
"unibyte:\\s-*t" which works fine.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-19  5:06                   ` Info-search-whitespace Stefan Monnier
  2004-11-19 17:48                     ` Info-search-whitespace Juri Linkov
@ 2004-11-19 20:04                     ` Richard Stallman
  2004-11-19 20:41                       ` Info-search-whitespace David Kastrup
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2004-11-19 20:04 UTC (permalink / raw)
  Cc: juri, emacs-devel

    Huh!?  The \(?:..\) is actually both useless and harmful here and should be
    removed (if possible).

It isn't useless, since it prevents confusion of back-references to
explicit groups.

I think the only way to make this feature work really right
is to add a regexp feature that substitutes something else 
for bunches of spaces, but does it only when the spaces
are not inside something else.  It was easy to do, so I did it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-19 20:04                     ` Info-search-whitespace Richard Stallman
@ 2004-11-19 20:41                       ` David Kastrup
  2004-11-21 15:39                         ` Info-search-whitespace Richard Stallman
  0 siblings, 1 reply; 27+ messages in thread
From: David Kastrup @ 2004-11-19 20:41 UTC (permalink / raw)
  Cc: juri, Stefan Monnier, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     Huh!?  The \(?:..\) is actually both useless and harmful here
>     and should be removed (if possible).
>
> It isn't useless, since it prevents confusion of back-references to
> explicit groups.

He said removed, not replaced by an explicit group.  The shy group
encloses a single item and thus is completely redundant.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-19 20:41                       ` Info-search-whitespace David Kastrup
@ 2004-11-21 15:39                         ` Richard Stallman
  2004-11-21 16:09                           ` Info-search-whitespace David Kastrup
  2004-11-22  0:18                           ` Info-search-whitespace Stefan Monnier
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Stallman @ 2004-11-21 15:39 UTC (permalink / raw)
  Cc: juri, monnier, emacs-devel

    He said removed, not replaced by an explicit group.  The shy group
    encloses a single item and thus is completely redundant.

Without the shy group, it would give wrong results when followed by ?.
The + and the ? would combine to give the wrong meaning.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-21 15:39                         ` Info-search-whitespace Richard Stallman
@ 2004-11-21 16:09                           ` David Kastrup
  2004-11-22  0:18                           ` Info-search-whitespace Stefan Monnier
  1 sibling, 0 replies; 27+ messages in thread
From: David Kastrup @ 2004-11-21 16:09 UTC (permalink / raw)
  Cc: juri, monnier, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     He said removed, not replaced by an explicit group.  The shy group
>     encloses a single item and thus is completely redundant.
>
> Without the shy group, it would give wrong results when followed by ?.
> The + and the ? would combine to give the wrong meaning.

I have here from the past discussion:

> Info-search-whitespace to "unibyte:\\(?:\\s-+\\)*t" and on the line

This appears to me to be perfectly equivalent to

"unibyte:\\s-*t"

Where is my mistake?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-21 15:39                         ` Info-search-whitespace Richard Stallman
  2004-11-21 16:09                           ` Info-search-whitespace David Kastrup
@ 2004-11-22  0:18                           ` Stefan Monnier
  2004-11-23 16:30                             ` Info-search-whitespace Richard Stallman
  1 sibling, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2004-11-22  0:18 UTC (permalink / raw)
  Cc: juri, emacs-devel

>     He said removed, not replaced by an explicit group.  The shy group
>     encloses a single item and thus is completely redundant.

> Without the shy group, it would give wrong results when followed by ?.
> The + and the ? would combine to give the wrong meaning.

Then regex.c should maybe try and recognize \(?:f+\)* and optimize it like it
optimizes f+*.


        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Info-search-whitespace
  2004-11-22  0:18                           ` Info-search-whitespace Stefan Monnier
@ 2004-11-23 16:30                             ` Richard Stallman
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Stallman @ 2004-11-23 16:30 UTC (permalink / raw)
  Cc: juri, emacs-devel

    Then regex.c should maybe try and recognize \(?:f+\)* and optimize it like it
    optimizes f+*.

That would be a good change, if someone wants to do it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Improvements to `(emacs)File Variables'
  2004-11-19  2:25             ` Improvements to `(emacs)File Variables' Richard Stallman
@ 2004-11-29 19:04               ` Reiner Steib
  0 siblings, 0 replies; 27+ messages in thread
From: Reiner Steib @ 2004-11-29 19:04 UTC (permalink / raw)
  Cc: emacs-devel, monnier, overlord

On Fri, Nov 19 2004, Richard Stallman wrote:

>     Let's not forget that the original question in this thread is
>     whether we should *add* more documentation for the "unibyte:t"
>     cookie.  I think we can answer "no" without formally declaring
>     the feature obsolete (after all, it is already documented at
>     other places, so we don't need to advertise it further).
>
> I agree, on that question.

I have installed the suggested changes.

[ I understood that you agree to add documentation of "unibyte:t",
please correct me if I was wrong. ]

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2004-11-29 19:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-14 19:02 Improvements to `(emacs)File Variables' Reiner Steib
2004-11-14 21:12 ` Stefan Monnier
2004-11-14 23:26   ` Miles Bader
2004-11-14 23:46     ` Stefan Monnier
2004-11-14 23:55       ` Miles Bader
2004-11-15  0:18         ` Stefan Monnier
2004-11-15  4:53           ` Miles Bader
2004-11-15  5:15             ` Stefan Monnier
2004-11-16 16:48             ` Richard Stallman
2004-11-16 16:49     ` Richard Stallman
2004-11-16 16:59       ` Stefan Monnier
2004-11-18  2:55         ` Richard Stallman
2004-11-18 16:47           ` Stefan Monnier
2004-11-18 17:07             ` Simon Krahnke
2004-11-18 18:04               ` Stefan Monnier
2004-11-19  1:23                 ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Juri Linkov
2004-11-19  5:06                   ` Info-search-whitespace Stefan Monnier
2004-11-19 17:48                     ` Info-search-whitespace Juri Linkov
2004-11-19 20:04                     ` Info-search-whitespace Richard Stallman
2004-11-19 20:41                       ` Info-search-whitespace David Kastrup
2004-11-21 15:39                         ` Info-search-whitespace Richard Stallman
2004-11-21 16:09                           ` Info-search-whitespace David Kastrup
2004-11-22  0:18                           ` Info-search-whitespace Stefan Monnier
2004-11-23 16:30                             ` Info-search-whitespace Richard Stallman
2004-11-19  7:15                   ` Info-search-whitespace (Was: Improvements to `(emacs)File Variables') Eli Zaretskii
2004-11-19  2:25             ` Improvements to `(emacs)File Variables' Richard Stallman
2004-11-29 19:04               ` Reiner Steib

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).