unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* dashes and non-breaking spaces
@ 2004-12-28 13:21 Paul Pogonyshev
  2004-12-29 21:08 ` Stefan Monnier
  2005-01-03  4:31 ` Richard Stallman
  0 siblings, 2 replies; 31+ messages in thread
From: Paul Pogonyshev @ 2004-12-28 13:21 UTC (permalink / raw)


2004-12-21  Richard M. Stallman  <rms@gnu.org>

	* xdisp.c (get_next_display_element): Display codes 8a0 and 8ad
	specially as `\ ' and `\-'.

This breaks my Wikipedia mode pretty badly.  While I understand the change
for non-breaking space, escaping the dash like this looks really improper
with variable-pitch fonts.  Can we back out the dash-thing or make it
customizable in some way?

I also had non-breaking space highlighted with a special face that had a
very-light-gray background (the "normal" being white), so that they were
recognizable, yet the text was perfectly readable.  Maybe it is possible to
build it into Emacs somewhere on a low level?  I.e. non-breaking space
would be displayed without the backslash, but with a different face.

I'm not arguing about defaults here, but I really want some variables to
control these things.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2004-12-28 13:21 dashes and non-breaking spaces Paul Pogonyshev
@ 2004-12-29 21:08 ` Stefan Monnier
  2005-01-02 15:24   ` Paul Pogonyshev
                     ` (2 more replies)
  2005-01-03  4:31 ` Richard Stallman
  1 sibling, 3 replies; 31+ messages in thread
From: Stefan Monnier @ 2004-12-29 21:08 UTC (permalink / raw)
  Cc: rms, emacs-devel

> 2004-12-21  Richard M. Stallman  <rms@gnu.org>
> 	* xdisp.c (get_next_display_element): Display codes 8a0 and 8ad
> 	specially as `\ ' and `\-'.

I think this change is wrong.  I understand the need to see such differences
at times, but not always.  It should be set using display-tables via some 
minor mode (similar to the minor mode that shows trailing whitespace).


        Stefan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2004-12-29 21:08 ` Stefan Monnier
@ 2005-01-02 15:24   ` Paul Pogonyshev
  2005-01-04 20:53   ` Karl Eichwalder
  2005-01-07 13:59   ` Kim F. Storm
  2 siblings, 0 replies; 31+ messages in thread
From: Paul Pogonyshev @ 2005-01-02 15:24 UTC (permalink / raw)
  Cc: rms, emacs-devel

Stefan Monnier wrote:
> > 2004-12-21  Richard M. Stallman  <rms@gnu.org>
> > 	* xdisp.c (get_next_display_element): Display codes 8a0 and 8ad
> > 	specially as `\ ' and `\-'.
>
> I think this change is wrong.  I understand the need to see such
> differences at times, but not always.  It should be set using
> display-tables via some minor mode (similar to the minor mode that shows
> trailing whitespace).

So, can we do something about it?

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2004-12-28 13:21 dashes and non-breaking spaces Paul Pogonyshev
  2004-12-29 21:08 ` Stefan Monnier
@ 2005-01-03  4:31 ` Richard Stallman
  1 sibling, 0 replies; 31+ messages in thread
From: Richard Stallman @ 2005-01-03  4:31 UTC (permalink / raw)
  Cc: emacs-devel

    This breaks my Wikipedia mode pretty badly.  While I understand the change
    for non-breaking space, escaping the dash like this looks really improper
    with variable-pitch fonts.  Can we back out the dash-thing or make it
    customizable in some way?

You can customize it by making a display table.  I did not want to
have a display table by default, but using them for specific
customizations is what they are for.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2004-12-29 21:08 ` Stefan Monnier
  2005-01-02 15:24   ` Paul Pogonyshev
@ 2005-01-04 20:53   ` Karl Eichwalder
  2005-01-05  5:46     ` Juri Linkov
  2005-01-07 13:59   ` Kim F. Storm
  2 siblings, 1 reply; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-04 20:53 UTC (permalink / raw)
  Cc: emacs-devel, rms, Paul Pogonyshev

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> 2004-12-21  Richard M. Stallman  <rms@gnu.org>
>> 	* xdisp.c (get_next_display_element): Display codes 8a0 and 8ad
>> 	specially as `\ ' and `\-'.
>
> I think this change is wrong.  I understand the need to see such differences
> at times, but not always.

Yes, it is useful to make these codes visible in edit modes; displaying
it in most viewing modes like « mail » and « news readers » is
annoying : Some Writers writers with a French background use
« no-breaking space » very often !

Using blue for the backslash is also arguable; a single slash in blue is
hard to distinguish from the surround text in black.  In the past I used
an IndianRed underscore for the no-breaking space.

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-04 20:53   ` Karl Eichwalder
@ 2005-01-05  5:46     ` Juri Linkov
  2005-01-06  4:53       ` Richard Stallman
  0 siblings, 1 reply; 31+ messages in thread
From: Juri Linkov @ 2005-01-05  5:46 UTC (permalink / raw)
  Cc: pogonyshev, monnier, emacs-devel

Karl Eichwalder <ke@gnu.franken.de> writes:
> Yes, it is useful to make these codes visible in edit modes; displaying
> it in most viewing modes like « mail » and « news readers » is
> annoying

I agree.  It is useful only in editing modes.  This is like highlighting
trailing whitespace which is useful only for writable files, i.e. where
whitespace deletions can be saved.  There should be a user option
specifying the predicate to activate the mode, with the default value
like `buffer-read-only'.

> Using blue for the backslash is also arguable; a single slash in blue is
> hard to distinguish from the surround text in black.  In the past I used
> an IndianRed underscore for the no-breaking space.

Yes, blue is one of the most unsuitable colors.  It seems everyone already
agreed on the following colors for escape glyphs:

(defface escape-glyph
  '((((class color) (min-colors 88) (background light))
     :foreground "dark red")
    (((class color) (min-colors 88) (background dark))
     :foreground "tan1")
    (((class color) (min-colors 8))
     :foreground "red"))
  "Face for characters displayed as ^-sequences or \\-sequences."
  :group 'basic-faces)

If these colors are unsuitable for no-break spaces then perhaps
a better face is `trailing-whitespace', or maybe it will require
a new face.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-05  5:46     ` Juri Linkov
@ 2005-01-06  4:53       ` Richard Stallman
  2005-01-12  2:02         ` Juri Linkov
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Stallman @ 2005-01-06  4:53 UTC (permalink / raw)
  Cc: pogonyshev, emacs-devel, monnier, ke

    Yes, blue is one of the most unsuitable colors.  It seems everyone already
    agreed on the following colors for escape glyphs:

When did we even discuss them?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2004-12-29 21:08 ` Stefan Monnier
  2005-01-02 15:24   ` Paul Pogonyshev
  2005-01-04 20:53   ` Karl Eichwalder
@ 2005-01-07 13:59   ` Kim F. Storm
  2005-01-07 23:04     ` Richard Stallman
  2 siblings, 1 reply; 31+ messages in thread
From: Kim F. Storm @ 2005-01-07 13:59 UTC (permalink / raw)
  Cc: emacs-devel, rms, Paul Pogonyshev

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> 2004-12-21  Richard M. Stallman  <rms@gnu.org>
>> 	* xdisp.c (get_next_display_element): Display codes 8a0 and 8ad
>> 	specially as `\ ' and `\-'.
>
> I think this change is wrong.  I understand the need to see such differences
> at times, but not always.  It should be set using display-tables via some 
> minor mode (similar to the minor mode that shows trailing whitespace).

I agree 100%

What about adding a show-nonbreak-escape option or some such?
If so, should it be on or off by default?

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-07 13:59   ` Kim F. Storm
@ 2005-01-07 23:04     ` Richard Stallman
  2005-01-09  2:17       ` Kim F. Storm
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Stallman @ 2005-01-07 23:04 UTC (permalink / raw)
  Cc: emacs-devel, monnier, pogonyshev

    What about adding a show-nonbreak-escape option or some such?

I wouldn't mind that option.  I think it should be enabled by default.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-07 23:04     ` Richard Stallman
@ 2005-01-09  2:17       ` Kim F. Storm
  0 siblings, 0 replies; 31+ messages in thread
From: Kim F. Storm @ 2005-01-09  2:17 UTC (permalink / raw)
  Cc: emacs-devel, monnier, pogonyshev

Richard Stallman <rms@gnu.org> writes:

>     What about adding a show-nonbreak-escape option or some such?
>
> I wouldn't mind that option.  I think it should be enabled by default.

I have implemented that option.

I have also fixed the problems with merging the escape-glyph face
with the current face / region face.

I also fixed it so that if e.g. the escape glyph is defined by
a display table has an embedded face, that face is used for the
following characters just like the escape-glyph face.

I have not done anything about merging faces in display table glyphs
in general.  However, it should be fairly trivial with the new
merge_into_realized_face function I have just added.

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-06  4:53       ` Richard Stallman
@ 2005-01-12  2:02         ` Juri Linkov
  2005-01-12  4:41           ` Miles Bader
  0 siblings, 1 reply; 31+ messages in thread
From: Juri Linkov @ 2005-01-12  2:02 UTC (permalink / raw)
  Cc: monnier, emacs-devel

Richard Stallman <rms@gnu.org> writes:
>     Yes, blue is one of the most unsuitable colors.  It seems everyone already
>     agreed on the following colors for escape glyphs:
>
> When did we even discuss them?

Below is your message on this subject.  And it seems everyone agreed
on new colors (with the exception of 8-color terminals colors).

From: Richard Stallman <rms@gnu.org>
Subject: Re: [Emacs-trunk-diffs] Changes to emacs/lisp/faces.el
To: Juri Linkov <juri@jurta.org>
Cc: storm@cua.dk, occitan@esperanto.org, emacs-devel@gnu.org
Date: Sat, 25 Dec 2004 10:12:48 -0500
Reply-To: rms@gnu.org

    The current `blue' is one of the most unsuitable colors.  It stand out
    very much.  Blue is mostly used for highlighting the important parts
    of the buffer: in programming modes it highlights function names,
    in Dired - file names.  Using it for escape sequences makes them
    too distracting.

    I highly recommend `dark red' for numerous reasons:

If people generally agree with you, I'll change it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12  2:02         ` Juri Linkov
@ 2005-01-12  4:41           ` Miles Bader
  2005-01-12  6:39             ` Karl Eichwalder
  0 siblings, 1 reply; 31+ messages in thread
From: Miles Bader @ 2005-01-12  4:41 UTC (permalink / raw)
  Cc: emacs-devel, rms, monnier

>     I highly recommend `dark red' for numerous reasons:
> 
> If people generally agree with you, I'll change it.

Um, I don't agree.  On a black background, the current cyan seems fine
to me; the amount of "stand outness" looks about right.  I think the
sort of glyphs being highlighted by this code are quite unusual, and
_should_ stand out at least bit.

[No comment on the light-background appearance.]

Looking at past threads, there seems to be no particular concensus for
changing the color (though if it's a very old discussion I might not
have it around any more).

-Miles

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12  4:41           ` Miles Bader
@ 2005-01-12  6:39             ` Karl Eichwalder
  2005-01-12 20:58               ` Miles Bader
  2005-01-13 20:29               ` Richard Stallman
  0 siblings, 2 replies; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-12  6:39 UTC (permalink / raw)
  Cc: Juri Linkov, emacs-devel, rms, monnier, miles

Miles Bader <snogglethorpe@gmail.com> writes:

> Um, I don't agree.  On a black background, the current cyan seems fine
> to me; the amount of "stand outness" looks about right.  I think the
> sort of glyphs being highlighted by this code are quite unusual, and
> _should_ stand out at least bit.

Yes, it should stand out, but not everywhere.  Reading arbitrary mail
messages I am not interested in such technical details. In Gnus by
default, it must not be highlighted at all; in modes like xml or tex it
ought to be as visible as possible (because there it probably is an
error you would like to correct as fast as possible).

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12  6:39             ` Karl Eichwalder
@ 2005-01-12 20:58               ` Miles Bader
  2005-01-12 21:58                 ` Stefan Monnier
  2005-01-13 20:29               ` Richard Stallman
  1 sibling, 1 reply; 31+ messages in thread
From: Miles Bader @ 2005-01-12 20:58 UTC (permalink / raw)
  Cc: Juri Linkov, emacs-devel, rms, monnier, miles

On Wed, 12 Jan 2005 07:39:11 +0100, Karl Eichwalder <ke@gnu.franken.de> wrote:
> > Um, I don't agree.  On a black background, the current cyan seems fine
> > to me; the amount of "stand outness" looks about right.  I think the
> > sort of glyphs being highlighted by this code are quite unusual, and
> > _should_ stand out at least bit.
> 
> Yes, it should stand out, but not everywhere.  Reading arbitrary mail
> messages I am not interested in such technical details. In Gnus by
> default, it must not be highlighted at all; in modes like xml or tex it
> ought to be as visible as possible (because there it probably is an
> error you would like to correct as fast as possible).

Well what's being discussed is the _default_, and as I said, I think
the current colors (on a black background) are just about right ---
they stand out enough to be somewhat noticeable, but not so much as to
interfere with reading.

[I disagree about Gnus BTW:  I see highlighted escapes in messages
_now_, and really like it; usually it's some dimwit using microsoft
"quotes" etc., and seeing them earlier allows me to run
`gnus-summary-treat-dumbquotes' before I start reading.]

-Miles

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12 20:58               ` Miles Bader
@ 2005-01-12 21:58                 ` Stefan Monnier
  2005-01-12 22:07                   ` Miles Bader
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Monnier @ 2005-01-12 21:58 UTC (permalink / raw)
  Cc: Juri Linkov, Karl Eichwalder, emacs-devel, rms, miles

> [I disagree about Gnus BTW:  I see highlighted escapes in messages
> _now_, and really like it; usually it's some dimwit using microsoft
> "quotes" etc., and seeing them earlier allows me to run
> `gnus-summary-treat-dumbquotes' before I start reading.]

Agreed for escape sequences in the 0-31 and 128-255 regions.
But not for the "\ " and "\-" which seems to be often used in email
that is properly labelled (at least the ones written in French).

I.e. agreed for the ones for which there is no accepted "normal" way to
display them, but not for the ones for which there is a well-defined way
they *should* be displayed.


        Stefan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12 21:58                 ` Stefan Monnier
@ 2005-01-12 22:07                   ` Miles Bader
  2005-01-12 22:30                     ` Stefan Monnier
  0 siblings, 1 reply; 31+ messages in thread
From: Miles Bader @ 2005-01-12 22:07 UTC (permalink / raw)
  Cc: Juri Linkov, Karl Eichwalder, emacs-devel, rms, miles

On Wed, 12 Jan 2005 16:58:46 -0500, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
> Agreed for escape sequences in the 0-31 and 128-255 regions.
> But not for the "\ " and "\-" which seems to be often used in email
> that is properly labelled (at least the ones written in French).
> 
> I.e. agreed for the ones for which there is no accepted "normal" way to
> display them, but not for the ones for which there is a well-defined way
> they *should* be displayed.

Sure -- but I thought the plan was to have a variable or use
display-table tweaks to simply display those characters as normal " "
and "-" in such contexts.  I.e. when they're displayed as "escape
characters", there's no reason to treat them differently from other
escapes.

-Miles

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12 22:07                   ` Miles Bader
@ 2005-01-12 22:30                     ` Stefan Monnier
  2005-01-13  9:29                       ` David Kastrup
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Monnier @ 2005-01-12 22:30 UTC (permalink / raw)
  Cc: Juri Linkov, Karl Eichwalder, emacs-devel, rms, miles

> Sure -- but I thought the plan was to have a variable or use
> display-table tweaks to simply display those characters as normal " "
> and "-" in such contexts.  I.e. when they're displayed as "escape
> characters", there's no reason to treat them differently from other
> escapes.

Yes, sorry, I was confused.
I guess what I wanted to say is not that the "\ " and "\-" shouldn't be
highlighted, but that Gnus should set `show-nonbreak-escape' to nil.


        Stefan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12 22:30                     ` Stefan Monnier
@ 2005-01-13  9:29                       ` David Kastrup
  2005-01-13 10:37                         ` Miles Bader
  0 siblings, 1 reply; 31+ messages in thread
From: David Kastrup @ 2005-01-13  9:29 UTC (permalink / raw)
  Cc: rms, emacs-devel, Juri Linkov, Karl Eichwalder, snogglethorpe,
	miles

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Sure -- but I thought the plan was to have a variable or use
>> display-table tweaks to simply display those characters as normal "
>> " and "-" in such contexts.  I.e. when they're displayed as "escape
>> characters", there's no reason to treat them differently from other
>> escapes.
>
> Yes, sorry, I was confused.  I guess what I wanted to say is not
> that the "\ " and "\-" shouldn't be highlighted, but that Gnus
> should set `show-nonbreak-escape' to nil.

Shouldn't that be the general setting of text modes?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13  9:29                       ` David Kastrup
@ 2005-01-13 10:37                         ` Miles Bader
  2005-01-14 11:33                           ` Richard Stallman
  0 siblings, 1 reply; 31+ messages in thread
From: Miles Bader @ 2005-01-13 10:37 UTC (permalink / raw)
  Cc: rms, emacs-devel, Juri Linkov, Stefan Monnier, Karl Eichwalder,
	miles

> > Yes, sorry, I was confused.  I guess what I wanted to say is not
> > that the "\ " and "\-" shouldn't be highlighted, but that Gnus
> > should set `show-nonbreak-escape' to nil.
> 
> Shouldn't that be the general setting of text modes?

I don't think so -- it basically should show them as escapes when
editing, and as if they were normal spaces/dashes when "viewing". 
Normal text modes are for editing.

-Miles

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-12  6:39             ` Karl Eichwalder
  2005-01-12 20:58               ` Miles Bader
@ 2005-01-13 20:29               ` Richard Stallman
  2005-01-13 21:11                 ` Karl Eichwalder
  2005-01-13 21:39                 ` Paul Pogonyshev
  1 sibling, 2 replies; 31+ messages in thread
From: Richard Stallman @ 2005-01-13 20:29 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, monnier, miles

    Yes, it should stand out, but not everywhere.  Reading arbitrary mail
    messages I am not interested in such technical details. In Gnus by
    default, it must not be highlighted at all;

The word "must" makes this statement so strong that it surprises me.
Why do you think this is such a great problem?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 20:29               ` Richard Stallman
@ 2005-01-13 21:11                 ` Karl Eichwalder
  2005-01-13 21:24                   ` Stefan Monnier
  2005-01-15  0:12                   ` Richard Stallman
  2005-01-13 21:39                 ` Paul Pogonyshev
  1 sibling, 2 replies; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-13 21:11 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, monnier, miles

Richard Stallman <rms@gnu.org> writes:

> The word "must" makes this statement so strong that it surprises me.
> Why do you think this is such a great problem?

I often receive broken UTF-8 messages.  People quote UTF-8 texts with
German "umlauts" and treat these texts as iso-8859-1 encoded resulting
in escapes instead of 'ß', for example:

    > 'Die Stadt heiÃ\237t "Lutherstadt Wittenberg".'
                 heißt (corrected)

"\237" and some other escapes can often occur and then the text is
disturbed with many blue fragments.

Messages written in French often contain non-breaking spaces
intentionally and legitimately; adding a special marker to them destroys
the text (Stefan Monnier explained it in a previous message).

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 21:11                 ` Karl Eichwalder
@ 2005-01-13 21:24                   ` Stefan Monnier
  2005-01-13 21:59                     ` Karl Eichwalder
  2005-01-15  0:12                   ` Richard Stallman
  1 sibling, 1 reply; 31+ messages in thread
From: Stefan Monnier @ 2005-01-13 21:24 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, rms, miles

>> The word "must" makes this statement so strong that it surprises me.
>> Why do you think this is such a great problem?

> I often receive broken UTF-8 messages.  People quote UTF-8 texts with
> German "umlauts" and treat these texts as iso-8859-1 encoded resulting
> in escapes instead of 'ß', for example:

>> 'Die Stadt heiÃ\237t "Lutherstadt Wittenberg".'
>                  heißt (corrected)

> "\237" and some other escapes can often occur and then the text is
> disturbed with many blue fragments.

Well, to me whether the \237 is written in blue or not doesn't make
any difference: it's garbage.  Do you really think that displaying the
escape sequence in the normal face rather than in blue makes the text much
less "disturbed"?

> Messages written in French often contain non-breaking spaces
> intentionally and legitimately; adding a special marker to them destroys
> the text (Stefan Monnier explained it in a previous message).

Right, and this is a separate concern and is controlled by
a different variable.


        Stefan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 20:29               ` Richard Stallman
  2005-01-13 21:11                 ` Karl Eichwalder
@ 2005-01-13 21:39                 ` Paul Pogonyshev
  2005-01-15  0:12                   ` Richard Stallman
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Pogonyshev @ 2005-01-13 21:39 UTC (permalink / raw)
  Cc: juri, miles, snogglethorpe, monnier, emacs-devel

Richard Stallman wrote:
>     Yes, it should stand out, but not everywhere.  Reading arbitrary mail
>     messages I am not interested in such technical details. In Gnus by
>     default, it must not be highlighted at all;
>
> The word "must" makes this statement so strong that it surprises me.
> Why do you think this is such a great problem?

Because it makes text harder to read.  You generally don't care if there
is a non-breaking space or a space in a mail your buddy sent to you, you
just read the message and don't want various extra symbols or highlights
get in your way.

Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 21:24                   ` Stefan Monnier
@ 2005-01-13 21:59                     ` Karl Eichwalder
  0 siblings, 0 replies; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-13 21:59 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, rms, miles

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Do you really think that displaying the escape sequence in the normal
> face rather than in blue makes the text much less "disturbed"?

Sure.  It asks for special attention resp. it distracts your attention
from the new text (you are interested in the new text, not in the
wrongly encoded quotations).

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 10:37                         ` Miles Bader
@ 2005-01-14 11:33                           ` Richard Stallman
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Stallman @ 2005-01-14 11:33 UTC (permalink / raw)
  Cc: emacs-devel, juri, monnier, ke, miles

    > Shouldn't that be the general setting of text modes?

    I don't think so -- it basically should show them as escapes when
    editing, and as if they were normal spaces/dashes when "viewing". 
    Normal text modes are for editing.

I agree.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 21:11                 ` Karl Eichwalder
  2005-01-13 21:24                   ` Stefan Monnier
@ 2005-01-15  0:12                   ` Richard Stallman
  2005-01-15  6:47                     ` Karl Eichwalder
  1 sibling, 1 reply; 31+ messages in thread
From: Richard Stallman @ 2005-01-15  0:12 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, monnier, miles

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

    I often receive broken UTF-8 messages.  People quote UTF-8 texts with
    German "umlauts" and treat these texts as iso-8859-1 encoded resulting
    in escapes instead of 'ß', for example:

It would be interesting to investigate why those messages were decoded
incorrectly.  Maybe we could improve the decoding heuristics.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-13 21:39                 ` Paul Pogonyshev
@ 2005-01-15  0:12                   ` Richard Stallman
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Stallman @ 2005-01-15  0:12 UTC (permalink / raw)
  Cc: emacs-devel, juri, monnier, ke, snogglethorpe, miles

    Because it makes text harder to read.  You generally don't care if there
    is a non-breaking space or a space in a mail your buddy sent to you, you
    just read the message and don't want various extra symbols or highlights
    get in your way.

Ok, I agree.  (I still wouldn't use the word "must", though.)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-15  0:12                   ` Richard Stallman
@ 2005-01-15  6:47                     ` Karl Eichwalder
  2005-01-15 14:05                       ` Benjamin Riefenstahl
  0 siblings, 1 reply; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-15  6:47 UTC (permalink / raw)
  Cc: juri, snogglethorpe, emacs-devel, monnier, miles

Richard Stallman <rms@gnu.org> writes:

>     I often receive broken UTF-8 messages.  People quote UTF-8 texts with
>     German "umlauts" and treat these texts as iso-8859-1 encoded resulting
>     in escapes instead of 'ß', for example:
>
> It would be interesting to investigate why those messages were decoded
> incorrectly.

Broken by the mail program of my mail partner.  His mail program treats
all mails as iso-8859-1 resp. windows-1252 encoded and while he answers
the encoding mixture happens.

> Maybe we could improve the decoding heuristics.

I think it isn't worth the trouble (to many false positives?).  If you
want to try: If an iso-8859-1 labeled text contains escapes, most
probably it is windows-1252 encoded; if there are still escapes and the
text is quoted, try to treat it as UTF-8.

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-15  6:47                     ` Karl Eichwalder
@ 2005-01-15 14:05                       ` Benjamin Riefenstahl
  2005-01-15 15:36                         ` Karl Eichwalder
  0 siblings, 1 reply; 31+ messages in thread
From: Benjamin Riefenstahl @ 2005-01-15 14:05 UTC (permalink / raw)
  Cc: rms, emacs-devel

Hi Karl, all,


Karl Eichwalder writes:
> Broken by the mail program of my mail partner.  His mail program
> treats all mails as iso-8859-1 resp. windows-1252 encoded and while
> he answers the encoding mixture happens.
>
> [...]
>
> I think it isn't worth the trouble (to many false positives?).  If
> you want to try: If an iso-8859-1 labeled text contains escapes,
> most probably it is windows-1252 encoded; if there are still escapes
> and the text is quoted, try to treat it as UTF-8.

Those are not "escapes" strictly speaking.  If you decode UTF-8 as
cp1252 or latin-1 you just get sequences of unusual non-ASCII
characters.

If the problem occurs regularly with texts marked as iso-8859-1, you
can try UTF-8 first and than fall back to cp1252.

First try to decode the text as UTF-8.  Because UTF-8 follows some
very strict rules, it's possible to check for these rules, and than
the probability to mistake any non-UTF-8 text for UTF-8 is very low in
general (< 99%, I believe, even for short texts).  This is even more
so for latin-1 or cp1252 texts, because these encode languages where
sequences of non-ASCII characters are rare in the first place.

If the text is not UTF-8, just treat it as cp1252.  Encoding-wise all
texts that are latin-1 can be displayed as cp1252 without any
problems.


benny

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-15 14:05                       ` Benjamin Riefenstahl
@ 2005-01-15 15:36                         ` Karl Eichwalder
  2005-01-15 17:30                           ` Benjamin Riefenstahl
  0 siblings, 1 reply; 31+ messages in thread
From: Karl Eichwalder @ 2005-01-15 15:36 UTC (permalink / raw)
  Cc: rms, emacs-devel

Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:

> Those are not "escapes" strictly speaking.  If you decode UTF-8 as
> cp1252 or latin-1 you just get sequences of unusual non-ASCII
> characters.

The point is that Emacs treats them as escapes...

> If the problem occurs regularly with texts marked as iso-8859-1, you
> can try UTF-8 first and than fall back to cp1252.

That's not the problem.  The problem is, that Emacs now applies colors
on them.  (And as I said, those messages are broken; only a part (the
quoted text) is "wrong".)

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: dashes and non-breaking spaces
  2005-01-15 15:36                         ` Karl Eichwalder
@ 2005-01-15 17:30                           ` Benjamin Riefenstahl
  0 siblings, 0 replies; 31+ messages in thread
From: Benjamin Riefenstahl @ 2005-01-15 17:30 UTC (permalink / raw)
  Cc: rms, emacs-devel

Hi Karl,


> Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:
>> Those are not "escapes" strictly speaking.  If you decode UTF-8 as
>> cp1252 or latin-1 you just get sequences of unusual non-ASCII
>> characters.

Karl Eichwalder writes:
> The point is that Emacs treats them as escapes...

What does "escape" mean here?  There are no escapes of any kind in
latin-1 in the non-ASCII region, as I understand it.  If this is just
another term for "control character", the only valid control
character, if you want to call it that, is \u00A0, NBSP (see subject).

In email, if there are characters in a latin-1 text from the reserved
C1 region (\u0080-\u009F), that is just an indication that it's not
actually latin-1, but that windows-1252 (cp1252) is used.  That
confusion is extremely common.

If there are byte sequences in there that are valid in UTF-8, that
means that it *is* UTF-8 (with a very high degree of certainty).

> (And as I said, those messages are broken; only a part (the quoted
> text) is "wrong".)

Well, that could mean that the sender has seen it in this form and
wants you to see it this way.  Or it can mean that the sender was too
lazy to correct it.

If you wanted to fix it in Emacs, you'd have to treat each quoted
block separately.  The algorithm that I gave would still be
reasonable.


benny

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2005-01-15 17:30 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-28 13:21 dashes and non-breaking spaces Paul Pogonyshev
2004-12-29 21:08 ` Stefan Monnier
2005-01-02 15:24   ` Paul Pogonyshev
2005-01-04 20:53   ` Karl Eichwalder
2005-01-05  5:46     ` Juri Linkov
2005-01-06  4:53       ` Richard Stallman
2005-01-12  2:02         ` Juri Linkov
2005-01-12  4:41           ` Miles Bader
2005-01-12  6:39             ` Karl Eichwalder
2005-01-12 20:58               ` Miles Bader
2005-01-12 21:58                 ` Stefan Monnier
2005-01-12 22:07                   ` Miles Bader
2005-01-12 22:30                     ` Stefan Monnier
2005-01-13  9:29                       ` David Kastrup
2005-01-13 10:37                         ` Miles Bader
2005-01-14 11:33                           ` Richard Stallman
2005-01-13 20:29               ` Richard Stallman
2005-01-13 21:11                 ` Karl Eichwalder
2005-01-13 21:24                   ` Stefan Monnier
2005-01-13 21:59                     ` Karl Eichwalder
2005-01-15  0:12                   ` Richard Stallman
2005-01-15  6:47                     ` Karl Eichwalder
2005-01-15 14:05                       ` Benjamin Riefenstahl
2005-01-15 15:36                         ` Karl Eichwalder
2005-01-15 17:30                           ` Benjamin Riefenstahl
2005-01-13 21:39                 ` Paul Pogonyshev
2005-01-15  0:12                   ` Richard Stallman
2005-01-07 13:59   ` Kim F. Storm
2005-01-07 23:04     ` Richard Stallman
2005-01-09  2:17       ` Kim F. Storm
2005-01-03  4:31 ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).