unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Customizing coding priority
@ 2007-01-17  0:09 Sven Bretfeld
  2007-01-17  0:26 ` Peter Dyballa
  2007-01-17  4:16 ` Eli Zaretskii
  0 siblings, 2 replies; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-17  0:09 UTC (permalink / raw)


Dear List

How can I change the coding priority list? With my
language-environment set to utf-8 the command M-x
describe-coding-system (default, current choices) gives:

Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. iso-2022-jp (alias: junet)
  4. iso-2022-7bit 
  5. iso-2022-7bit-lock (alias: iso-2022-int-1)
  6. iso-2022-8bit-ss2 
  7. emacs-mule 
  8. raw-text 
  9. japanese-shift-jis (alias: shift_jis sjis)
  10. chinese-big5 (alias: big5 cn-big5)
  11. no-conversion (alias: binary)

The problem is that vm doesn't recognize Mails written in iso-8859-15
correctly and therefore encodes my replies in iso-2022-jp instead. 

I think that the problem could be solved if I tell Emacs to use
iso-8859-15 as 3rd priority for recognizing the coding system of
files/mails. 

I have tried adding:

'(set-coding-priority-list '((utf-8) (iso-8859-1) (iso-8859-15)))

to the custom-set-variables. But that doesn't change anything.

Can anybody help me?

Thanks very much
Sven

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17  0:09 Customizing coding priority Sven Bretfeld
@ 2007-01-17  0:26 ` Peter Dyballa
  2007-01-17  4:16 ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Peter Dyballa @ 2007-01-17  0:26 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 17.01.2007 um 01:09 schrieb Sven Bretfeld:

> How can I change the coding priority list?

prefer-coding-system ?

It can be used more often than once ...

--
Mit friedvollen Grüßen

   Pete

Es geht nix über eine elektrische Klobürste!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17  0:09 Customizing coding priority Sven Bretfeld
  2007-01-17  0:26 ` Peter Dyballa
@ 2007-01-17  4:16 ` Eli Zaretskii
  2007-01-17  7:59   ` Sven Bretfeld
  1 sibling, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-17  4:16 UTC (permalink / raw)


> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
> Date: Wed, 17 Jan 2007 01:09:37 +0100
> 
> The problem is that vm doesn't recognize Mails written in iso-8859-15
> correctly

Does that mean these mails are displayed incorrectly?

> and therefore encodes my replies in iso-2022-jp instead. 

I don't use VM, but can't you use "C-x RET f" to force the encoding of
your reply before you send it?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17  4:16 ` Eli Zaretskii
@ 2007-01-17  7:59   ` Sven Bretfeld
  2007-01-17 18:35     ` Eli Zaretskii
  2007-01-17 20:38     ` Sven Bretfeld
  0 siblings, 2 replies; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-17  7:59 UTC (permalink / raw)
  Cc: help-gnu-emacs

 > > The problem is that vm doesn't recognize Mails written in iso-8859-15
 > > correctly
 > 
 > Does that mean these mails are displayed incorrectly?

The crazy thing is that they are displayed correctly when I read
them. They are still correct when I write my reply. But the recipient
of the reply (as well as my FCC-Box) receives something like this sample:

 > lieber sven
 > hast du letzten donnerstag einen neuen text verteilt? wenn ja, 
 > k^[,bv^[(Bnntest du mir den vielleicht mailen? wenn nein, k^[,bv^[(Bnntest du mir 
 > mitteilen, bis zu welchem satz ihr gekommen seid?
 > vielen dank und sch^[,bv^[(Bne gr^[,b|^[(Bsse
 > maria
^[,Adv|^[(B

The last line outside the citation just containes three German umlauts
(I use this example for testing). The broken characters in the
citation are umlauts too. The headers of these replies say text/plain;
iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as
well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But
the problem is still exactly the same. What else can it be?

In the meantime I found out that I can send a correctly encoded reply
to the above cited message if I don't add any umlauts in my own
answer. The citation stays intact then. This is not really practically
useful, but maybe it could help to trace the problem.

Thanks for your help
Sven

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17  7:59   ` Sven Bretfeld
@ 2007-01-17 18:35     ` Eli Zaretskii
  2007-01-17 22:44       ` Sven Bretfeld
  2007-01-17 20:38     ` Sven Bretfeld
  1 sibling, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-17 18:35 UTC (permalink / raw)


> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
> Date: Wed, 17 Jan 2007 08:59:44 +0100
> Cc: help-gnu-emacs@gnu.org
> 
> The last line outside the citation just containes three German umlauts
> (I use this example for testing). The broken characters in the
> citation are umlauts too. The headers of these replies say text/plain;
> iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as
> well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But
> the problem is still exactly the same. What else can it be?

How did you insert those umlauts, exactly?  And what does Emacs
display if you go to one of those characters and type "C-u C-x ="?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17  7:59   ` Sven Bretfeld
  2007-01-17 18:35     ` Eli Zaretskii
@ 2007-01-17 20:38     ` Sven Bretfeld
  1 sibling, 0 replies; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-17 20:38 UTC (permalink / raw)
  Cc: help-gnu-emacs

 
>> The last line outside the citation just containes three German umlauts
>> (I use this example for testing). The broken characters in the
>> citation are umlauts too. The headers of these replies say text/plain;
>> iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as
>> well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But
>> the problem is still exactly the same. What else can it be?

> How did you insert those umlauts, exactly? And what does Emacs
> display if you go to one of those characters and type "C-u C-x ="?

I've already tried this today. It's quite interesting. The umlauts in
the quotations are displayed in a font different from the rest
(somewhat smaller and bolder). They are said to be part of the
iso-8859-15 charset when I type C-u C-x = (just as expected). All
other characters are plain (ASCII (ISO646 IRV)). When I add my own
text after the quotation all umlauts I type belong to
latin-iso8859-1. When I send this text it is destructed again. Then I
found out that, if I yank umlauts from the quotation to my own text
(instead of typing them myself), the sent message stays intact and is
displayed fine after receiving it again, albeit the Content-type
header says "charset unknown".

To my understanding this means that Emacs is unable to translate
umlauts belonging to iso-8859-15 to the default charset I use for
umlauts when writing replies, i.e. iso-8859-1. Therefore the umlauts
in quotations are kept untranslated in the coding system of the
original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in
quotations and iso-8859-1 in my part seems to disturb Emacs so that it
encodes the entire umlauts of my reply as iso-2022-jp when sending it.

So I've tried to avoid iso-8859-1 completely in replies and use
umlauts belonging to iso-8859-15 also for my own part of the text. But
this doesn't work for any reason. I've tried to change the coding
system and the input method of the reply buffer manually to
iso-8859-15 (using C-x f RET and C-x RET C-\) but umlauts I type
always come as iso-8859-1.

By the way, I also tried to start Emacs with an almost empty .emacs
file. The problem stays. So it seems not to depend on any user
specific configuration.

It's really strange. Can it be a simple bug in vm? But it seems to
work for other people.

Thanks for your help

Sven

(Sorry that I have destroyed the threading. I received Eli's last
message on another account while experimenting with my problem. I
forgot the keep option in fetchmailrc and had to copy and paste the
message.)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17 18:35     ` Eli Zaretskii
@ 2007-01-17 22:44       ` Sven Bretfeld
  2007-01-18  4:20         ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-17 22:44 UTC (permalink / raw)
  Cc: help-gnu-emacs

Eli Zaretskii writes:
 > > The last line outside the citation just containes three German umlauts
 > > (I use this example for testing). The broken characters in the
 > > citation are umlauts too. The headers of these replies say text/plain;
 > > iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as
 > > well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But
 > > the problem is still exactly the same. What else can it be?
 > 
 > How did you insert those umlauts, exactly?  And what does Emacs
 > display if you go to one of those characters and type "C-u C-x ="?

I've already tried this today. It's quite interesting. The umlauts in
the quotations are displayed in a font different from the rest
(somewhat smaller and bolder). These are said to be part of the
iso-8859-15 charset when I type C-u C-x = (just as expected). All
other characters are plain ascii (ASCII (ISO646 IRV)). It is different
outside of quotations. When I add my own text after a quotation all
umlauts I type belong to latin-iso8859-1. When I send this text it is
destructed again. After a while I found out that, if I yank umlauts
from the quotation to my own text (instead of typing them myself) and
send this message to myself, it stays intact and is displayed fine,
albeit the Content-type header says "charset unknown".

To my understanding this means that Emacs is unable to translate
umlauts belonging to iso-8859-15 to the default charset I use for
umlauts when writing replies, i.e. iso-8859-1. Therefore the umlauts
in quotations are kept untranslated in the coding system of the
original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in
quotations and iso-8859-1 in the text typed by myself seems to disturb
Emacs so that it encodes the entire umlauts of my reply as iso-2022-jp
when sending it.

I concluded that I have to avoid iso-8859-1 completely in replies and
use umlauts belonging to iso-8859-15 also for my own part of the
text. But this doesn't work for some reason. I've tried to change the
coding system and the input method of the reply buffer manually to
iso-8859-15 (using C-x f RET and C-x RET C-\) but the umlauts I type
always come as iso-8859-1.

By the way, I've also tried to start Emacs with an almost empty .emacs
file. The problem remaines. So it seems not to depend on any user
specific configuration.

It's really strange. Can it be a simple bug in vm? But it seems to
work for other people.

Thanks for your help

Sven

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-17 22:44       ` Sven Bretfeld
@ 2007-01-18  4:20         ` Eli Zaretskii
  2007-01-18  4:58           ` Tom Rauchenwald
                             ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-18  4:20 UTC (permalink / raw)


> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
> Date: Wed, 17 Jan 2007 23:44:42 +0100
> Cc: help-gnu-emacs@gnu.org
>  > 
>  > How did you insert those umlauts, exactly?  And what does Emacs
>  > display if you go to one of those characters and type "C-u C-x ="?
> 
> I've already tried this today. It's quite interesting. The umlauts in
> the quotations are displayed in a font different from the rest
> (somewhat smaller and bolder). These are said to be part of the
> iso-8859-15 charset when I type C-u C-x = (just as expected). All
> other characters are plain ascii (ASCII (ISO646 IRV)). It is different
> outside of quotations. When I add my own text after a quotation all
> umlauts I type belong to latin-iso8859-1.

You didn't answer my first question: how these umlauts were produced.
Did you copy them from another text, perhaps?  And how is the way you
produced those umlauts differs from the way you type the
latin-iso8859-1 characters after the quotation?

Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most
probably, your problem.  Assuming you use Emacs 21.x (is that right?),
Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and
8859-15 characters in the same message.  That is why you get
iso-2022-jp encoding.

> To my understanding this means that Emacs is unable to translate
> umlauts belonging to iso-8859-15 to the default charset I use for
> umlauts when writing replies, i.e. iso-8859-1.

In v21.x, it cannot.

> Therefore the umlauts
> in quotations are kept untranslated in the coding system of the
> original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in
> quotations and iso-8859-1 in the text typed by myself seems to disturb
> Emacs so that it encodes the entire umlauts of my reply as iso-2022-jp
> when sending it.

Correct.

> I concluded that I have to avoid iso-8859-1 completely in replies and
> use umlauts belonging to iso-8859-15 also for my own part of the
> text. But this doesn't work for some reason. I've tried to change the
> coding system and the input method of the reply buffer manually to
> iso-8859-15 (using C-x f RET and C-x RET C-\) but the umlauts I type
> always come as iso-8859-1.

C-x RET f does not affect the characters you type.  You should try to
set an input method (with C-u C-\) that produces iso-8859-15
characters, if there is such an input method in Emacs 21.x.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18  4:20         ` Eli Zaretskii
@ 2007-01-18  4:58           ` Tom Rauchenwald
  2007-01-18 10:02             ` Peter Dyballa
  2007-01-18 16:12           ` Sven Bretfeld
  2007-01-18 17:31           ` Reiner Steib
  2 siblings, 1 reply; 26+ messages in thread
From: Tom Rauchenwald @ 2007-01-18  4:58 UTC (permalink / raw)


Eli Zaretskii <eliz@gnu.org> writes:

> Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most
> probably, your problem.  Assuming you use Emacs 21.x (is that right?),
> Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and
> 8859-15 characters in the same message.  That is why you get
> iso-2022-jp encoding.

Are you sure about this? Both charsets are basically the same, for the
german-speaking area the only difference i can think of is the
addition of the euro-sign. So for a few umlauts it doesn't matter
which charset is used.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18  4:58           ` Tom Rauchenwald
@ 2007-01-18 10:02             ` Peter Dyballa
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Dyballa @ 2007-01-18 10:02 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 18.01.2007 um 05:58 schrieb Tom Rauchenwald:

> Eli Zaretskii writes:
>
>> Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most
>> probably, your problem.  Assuming you use Emacs 21.x (is that  
>> right?),
>> Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and
>> 8859-15 characters in the same message.  That is why you get
>> iso-2022-jp encoding.
>
> Are you sure about this? Both charsets are basically the same, for the
> german-speaking area the only difference i can think of is the
> addition of the euro-sign. So for a few umlauts it doesn't matter
> which charset is used.
>

The umlauts are the same. The differences are these:

;   oct   dec   hex    UCS2    UTF-8
;=====================================
¤ = 244 = 164 = A4 = U+00A4 =    C2 A4 : CURRENCY SIGN
------------------------------------------------------------------------ 
-----
€ = 244 = 164 = A4 = U+20AC = E2 82 AC : EURO SIGN

¦ = 246 = 166 = A6 = U+00A6 =    C2 A6 : BROKEN BAR
------------------------------------------------------------------------ 
-----
Š = 246 = 166 = A6 = U+0160 =    C5 A0 : LATIN CAPITAL LETTER S WITH  
CARON

¨ = 250 = 168 = A8 = U+00A8 =    C2 A8 : DIAERESIS
------------------------------------------------------------------------ 
-----
š = 250 = 168 = A8 = U+0161 =    C5 A1 : LATIN SMALL LETTER S WITH CARON

´ = 264 = 180 = B4 = U+00B4 =    C2 B4 : ACUTE ACCENT
------------------------------------------------------------------------ 
-----
Ž = 264 = 180 = B4 = U+017D =    C5 BD : LATIN CAPITAL LETTER Z WITH  
CARON

¸ = 270 = 184 = B8 = U+00B8 =    C2 B8 : CEDILLA
------------------------------------------------------------------------ 
-----
ž = 270 = 184 = B8 = U+017E =    C5 BE : LATIN SMALL LETTER Z WITH CARON

¼ = 274 = 188 = BC = U+00BC =    C2 BC : VULGAR FRACTION ONE QUARTER
½ = 275 = 189 = BD = U+00BD =    C2 BD : VULGAR FRACTION ONE HALF
¾ = 276 = 190 = BE = U+00BE =    C2 BE : VULGAR FRACTION THREE QUARTERS
------------------------------------------------------------------------ 
-----
Œ = 274 = 188 = BC = U+0152 =    C5 92 : LATIN CAPITAL LIGATURE OE
œ = 275 = 189 = BD = U+0153 =    C5 93 : LATIN SMALL LIGATURE OE
Ÿ = 276 = 190 = BE = U+0178 =    C5 B8 : LATIN CAPITAL LETTER Y WITH  
DIAERESIS



--
Mit friedvollen Grüßen

   Pete

A lot of us are working harder than we want, at things we don't like  
to do. Why? ...In order to afford the sort of existence we don't care  
to live.
                                    -- Bradford Angier

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18  4:20         ` Eli Zaretskii
  2007-01-18  4:58           ` Tom Rauchenwald
@ 2007-01-18 16:12           ` Sven Bretfeld
  2007-01-18 16:32             ` Peter Dyballa
  2007-01-18 21:36             ` Eli Zaretskii
  2007-01-18 17:31           ` Reiner Steib
  2 siblings, 2 replies; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-18 16:12 UTC (permalink / raw)
  Cc: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 2372 bytes --]

Hi Eli, hi list

Eli Zaretskii writes:
 > You didn't answer my first question: how these umlauts were produced.
 > Did you copy them from another text, perhaps?  And how is the way you
 > produced those umlauts differs from the way you type the
 > latin-iso8859-1 characters after the quotation?

I hope I get your question right. The umlauts that are encoded in
iso-8859-15 appear in the mail by executing the function
vm-reply-include-text. It is bound to the R-key when in an
Mailbox-Summary buffer in vm. It sets up an answer to the email under
the point citing the content of the original. The cited text keeps
umlauts encoded in iso-8859-15. Thus, it is an automatical function
not under direct control of the user. But I think, this must be the
place from where a possible solution has to set out. Namely, I need to
tell Emacs to translate iso-8859-15 encoded characters to iso-8859-1
when executing vm-reply-include-text. Regretably, I have no idea how
to do that.

Anyway, what Peter and Tom remark sounds strange. There must be some
difference in the umlauts of the two coding systems, at least for
Emacs. Because the iso-8859-15 umlauts of the cited text alway look
different from the ones I type in iso-8859-1. The former are displayed
in another font. Maybe it has to do with my general KDE-settings? I
live in Switzerland and we don't need the Euro character at
all. Possibly iso-8859-15 isn't installed on the system (anyway, I use
utf-8 as default for all KDE programs and for Emacs). Does Emacs
inherit parts of its own coding configuration from KDE? Just because I
wonder why iso-8859-15 does not appear in the list when I execute
describe-coding-system. Also, when I change the coding system using
M-x prefer-coding-system iso-8859-15 and type äöü, these characters
are described as belonging to iso-8559-1 when I check them with C-u
C-x =. Maybe iso-8859-15 is not supported fully with my present
Emacs configuration? Can this be the problem?

Sorry, guys. I really like to have this problem solved. I like vm as my
Email client and I don't want to change. I experimented with mutt
yesterday but I didn't fall in love with it as much as I did with
vm. Emacs rules!

Eli, do you think the problem wouldn't exist in Emacs 22? I use 21
from the standard package of Debian Etch.

Thank you for helping me

Sven

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 16:12           ` Sven Bretfeld
@ 2007-01-18 16:32             ` Peter Dyballa
  2007-01-18 18:27               ` Reiner Steib
  2007-01-18 21:36             ` Eli Zaretskii
  1 sibling, 1 reply; 26+ messages in thread
From: Peter Dyballa @ 2007-01-18 16:32 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 18.01.2007 um 17:12 schrieb Sven Bretfeld:

> Anyway, what Peter and Tom remark sounds strange. There must be some
> difference in the umlauts of the two coding systems, at least for
> Emacs.

Yes, there is in GNU Emacs. The characters don't stand alone, they  
have some encoding attribute. Unicode Emacs 23.0.0 cancels this,  
finally.


'locale -a' should display which encodings your system knows. But  
this should not have any influence on GNU Emacs: it has its own ELisp  
files to handle these encodings. And internally all files are held in  
a common encoding, which is then presented to the user by the  
encoding used in this buffer.


Probably vm has some restrictions. GNU Emacs 22 won't solve the  
problem, GNU Emacs 23 might! (I simply gave up and switched to a  
different MUA.)

--
Mit friedvollen Grüßen

   Pete

UNIX is user friendly, it's just picky about who its friends are.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18  4:20         ` Eli Zaretskii
  2007-01-18  4:58           ` Tom Rauchenwald
  2007-01-18 16:12           ` Sven Bretfeld
@ 2007-01-18 17:31           ` Reiner Steib
  2007-01-18 18:15             ` Peter Dyballa
  2 siblings, 1 reply; 26+ messages in thread
From: Reiner Steib @ 2007-01-18 17:31 UTC (permalink / raw)


On Thu, Jan 18 2007, Eli Zaretskii wrote:

>> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
[...]
>> To my understanding this means that Emacs is unable to translate
>> umlauts belonging to iso-8859-15 to the default charset I use for
>> umlauts when writing replies, i.e. iso-8859-1.
>
> In v21.x, it cannot.

Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it
is turned *on* by default.  (I didn't follow the complete thread, so I
might have missed some detail.)

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 17:31           ` Reiner Steib
@ 2007-01-18 18:15             ` Peter Dyballa
  2007-01-18 18:46               ` Reiner Steib
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Dyballa @ 2007-01-18 18:15 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 18.01.2007 um 18:31 schrieb Reiner Steib:

> Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it
> is turned *on* by default.  (I didn't follow the complete thread, so I
> might have missed some detail.)

It's also on in my GNU Emacs 22.0.92 – ISO 8859-1 and ISO 8859-15  
have only US ASCII characters in common ...

--
Mit friedvollen Grüßen

   Pete

Rain is saved up in cloud banks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 16:32             ` Peter Dyballa
@ 2007-01-18 18:27               ` Reiner Steib
  0 siblings, 0 replies; 26+ messages in thread
From: Reiner Steib @ 2007-01-18 18:27 UTC (permalink / raw)


On Thu, Jan 18 2007, Peter Dyballa wrote:

> Am 18.01.2007 um 17:12 schrieb Sven Bretfeld:
>
>> Anyway, what Peter and Tom remark sounds strange. There must be some
>> difference in the umlauts of the two coding systems, at least for
>> Emacs.
>
> Yes, there is in GNU Emacs. The characters don't stand alone, they have some
> encoding attribute. Unicode Emacs 23.0.0 cancels this,  finally.

Your usage of "GNU Emacs" and "Unicode Emacs" here is wrong or at
least misleading, IMHO.  See also (info "(efaq)Difference between
Emacs and XEmacs").

> Probably vm has some restrictions. GNU Emacs 22 won't solve the problem, 

NACK.  `unify-8859-on-encoding-mode' and `unify-8859-on-decoding-mode'
are sufficient to solve this problem WRT to iso-8859-{1,15} even in
Emacs 21.[34].

> GNU Emacs 23 might! (I simply gave up and switched to a different
> MUA.)

Even with Emacs 21.4, there are no problems with charsets when using
Gnus.  `rs-ucs-coding-system.el'[1] might be interesting when support
for windows-12xx and/or additional iso-8859 charset (-6, -10, -11,
-13, -16), see the table in [1].

Bye, Reiner.

[1] <http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/misc/rs-ucs-coding-system.el>
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 18:15             ` Peter Dyballa
@ 2007-01-18 18:46               ` Reiner Steib
  2007-01-18 22:14                 ` Sven Bretfeld
                                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Reiner Steib @ 2007-01-18 18:46 UTC (permalink / raw)


On Thu, Jan 18 2007, Peter Dyballa wrote:

> Am 18.01.2007 um 18:31 schrieb Reiner Steib:
>
>> Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it
>> is turned *on* by default.  (I didn't follow the complete thread, so I
>> might have missed some detail.)
>
> It's also on in my GNU Emacs 22.0.92 – 

Sure.  I didn't say that it isn't.

> ISO 8859-1 and ISO 8859-15 have only US ASCII characters in common

Large parts of the non-ASCII range are identical, cf. iso_8859-1(7)
and iso_8859-15(7).  I guess you meant something different, because in
article <F7CEE765-EBF8-41F1-AA96-ED03DB2DD0E7@Web.DE> you listed the
difference (exactly eight positions).

Bye, Reiner.

P.S.: Your X-Image-Url gives a 404 error.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 16:12           ` Sven Bretfeld
  2007-01-18 16:32             ` Peter Dyballa
@ 2007-01-18 21:36             ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-18 21:36 UTC (permalink / raw)


> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
> Date: Thu, 18 Jan 2007 17:12:48 +0100
> Cc: help-gnu-emacs@gnu.org
> 
> Eli Zaretskii writes:
>  > You didn't answer my first question: how these umlauts were produced.
>  > Did you copy them from another text, perhaps?  And how is the way you
>  > produced those umlauts differs from the way you type the
>  > latin-iso8859-1 characters after the quotation?
> 
> I hope I get your question right. The umlauts that are encoded in
> iso-8859-15 appear in the mail by executing the function
> vm-reply-include-text. It is bound to the R-key when in an
> Mailbox-Summary buffer in vm. It sets up an answer to the email under
> the point citing the content of the original. The cited text keeps
> umlauts encoded in iso-8859-15.

Yes, this answers my question.  I understand that these characters
come from the mail to which you reply.

> Anyway, what Peter and Tom remark sounds strange. There must be some
> difference in the umlauts of the two coding systems, at least for
> Emacs. Because the iso-8859-15 umlauts of the cited text alway look
> different from the ones I type in iso-8859-1. The former are displayed
> in another font.

Do you have something related in your ~/.emacs init file?  Does this
problem go away if you invoke Emacs with "emacs -q --no-site-file"?
(If doing so prevents you from using VM, then leave only the
VM-related customizations on .emacs and comment out everything else.)

> Eli, do you think the problem wouldn't exist in Emacs 22?

I don't know.  I need to understand your problem first.  But if it's
easy for you to try Emacs 22, I recommend doing so.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 18:46               ` Reiner Steib
@ 2007-01-18 22:14                 ` Sven Bretfeld
  2007-01-18 22:20                   ` Sven Bretfeld
  2007-01-19  0:24                 ` Peter Dyballa
       [not found]                 ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-18 22:14 UTC (permalink / raw)


This ist what Emacs tells me when I hit C-u C-x = with the point above
an 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 22:14                 ` Sven Bretfeld
@ 2007-01-18 22:20                   ` Sven Bretfeld
  2007-01-19 10:23                     ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-18 22:20 UTC (permalink / raw)
  Cc: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 2809 bytes --]

Oh, I'm sorry. In my last posting there was an iso-8859-15 encoded ö
in the cited output of C-u C-x =, that of course produced the problem
we are talking about. Now you can see what happens. Here
is the "cleaned" text:

Sven Bretfeld writes:
 > This ist what Emacs tells me when I hit C-u C-x = with the point above
 > an ö encoded with iso-8859-15:
 > 
 >   character: ö (07566, 3958, 0xf76)
 >     charset: latin-iso8859-15
 > 	     (Right-Hand Part of Latin Alphabet 9 (ISO/IEC 8859-15): ISO-IR-203)
 >  code point: 118
 >      syntax: word
 >    category: l:Latin  
 > buffer code: 0x8E 0xF6
 >   file code: not encodable by coding system no-conversion
 >        font: -Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-15
 > 
 > Here is the same for an ö encoded with iso-8859-1:
 > 
 >   character: ö (04366, 2294, 0x8f6)
 >     charset: latin-iso8859-1
 > 	     (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100)
 >  code point: 118
 >      syntax: word
 >    category: l:Latin  
 > buffer code: 0x81 0xF6
 >   file code: 0x81 0xF6 (encoded by coding system raw-text)
 >        font: -Adobe-Courier-Medium-R-Normal--24-240-75-75-M-150-ISO8859-1
 > 
 > I cannot make much of it. But it doesn't look the same to me. Maybe
 > somebody can see any hint to the problem here.
 > 
 > I have inserted 
 > 
 >  (require 'ucs-tables) 
 >  (unify-8859-on-encoding-mode 1)
 > 
 > in my .emacs file. But it didn't solve the problem. Maybe there is a
 > mistake or a shortcoming in the vm-pakage. What I found is a piece of
 > code in the file /usr/share/emacs/site-lisp/vm/vm-vars.el that looks
 > relevant to me, since it seems not to include a translation rule for
 > iso-8859-15 at all:
 > 
 > (defvar vm-mime-mule-charset-to-charset-alist
 >   '(
 >     (latin-iso8859-1    "iso-8859-1")
 >     (latin-iso8859-2    "iso-8859-2")
 >     (latin-iso8859-3    "iso-8859-3")
 >     (latin-iso8859-4    "iso-8859-4")
 >     (cyrillic-iso8859-5 "iso-8859-5")
 >     (arabic-iso8859-6   "iso-8859-6")
 >     (greek-iso8859-7    "iso-8859-7")
 >     (hebrew-iso8859-8   "iso-8859-8")
 >     (latin-iso8859-9    "iso-8859-9")
 >     (japanese-jisx0208  "iso-2022-jp")
 >     (korean-ksc5601     "iso-2022-kr")
 >     (chinese-gb2312     "iso-2022-jp")
 >     (sisheng            "iso-2022-jp")
 >     (thai-tis620        "iso-2022-jp")
 >    )
 >   "Alist that maps MULE character sets to matching MIME character sets.")
 > 
 > I've tried adding (latin-iso8859-15 "iso-8859-15") to the list, but
 > that didn't help.
 > 
 > Thanks again
 > 
 > Sven
 > 
 > 
 > 
 > _______________________________________________
 > help-gnu-emacs mailing list
 > help-gnu-emacs@gnu.org
 > http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 18:46               ` Reiner Steib
  2007-01-18 22:14                 ` Sven Bretfeld
@ 2007-01-19  0:24                 ` Peter Dyballa
  2007-01-19  9:37                   ` Reiner Steib
  2007-01-19 10:40                   ` Eli Zaretskii
       [not found]                 ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org>
  2 siblings, 2 replies; 26+ messages in thread
From: Peter Dyballa @ 2007-01-19  0:24 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 18.01.2007 um 19:46 schrieb Reiner Steib:

>> ISO 8859-1 and ISO 8859-15 have only US ASCII characters in common
>
> Large parts of the non-ASCII range are identical, cf. iso_8859-1(7)
> and iso_8859-15(7).  I guess you meant something different, because in
> article <F7CEE765-EBF8-41F1-AA96-ED03DB2DD0E7@Web.DE> you listed the
> difference (exactly eight positions).

Only the 7 bit US ASCII characters are equal. In the 8 bit area  
compare-windows finds every 8 bit character (for example ä and ä, or  
ö and ö) different ...

--
Mit friedvollen Grüßen

   Pete

   Basic, n.:
A programming language.  Related to certain social diseases in
that those who have it will not admit it in polite company.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-19  0:24                 ` Peter Dyballa
@ 2007-01-19  9:37                   ` Reiner Steib
  2007-01-19 10:40                   ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Reiner Steib @ 2007-01-19  9:37 UTC (permalink / raw)


On Fri, Jan 19 2007, Peter Dyballa wrote:

> Only the 7 bit US ASCII characters are equal. In the 8 bit area
> compare-windows finds every 8 bit character (for example ä and ä, or
> ö and ö) different ...

Sorry, I didn't realize that you refer to Emacs buffers here.  I was
talking about the definition of the two character sets.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-18 22:20                   ` Sven Bretfeld
@ 2007-01-19 10:23                     ` Eli Zaretskii
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-19 10:23 UTC (permalink / raw)


> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch>
> Date: Thu, 18 Jan 2007 23:20:47 +0100
> Cc: help-gnu-emacs@gnu.org
> 
>   character: ö (04366, 2294, 0x8f6)
>     charset: latin-iso8859-1
> 	     (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100)
>  code point: 118
>      syntax: word
>    category: l:Latin  
> buffer code: 0x81 0xF6
>   file code: 0x81 0xF6 (encoded by coding system raw-text)
                                                   ^^^^^^^^
This ``raw-text'' thingy might be a sign of the problem, as well as
this:

> buffer code: 0x8E 0xF6
>   file code: not encodable by coding system no-conversion
                                              ^^^^^^^^^^^^^
It's not normal for an email buffer to use any of these two
``encodings''.  What does Emacs tell you if you type
"M-: buffer-file-coding-system RET" in the buffer where you compose
such problematic email messages, the ones that mix iso-8859-1 and
iso-8859-15 characters and end up being encoded in iso-2022-jp?

Anyway, I'm beginning to think that maybe this is some bug in VM.  Did
you consider asking on the VM mailing list?

> in my .emacs file. But it didn't solve the problem. Maybe there is a
> mistake or a shortcoming in the vm-pakage. What I found is a piece of
> code in the file /usr/share/emacs/site-lisp/vm/vm-vars.el that looks
> relevant to me, since it seems not to include a translation rule for
> iso-8859-15 at all:

That might also be a problem.  But I asked you to try to remove from
your .emacs everything that is not required to us VM per se.  Did you
try that?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
  2007-01-19  0:24                 ` Peter Dyballa
  2007-01-19  9:37                   ` Reiner Steib
@ 2007-01-19 10:40                   ` Eli Zaretskii
  1 sibling, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2007-01-19 10:40 UTC (permalink / raw)


> From: Peter Dyballa <Peter_Dyballa@Web.DE>
> Date: Fri, 19 Jan 2007 01:24:48 +0100
> Cc: help-gnu-emacs@gnu.org
> 
> Only the 7 bit US ASCII characters are equal. In the 8 bit area  
> compare-windows finds every 8 bit character (for example ä and ä, or  
> ö and ö) different ...

Of course, due to unify-8859-on-encoding, what I see in this mail of
yours are pairs of identical characters, so your point doesn't get
across...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Customizing coding priority
       [not found]                 ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org>
@ 2007-01-19 14:04                   ` Piet van Oostrum
  2007-01-19 17:10                     ` [SOLVED] " Sven Bretfeld
  0 siblings, 1 reply; 26+ messages in thread
From: Piet van Oostrum @ 2007-01-19 14:04 UTC (permalink / raw)


>>>>> Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> (SB) wrote:

>SB> I have inserted 

>SB>  (require 'ucs-tables) 
>SB>  (unify-8859-on-encoding-mode 1)

>SB> in my .emacs file. But it didn't solve the problem. Maybe there is a
>SB> mistake or a shortcoming in the vm-pakage. 

The standard VM just doesn't have the code to encode messages with
characters from mixed charsets properly. If it can't find a single charset
in the message that encodes all characters it chooses iso-2022-jp which is
what you see in your message. This is an encoding that switches between
other encodings with escape sequences. But most people outside Japan will
not be able to read it.

There are two solutions for it AFAIK.

One is a small piece of code I wrote when the Euro was introduces, because
I experienced the same problems when using the €-sign in VM. You just put
this in your .emacs file. I am quite sure it won't work with XEmacs (never
tried) but with Emacs 22 it does work. It follows below. It presupposes
that unify-on-encoding was set.

The other possiblity is to download Robert Widhopf-Fenk's version of VM
from http://www.robf.de/Hacking/elisp. It contains more robust code that
also works on XEmacs. maybe you can load only vm-mime.el, but I am not sure
if it works with all the other files. I am using his whole package without
problems. 

,----
| (defun vm-sort-coding-systems-predicate (a b)
|   (> (length (memq a vm-coding-system-priorities))
|      (length (memq b vm-coding-system-priorities))))
| 
| (setq vm-coding-system-priorities 
|       '(iso-latin-1 iso-latin-9 mule-utf-8 mac-roman)
| ;      '(iso-latin-1 iso-latin-9 windows-1252 mule-utf-8 mac-roman)
|       mm-coding-system-priorities vm-coding-system-priorities)
| 
| ; The next line is for a noautoload vm.elc. Otherwise use "vm-mime".
| ;(eval-after-load "vm"
| ; The next line is for an autoload (default) vm.elc. Otherwise use "vm".
| (eval-after-load "vm-mime"
| '(defun vm-determine-proper-charset (beg end)
|   (save-excursion
|     (save-restriction
|       (narrow-to-region beg end)
|       (catch 'done
| 	(goto-char (point-min))
| 	(if (or vm-xemacs-mule-p 
| 		(and vm-fsfemacs-mule-p enable-multibyte-characters))
| 	    (let ((charsets (delq 'compound-text (find-coding-systems-region
| 					  (point-min) (point-max)))))
| 	      (cond ((equal charsets '(undecided))
| 		     "us-ascii")
| 		    (t
| 		     (setq charsets 
| 			   (sort charsets 'vm-sort-coding-systems-predicate))
| 		     (while charsets
| 		       (let ((cs (coding-system-get (pop charsets) 'mime-charset)))
| 			 (if cs
| 			     (throw 'done (symbol-name cs))))))))
| 	  (and (re-search-forward "[^\000-\177]" nil t)
| 	       (throw 'done (or vm-mime-8bit-composition-charset
| 				"iso-8859-1")))
| 	  (throw 'done vm-mime-7bit-composition-charset)))))))
| 
| ; This is only necessary for incoming mail in utf-7 or from Windows
| (require 'utf-7)
| (eval-after-load "vm"
|   '(setq vm-mime-mule-charset-to-coding-alist 
|       (cons (quote ("utf-7" utf-7)) 
| 	    ;code below is to accept mail from those morons that send 
| 	    ; latin1 or windows-1252 characters without a charset declaration
| 	    ; (or with charset=ascii)
| 	    (cons (quote ("us-ascii" windows-1252)) 
| 		  (cons (quote ("iso-8859-1" windows-1252)) 
| 			vm-mime-mule-charset-to-coding-alist)))))
`----

-- 
Piet van Oostrum <piet@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: piet@vanoostrum.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [SOLVED] Customizing coding priority
  2007-01-19 14:04                   ` Piet van Oostrum
@ 2007-01-19 17:10                     ` Sven Bretfeld
  2007-01-19 22:45                       ` Lennart Borgman (gmail)
  0 siblings, 1 reply; 26+ messages in thread
From: Sven Bretfeld @ 2007-01-19 17:10 UTC (permalink / raw)
  Cc: help-gnu-emacs

Piet van Oostrum writes:
 > 
 > There are two solutions for it AFAIK.

Wow!!! I almost cannot belief it. It works! Thank you very much, Piet, for
sharing the code. 

I hope that other newcommers to vm will find this thread or at least a
hint to the vm-version of Robert Widhopf-Fenk. I haven't found it with
Google, and I think I've tried every possible combination of search
items.

Thanks to all

Sven

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [SOLVED] Customizing coding priority
  2007-01-19 17:10                     ` [SOLVED] " Sven Bretfeld
@ 2007-01-19 22:45                       ` Lennart Borgman (gmail)
  0 siblings, 0 replies; 26+ messages in thread
From: Lennart Borgman (gmail) @ 2007-01-19 22:45 UTC (permalink / raw)
  Cc: Piet van Oostrum, help-gnu-emacs

Sven Bretfeld wrote:
> Piet van Oostrum writes:
>  > 
>  > There are two solutions for it AFAIK.
> 
> Wow!!! I almost cannot belief it. It works! Thank you very much, Piet, for
> sharing the code. 
> 
> I hope that other newcommers to vm will find this thread or at least a
> hint to the vm-version of Robert Widhopf-Fenk. I haven't found it with
> Google, and I think I've tried every possible combination of search
> items.
> 
> Thanks to all
> 
> Sven


Maybe tell about it on EmacsWiki?

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2007-01-19 22:45 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-17  0:09 Customizing coding priority Sven Bretfeld
2007-01-17  0:26 ` Peter Dyballa
2007-01-17  4:16 ` Eli Zaretskii
2007-01-17  7:59   ` Sven Bretfeld
2007-01-17 18:35     ` Eli Zaretskii
2007-01-17 22:44       ` Sven Bretfeld
2007-01-18  4:20         ` Eli Zaretskii
2007-01-18  4:58           ` Tom Rauchenwald
2007-01-18 10:02             ` Peter Dyballa
2007-01-18 16:12           ` Sven Bretfeld
2007-01-18 16:32             ` Peter Dyballa
2007-01-18 18:27               ` Reiner Steib
2007-01-18 21:36             ` Eli Zaretskii
2007-01-18 17:31           ` Reiner Steib
2007-01-18 18:15             ` Peter Dyballa
2007-01-18 18:46               ` Reiner Steib
2007-01-18 22:14                 ` Sven Bretfeld
2007-01-18 22:20                   ` Sven Bretfeld
2007-01-19 10:23                     ` Eli Zaretskii
2007-01-19  0:24                 ` Peter Dyballa
2007-01-19  9:37                   ` Reiner Steib
2007-01-19 10:40                   ` Eli Zaretskii
     [not found]                 ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org>
2007-01-19 14:04                   ` Piet van Oostrum
2007-01-19 17:10                     ` [SOLVED] " Sven Bretfeld
2007-01-19 22:45                       ` Lennart Borgman (gmail)
2007-01-17 20:38     ` Sven Bretfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).