* Customizing coding priority @ 2007-01-17 0:09 Sven Bretfeld 2007-01-17 0:26 ` Peter Dyballa 2007-01-17 4:16 ` Eli Zaretskii 0 siblings, 2 replies; 26+ messages in thread From: Sven Bretfeld @ 2007-01-17 0:09 UTC (permalink / raw) Dear List How can I change the coding priority list? With my language-environment set to utf-8 the command M-x describe-coding-system (default, current choices) gives: Priority order for recognizing coding systems when reading files: 1. mule-utf-8 (alias: utf-8) 2. iso-latin-1 (alias: iso-8859-1 latin-1) 3. iso-2022-jp (alias: junet) 4. iso-2022-7bit 5. iso-2022-7bit-lock (alias: iso-2022-int-1) 6. iso-2022-8bit-ss2 7. emacs-mule 8. raw-text 9. japanese-shift-jis (alias: shift_jis sjis) 10. chinese-big5 (alias: big5 cn-big5) 11. no-conversion (alias: binary) The problem is that vm doesn't recognize Mails written in iso-8859-15 correctly and therefore encodes my replies in iso-2022-jp instead. I think that the problem could be solved if I tell Emacs to use iso-8859-15 as 3rd priority for recognizing the coding system of files/mails. I have tried adding: '(set-coding-priority-list '((utf-8) (iso-8859-1) (iso-8859-15))) to the custom-set-variables. But that doesn't change anything. Can anybody help me? Thanks very much Sven ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 0:09 Customizing coding priority Sven Bretfeld @ 2007-01-17 0:26 ` Peter Dyballa 2007-01-17 4:16 ` Eli Zaretskii 1 sibling, 0 replies; 26+ messages in thread From: Peter Dyballa @ 2007-01-17 0:26 UTC (permalink / raw) Cc: help-gnu-emacs Am 17.01.2007 um 01:09 schrieb Sven Bretfeld: > How can I change the coding priority list? prefer-coding-system ? It can be used more often than once ... -- Mit friedvollen Grüßen Pete Es geht nix über eine elektrische Klobürste! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 0:09 Customizing coding priority Sven Bretfeld 2007-01-17 0:26 ` Peter Dyballa @ 2007-01-17 4:16 ` Eli Zaretskii 2007-01-17 7:59 ` Sven Bretfeld 1 sibling, 1 reply; 26+ messages in thread From: Eli Zaretskii @ 2007-01-17 4:16 UTC (permalink / raw) > From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> > Date: Wed, 17 Jan 2007 01:09:37 +0100 > > The problem is that vm doesn't recognize Mails written in iso-8859-15 > correctly Does that mean these mails are displayed incorrectly? > and therefore encodes my replies in iso-2022-jp instead. I don't use VM, but can't you use "C-x RET f" to force the encoding of your reply before you send it? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 4:16 ` Eli Zaretskii @ 2007-01-17 7:59 ` Sven Bretfeld 2007-01-17 18:35 ` Eli Zaretskii 2007-01-17 20:38 ` Sven Bretfeld 0 siblings, 2 replies; 26+ messages in thread From: Sven Bretfeld @ 2007-01-17 7:59 UTC (permalink / raw) Cc: help-gnu-emacs > > The problem is that vm doesn't recognize Mails written in iso-8859-15 > > correctly > > Does that mean these mails are displayed incorrectly? The crazy thing is that they are displayed correctly when I read them. They are still correct when I write my reply. But the recipient of the reply (as well as my FCC-Box) receives something like this sample: > lieber sven > hast du letzten donnerstag einen neuen text verteilt? wenn ja, > k^[,bv^[(Bnntest du mir den vielleicht mailen? wenn nein, k^[,bv^[(Bnntest du mir > mitteilen, bis zu welchem satz ihr gekommen seid? > vielen dank und sch^[,bv^[(Bne gr^[,b|^[(Bsse > maria ^[,Adv|^[(B The last line outside the citation just containes three German umlauts (I use this example for testing). The broken characters in the citation are umlauts too. The headers of these replies say text/plain; iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But the problem is still exactly the same. What else can it be? In the meantime I found out that I can send a correctly encoded reply to the above cited message if I don't add any umlauts in my own answer. The citation stays intact then. This is not really practically useful, but maybe it could help to trace the problem. Thanks for your help Sven ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 7:59 ` Sven Bretfeld @ 2007-01-17 18:35 ` Eli Zaretskii 2007-01-17 22:44 ` Sven Bretfeld 2007-01-17 20:38 ` Sven Bretfeld 1 sibling, 1 reply; 26+ messages in thread From: Eli Zaretskii @ 2007-01-17 18:35 UTC (permalink / raw) > From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> > Date: Wed, 17 Jan 2007 08:59:44 +0100 > Cc: help-gnu-emacs@gnu.org > > The last line outside the citation just containes three German umlauts > (I use this example for testing). The broken characters in the > citation are umlauts too. The headers of these replies say text/plain; > iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as > well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But > the problem is still exactly the same. What else can it be? How did you insert those umlauts, exactly? And what does Emacs display if you go to one of those characters and type "C-u C-x ="? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 18:35 ` Eli Zaretskii @ 2007-01-17 22:44 ` Sven Bretfeld 2007-01-18 4:20 ` Eli Zaretskii 0 siblings, 1 reply; 26+ messages in thread From: Sven Bretfeld @ 2007-01-17 22:44 UTC (permalink / raw) Cc: help-gnu-emacs Eli Zaretskii writes: > > The last line outside the citation just containes three German umlauts > > (I use this example for testing). The broken characters in the > > citation are umlauts too. The headers of these replies say text/plain; > > iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as > > well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But > > the problem is still exactly the same. What else can it be? > > How did you insert those umlauts, exactly? And what does Emacs > display if you go to one of those characters and type "C-u C-x ="? I've already tried this today. It's quite interesting. The umlauts in the quotations are displayed in a font different from the rest (somewhat smaller and bolder). These are said to be part of the iso-8859-15 charset when I type C-u C-x = (just as expected). All other characters are plain ascii (ASCII (ISO646 IRV)). It is different outside of quotations. When I add my own text after a quotation all umlauts I type belong to latin-iso8859-1. When I send this text it is destructed again. After a while I found out that, if I yank umlauts from the quotation to my own text (instead of typing them myself) and send this message to myself, it stays intact and is displayed fine, albeit the Content-type header says "charset unknown". To my understanding this means that Emacs is unable to translate umlauts belonging to iso-8859-15 to the default charset I use for umlauts when writing replies, i.e. iso-8859-1. Therefore the umlauts in quotations are kept untranslated in the coding system of the original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in quotations and iso-8859-1 in the text typed by myself seems to disturb Emacs so that it encodes the entire umlauts of my reply as iso-2022-jp when sending it. I concluded that I have to avoid iso-8859-1 completely in replies and use umlauts belonging to iso-8859-15 also for my own part of the text. But this doesn't work for some reason. I've tried to change the coding system and the input method of the reply buffer manually to iso-8859-15 (using C-x f RET and C-x RET C-\) but the umlauts I type always come as iso-8859-1. By the way, I've also tried to start Emacs with an almost empty .emacs file. The problem remaines. So it seems not to depend on any user specific configuration. It's really strange. Can it be a simple bug in vm? But it seems to work for other people. Thanks for your help Sven ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 22:44 ` Sven Bretfeld @ 2007-01-18 4:20 ` Eli Zaretskii 2007-01-18 4:58 ` Tom Rauchenwald ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Eli Zaretskii @ 2007-01-18 4:20 UTC (permalink / raw) > From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> > Date: Wed, 17 Jan 2007 23:44:42 +0100 > Cc: help-gnu-emacs@gnu.org > > > > How did you insert those umlauts, exactly? And what does Emacs > > display if you go to one of those characters and type "C-u C-x ="? > > I've already tried this today. It's quite interesting. The umlauts in > the quotations are displayed in a font different from the rest > (somewhat smaller and bolder). These are said to be part of the > iso-8859-15 charset when I type C-u C-x = (just as expected). All > other characters are plain ascii (ASCII (ISO646 IRV)). It is different > outside of quotations. When I add my own text after a quotation all > umlauts I type belong to latin-iso8859-1. You didn't answer my first question: how these umlauts were produced. Did you copy them from another text, perhaps? And how is the way you produced those umlauts differs from the way you type the latin-iso8859-1 characters after the quotation? Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most probably, your problem. Assuming you use Emacs 21.x (is that right?), Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and 8859-15 characters in the same message. That is why you get iso-2022-jp encoding. > To my understanding this means that Emacs is unable to translate > umlauts belonging to iso-8859-15 to the default charset I use for > umlauts when writing replies, i.e. iso-8859-1. In v21.x, it cannot. > Therefore the umlauts > in quotations are kept untranslated in the coding system of the > original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in > quotations and iso-8859-1 in the text typed by myself seems to disturb > Emacs so that it encodes the entire umlauts of my reply as iso-2022-jp > when sending it. Correct. > I concluded that I have to avoid iso-8859-1 completely in replies and > use umlauts belonging to iso-8859-15 also for my own part of the > text. But this doesn't work for some reason. I've tried to change the > coding system and the input method of the reply buffer manually to > iso-8859-15 (using C-x f RET and C-x RET C-\) but the umlauts I type > always come as iso-8859-1. C-x RET f does not affect the characters you type. You should try to set an input method (with C-u C-\) that produces iso-8859-15 characters, if there is such an input method in Emacs 21.x. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 4:20 ` Eli Zaretskii @ 2007-01-18 4:58 ` Tom Rauchenwald 2007-01-18 10:02 ` Peter Dyballa 2007-01-18 16:12 ` Sven Bretfeld 2007-01-18 17:31 ` Reiner Steib 2 siblings, 1 reply; 26+ messages in thread From: Tom Rauchenwald @ 2007-01-18 4:58 UTC (permalink / raw) Eli Zaretskii <eliz@gnu.org> writes: > Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most > probably, your problem. Assuming you use Emacs 21.x (is that right?), > Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and > 8859-15 characters in the same message. That is why you get > iso-2022-jp encoding. Are you sure about this? Both charsets are basically the same, for the german-speaking area the only difference i can think of is the addition of the euro-sign. So for a few umlauts it doesn't matter which charset is used. Tom ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 4:58 ` Tom Rauchenwald @ 2007-01-18 10:02 ` Peter Dyballa 0 siblings, 0 replies; 26+ messages in thread From: Peter Dyballa @ 2007-01-18 10:02 UTC (permalink / raw) Cc: help-gnu-emacs Am 18.01.2007 um 05:58 schrieb Tom Rauchenwald: > Eli Zaretskii writes: > >> Anyway, this mixing of latin-iso8859-1 and iso-8859-15 _is_, most >> probably, your problem. Assuming you use Emacs 21.x (is that >> right?), >> Emacs is trying to do what it cannot do in v21.x: encode 8859-1 and >> 8859-15 characters in the same message. That is why you get >> iso-2022-jp encoding. > > Are you sure about this? Both charsets are basically the same, for the > german-speaking area the only difference i can think of is the > addition of the euro-sign. So for a few umlauts it doesn't matter > which charset is used. > The umlauts are the same. The differences are these: ; oct dec hex UCS2 UTF-8 ;===================================== ¤ = 244 = 164 = A4 = U+00A4 = C2 A4 : CURRENCY SIGN ------------------------------------------------------------------------ ----- € = 244 = 164 = A4 = U+20AC = E2 82 AC : EURO SIGN ¦ = 246 = 166 = A6 = U+00A6 = C2 A6 : BROKEN BAR ------------------------------------------------------------------------ ----- Š = 246 = 166 = A6 = U+0160 = C5 A0 : LATIN CAPITAL LETTER S WITH CARON ¨ = 250 = 168 = A8 = U+00A8 = C2 A8 : DIAERESIS ------------------------------------------------------------------------ ----- š = 250 = 168 = A8 = U+0161 = C5 A1 : LATIN SMALL LETTER S WITH CARON ´ = 264 = 180 = B4 = U+00B4 = C2 B4 : ACUTE ACCENT ------------------------------------------------------------------------ ----- Ž = 264 = 180 = B4 = U+017D = C5 BD : LATIN CAPITAL LETTER Z WITH CARON ¸ = 270 = 184 = B8 = U+00B8 = C2 B8 : CEDILLA ------------------------------------------------------------------------ ----- ž = 270 = 184 = B8 = U+017E = C5 BE : LATIN SMALL LETTER Z WITH CARON ¼ = 274 = 188 = BC = U+00BC = C2 BC : VULGAR FRACTION ONE QUARTER ½ = 275 = 189 = BD = U+00BD = C2 BD : VULGAR FRACTION ONE HALF ¾ = 276 = 190 = BE = U+00BE = C2 BE : VULGAR FRACTION THREE QUARTERS ------------------------------------------------------------------------ ----- Œ = 274 = 188 = BC = U+0152 = C5 92 : LATIN CAPITAL LIGATURE OE œ = 275 = 189 = BD = U+0153 = C5 93 : LATIN SMALL LIGATURE OE Ÿ = 276 = 190 = BE = U+0178 = C5 B8 : LATIN CAPITAL LETTER Y WITH DIAERESIS -- Mit friedvollen Grüßen Pete A lot of us are working harder than we want, at things we don't like to do. Why? ...In order to afford the sort of existence we don't care to live. -- Bradford Angier ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 4:20 ` Eli Zaretskii 2007-01-18 4:58 ` Tom Rauchenwald @ 2007-01-18 16:12 ` Sven Bretfeld 2007-01-18 16:32 ` Peter Dyballa 2007-01-18 21:36 ` Eli Zaretskii 2007-01-18 17:31 ` Reiner Steib 2 siblings, 2 replies; 26+ messages in thread From: Sven Bretfeld @ 2007-01-18 16:12 UTC (permalink / raw) Cc: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 2372 bytes --] Hi Eli, hi list Eli Zaretskii writes: > You didn't answer my first question: how these umlauts were produced. > Did you copy them from another text, perhaps? And how is the way you > produced those umlauts differs from the way you type the > latin-iso8859-1 characters after the quotation? I hope I get your question right. The umlauts that are encoded in iso-8859-15 appear in the mail by executing the function vm-reply-include-text. It is bound to the R-key when in an Mailbox-Summary buffer in vm. It sets up an answer to the email under the point citing the content of the original. The cited text keeps umlauts encoded in iso-8859-15. Thus, it is an automatical function not under direct control of the user. But I think, this must be the place from where a possible solution has to set out. Namely, I need to tell Emacs to translate iso-8859-15 encoded characters to iso-8859-1 when executing vm-reply-include-text. Regretably, I have no idea how to do that. Anyway, what Peter and Tom remark sounds strange. There must be some difference in the umlauts of the two coding systems, at least for Emacs. Because the iso-8859-15 umlauts of the cited text alway look different from the ones I type in iso-8859-1. The former are displayed in another font. Maybe it has to do with my general KDE-settings? I live in Switzerland and we don't need the Euro character at all. Possibly iso-8859-15 isn't installed on the system (anyway, I use utf-8 as default for all KDE programs and for Emacs). Does Emacs inherit parts of its own coding configuration from KDE? Just because I wonder why iso-8859-15 does not appear in the list when I execute describe-coding-system. Also, when I change the coding system using M-x prefer-coding-system iso-8859-15 and type äöü, these characters are described as belonging to iso-8559-1 when I check them with C-u C-x =. Maybe iso-8859-15 is not supported fully with my present Emacs configuration? Can this be the problem? Sorry, guys. I really like to have this problem solved. I like vm as my Email client and I don't want to change. I experimented with mutt yesterday but I didn't fall in love with it as much as I did with vm. Emacs rules! Eli, do you think the problem wouldn't exist in Emacs 22? I use 21 from the standard package of Debian Etch. Thank you for helping me Sven [-- Attachment #2: Type: text/plain, Size: 152 bytes --] _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 16:12 ` Sven Bretfeld @ 2007-01-18 16:32 ` Peter Dyballa 2007-01-18 18:27 ` Reiner Steib 2007-01-18 21:36 ` Eli Zaretskii 1 sibling, 1 reply; 26+ messages in thread From: Peter Dyballa @ 2007-01-18 16:32 UTC (permalink / raw) Cc: help-gnu-emacs Am 18.01.2007 um 17:12 schrieb Sven Bretfeld: > Anyway, what Peter and Tom remark sounds strange. There must be some > difference in the umlauts of the two coding systems, at least for > Emacs. Yes, there is in GNU Emacs. The characters don't stand alone, they have some encoding attribute. Unicode Emacs 23.0.0 cancels this, finally. 'locale -a' should display which encodings your system knows. But this should not have any influence on GNU Emacs: it has its own ELisp files to handle these encodings. And internally all files are held in a common encoding, which is then presented to the user by the encoding used in this buffer. Probably vm has some restrictions. GNU Emacs 22 won't solve the problem, GNU Emacs 23 might! (I simply gave up and switched to a different MUA.) -- Mit friedvollen Grüßen Pete UNIX is user friendly, it's just picky about who its friends are. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 16:32 ` Peter Dyballa @ 2007-01-18 18:27 ` Reiner Steib 0 siblings, 0 replies; 26+ messages in thread From: Reiner Steib @ 2007-01-18 18:27 UTC (permalink / raw) On Thu, Jan 18 2007, Peter Dyballa wrote: > Am 18.01.2007 um 17:12 schrieb Sven Bretfeld: > >> Anyway, what Peter and Tom remark sounds strange. There must be some >> difference in the umlauts of the two coding systems, at least for >> Emacs. > > Yes, there is in GNU Emacs. The characters don't stand alone, they have some > encoding attribute. Unicode Emacs 23.0.0 cancels this, finally. Your usage of "GNU Emacs" and "Unicode Emacs" here is wrong or at least misleading, IMHO. See also (info "(efaq)Difference between Emacs and XEmacs"). > Probably vm has some restrictions. GNU Emacs 22 won't solve the problem, NACK. `unify-8859-on-encoding-mode' and `unify-8859-on-decoding-mode' are sufficient to solve this problem WRT to iso-8859-{1,15} even in Emacs 21.[34]. > GNU Emacs 23 might! (I simply gave up and switched to a different > MUA.) Even with Emacs 21.4, there are no problems with charsets when using Gnus. `rs-ucs-coding-system.el'[1] might be interesting when support for windows-12xx and/or additional iso-8859 charset (-6, -10, -11, -13, -16), see the table in [1]. Bye, Reiner. [1] <http://theotp1.physik.uni-ulm.de/~ste/comp/emacs/misc/rs-ucs-coding-system.el> -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 16:12 ` Sven Bretfeld 2007-01-18 16:32 ` Peter Dyballa @ 2007-01-18 21:36 ` Eli Zaretskii 1 sibling, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2007-01-18 21:36 UTC (permalink / raw) > From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> > Date: Thu, 18 Jan 2007 17:12:48 +0100 > Cc: help-gnu-emacs@gnu.org > > Eli Zaretskii writes: > > You didn't answer my first question: how these umlauts were produced. > > Did you copy them from another text, perhaps? And how is the way you > > produced those umlauts differs from the way you type the > > latin-iso8859-1 characters after the quotation? > > I hope I get your question right. The umlauts that are encoded in > iso-8859-15 appear in the mail by executing the function > vm-reply-include-text. It is bound to the R-key when in an > Mailbox-Summary buffer in vm. It sets up an answer to the email under > the point citing the content of the original. The cited text keeps > umlauts encoded in iso-8859-15. Yes, this answers my question. I understand that these characters come from the mail to which you reply. > Anyway, what Peter and Tom remark sounds strange. There must be some > difference in the umlauts of the two coding systems, at least for > Emacs. Because the iso-8859-15 umlauts of the cited text alway look > different from the ones I type in iso-8859-1. The former are displayed > in another font. Do you have something related in your ~/.emacs init file? Does this problem go away if you invoke Emacs with "emacs -q --no-site-file"? (If doing so prevents you from using VM, then leave only the VM-related customizations on .emacs and comment out everything else.) > Eli, do you think the problem wouldn't exist in Emacs 22? I don't know. I need to understand your problem first. But if it's easy for you to try Emacs 22, I recommend doing so. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 4:20 ` Eli Zaretskii 2007-01-18 4:58 ` Tom Rauchenwald 2007-01-18 16:12 ` Sven Bretfeld @ 2007-01-18 17:31 ` Reiner Steib 2007-01-18 18:15 ` Peter Dyballa 2 siblings, 1 reply; 26+ messages in thread From: Reiner Steib @ 2007-01-18 17:31 UTC (permalink / raw) On Thu, Jan 18 2007, Eli Zaretskii wrote: >> From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> [...] >> To my understanding this means that Emacs is unable to translate >> umlauts belonging to iso-8859-15 to the default charset I use for >> umlauts when writing replies, i.e. iso-8859-1. > > In v21.x, it cannot. Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it is turned *on* by default. (I didn't follow the complete thread, so I might have missed some detail.) Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 17:31 ` Reiner Steib @ 2007-01-18 18:15 ` Peter Dyballa 2007-01-18 18:46 ` Reiner Steib 0 siblings, 1 reply; 26+ messages in thread From: Peter Dyballa @ 2007-01-18 18:15 UTC (permalink / raw) Cc: help-gnu-emacs Am 18.01.2007 um 18:31 schrieb Reiner Steib: > Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it > is turned *on* by default. (I didn't follow the complete thread, so I > might have missed some detail.) It's also on in my GNU Emacs 22.0.92 – ISO 8859-1 and ISO 8859-15 have only US ASCII characters in common ... -- Mit friedvollen Grüßen Pete Rain is saved up in cloud banks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 18:15 ` Peter Dyballa @ 2007-01-18 18:46 ` Reiner Steib 2007-01-18 22:14 ` Sven Bretfeld ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Reiner Steib @ 2007-01-18 18:46 UTC (permalink / raw) On Thu, Jan 18 2007, Peter Dyballa wrote: > Am 18.01.2007 um 18:31 schrieb Reiner Steib: > >> Emacs 21.3 and 21.4 both include `unify-8859-on-encoding-mode' and it >> is turned *on* by default. (I didn't follow the complete thread, so I >> might have missed some detail.) > > It's also on in my GNU Emacs 22.0.92 – Sure. I didn't say that it isn't. > ISO 8859-1 and ISO 8859-15 have only US ASCII characters in common Large parts of the non-ASCII range are identical, cf. iso_8859-1(7) and iso_8859-15(7). I guess you meant something different, because in article <F7CEE765-EBF8-41F1-AA96-ED03DB2DD0E7@Web.DE> you listed the difference (exactly eight positions). Bye, Reiner. P.S.: Your X-Image-Url gives a 404 error. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 18:46 ` Reiner Steib @ 2007-01-18 22:14 ` Sven Bretfeld 2007-01-18 22:20 ` Sven Bretfeld 2007-01-19 0:24 ` Peter Dyballa [not found] ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org> 2 siblings, 1 reply; 26+ messages in thread From: Sven Bretfeld @ 2007-01-18 22:14 UTC (permalink / raw) This ist what Emacs tells me when I hit C-u C-x = with the point above an ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 22:14 ` Sven Bretfeld @ 2007-01-18 22:20 ` Sven Bretfeld 2007-01-19 10:23 ` Eli Zaretskii 0 siblings, 1 reply; 26+ messages in thread From: Sven Bretfeld @ 2007-01-18 22:20 UTC (permalink / raw) Cc: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 2809 bytes --] Oh, I'm sorry. In my last posting there was an iso-8859-15 encoded ö in the cited output of C-u C-x =, that of course produced the problem we are talking about. Now you can see what happens. Here is the "cleaned" text: Sven Bretfeld writes: > This ist what Emacs tells me when I hit C-u C-x = with the point above > an ö encoded with iso-8859-15: > > character: ö (07566, 3958, 0xf76) > charset: latin-iso8859-15 > (Right-Hand Part of Latin Alphabet 9 (ISO/IEC 8859-15): ISO-IR-203) > code point: 118 > syntax: word > category: l:Latin > buffer code: 0x8E 0xF6 > file code: not encodable by coding system no-conversion > font: -Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-15 > > Here is the same for an ö encoded with iso-8859-1: > > character: ö (04366, 2294, 0x8f6) > charset: latin-iso8859-1 > (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100) > code point: 118 > syntax: word > category: l:Latin > buffer code: 0x81 0xF6 > file code: 0x81 0xF6 (encoded by coding system raw-text) > font: -Adobe-Courier-Medium-R-Normal--24-240-75-75-M-150-ISO8859-1 > > I cannot make much of it. But it doesn't look the same to me. Maybe > somebody can see any hint to the problem here. > > I have inserted > > (require 'ucs-tables) > (unify-8859-on-encoding-mode 1) > > in my .emacs file. But it didn't solve the problem. Maybe there is a > mistake or a shortcoming in the vm-pakage. What I found is a piece of > code in the file /usr/share/emacs/site-lisp/vm/vm-vars.el that looks > relevant to me, since it seems not to include a translation rule for > iso-8859-15 at all: > > (defvar vm-mime-mule-charset-to-charset-alist > '( > (latin-iso8859-1 "iso-8859-1") > (latin-iso8859-2 "iso-8859-2") > (latin-iso8859-3 "iso-8859-3") > (latin-iso8859-4 "iso-8859-4") > (cyrillic-iso8859-5 "iso-8859-5") > (arabic-iso8859-6 "iso-8859-6") > (greek-iso8859-7 "iso-8859-7") > (hebrew-iso8859-8 "iso-8859-8") > (latin-iso8859-9 "iso-8859-9") > (japanese-jisx0208 "iso-2022-jp") > (korean-ksc5601 "iso-2022-kr") > (chinese-gb2312 "iso-2022-jp") > (sisheng "iso-2022-jp") > (thai-tis620 "iso-2022-jp") > ) > "Alist that maps MULE character sets to matching MIME character sets.") > > I've tried adding (latin-iso8859-15 "iso-8859-15") to the list, but > that didn't help. > > Thanks again > > Sven > > > > _______________________________________________ > help-gnu-emacs mailing list > help-gnu-emacs@gnu.org > http://lists.gnu.org/mailman/listinfo/help-gnu-emacs [-- Attachment #2: Type: text/plain, Size: 152 bytes --] _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 22:20 ` Sven Bretfeld @ 2007-01-19 10:23 ` Eli Zaretskii 0 siblings, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2007-01-19 10:23 UTC (permalink / raw) > From: Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> > Date: Thu, 18 Jan 2007 23:20:47 +0100 > Cc: help-gnu-emacs@gnu.org > > character: ö (04366, 2294, 0x8f6) > charset: latin-iso8859-1 > (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100) > code point: 118 > syntax: word > category: l:Latin > buffer code: 0x81 0xF6 > file code: 0x81 0xF6 (encoded by coding system raw-text) ^^^^^^^^ This ``raw-text'' thingy might be a sign of the problem, as well as this: > buffer code: 0x8E 0xF6 > file code: not encodable by coding system no-conversion ^^^^^^^^^^^^^ It's not normal for an email buffer to use any of these two ``encodings''. What does Emacs tell you if you type "M-: buffer-file-coding-system RET" in the buffer where you compose such problematic email messages, the ones that mix iso-8859-1 and iso-8859-15 characters and end up being encoded in iso-2022-jp? Anyway, I'm beginning to think that maybe this is some bug in VM. Did you consider asking on the VM mailing list? > in my .emacs file. But it didn't solve the problem. Maybe there is a > mistake or a shortcoming in the vm-pakage. What I found is a piece of > code in the file /usr/share/emacs/site-lisp/vm/vm-vars.el that looks > relevant to me, since it seems not to include a translation rule for > iso-8859-15 at all: That might also be a problem. But I asked you to try to remove from your .emacs everything that is not required to us VM per se. Did you try that? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-18 18:46 ` Reiner Steib 2007-01-18 22:14 ` Sven Bretfeld @ 2007-01-19 0:24 ` Peter Dyballa 2007-01-19 9:37 ` Reiner Steib 2007-01-19 10:40 ` Eli Zaretskii [not found] ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org> 2 siblings, 2 replies; 26+ messages in thread From: Peter Dyballa @ 2007-01-19 0:24 UTC (permalink / raw) Cc: help-gnu-emacs Am 18.01.2007 um 19:46 schrieb Reiner Steib: >> ISO 8859-1 and ISO 8859-15 have only US ASCII characters in common > > Large parts of the non-ASCII range are identical, cf. iso_8859-1(7) > and iso_8859-15(7). I guess you meant something different, because in > article <F7CEE765-EBF8-41F1-AA96-ED03DB2DD0E7@Web.DE> you listed the > difference (exactly eight positions). Only the 7 bit US ASCII characters are equal. In the 8 bit area compare-windows finds every 8 bit character (for example ä and ä, or ö and ö) different ... -- Mit friedvollen Grüßen Pete Basic, n.: A programming language. Related to certain social diseases in that those who have it will not admit it in polite company. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-19 0:24 ` Peter Dyballa @ 2007-01-19 9:37 ` Reiner Steib 2007-01-19 10:40 ` Eli Zaretskii 1 sibling, 0 replies; 26+ messages in thread From: Reiner Steib @ 2007-01-19 9:37 UTC (permalink / raw) On Fri, Jan 19 2007, Peter Dyballa wrote: > Only the 7 bit US ASCII characters are equal. In the 8 bit area > compare-windows finds every 8 bit character (for example ä and ä, or > ö and ö) different ... Sorry, I didn't realize that you refer to Emacs buffers here. I was talking about the definition of the two character sets. Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-19 0:24 ` Peter Dyballa 2007-01-19 9:37 ` Reiner Steib @ 2007-01-19 10:40 ` Eli Zaretskii 1 sibling, 0 replies; 26+ messages in thread From: Eli Zaretskii @ 2007-01-19 10:40 UTC (permalink / raw) > From: Peter Dyballa <Peter_Dyballa@Web.DE> > Date: Fri, 19 Jan 2007 01:24:48 +0100 > Cc: help-gnu-emacs@gnu.org > > Only the 7 bit US ASCII characters are equal. In the 8 bit area > compare-windows finds every 8 bit character (for example ä and ä, or > ö and ö) different ... Of course, due to unify-8859-on-encoding, what I see in this mail of yours are pairs of identical characters, so your point doesn't get across... ^ permalink raw reply [flat|nested] 26+ messages in thread
[parent not found: <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org>]
* Re: Customizing coding priority [not found] ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org> @ 2007-01-19 14:04 ` Piet van Oostrum 2007-01-19 17:10 ` [SOLVED] " Sven Bretfeld 0 siblings, 1 reply; 26+ messages in thread From: Piet van Oostrum @ 2007-01-19 14:04 UTC (permalink / raw) >>>>> Sven Bretfeld <sven.bretfeld@relwi.unibe.ch> (SB) wrote: >SB> I have inserted >SB> (require 'ucs-tables) >SB> (unify-8859-on-encoding-mode 1) >SB> in my .emacs file. But it didn't solve the problem. Maybe there is a >SB> mistake or a shortcoming in the vm-pakage. The standard VM just doesn't have the code to encode messages with characters from mixed charsets properly. If it can't find a single charset in the message that encodes all characters it chooses iso-2022-jp which is what you see in your message. This is an encoding that switches between other encodings with escape sequences. But most people outside Japan will not be able to read it. There are two solutions for it AFAIK. One is a small piece of code I wrote when the Euro was introduces, because I experienced the same problems when using the €-sign in VM. You just put this in your .emacs file. I am quite sure it won't work with XEmacs (never tried) but with Emacs 22 it does work. It follows below. It presupposes that unify-on-encoding was set. The other possiblity is to download Robert Widhopf-Fenk's version of VM from http://www.robf.de/Hacking/elisp. It contains more robust code that also works on XEmacs. maybe you can load only vm-mime.el, but I am not sure if it works with all the other files. I am using his whole package without problems. ,---- | (defun vm-sort-coding-systems-predicate (a b) | (> (length (memq a vm-coding-system-priorities)) | (length (memq b vm-coding-system-priorities)))) | | (setq vm-coding-system-priorities | '(iso-latin-1 iso-latin-9 mule-utf-8 mac-roman) | ; '(iso-latin-1 iso-latin-9 windows-1252 mule-utf-8 mac-roman) | mm-coding-system-priorities vm-coding-system-priorities) | | ; The next line is for a noautoload vm.elc. Otherwise use "vm-mime". | ;(eval-after-load "vm" | ; The next line is for an autoload (default) vm.elc. Otherwise use "vm". | (eval-after-load "vm-mime" | '(defun vm-determine-proper-charset (beg end) | (save-excursion | (save-restriction | (narrow-to-region beg end) | (catch 'done | (goto-char (point-min)) | (if (or vm-xemacs-mule-p | (and vm-fsfemacs-mule-p enable-multibyte-characters)) | (let ((charsets (delq 'compound-text (find-coding-systems-region | (point-min) (point-max))))) | (cond ((equal charsets '(undecided)) | "us-ascii") | (t | (setq charsets | (sort charsets 'vm-sort-coding-systems-predicate)) | (while charsets | (let ((cs (coding-system-get (pop charsets) 'mime-charset))) | (if cs | (throw 'done (symbol-name cs)))))))) | (and (re-search-forward "[^\000-\177]" nil t) | (throw 'done (or vm-mime-8bit-composition-charset | "iso-8859-1"))) | (throw 'done vm-mime-7bit-composition-charset))))))) | | ; This is only necessary for incoming mail in utf-7 or from Windows | (require 'utf-7) | (eval-after-load "vm" | '(setq vm-mime-mule-charset-to-coding-alist | (cons (quote ("utf-7" utf-7)) | ;code below is to accept mail from those morons that send | ; latin1 or windows-1252 characters without a charset declaration | ; (or with charset=ascii) | (cons (quote ("us-ascii" windows-1252)) | (cons (quote ("iso-8859-1" windows-1252)) | vm-mime-mule-charset-to-coding-alist))))) `---- -- Piet van Oostrum <piet@cs.uu.nl> URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4] Private email: piet@vanoostrum.org ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [SOLVED] Customizing coding priority 2007-01-19 14:04 ` Piet van Oostrum @ 2007-01-19 17:10 ` Sven Bretfeld 2007-01-19 22:45 ` Lennart Borgman (gmail) 0 siblings, 1 reply; 26+ messages in thread From: Sven Bretfeld @ 2007-01-19 17:10 UTC (permalink / raw) Cc: help-gnu-emacs Piet van Oostrum writes: > > There are two solutions for it AFAIK. Wow!!! I almost cannot belief it. It works! Thank you very much, Piet, for sharing the code. I hope that other newcommers to vm will find this thread or at least a hint to the vm-version of Robert Widhopf-Fenk. I haven't found it with Google, and I think I've tried every possible combination of search items. Thanks to all Sven ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [SOLVED] Customizing coding priority 2007-01-19 17:10 ` [SOLVED] " Sven Bretfeld @ 2007-01-19 22:45 ` Lennart Borgman (gmail) 0 siblings, 0 replies; 26+ messages in thread From: Lennart Borgman (gmail) @ 2007-01-19 22:45 UTC (permalink / raw) Cc: Piet van Oostrum, help-gnu-emacs Sven Bretfeld wrote: > Piet van Oostrum writes: > > > > There are two solutions for it AFAIK. > > Wow!!! I almost cannot belief it. It works! Thank you very much, Piet, for > sharing the code. > > I hope that other newcommers to vm will find this thread or at least a > hint to the vm-version of Robert Widhopf-Fenk. I haven't found it with > Google, and I think I've tried every possible combination of search > items. > > Thanks to all > > Sven Maybe tell about it on EmacsWiki? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Customizing coding priority 2007-01-17 7:59 ` Sven Bretfeld 2007-01-17 18:35 ` Eli Zaretskii @ 2007-01-17 20:38 ` Sven Bretfeld 1 sibling, 0 replies; 26+ messages in thread From: Sven Bretfeld @ 2007-01-17 20:38 UTC (permalink / raw) Cc: help-gnu-emacs >> The last line outside the citation just containes three German umlauts >> (I use this example for testing). The broken characters in the >> citation are umlauts too. The headers of these replies say text/plain; >> iso-2022-jp was the Content-type. I've tried C-x RET iso-8859-15 as >> well as Peter's suggestion M-x prefer-coding-system iso-8859-15. But >> the problem is still exactly the same. What else can it be? > How did you insert those umlauts, exactly? And what does Emacs > display if you go to one of those characters and type "C-u C-x ="? I've already tried this today. It's quite interesting. The umlauts in the quotations are displayed in a font different from the rest (somewhat smaller and bolder). They are said to be part of the iso-8859-15 charset when I type C-u C-x = (just as expected). All other characters are plain (ASCII (ISO646 IRV)). When I add my own text after the quotation all umlauts I type belong to latin-iso8859-1. When I send this text it is destructed again. Then I found out that, if I yank umlauts from the quotation to my own text (instead of typing them myself), the sent message stays intact and is displayed fine after receiving it again, albeit the Content-type header says "charset unknown". To my understanding this means that Emacs is unable to translate umlauts belonging to iso-8859-15 to the default charset I use for umlauts when writing replies, i.e. iso-8859-1. Therefore the umlauts in quotations are kept untranslated in the coding system of the original sender, i.e. iso-8859-15. This mixture of iso-8859-15 in quotations and iso-8859-1 in my part seems to disturb Emacs so that it encodes the entire umlauts of my reply as iso-2022-jp when sending it. So I've tried to avoid iso-8859-1 completely in replies and use umlauts belonging to iso-8859-15 also for my own part of the text. But this doesn't work for any reason. I've tried to change the coding system and the input method of the reply buffer manually to iso-8859-15 (using C-x f RET and C-x RET C-\) but umlauts I type always come as iso-8859-1. By the way, I also tried to start Emacs with an almost empty .emacs file. The problem stays. So it seems not to depend on any user specific configuration. It's really strange. Can it be a simple bug in vm? But it seems to work for other people. Thanks for your help Sven (Sorry that I have destroyed the threading. I received Eli's last message on another account while experimenting with my problem. I forgot the keep option in fetchmailrc and had to copy and paste the message.) ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2007-01-19 22:45 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-01-17 0:09 Customizing coding priority Sven Bretfeld 2007-01-17 0:26 ` Peter Dyballa 2007-01-17 4:16 ` Eli Zaretskii 2007-01-17 7:59 ` Sven Bretfeld 2007-01-17 18:35 ` Eli Zaretskii 2007-01-17 22:44 ` Sven Bretfeld 2007-01-18 4:20 ` Eli Zaretskii 2007-01-18 4:58 ` Tom Rauchenwald 2007-01-18 10:02 ` Peter Dyballa 2007-01-18 16:12 ` Sven Bretfeld 2007-01-18 16:32 ` Peter Dyballa 2007-01-18 18:27 ` Reiner Steib 2007-01-18 21:36 ` Eli Zaretskii 2007-01-18 17:31 ` Reiner Steib 2007-01-18 18:15 ` Peter Dyballa 2007-01-18 18:46 ` Reiner Steib 2007-01-18 22:14 ` Sven Bretfeld 2007-01-18 22:20 ` Sven Bretfeld 2007-01-19 10:23 ` Eli Zaretskii 2007-01-19 0:24 ` Peter Dyballa 2007-01-19 9:37 ` Reiner Steib 2007-01-19 10:40 ` Eli Zaretskii [not found] ` <mailman.3276.1169158455.2155.help-gnu-emacs@gnu.org> 2007-01-19 14:04 ` Piet van Oostrum 2007-01-19 17:10 ` [SOLVED] " Sven Bretfeld 2007-01-19 22:45 ` Lennart Borgman (gmail) 2007-01-17 20:38 ` Sven Bretfeld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).