* garbage chars when pasting French chars into emacs @ 2012-02-01 20:41 ken 2012-02-01 21:23 ` Eli Zaretskii 2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter 0 siblings, 2 replies; 9+ messages in thread From: ken @ 2012-02-01 20:41 UTC (permalink / raw) To: GNU Emacs List Just to be comprehensive I'll state at the outset that I'm using Linux (CentOS 5.7), so this is the environment emacs is working in. From a shell I get this: $ set|grep -i lang LANG=en_US.UTF-8 Now I pull up a webpage with some French on it: <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the source code of this page, I see at the top: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> So this page is presented in UTF-8. Firefox is also set to present pages in UTF-8: View -> Character Encoding -> UTF-8 But when I copy and paste the text from "Francais" to "invisible, 1964)" inclusive, many of the characters aren't rendered correctly; I get "garbage" characters in their stead, e.g., the second-to-last line appears something like this: * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960 Other lines are improperly rendered also. I'd like to fix this. And if possible understand why this doesn't work, so I might be able to diagnose these problems for myself. BTW, I'm using GNU Emacs 21.4.1 (i686-redhat-linux-gnu, X toolkit, Xaw3d scroll bars) of 2011-04-28 on builder10.centos.org Yes, it's an older version, but it's the latest from the CentOS 5.7 distribution. (Blame Red Hat.) Thanks for your help. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-01 20:41 garbage chars when pasting French chars into emacs ken @ 2012-02-01 21:23 ` Eli Zaretskii 2012-02-02 2:39 ` ken 2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter 1 sibling, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2012-02-01 21:23 UTC (permalink / raw) To: help-gnu-emacs > Date: Wed, 01 Feb 2012 15:41:42 -0500 > From: ken <gebser@mousecar.com> > > Just to be comprehensive I'll state at the outset that I'm using Linux > (CentOS 5.7), so this is the environment emacs is working in. From a > shell I get this: > > $ set|grep -i lang > LANG=en_US.UTF-8 > > Now I pull up a webpage with some French on it: > <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the > source code of this page, I see at the top: > > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> > > So this page is presented in UTF-8. > > Firefox is also set to present pages in UTF-8: View -> Character > Encoding -> UTF-8 > > But when I copy and paste the text from "Francais" to "invisible, 1964)" > inclusive, many of the characters aren't rendered correctly; I get > "garbage" characters in their stead, e.g., the second-to-last line > appears something like this: > > * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960 > > Other lines are improperly rendered also. > > I'd like to fix this. And if possible understand why this doesn't work, > so I might be able to diagnose these problems for myself. What is your value of selection-coding-system? Try setting it to something like ctext-with-extensions. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-01 21:23 ` Eli Zaretskii @ 2012-02-02 2:39 ` ken 2012-02-02 3:55 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: ken @ 2012-02-02 2:39 UTC (permalink / raw) To: GNU Emacs List On 02/01/2012 04:23 PM Eli Zaretskii wrote: >> Date: Wed, 01 Feb 2012 15:41:42 -0500 >> From: ken <gebser@mousecar.com> >> >> Just to be comprehensive I'll state at the outset that I'm using Linux >> (CentOS 5.7), so this is the environment emacs is working in. From a >> shell I get this: >> >> $ set|grep -i lang >> LANG=en_US.UTF-8 >> >> Now I pull up a webpage with some French on it: >> <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the >> source code of this page, I see at the top: >> >> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> >> >> So this page is presented in UTF-8. >> >> Firefox is also set to present pages in UTF-8: View -> Character >> Encoding -> UTF-8 >> >> But when I copy and paste the text from "Francais" to "invisible, 1964)" >> inclusive, many of the characters aren't rendered correctly; I get >> "garbage" characters in their stead, e.g., the second-to-last line >> appears something like this: >> >> * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960 >> >> Other lines are improperly rendered also. >> >> I'd like to fix this. And if possible understand why this doesn't work, >> so I might be able to diagnose these problems for myself. > > What is your value of selection-coding-system? Try setting it to > something like ctext-with-extensions. Thanks, Eli, Immediately prior to doing the copy-and-paste I ran all of these: (set-language-environment 'UTF-8) (set-default-coding-systems 'utf-8) (setq file-name-coding-system 'utf-8) (setq default-buffer-file-coding-system 'utf-8) (setq coding-system-for-write 'utf-8) (set-keyboard-coding-system 'utf-8) (set-terminal-coding-system 'utf-8) (set-clipboard-coding-system 'utf-8) (set-selection-coding-system 'utf-8) (prefer-coding-system 'utf-8) (modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix) Following your advice, I ran (set-selection-coding-system 'ctext-with-extensions) and then did the same copy-and-paste again. This got more of the characters correct, but not all of them. So we're a lot closer.... Got another suggestion? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-02 2:39 ` ken @ 2012-02-02 3:55 ` Eli Zaretskii 2012-02-02 20:00 ` ken 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2012-02-02 3:55 UTC (permalink / raw) To: help-gnu-emacs > Date: Wed, 01 Feb 2012 21:39:22 -0500 > From: ken <gebser@mousecar.com> > > > What is your value of selection-coding-system? Try setting it to > > something like ctext-with-extensions. > > Thanks, Eli, > > Immediately prior to doing the copy-and-paste I ran all of these: > > (set-language-environment 'UTF-8) > (set-default-coding-systems 'utf-8) > (setq file-name-coding-system 'utf-8) > (setq default-buffer-file-coding-system 'utf-8) > (setq coding-system-for-write 'utf-8) > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) > (set-clipboard-coding-system 'utf-8) > (set-selection-coding-system 'utf-8) > (prefer-coding-system 'utf-8) > (modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix) Not a good idea, I'm afraid: the UTF-8 support in Emacs 21 left a lot to be desired. > Following your advice, I ran > > (set-selection-coding-system 'ctext-with-extensions) > > and then did the same copy-and-paste again. This got more of the > characters correct, but not all of them. Can you show the original text, and then what you have after pasting? I need to see which characters aren't pasted correctly. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-02 3:55 ` Eli Zaretskii @ 2012-02-02 20:00 ` ken 2012-02-03 7:31 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: ken @ 2012-02-02 20:00 UTC (permalink / raw) To: GNU Emacs List On 02/01/2012 10:55 PM Eli Zaretskii wrote: >> Date: Wed, 01 Feb 2012 21:39:22 -0500 >> From: ken <gebser@mousecar.com> >> >>> What is your value of selection-coding-system? Try setting it to >>> something like ctext-with-extensions. >> Thanks, Eli, >> >> Immediately prior to doing the copy-and-paste I ran all of these: >> >> (set-language-environment 'UTF-8) >> (set-default-coding-systems 'utf-8) >> (setq file-name-coding-system 'utf-8) >> (setq default-buffer-file-coding-system 'utf-8) >> (setq coding-system-for-write 'utf-8) >> (set-keyboard-coding-system 'utf-8) >> (set-terminal-coding-system 'utf-8) >> (set-clipboard-coding-system 'utf-8) >> (set-selection-coding-system 'utf-8) >> (prefer-coding-system 'utf-8) >> (modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix) > > Not a good idea, I'm afraid: the UTF-8 support in Emacs 21 left a lot > to be desired. > >> Following your advice, I ran >> >> (set-selection-coding-system 'ctext-with-extensions) >> >> and then did the same copy-and-paste again. This got more of the >> characters correct, but not all of them. Looking again, I see my eyes must have been malfunctioning for the text pasted into emacs is rendered correctly. However, when I try to save the text, I'm presented with the minibuffer message "Select coding system (default iso-2022-jp-2): ". At the same time a second buffer opens under the first giving the options for coding system: =================================================================== These default coding systems were tried: mule-utf-8-unix iso-latin-1 However, none of them safely encodes the target text. Select one of the following safe coding systems: iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc iso-latin-2-with-esc iso-latin-1-with-esc in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc =================================================================== Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly, I'd like to save into utf-8, this to avoid having problems with this same file at a later time when I add more text to it. > Can you show the original text, and then what you have after pasting? > I need to see which characters aren't pasted correctly. The original text must have been edited out somewhere in the thread. It's at <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>, from "Francais" through the end of the unordered list, ie, the line ending with "et l’invisible, 1964". Here it is: ========================================================== Français * La Structure du comportement, 1942 * La Phénoménologie de la perception, 1945 * Humanisme et terreur, 1947 * Sens et non-sens, 1948 * Les Sciences de l’homme et la phénoménologie * Les Relations avec autrui chez l’enfant * Éloge de la philosophie, leçon inaugurale faite au collège de France, le jeudi 15 janvier 1953 * Signes, 1960 * L’œil et l’esprit, Gallimard, 1960 * Le visible et l’invisible, 1964 ========================================================== Thanks again for your help. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-02 20:00 ` ken @ 2012-02-03 7:31 ` Eli Zaretskii 2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2012-02-03 7:31 UTC (permalink / raw) To: help-gnu-emacs > Date: Thu, 02 Feb 2012 15:00:37 -0500 > From: ken <gebser@mousecar.com> > > >> Following your advice, I ran > >> > >> (set-selection-coding-system 'ctext-with-extensions) > >> > >> and then did the same copy-and-paste again. This got more of the > >> characters correct, but not all of them. > > Looking again, I see my eyes must have been malfunctioning for the text > pasted into emacs is rendered correctly. However, when I try to save > the text, I'm presented with the minibuffer message "Select coding > system (default iso-2022-jp-2): ". At the same time a second buffer > opens under the first giving the options for coding system: > > =================================================================== > These default coding systems were tried: > mule-utf-8-unix iso-latin-1 > However, none of them safely encodes the target text. > > Select one of the following safe coding systems: > iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion > ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock > iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc > lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc > greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc > iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc > iso-latin-2-with-esc iso-latin-1-with-esc > in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc > chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc > =================================================================== > > Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly, > I'd like to save into utf-8 You can't, not with Emacs 21. In that version, the same character in different character sets was treated as 2 different characters. Also, the mule-utf-8 character set didn't include the Latin-1 characters. The only suggestion I have is to try iso-latin-1-with-esc (you will see above that this is one of the possibilities suggested by Emacs), it should at least produce a Latin-1 encoded file, which will be easier on you later. You really need to upgrade your Emacs, if you want to use UTF-8. ^ permalink raw reply [flat|nested] 9+ messages in thread
* different distro [was: Re: garbage chars when pasting French chars into emacs] 2012-02-03 7:31 ` Eli Zaretskii @ 2012-02-06 18:01 ` ken 2012-02-06 20:15 ` Peter Dyballa 0 siblings, 1 reply; 9+ messages in thread From: ken @ 2012-02-06 18:01 UTC (permalink / raw) To: GNU Emacs List On 02/03/2012 02:31 AM Eli Zaretskii wrote: >> Date: Thu, 02 Feb 2012 15:00:37 -0500 >> From: ken <gebser@mousecar.com> >> >>>> .... >> >> Select one of the following safe coding systems: >> iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion >> ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock >> iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc >> lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc >> greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc >> iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc >> iso-latin-2-with-esc iso-latin-1-with-esc >> in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc >> chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc >> =================================================================== >> >> Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly, >> I'd like to save into utf-8 > > You can't, not with Emacs 21. In that version, the same character in > different character sets was treated as 2 different characters. Also, > the mule-utf-8 character set didn't include the Latin-1 characters. > > The only suggestion I have is to try iso-latin-1-with-esc (you will > see above that this is one of the possibilities suggested by Emacs), > it should at least produce a Latin-1 encoded file, which will be > easier on you later. > > You really need to upgrade your Emacs, if you want to use UTF-8. Thanks, Eli, for all the good advice. I was using v.22 previously, but was hoping to stay with the standard (CentOS/Red Hat) distribution. This situation and your insight have convinced me I've got to use a more recent emacs. I'm thinking now of doing an edgier distro, some rpm/yum based thing like Fedora. Suggestions appreciated. Thanks again to all for the great tips. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: different distro [was: Re: garbage chars when pasting French chars into emacs] 2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken @ 2012-02-06 20:15 ` Peter Dyballa 0 siblings, 0 replies; 9+ messages in thread From: Peter Dyballa @ 2012-02-06 20:15 UTC (permalink / raw) To: gebser; +Cc: GNU Emacs List Am 6.2.2012 um 19:01 schrieb ken: > I'm thinking now of doing an edgier distro, some rpm/yum based thing like Fedora. Suggestions appreciated. Fedora is a good choice – although I was bugged first when copy & paste like your's also inserted "control chars". -- Greetings Pete Encryption, n.: A powerful algorithmic encoding technique employed in the creation of computer manuals. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs 2012-02-01 20:41 garbage chars when pasting French chars into emacs ken 2012-02-01 21:23 ` Eli Zaretskii @ 2012-02-01 21:29 ` Philipp Haselwarter 1 sibling, 0 replies; 9+ messages in thread From: Philipp Haselwarter @ 2012-02-01 21:29 UTC (permalink / raw) To: help-gnu-emacs Works perfectly for me (GNU Emacs 24). Do you use emacs in a terminal or in graphic mode? Is building a newer emacs an option? If it's a single user workstation or if you're the only emacs user it shouldn't be hard. -- Philipp Haselwarter ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-02-06 20:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-01 20:41 garbage chars when pasting French chars into emacs ken 2012-02-01 21:23 ` Eli Zaretskii 2012-02-02 2:39 ` ken 2012-02-02 3:55 ` Eli Zaretskii 2012-02-02 20:00 ` ken 2012-02-03 7:31 ` Eli Zaretskii 2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken 2012-02-06 20:15 ` Peter Dyballa 2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).