* garbage chars when pasting French chars into emacs
@ 2012-02-01 20:41 ken
2012-02-01 21:23 ` Eli Zaretskii
2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter
0 siblings, 2 replies; 9+ messages in thread
From: ken @ 2012-02-01 20:41 UTC (permalink / raw)
To: GNU Emacs List
Just to be comprehensive I'll state at the outset that I'm using Linux
(CentOS 5.7), so this is the environment emacs is working in. From a
shell I get this:
$ set|grep -i lang
LANG=en_US.UTF-8
Now I pull up a webpage with some French on it:
<http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the
source code of this page, I see at the top:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
So this page is presented in UTF-8.
Firefox is also set to present pages in UTF-8: View -> Character
Encoding -> UTF-8
But when I copy and paste the text from "Francais" to "invisible, 1964)"
inclusive, many of the characters aren't rendered correctly; I get
"garbage" characters in their stead, e.g., the second-to-last line
appears something like this:
* L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960
Other lines are improperly rendered also.
I'd like to fix this. And if possible understand why this doesn't work,
so I might be able to diagnose these problems for myself.
BTW, I'm using GNU Emacs 21.4.1 (i686-redhat-linux-gnu, X toolkit, Xaw3d
scroll bars) of 2011-04-28 on builder10.centos.org
Yes, it's an older version, but it's the latest from the CentOS 5.7
distribution. (Blame Red Hat.)
Thanks for your help.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-01 20:41 garbage chars when pasting French chars into emacs ken
@ 2012-02-01 21:23 ` Eli Zaretskii
2012-02-02 2:39 ` ken
2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter
1 sibling, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2012-02-01 21:23 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Wed, 01 Feb 2012 15:41:42 -0500
> From: ken <gebser@mousecar.com>
>
> Just to be comprehensive I'll state at the outset that I'm using Linux
> (CentOS 5.7), so this is the environment emacs is working in. From a
> shell I get this:
>
> $ set|grep -i lang
> LANG=en_US.UTF-8
>
> Now I pull up a webpage with some French on it:
> <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the
> source code of this page, I see at the top:
>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
>
> So this page is presented in UTF-8.
>
> Firefox is also set to present pages in UTF-8: View -> Character
> Encoding -> UTF-8
>
> But when I copy and paste the text from "Francais" to "invisible, 1964)"
> inclusive, many of the characters aren't rendered correctly; I get
> "garbage" characters in their stead, e.g., the second-to-last line
> appears something like this:
>
> * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960
>
> Other lines are improperly rendered also.
>
> I'd like to fix this. And if possible understand why this doesn't work,
> so I might be able to diagnose these problems for myself.
What is your value of selection-coding-system? Try setting it to
something like ctext-with-extensions.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-01 20:41 garbage chars when pasting French chars into emacs ken
2012-02-01 21:23 ` Eli Zaretskii
@ 2012-02-01 21:29 ` Philipp Haselwarter
1 sibling, 0 replies; 9+ messages in thread
From: Philipp Haselwarter @ 2012-02-01 21:29 UTC (permalink / raw)
To: help-gnu-emacs
Works perfectly for me (GNU Emacs 24).
Do you use emacs in a terminal or in graphic mode?
Is building a newer emacs an option? If it's a single user workstation
or if you're the only emacs user it shouldn't be hard.
--
Philipp Haselwarter
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-01 21:23 ` Eli Zaretskii
@ 2012-02-02 2:39 ` ken
2012-02-02 3:55 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: ken @ 2012-02-02 2:39 UTC (permalink / raw)
To: GNU Emacs List
On 02/01/2012 04:23 PM Eli Zaretskii wrote:
>> Date: Wed, 01 Feb 2012 15:41:42 -0500
>> From: ken <gebser@mousecar.com>
>>
>> Just to be comprehensive I'll state at the outset that I'm using Linux
>> (CentOS 5.7), so this is the environment emacs is working in. From a
>> shell I get this:
>>
>> $ set|grep -i lang
>> LANG=en_US.UTF-8
>>
>> Now I pull up a webpage with some French on it:
>> <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>. Examining the
>> source code of this page, I see at the top:
>>
>> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
>>
>> So this page is presented in UTF-8.
>>
>> Firefox is also set to present pages in UTF-8: View -> Character
>> Encoding -> UTF-8
>>
>> But when I copy and paste the text from "Francais" to "invisible, 1964)"
>> inclusive, many of the characters aren't rendered correctly; I get
>> "garbage" characters in their stead, e.g., the second-to-last line
>> appears something like this:
>>
>> * L^[$(B!G^[$(C)+^[(Bil et l^[$(B!G^[(Besprit, Gallimard, 1960
>>
>> Other lines are improperly rendered also.
>>
>> I'd like to fix this. And if possible understand why this doesn't work,
>> so I might be able to diagnose these problems for myself.
>
> What is your value of selection-coding-system? Try setting it to
> something like ctext-with-extensions.
Thanks, Eli,
Immediately prior to doing the copy-and-paste I ran all of these:
(set-language-environment 'UTF-8)
(set-default-coding-systems 'utf-8)
(setq file-name-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)
(setq coding-system-for-write 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-clipboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(prefer-coding-system 'utf-8)
(modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix)
Following your advice, I ran
(set-selection-coding-system 'ctext-with-extensions)
and then did the same copy-and-paste again. This got more of the
characters correct, but not all of them. So we're a lot closer.... Got
another suggestion?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-02 2:39 ` ken
@ 2012-02-02 3:55 ` Eli Zaretskii
2012-02-02 20:00 ` ken
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2012-02-02 3:55 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Wed, 01 Feb 2012 21:39:22 -0500
> From: ken <gebser@mousecar.com>
>
> > What is your value of selection-coding-system? Try setting it to
> > something like ctext-with-extensions.
>
> Thanks, Eli,
>
> Immediately prior to doing the copy-and-paste I ran all of these:
>
> (set-language-environment 'UTF-8)
> (set-default-coding-systems 'utf-8)
> (setq file-name-coding-system 'utf-8)
> (setq default-buffer-file-coding-system 'utf-8)
> (setq coding-system-for-write 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-clipboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (prefer-coding-system 'utf-8)
> (modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix)
Not a good idea, I'm afraid: the UTF-8 support in Emacs 21 left a lot
to be desired.
> Following your advice, I ran
>
> (set-selection-coding-system 'ctext-with-extensions)
>
> and then did the same copy-and-paste again. This got more of the
> characters correct, but not all of them.
Can you show the original text, and then what you have after pasting?
I need to see which characters aren't pasted correctly.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-02 3:55 ` Eli Zaretskii
@ 2012-02-02 20:00 ` ken
2012-02-03 7:31 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: ken @ 2012-02-02 20:00 UTC (permalink / raw)
To: GNU Emacs List
On 02/01/2012 10:55 PM Eli Zaretskii wrote:
>> Date: Wed, 01 Feb 2012 21:39:22 -0500
>> From: ken <gebser@mousecar.com>
>>
>>> What is your value of selection-coding-system? Try setting it to
>>> something like ctext-with-extensions.
>> Thanks, Eli,
>>
>> Immediately prior to doing the copy-and-paste I ran all of these:
>>
>> (set-language-environment 'UTF-8)
>> (set-default-coding-systems 'utf-8)
>> (setq file-name-coding-system 'utf-8)
>> (setq default-buffer-file-coding-system 'utf-8)
>> (setq coding-system-for-write 'utf-8)
>> (set-keyboard-coding-system 'utf-8)
>> (set-terminal-coding-system 'utf-8)
>> (set-clipboard-coding-system 'utf-8)
>> (set-selection-coding-system 'utf-8)
>> (prefer-coding-system 'utf-8)
>> (modify-coding-system-alist 'process "\\*shell\\*\\'" 'utf-8-unix)
>
> Not a good idea, I'm afraid: the UTF-8 support in Emacs 21 left a lot
> to be desired.
>
>> Following your advice, I ran
>>
>> (set-selection-coding-system 'ctext-with-extensions)
>>
>> and then did the same copy-and-paste again. This got more of the
>> characters correct, but not all of them.
Looking again, I see my eyes must have been malfunctioning for the text
pasted into emacs is rendered correctly. However, when I try to save
the text, I'm presented with the minibuffer message "Select coding
system (default iso-2022-jp-2): ". At the same time a second buffer
opens under the first giving the options for coding system:
===================================================================
These default coding systems were tried:
mule-utf-8-unix iso-latin-1
However, none of them safely encodes the target text.
Select one of the following safe coding systems:
iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion
ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock
iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc
lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc
greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc
iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc
iso-latin-2-with-esc iso-latin-1-with-esc
in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc
chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc
===================================================================
Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly,
I'd like to save into utf-8, this to avoid having problems with this
same file at a later time when I add more text to it.
> Can you show the original text, and then what you have after pasting?
> I need to see which characters aren't pasted correctly.
The original text must have been edited out somewhere in the thread.
It's at <http://www.wikilivres.info/wiki/Maurice_Merleau-Ponty>, from
"Francais" through the end of the unordered list, ie, the line ending
with "et l’invisible, 1964".
Here it is:
==========================================================
Français
* La Structure du comportement, 1942
* La Phénoménologie de la perception, 1945
* Humanisme et terreur, 1947
* Sens et non-sens, 1948
* Les Sciences de l’homme et la phénoménologie
* Les Relations avec autrui chez l’enfant
* Éloge de la philosophie, leçon inaugurale faite au collège de
France, le jeudi 15 janvier 1953
* Signes, 1960
* L’œil et l’esprit, Gallimard, 1960
* Le visible et l’invisible, 1964
==========================================================
Thanks again for your help.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: garbage chars when pasting French chars into emacs
2012-02-02 20:00 ` ken
@ 2012-02-03 7:31 ` Eli Zaretskii
2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2012-02-03 7:31 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Thu, 02 Feb 2012 15:00:37 -0500
> From: ken <gebser@mousecar.com>
>
> >> Following your advice, I ran
> >>
> >> (set-selection-coding-system 'ctext-with-extensions)
> >>
> >> and then did the same copy-and-paste again. This got more of the
> >> characters correct, but not all of them.
>
> Looking again, I see my eyes must have been malfunctioning for the text
> pasted into emacs is rendered correctly. However, when I try to save
> the text, I'm presented with the minibuffer message "Select coding
> system (default iso-2022-jp-2): ". At the same time a second buffer
> opens under the first giving the options for coding system:
>
> ===================================================================
> These default coding systems were tried:
> mule-utf-8-unix iso-latin-1
> However, none of them safely encodes the target text.
>
> Select one of the following safe coding systems:
> iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion
> ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock
> iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc
> lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc
> greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc
> iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc
> iso-latin-2-with-esc iso-latin-1-with-esc
> in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc
> chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc
> ===================================================================
>
> Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly,
> I'd like to save into utf-8
You can't, not with Emacs 21. In that version, the same character in
different character sets was treated as 2 different characters. Also,
the mule-utf-8 character set didn't include the Latin-1 characters.
The only suggestion I have is to try iso-latin-1-with-esc (you will
see above that this is one of the possibilities suggested by Emacs),
it should at least produce a Latin-1 encoded file, which will be
easier on you later.
You really need to upgrade your Emacs, if you want to use UTF-8.
^ permalink raw reply [flat|nested] 9+ messages in thread
* different distro [was: Re: garbage chars when pasting French chars into emacs]
2012-02-03 7:31 ` Eli Zaretskii
@ 2012-02-06 18:01 ` ken
2012-02-06 20:15 ` Peter Dyballa
0 siblings, 1 reply; 9+ messages in thread
From: ken @ 2012-02-06 18:01 UTC (permalink / raw)
To: GNU Emacs List
On 02/03/2012 02:31 AM Eli Zaretskii wrote:
>> Date: Thu, 02 Feb 2012 15:00:37 -0500
>> From: ken <gebser@mousecar.com>
>>
>>>> ....
>>
>> Select one of the following safe coding systems:
>> iso-2022-jp-2 x-ctext iso-2022-7bit raw-text emacs-mule no-conversion
>> ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock
>> iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc
>> lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc
>> greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc
>> iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc
>> iso-latin-2-with-esc iso-latin-1-with-esc
>> in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc
>> chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc
>> ===================================================================
>>
>> Entering "utf-8" into the minibuffer, of course, doesn't work. Frankly,
>> I'd like to save into utf-8
>
> You can't, not with Emacs 21. In that version, the same character in
> different character sets was treated as 2 different characters. Also,
> the mule-utf-8 character set didn't include the Latin-1 characters.
>
> The only suggestion I have is to try iso-latin-1-with-esc (you will
> see above that this is one of the possibilities suggested by Emacs),
> it should at least produce a Latin-1 encoded file, which will be
> easier on you later.
>
> You really need to upgrade your Emacs, if you want to use UTF-8.
Thanks, Eli, for all the good advice.
I was using v.22 previously, but was hoping to stay with the standard
(CentOS/Red Hat) distribution. This situation and your insight have
convinced me I've got to use a more recent emacs. I'm thinking now of
doing an edgier distro, some rpm/yum based thing like Fedora.
Suggestions appreciated.
Thanks again to all for the great tips.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: different distro [was: Re: garbage chars when pasting French chars into emacs]
2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken
@ 2012-02-06 20:15 ` Peter Dyballa
0 siblings, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2012-02-06 20:15 UTC (permalink / raw)
To: gebser; +Cc: GNU Emacs List
Am 6.2.2012 um 19:01 schrieb ken:
> I'm thinking now of doing an edgier distro, some rpm/yum based thing like Fedora. Suggestions appreciated.
Fedora is a good choice – although I was bugged first when copy & paste like your's also inserted "control chars".
--
Greetings
Pete
Encryption, n.:
A powerful algorithmic encoding technique employed in the creation of computer manuals.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-02-06 20:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-01 20:41 garbage chars when pasting French chars into emacs ken
2012-02-01 21:23 ` Eli Zaretskii
2012-02-02 2:39 ` ken
2012-02-02 3:55 ` Eli Zaretskii
2012-02-02 20:00 ` ken
2012-02-03 7:31 ` Eli Zaretskii
2012-02-06 18:01 ` different distro [was: Re: garbage chars when pasting French chars into emacs] ken
2012-02-06 20:15 ` Peter Dyballa
2012-02-01 21:29 ` garbage chars when pasting French chars into emacs Philipp Haselwarter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).