* two utf-8 questions
@ 2006-01-28 19:30 B. T. Raven
2006-01-28 23:13 ` Peter Dyballa
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: B. T. Raven @ 2006-01-28 19:30 UTC (permalink / raw)
1)
Even though the following is in my .emacs:
(setq default-buffer-file-coding-system 'utf-8)
when I type 'C-x ret f' I see the prompt:
Coding system for visited file (default, nil) instead of (default, utf-8)
There are contexts where I need to specify utf-8 here, or else the file
won't be saved in the correct format. This seems to be true even if the
file header ;; -*- coding: utf-8 -*- is present. Why is this?
2)
Has it been pretty much decided that copypasting Unicode using the
clipboard between emacs and MS apps is impossible for OS versions earlier
than W2000? Even though most of what I read on emacs-devel is Greek to me,
I try to glean whatever I can from postings relating to w32. After many
months of lurking there I am beginning to suspect that some of the
following settings are particularly inappropriate:
(set-language-environment 'UTF-8)
(set-default-coding-systems 'utf-8)
(setq file-name-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)
(setq coding-system-for-write 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-clipboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(prefer-coding-system 'utf-8)
(modify-coding-system-alist 'process
"[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
What could the clipboard and selection variables be set to in order to
give me a better chance of copypasting Unicode successfully?
Thanks,
Ed
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-01-28 19:30 two utf-8 questions B. T. Raven
@ 2006-01-28 23:13 ` Peter Dyballa
[not found] ` <mailman.81.1138502554.3044.help-gnu-emacs@gnu.org>
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Peter Dyballa @ 2006-01-28 23:13 UTC (permalink / raw)
Cc: help-gnu-emacs
Am 28.01.2006 um 19:30 schrieb B. T. Raven:
> 1)
>
> Even though the following is in my .emacs:
>
> (setq default-buffer-file-coding-system 'utf-8)
>
> when I type 'C-x ret f' I see the prompt:
>
> Coding system for visited file (default, nil) instead of (default,
> utf-8)
>
Maybe you need a: (prefer-coding-system 'utf-8)
>
>
> 2)
>
> Has it been pretty much decided that copypasting Unicode using the
> clipboard between emacs and MS apps is impossible for OS versions
> earlier
> than W2000?
I don't think so, but it's natural to assume that all Losedows from
last millennium has no idea of Unicode.
> After many
> months of lurking there I am beginning to suspect that some of the
> following settings are particularly inappropriate:
>
> (set-language-environment 'UTF-8)
> (set-default-coding-systems 'utf-8)
> (setq file-name-coding-system 'utf-8)
> (setq default-buffer-file-coding-system 'utf-8)
> (setq coding-system-for-write 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-clipboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (prefer-coding-system 'utf-8)
> (modify-coding-system-alist 'process
> "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
>
> What could the clipboard and selection variables be set to in
> order to
> give me a better chance of copypasting Unicode successfully?
>
I would think it's some CP125x code page (1250: Extended European,
1251: Cyrillic, 1252: ANSI, something like ISO 8859-1 or ISO
Latin-1). And it's probably best to set this as a default -- or how
do you print or email an UTF-8 text from this Losedows?
--
Greetings
Pete
When confronted with actual numbers, a mathematician is at a loss.
(Steffen Hokland)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
[not found] ` <mailman.81.1138502554.3044.help-gnu-emacs@gnu.org>
@ 2006-01-29 5:06 ` B. T. Raven
2006-01-29 10:00 ` Peter Dyballa
0 siblings, 1 reply; 8+ messages in thread
From: B. T. Raven @ 2006-01-29 5:06 UTC (permalink / raw)
"Peter Dyballa" <Peter_Dyballa@Web.DE> wrote in message
news:mailman.81.1138502554.3044.help-gnu-emacs@gnu.org...
>
> Am 28.01.2006 um 19:30 schrieb B. T. Raven:
>
> > 1)
> >
> > Even though the following is in my .emacs:
> >
> > (setq default-buffer-file-coding-system 'utf-8)
> >
> > when I type 'C-x ret f' I see the prompt:
> >
> > Coding system for visited file (default, nil) instead of (default,
> > utf-8)
> >
>
> Maybe you need a: (prefer-coding-system 'utf-8)
I already have that (see below). If I respond to the prompt with a bare
carriage return the '-u' in the mode line changes to -- and the file won't
be saved with utf-8 encoding. It seems that shouldn't happen.
>
> >
> >
> > 2)
> >
> > Has it been pretty much decided that copypasting Unicode using the
> > clipboard between emacs and MS apps is impossible for OS versions
> > earlier
> > than W2000?
>
> I don't think so, but it's natural to assume that all Losedows from
> last millennium has no idea of Unicode.
>
> > After many
> > months of lurking there I am beginning to suspect that some of the
> > following settings are particularly inappropriate:
> >
> > (set-language-environment 'UTF-8)
> > (set-default-coding-systems 'utf-8)
> > (setq file-name-coding-system 'utf-8)
> > (setq default-buffer-file-coding-system 'utf-8)
> > (setq coding-system-for-write 'utf-8)
> > (set-keyboard-coding-system 'utf-8)
> > (set-terminal-coding-system 'utf-8)
> > (set-clipboard-coding-system 'utf-8)
> > (set-selection-coding-system 'utf-8)
> > (prefer-coding-system 'utf-8)
> > (modify-coding-system-alist 'process
> > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
> >
> > What could the clipboard and selection variables be set to in
> > order to
> > give me a better chance of copypasting Unicode successfully?
> >
>
> I would think it's some CP125x code page (1250: Extended European,
> 1251: Cyrillic, 1252: ANSI, something like ISO 8859-1 or ISO
> Latin-1). And it's probably best to set this as a default -- or how
> do you print or email an UTF-8 text from this Losedows?
By filtering files saved as utf-8 through OpenOffice. I have changed
clipboard and selection variables back to 'iso-8859-1. That way I can
copypaste at least the more common Western European diacritics. I have the
same ver. 21.3 on a w2000 laptop but I can't transfer arbitrary utf-8
encoded characters via the clipboard there either.
>
> --
> Greetings
>
> Pete
>
> When confronted with actual numbers, a mathematician is at a loss.
> (Steffen Hokland)
This happens to teen-age boys with girls, too.
Thanks,
Ed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-01-28 19:30 two utf-8 questions B. T. Raven
2006-01-28 23:13 ` Peter Dyballa
[not found] ` <mailman.81.1138502554.3044.help-gnu-emacs@gnu.org>
@ 2006-01-29 9:50 ` François Gannaz
2006-02-03 23:39 ` Stefan Monnier
3 siblings, 0 replies; 8+ messages in thread
From: François Gannaz @ 2006-01-29 9:50 UTC (permalink / raw)
Le sam 28 jan 19:30, B. T. Raven a écrit :
> 1)
>
> Even though the following is in my .emacs:
>
> (setq default-buffer-file-coding-system 'utf-8)
>
> when I type 'C-x ret f' I see the prompt:
>
> Coding system for visited file (default, nil) instead of (default, utf-8)
>
> There are contexts where I need to specify utf-8 here, or else the file
> won't be saved in the correct format. This seems to be true even if the
> file header ;; -*- coding: utf-8 -*- is present. Why is this?
Emacs'support of utf-8 is still a bit mysterious to me. You might try a
few other settings:
(setq-default enable-multibyte-characters t)
(setq locale-coding-system 'utf-8)
(setq coding-category-list '(coding-category-utf-8
coding-category-iso-8-1 coding-category-iso-8-2
coding-category-iso-7-tight coding-category-iso-7
coding-category-iso-7-else coding-category-iso-8-else
coding-category-sjis coding-category-utf-16-le coding-category-utf-16-be
coding-category-emacs-mule coding-category-raw-text coding-category-big5
coding-category-ccl coding-category-binary))
Have you also tried to see what happened without your init file
(run "emacs -q") and then evaluating your utf-8 settings? There might be
some package you use that breaks the unicode support.
> 2)
>
> Has it been pretty much decided that copypasting Unicode using the
> clipboard between emacs and MS apps is impossible for OS versions earlier
> than W2000? Even though most of what I read on emacs-devel is Greek to me,
> I try to glean whatever I can from postings relating to w32. After many
> months of lurking there I am beginning to suspect that some of the
> following settings are particularly inappropriate:
>
> (set-language-environment 'UTF-8)
> (set-default-coding-systems 'utf-8)
> (setq file-name-coding-system 'utf-8)
> (setq default-buffer-file-coding-system 'utf-8)
> (setq coding-system-for-write 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-clipboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (prefer-coding-system 'utf-8)
> (modify-coding-system-alist 'process
> "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
>
> What could the clipboard and selection variables be set to in order to
> give me a better chance of copypasting Unicode successfully?
I don't know well the Windows world, but I'm sure that win98 and earlier
don't handle unicode. Of course, some apps can use utf-8, but the system
won't be able to handle this. For other OS like winME, I have no clue.
Is it even possible to copypaste unicode between ms apps ?
--
François Gannaz
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-01-29 5:06 ` B. T. Raven
@ 2006-01-29 10:00 ` Peter Dyballa
0 siblings, 0 replies; 8+ messages in thread
From: Peter Dyballa @ 2006-01-29 10:00 UTC (permalink / raw)
Cc: help-gnu-emacs
Am 29.01.2006 um 05:06 schrieb B. T. Raven:
>>> when I type 'C-x ret f' I see the prompt:
>>>
>>> Coding system for visited file (default, nil) instead of (default,
>>> utf-8)
>>>
>>
>> Maybe you need a: (prefer-coding-system 'utf-8)
>
> I already have that (see below). If I respond to the prompt with a
> bare
> carriage return the '-u' in the mode line changes to -- and the
> file won't
> be saved with utf-8 encoding. It seems that shouldn't happen.
Once you've set GNU Emacs to handle file contents as UTF-8 you should
not need to open files with C-x RET f etc. It should stick to this
encoding automatically when you open them with v or e in dired-mode
or with C-x f. For me a header line like
;;; -*- mode: Text; coding: utf-8; -*-
works fine and converts every stubborn file contents. Another way is
to use at the file's end:
%
% Local Variables:
% mode: LaTeX
% fill-column: 160
% coding-system: utf-8-emacs-dos
% End:
%
%%
--
Greetings
Pete
"Computers are good at following instructions,
but not at reading your mind."
D. E. Knuth, The TeXbook, Addison-Wesley 1984, 1986, 1996, p. 9
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-01-28 19:30 two utf-8 questions B. T. Raven
` (2 preceding siblings ...)
2006-01-29 9:50 ` François Gannaz
@ 2006-02-03 23:39 ` Stefan Monnier
2006-02-05 18:12 ` B. T. Raven
3 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2006-02-03 23:39 UTC (permalink / raw)
> 1)
> Even though the following is in my .emacs:
> (setq default-buffer-file-coding-system 'utf-8)
> when I type 'C-x ret f' I see the prompt:
> Coding system for visited file (default, nil) instead of (default, utf-8)
I think it never said anything else than nil.
> There are contexts where I need to specify utf-8 here, or else the file
> won't be saved in the correct format. This seems to be true even if the
> file header ;; -*- coding: utf-8 -*- is present. Why is this?
This varies a lot between different versions of Emacs (because it's still
being continuously improved). Also it depends on many more details.
So it's hard to give a useful answer without knowing the version of Emacs
you're using, the locale under which you're running, how much you've munged
the default Mule setup, and then how you've loaded the file, what operations
you've done in between and which command gave you the above problem.
As for obeying the `coding' tag when saving: this has only been added as
a new feature recently in Emacs-CVS. In earlier code, the `coding' tag is
only used when opening a file (and it's checked for consistency when
saving).
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-02-03 23:39 ` Stefan Monnier
@ 2006-02-05 18:12 ` B. T. Raven
2006-02-06 4:14 ` Stefan Monnier
0 siblings, 1 reply; 8+ messages in thread
From: B. T. Raven @ 2006-02-05 18:12 UTC (permalink / raw)
"Stefan Monnier" <monnier@iro.umontreal.ca> wrote in message
news:87slr0kp0b.fsf-monnier+gnu.emacs.help@gnu.org...
> > 1)
> > Even though the following is in my .emacs:
>
> > (setq default-buffer-file-coding-system 'utf-8)
>
> > when I type 'C-x ret f' I see the prompt:
>
> > Coding system for visited file (default, nil) instead of (default,
utf-8)
>
> I think it never said anything else than nil.
Then it appears that there is no way to set the default (at least on my
w32 build, GNU Emacs 21.3.1 (i386-mingw-windows98.2222) of 2004-03-10 on
NYAUMO)
>
> > There are contexts where I need to specify utf-8 here, or else the
file
> > won't be saved in the correct format. This seems to be true even if
the
> > file header ;; -*- coding: utf-8 -*- is present. Why is this?
>
> This varies a lot between different versions of Emacs (because it's
still
> being continuously improved). Also it depends on many more details.
> So it's hard to give a useful answer without knowing the version of
Emacs
> you're using, the locale under which you're running, how much you've
munged
> the default Mule setup, and then how you've loaded the file, what
operations
> you've done in between and which command gave you the above problem.
>
> As for obeying the `coding' tag when saving: this has only been added as
> a new feature recently in Emacs-CVS. In earlier code, the `coding' tag
is
> only used when opening a file (and it's checked for consistency when
> saving).
>
>
> Stefan
As far as I know, these are the only settings in my .emacs that might have
munged the default setup:
(set-language-environment 'UTF-8)
(set-default-coding-systems 'utf-8)
(setq file-name-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)
(setq coding-system-for-write 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-clipboard-coding-system 'iso-8859-1)
(set-selection-coding-system 'iso-8859-1)
(prefer-coding-system 'utf-8)
(modify-coding-system-alist 'process
"[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos)
and via custom:
'(current-language-environment "UTF-8") ;; may be synonymous with
set-language-environment above
'(unify-8859-on-encoding-mode t nil (ucs-tables))
Apparently the clipboard and selection coding systems affect the
locale-coding-system value which shows as iso-8859-1. Still, a buffer that
correctly shows unicode glyphs and which has -u in the mode line should be
saved with the utf-8 encoding. When I issue C-x C-s, C-x k, and then C-x
C-f with the same file name, the utf-8 encoding has been lost. I would be
satisfied with a setup which could handle only the utf-8 encoding if that
were possible by updating some of my *.el files without installing the
newer CVS binaries. But I suppose that's not advisable. Thanks for your
input anyway.
Ed
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: two utf-8 questions
2006-02-05 18:12 ` B. T. Raven
@ 2006-02-06 4:14 ` Stefan Monnier
0 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2006-02-06 4:14 UTC (permalink / raw)
> Apparently the clipboard and selection coding systems affect the
> locale-coding-system value which shows as iso-8859-1. Still, a buffer
> that correctly shows unicode glyphs and which has -u in the mode line
> should be saved with the utf-8 encoding.
The -u just says "this file will be saved in utf-8 if all goes well".
But if the buffer contains chars which Emacs doesn't know how to save in
utf-8, it will use something else (and update the mode line
correspondingly).
> When I issue C-x C-s,
What happened at that step: did it complain? Did the "-u" part change?
If not, then the file was properly saved using the utf-8 encoding.
> C-x k, and then C-x C-f with the same file name, the utf-8 encoding has
> been lost.
You mean that utf-8 encoding in the file is not properly recognized and the
non-ascii chars appear as "garbage"?
That sounds like a bug I don't know, or some special case due to special
unencodable bytes in your original buffer. Give us more hard data (file
name, file contents, ...).
And start by removing most of your .emacs munging.
Just keep: (set-language-environment 'UTF-8)
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-02-06 4:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-28 19:30 two utf-8 questions B. T. Raven
2006-01-28 23:13 ` Peter Dyballa
[not found] ` <mailman.81.1138502554.3044.help-gnu-emacs@gnu.org>
2006-01-29 5:06 ` B. T. Raven
2006-01-29 10:00 ` Peter Dyballa
2006-01-29 9:50 ` François Gannaz
2006-02-03 23:39 ` Stefan Monnier
2006-02-05 18:12 ` B. T. Raven
2006-02-06 4:14 ` Stefan Monnier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).