* Opening file in UTF-8 mode automatically
@ 2007-11-24 16:38 spamfilteraccount
2007-11-24 18:26 ` Peter Dyballa
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-24 16:38 UTC (permalink / raw)
To: help-gnu-emacs
I need to edit some UTF-8 files and it's very annoying emacs doesn't
detect it automatically (I have to reopen them as utf-8 manually) and
sometimes I notice it only after I already edited and saved the file
which messes up the formatting.
I tried prefer-coding-system utf-8, but it didn't help.
I can't put lisp code into the files, because they are data files.
Is there a definitive way to do it? The BOM is at the beginning of the
files, so Emacs could detect it automatically.
It's emacs 22.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 16:38 Opening file in UTF-8 mode automatically spamfilteraccount
@ 2007-11-24 18:26 ` Peter Dyballa
2007-11-24 20:09 ` Eli Zaretskii
2007-11-24 18:38 ` Reiner Steib
` (3 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Peter Dyballa @ 2007-11-24 18:26 UTC (permalink / raw)
To: PT; +Cc: help-gnu-emacs
Am 24.11.2007 um 17:38 schrieb spamfilteraccount:
> I tried prefer-coding-system utf-8, but it didn't help.
It might help to set LANG and LC_CTYPE to some UTF-8 value. Another
step would be to avoid set-language-environment. Both means – and
(prefer-coding-system 'utf-8) works fine for me.
Finally, if the files' extension is quite unique, you could use
something like this to bind a file name extension to some particular
file encoding:
(add-to-list 'file-coding-system-alist '("\\.tex\\'" . utf-8))
This can be done temporarily.
If your files really use a particular mark at the beginning (EF BB
BF), you could augment magic-mode-alist, but this does not directly
set the file's encoding.
--
Greetings
Pete
Ce qui été compris n'existe plus. (Paul Eluard)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 16:38 Opening file in UTF-8 mode automatically spamfilteraccount
2007-11-24 18:26 ` Peter Dyballa
@ 2007-11-24 18:38 ` Reiner Steib
2007-11-25 5:49 ` spamfilteraccount
2007-11-24 20:07 ` Eli Zaretskii
` (2 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Reiner Steib @ 2007-11-24 18:38 UTC (permalink / raw)
To: help-gnu-emacs
On Sat, Nov 24 2007, spamfilteraccount@gmail.com wrote:
> I need to edit some UTF-8 files and it's very annoying emacs doesn't
> detect it automatically (I have to reopen them as utf-8 manually) and
> sometimes I notice it only after I already edited and saved the file
> which messes up the formatting.
>
> I tried prefer-coding-system utf-8, but it didn't help.
>
> I can't put lisp code into the files, because they are data files.
>
> Is there a definitive way to do it? The BOM is at the beginning of the
> files, so Emacs could detect it automatically.
>
> It's emacs 22.
`auto-coding-regexp-alist' in Emacs 22 already has an entry for the
BOM: ("\\`\xEF\xBB\xBF" . utf-8).
Either you have removed this entry or there is a bug. If the latter,
please report it as a bug (M-x report-emacs-bug RET) along with a
small gzipped sample document and a recipe starting from "emacs -Q" to
reproduce the problem.
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 16:38 Opening file in UTF-8 mode automatically spamfilteraccount
2007-11-24 18:26 ` Peter Dyballa
2007-11-24 18:38 ` Reiner Steib
@ 2007-11-24 20:07 ` Eli Zaretskii
[not found] ` <mailman.4039.1195934839.18990.help-gnu-emacs@gnu.org>
2007-11-25 22:49 ` Xah Lee
4 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2007-11-24 20:07 UTC (permalink / raw)
To: help-gnu-emacs
> From: "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com>
> Date: Sat, 24 Nov 2007 08:38:20 -0800 (PST)
>
> I need to edit some UTF-8 files and it's very annoying emacs doesn't
> detect it automatically (I have to reopen them as utf-8 manually) and
> sometimes I notice it only after I already edited and saved the file
> which messes up the formatting.
What is your locale?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 18:26 ` Peter Dyballa
@ 2007-11-24 20:09 ` Eli Zaretskii
0 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2007-11-24 20:09 UTC (permalink / raw)
To: help-gnu-emacs
> From: Peter Dyballa <Peter_Dyballa@Web.DE>
> Date: Sat, 24 Nov 2007 19:26:15 +0100
> Cc: help-gnu-emacs@gnu.org
>
> Am 24.11.2007 um 17:38 schrieb spamfilteraccount:
>
> > I tried prefer-coding-system utf-8, but it didn't help.
>
>
> It might help to set LANG and LC_CTYPE to some UTF-8 value.
On Posix systems, perhaps. On Windows, the results of setting these
might be different, as the locale settings are not done via
environment variables there.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 18:38 ` Reiner Steib
@ 2007-11-25 5:49 ` spamfilteraccount
2007-11-25 11:33 ` Reiner Steib
0 siblings, 1 reply; 18+ messages in thread
From: spamfilteraccount @ 2007-11-25 5:49 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 24, 7:38 pm, Reiner Steib <reinersteib+gm...@imap.cc> wrote:
>
> `auto-coding-regexp-alist' in Emacs 22 already has an entry for the
> BOM: ("\\`\xEF\xBB\xBF" . utf-8).
>
Yep, it's there, I checked it, but it doesn't trigger opening the file
in utf-8 mode.
Is there any other setting which is necessary for auto-coding-regexp-
alist to work, or should this single setting be enough? I don't have
anything else set.
I checked mule.el (the only .el file where this variable is seemingly
used), but it's not clear for me what should trigger it. Apparently
it's not triggered by find-file-hook or something which I would
expect.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
[not found] ` <mailman.4039.1195934839.18990.help-gnu-emacs@gnu.org>
@ 2007-11-25 5:51 ` spamfilteraccount
2007-11-25 20:59 ` Eli Zaretskii
[not found] ` <mailman.4082.1196024377.18990.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-25 5:51 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 24, 9:07 pm, Eli Zaretskii <e...@gnu.org> wrote:
> What is your locale?
None set. It's an English Windows. I'd like to solve this problem from
emacs alone.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-25 5:49 ` spamfilteraccount
@ 2007-11-25 11:33 ` Reiner Steib
2007-11-27 7:12 ` spamfilteraccount
0 siblings, 1 reply; 18+ messages in thread
From: Reiner Steib @ 2007-11-25 11:33 UTC (permalink / raw)
To: help-gnu-emacs
On Sun, Nov 25 2007, spamfilteraccount@gmail.com wrote:
> On Nov 24, 7:38 pm, Reiner Steib <reinersteib+gm...@imap.cc> wrote:
>> `auto-coding-regexp-alist' in Emacs 22 already has an entry for the
>> BOM: ("\\`\xEF\xBB\xBF" . utf-8).
>
> Yep, it's there, I checked it, but it doesn't trigger opening the file
> in utf-8 mode.
Works for me:
$ echo -e '\xEF\xBB\xBF BOM test' > /tmp/BOM.txt
$ cvs-EMACS_22_BASE/i686$ LC_ALL=C ./src/emacs -Q /tmp/BOM.txt
==> u -- mule-utf-8-unix in the mode line.
So we need a (small) sample file[1] and a recipe to reproduce the
problem starting from emacs -Q. `M-x report-emacs-bug RET'
additionally provides useful information for the developers.
[1] E.g. the first few lines of your text file.
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-25 5:51 ` spamfilteraccount
@ 2007-11-25 20:59 ` Eli Zaretskii
[not found] ` <mailman.4082.1196024377.18990.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2007-11-25 20:59 UTC (permalink / raw)
To: help-gnu-emacs
> From: "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com>
> Date: Sat, 24 Nov 2007 21:51:18 -0800 (PST)
>
> On Nov 24, 9:07 pm, Eli Zaretskii <e...@gnu.org> wrote:
>
> > What is your locale?
>
> None set.
That's not possible on Windows, AFAIK.
What does Emacs produce under "Important settings" in the *mail*
buffer if you type "M-x report-emacs-bug RET foo RET"?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-24 16:38 Opening file in UTF-8 mode automatically spamfilteraccount
` (3 preceding siblings ...)
[not found] ` <mailman.4039.1195934839.18990.help-gnu-emacs@gnu.org>
@ 2007-11-25 22:49 ` Xah Lee
2007-11-27 7:15 ` spamfilteraccount
4 siblings, 1 reply; 18+ messages in thread
From: Xah Lee @ 2007-11-25 22:49 UTC (permalink / raw)
To: help-gnu-emacs
Not sure about auto-detecting, but i work with utf-8 daily and never
have a problem.
I do, however, set my emacs to use utf-8 by default.
To set your file encoding in emacs, use the menu "Options→Mule
(Multilingual Environment)→Set Language Environment".
After you've pulled the menu, be sure to also pull the menu command
"Options→Save Options" so that emacs remembers your settings.
or
Alt+x set-language-environment UTF-8
See also:
* Emacs and Unicode tips
http://xahlee.org/emacs/emacs_n_unicode.html
Xah
xah@xahlee.org
\xAD\xF4 http://xahlee.org/
On Nov 24, 8:38 am, "spamfilteracco...@gmail.com"
<spamfilteracco...@gmail.com> wrote:
> I need to edit some UTF-8 files and it's very annoying emacs doesn't
> detect it automatically (I have to reopen them as utf-8 manually) and
> sometimes I notice it only after I already edited and saved the file
> which messes up the formatting.
>
> I tried prefer-coding-system utf-8, but it didn't help.
>
> I can't put lisp code into the files, because they are data files.
>
> Is there a definitive way to do it? The BOM is at the beginning of the
> files, so Emacs could detect it automatically.
>
> It's emacs 22.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-25 11:33 ` Reiner Steib
@ 2007-11-27 7:12 ` spamfilteraccount
0 siblings, 0 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-27 7:12 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 25, 12:33 pm, Reiner Steib <reinersteib+gm...@imap.cc> wrote:
> On Sun, Nov 25 2007, spamfilteracco...@gmail.com wrote:
> > On Nov 24, 7:38 pm, Reiner Steib <reinersteib+gm...@imap.cc> wrote:
> >> `auto-coding-regexp-alist' in Emacs 22 already has an entry for the
> >> BOM: ("\\`\xEF\xBB\xBF" . utf-8).
>
> > Yep, it's there, I checked it, but it doesn't trigger opening the file
> > in utf-8 mode.
>
> Works for me:
>
If anyone can tell me what code should trigger utf-8 mode via auto-
coding-regexp-alist then I can debug the problem myself.
I'm quite knowledgeable in Emacs Lisp, so if it's in the lisp part
then I can surely debug it.
I took a cursory look into mule.el and I saw auto-coding-regexp-alist
is used there, but I haven't seen any apparent mechanism tying that
stuff to file opening (no find-file-hook or something).
Anyone can give me some pointers regarding this?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
[not found] ` <mailman.4082.1196024377.18990.help-gnu-emacs@gnu.org>
@ 2007-11-27 7:14 ` spamfilteraccount
2007-11-27 7:30 ` Zhang Wei
2007-11-27 21:48 ` Eli Zaretskii
0 siblings, 2 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-27 7:14 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 25, 9:59 pm, Eli Zaretskii <e...@gnu.org> wrote:
> > From: "spamfilteracco...@gmail.com" <spamfilteracco...@gmail.com>
> > Date: Sat, 24 Nov 2007 21:51:18 -0800 (PST)
>
> > On Nov 24, 9:07 pm, Eli Zaretskii <e...@gnu.org> wrote:
>
> > > What is your locale?
>
> > None set.
>
> That's not possible on Windows, AFAIK.
>
> What does Emacs produce under "Important settings" in the *mail*
> buffer if you type "M-x report-emacs-bug RET foo RET"?
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: HUN
locale-coding-system: cp1252
default-enable-multibyte-characters: nil
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-25 22:49 ` Xah Lee
@ 2007-11-27 7:15 ` spamfilteraccount
0 siblings, 0 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-27 7:15 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 25, 11:49 pm, Xah Lee <x...@xahlee.org> wrote:
> Not sure about auto-detecting, but i work with utf-8 daily and never
> have a problem.
>
> I do, however, set my emacs to use utf-8 by default.
>
> To set your file encoding in emacs, use the menu "Options→Mule
> (Multilingual Environment)→Set Language Environment".
>
> After you've pulled the menu, be sure to also pull the menu command
> "Options→Save Options" so that emacs remembers your settings.
>
> or
>
> Alt+x set-language-environment UTF-8
>
>
Didn't work. The file is still not opened in utf-8 mode.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-27 7:14 ` spamfilteraccount
@ 2007-11-27 7:30 ` Zhang Wei
2007-11-27 17:41 ` Reiner Steib
` (2 more replies)
2007-11-27 21:48 ` Eli Zaretskii
1 sibling, 3 replies; 18+ messages in thread
From: Zhang Wei @ 2007-11-27 7:30 UTC (permalink / raw)
To: help-gnu-emacs
"spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes:
[...]
> default-enable-multibyte-characters: nil
I think this variable should be set to "t" to open a utf-8 file.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-27 7:30 ` Zhang Wei
@ 2007-11-27 17:41 ` Reiner Steib
2007-11-27 21:47 ` Eli Zaretskii
[not found] ` <mailman.4200.1196200033.18990.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 18+ messages in thread
From: Reiner Steib @ 2007-11-27 17:41 UTC (permalink / raw)
To: help-gnu-emacs
On Tue, Nov 27 2007, Zhang Wei wrote:
> "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes:
>
> [...]
>
>> default-enable-multibyte-characters: nil
>
> I think this variable should be set to "t" to open a utf-8 file.
It is t by default. So the OP should check his init files and remove
the code that disables it. See e.g.
,----[ (info "(emacs)Enabling Multibyte") ]
| To turn off multibyte character support by default, start Emacs with
| the `--unibyte' option (*note Initial Options::), or set the
| environment variable `EMACS_UNIBYTE'. You can also customize
| `enable-multibyte-characters' or, equivalently, directly set the
| variable `default-enable-multibyte-characters' to `nil' in your init
| file to have basically the same effect as `--unibyte'.
`----
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-27 7:30 ` Zhang Wei
2007-11-27 17:41 ` Reiner Steib
@ 2007-11-27 21:47 ` Eli Zaretskii
[not found] ` <mailman.4200.1196200033.18990.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2007-11-27 21:47 UTC (permalink / raw)
To: help-gnu-emacs
> From: Zhang Wei <id.brep@gmail.com>
> Date: Tue, 27 Nov 2007 15:30:23 +0800
>
> "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes:
>
> [...]
>
> > default-enable-multibyte-characters: nil
>
> I think this variable should be set to "t" to open a utf-8 file.
Yes, of course!
To the OP: why are you running Emacs in the unibyte mode? Does the
problem go away if you invoke Emacs with "emacs -Q"?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
2007-11-27 7:14 ` spamfilteraccount
2007-11-27 7:30 ` Zhang Wei
@ 2007-11-27 21:48 ` Eli Zaretskii
1 sibling, 0 replies; 18+ messages in thread
From: Eli Zaretskii @ 2007-11-27 21:48 UTC (permalink / raw)
To: help-gnu-emacs
> From: "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com>
> Date: Mon, 26 Nov 2007 23:14:14 -0800 (PST)
>
> On Nov 25, 9:59 pm, Eli Zaretskii <e...@gnu.org> wrote:
> > > From: "spamfilteracco...@gmail.com" <spamfilteracco...@gmail.com>
> > > Date: Sat, 24 Nov 2007 21:51:18 -0800 (PST)
> >
> > > On Nov 24, 9:07 pm, Eli Zaretskii <e...@gnu.org> wrote:
> >
> > > > What is your locale?
> >
> > > None set.
> >
> > That's not possible on Windows, AFAIK.
> >
> > What does Emacs produce under "Important settings" in the *mail*
> > buffer if you type "M-x report-emacs-bug RET foo RET"?
>
> Important settings:
> value of $LC_ALL: nil
> value of $LC_COLLATE: nil
> value of $LC_CTYPE: nil
> value of $LC_MESSAGES: nil
> value of $LC_MONETARY: nil
> value of $LC_NUMERIC: nil
> value of $LC_TIME: nil
> value of $LANG: HUN
> locale-coding-system: cp1252
So, as you see, yours is the Hungarian locale with codepage 1252 as
the locale-native encoding.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Opening file in UTF-8 mode automatically
[not found] ` <mailman.4200.1196200033.18990.help-gnu-emacs@gnu.org>
@ 2007-11-28 6:54 ` spamfilteraccount
0 siblings, 0 replies; 18+ messages in thread
From: spamfilteraccount @ 2007-11-28 6:54 UTC (permalink / raw)
To: help-gnu-emacs
On Nov 27, 10:47 pm, Eli Zaretskii <e...@gnu.org> wrote:
> > From: Zhang Wei <id.b...@gmail.com>
> > Date: Tue, 27 Nov 2007 15:30:23 +0800
>
> > "spamfilteracco...@gmail.com" <spamfilteracco...@gmail.com> writes:
>
> > [...]
>
> > > default-enable-multibyte-characters: nil
>
> > I think this variable should be set to "t" to open a utf-8 file.
>
> Yes, of course!
>
> To the OP: why are you running Emacs in the unibyte mode?
I don't know. :)
I checked it and I did start emacs with --unibyte. I set it ages ago
and completely forgot about it. I didn't edit utf encoded files in the
past, so it was no problem.
I removed the --unibyte option and now everything is working fine.
UTF-8 files are opened in UTF-8 mode automatically.
Thanks for the help everyone.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-11-28 6:54 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-24 16:38 Opening file in UTF-8 mode automatically spamfilteraccount
2007-11-24 18:26 ` Peter Dyballa
2007-11-24 20:09 ` Eli Zaretskii
2007-11-24 18:38 ` Reiner Steib
2007-11-25 5:49 ` spamfilteraccount
2007-11-25 11:33 ` Reiner Steib
2007-11-27 7:12 ` spamfilteraccount
2007-11-24 20:07 ` Eli Zaretskii
[not found] ` <mailman.4039.1195934839.18990.help-gnu-emacs@gnu.org>
2007-11-25 5:51 ` spamfilteraccount
2007-11-25 20:59 ` Eli Zaretskii
[not found] ` <mailman.4082.1196024377.18990.help-gnu-emacs@gnu.org>
2007-11-27 7:14 ` spamfilteraccount
2007-11-27 7:30 ` Zhang Wei
2007-11-27 17:41 ` Reiner Steib
2007-11-27 21:47 ` Eli Zaretskii
[not found] ` <mailman.4200.1196200033.18990.help-gnu-emacs@gnu.org>
2007-11-28 6:54 ` spamfilteraccount
2007-11-27 21:48 ` Eli Zaretskii
2007-11-25 22:49 ` Xah Lee
2007-11-27 7:15 ` spamfilteraccount
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).