From: "William Xue" <william.xue@gmail.com>
To: "Stefan Monnier" <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: can not decode 0x93 and 0x94 to correct char
Date: Sat, 29 Sep 2007 23:30:32 +0800 [thread overview]
Message-ID: <op.tze9c6tohkv0w5@smiling> (raw)
In-Reply-To: <jwvodflk2qu.fsf-monnier+emacs@gnu.org>
On Sat, 29 Sep 2007 21:47:37 +0800, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> ; for cp1258
>> (prefer-coding-system 'windows-1258)
>> ; for displaying utf-8 encoded file
>> (prefer-coding-system 'utf-8-emacs)
>> ; for displaying chinese characters
>> (prefer-coding-system 'gb2312)
>
>> It would be a little problem. Because if I changed the gb2312 to gb18030
>> or gbk, the first setting (prefer-coding-system 'windows-1258) would
>> be failed.
>
> I'm not sure what you mean by "would be failed", but when you use
If I changed the gb2312 to gb18030 or gbk, the char \223 and \224, which
are
left and right quotation marks in cp1258, would not be decoded correctly.
So I think it may not be a correct solution for this situation. If
somebody want to
decode Japanese, French, Russian, and so on, it's too complex
> prefer-coding-system, you have to realize that it's not quite as simple
> as
> it sounds:
> - first, the three statements above mean to try (in this order) first
> gb2312, then utf-8, then windows-1258.
> - second, this order should not be chosen exclusively based on how often
> you expect to use each of those encodings. Because it depends a lot of
> the frequency of false positives. E.g. utf-8 should usually be first,
> because it has very few false positives (if the auto-detect decides
> it's
> utf-8, then it's very unlikely that the file isn't utf-8).
> OTOH window-1258 should *not* be first because it has many false
> positives: any file without a 0 byte in it is a valid windows-1258
> file.
>
> The second point is the main reason why the order of detection of coding
> systems when reading a file should be the same as the order of
> preference to
> choose a coding system to use when writing a file.
Thanks!
>
>
> Stefan
>
>
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel
--
Yours,
WilliamX
next prev parent reply other threads:[~2007-09-29 15:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-28 4:33 can not decode 0x93 and 0x94 to correct char William Xue
2007-09-28 6:31 ` Kenichi Handa
2007-09-28 8:30 ` Eli Zaretskii
2007-09-28 9:38 ` William Xue
2007-10-01 1:33 ` Kenichi Handa
2007-09-28 13:50 ` Stefan Monnier
2007-09-28 14:45 ` Eli Zaretskii
2007-09-29 8:29 ` William Xue
2007-09-29 13:47 ` Stefan Monnier
2007-09-29 15:30 ` William Xue [this message]
-- strict thread matches above, loose matches on Subject: below --
2007-09-28 4:24 William Xue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=op.tze9c6tohkv0w5@smiling \
--to=william.xue@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).