all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "William Xue" <william.xue@gmail.com>
To: "Stefan Monnier" <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: can not decode 0x93 and 0x94 to correct char
Date: Sat, 29 Sep 2007 23:30:32 +0800	[thread overview]
Message-ID: <op.tze9c6tohkv0w5@smiling> (raw)
In-Reply-To: <jwvodflk2qu.fsf-monnier+emacs@gnu.org>

On Sat, 29 Sep 2007 21:47:37 +0800, Stefan Monnier  
<monnier@iro.umontreal.ca> wrote:

>> ; for cp1258
>> (prefer-coding-system 'windows-1258)
>> ; for displaying utf-8 encoded file
>> (prefer-coding-system 'utf-8-emacs)
>> ; for displaying chinese characters
>> (prefer-coding-system 'gb2312)
>
>> It would be a little problem. Because if I changed the gb2312 to gb18030
>> or gbk, the first setting (prefer-coding-system 'windows-1258) would
>> be failed.
>
> I'm not sure what you mean by "would be failed", but when you use

If I changed the gb2312 to gb18030 or gbk, the char \223 and \224, which  
are
left and right quotation marks in cp1258, would not be decoded correctly.

So I think it may not be a correct solution for this situation. If  
somebody want to
decode Japanese, French, Russian, and so on, it's too complex

> prefer-coding-system, you have to realize that it's not quite as simple  
> as
> it sounds:
> - first, the three statements above mean to try (in this order) first
>   gb2312, then utf-8, then windows-1258.
> - second, this order should not be chosen exclusively based on how often
>   you expect to use each of those encodings.  Because it depends a lot of
>   the frequency of false positives.  E.g. utf-8 should usually be first,
>   because it has very few false positives (if the auto-detect decides  
> it's
>   utf-8, then it's very unlikely that the file isn't utf-8).
>   OTOH window-1258 should *not* be first because it has many false
>   positives: any file without a 0 byte in it is a valid windows-1258  
> file.
>
> The second point is the main reason why the order of detection of coding
> systems when reading a file should be the same as the order of  
> preference to
> choose a coding system to use when writing a file.

Thanks!

>
>
>         Stefan
>
>
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel



-- 
Yours,
WilliamX

  reply	other threads:[~2007-09-29 15:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-28  4:33 can not decode 0x93 and 0x94 to correct char William Xue
2007-09-28  6:31 ` Kenichi Handa
2007-09-28  8:30   ` Eli Zaretskii
2007-09-28  9:38     ` William Xue
2007-10-01  1:33     ` Kenichi Handa
2007-09-28 13:50 ` Stefan Monnier
2007-09-28 14:45   ` Eli Zaretskii
2007-09-29  8:29     ` William Xue
2007-09-29 13:47       ` Stefan Monnier
2007-09-29 15:30         ` William Xue [this message]
  -- strict thread matches above, loose matches on Subject: below --
2007-09-28  4:24 William Xue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.tze9c6tohkv0w5@smiling \
    --to=william.xue@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.