From: Eli Zaretskii <eliz@gnu.org>
To: help-gnu-emacs@gnu.org
Subject: Re: Automatic recognition of some specific coding systems
Date: Thu, 26 Feb 2015 18:36:04 +0200 [thread overview]
Message-ID: <83ioeo6363.fsf@gnu.org> (raw)
In-Reply-To: <DUB124-W1891E783BC3ED13641049EA8170@phx.gbl>
> From: Jürgen Hartmann <juergen_hartmann_@hotmail.com>
> Date: Thu, 26 Feb 2015 00:23:50 +0100
>
> > Try this:
> >
> > (set-coding-system-priority 'utf-8 'cp850)
>
> After doing this, the coding systems
>
> utf-8
> cp850
>
> get correctly recognized, but
>
> latin-9-unix
>
> gets wrongly recognized as cp850-unix encoded.
>
> If I modify the lisp expression to
>
> (set-coding-system-priority 'utf-8 'latin-9)
>
> it is utf-8 and latin-9 that are properly recognized while the test
> file
>
> cp850-dos
>
> gets detected as iso-latin-9-dos encoded.
I feared that might be the result.
> If I pass all three coding systems to set-coding-system-priority,
>
> (set-coding-system-priority 'utf-8 'latin-9 'cp850) or
> (set-coding-system-priority 'utf-8 'cp850 'latin-9)
>
> it turns out that the function set-coding-system-priority ignores the third
> coding system in these cases, because it belongs to the same coding
> category as the coding system named in the second place. The source
> code src/coding.c comments this in the lines 9972 and 9973 like this:
>
> /* Ignore this coding system because a coding system of the
> same category already had a higher priority. */
Yes, I know. That's why I only mentioned 2 of them.
It looks like what you want is beyond the current capabilities of
Emacs's auto-detection of encoding. See below for some alternatives.
Having said that...
> By the way, could you verify, that this is possible with Emacs 22.3
> with the customization described in my previous post?
...no, it doesn't work for me. The latin-9 file is decoded using my
locale's encoding (which isn't latin-9), and cp850 file is still
raw-text.
So I think some other factor(s) is/are at work on your system. Your
locale's encoding is certainly one of them, but I think there should
be something else, either in your customizations or somewhere else.
In general, even if Emacs 22.3 was capable to do the job, I think it
was by sheer luck, and is anyway fragile, since the same
customizations don't work for me (and AFAIU, aren't supposed to work).
So I would suggest to explore alternative ways of doing this in Emacs
24 reliably. Some possibilities you may wish to explore:
. Put a 'coding: cp850' cookie in the cp850 files
. If the names of the cp850 files all match some common pattern, you
can use modify-coding-system-alist to tell Emacs to decode them by
cp850
. Similarly, if the cp850 files' contents match some common regexp,
you can customize auto-coding-regexp-alist to force their decoding
by cp850
Of course, you can always turn the table, and do the above for
latin-9, while keeping cp850 in set-coding-system-priority call. It
all depends which one of these 2 lends itself better to one of these
methods.
I believe that if one of these alternatives can do the job for you,
the result will be much more reliable.
next prev parent reply other threads:[~2015-02-26 16:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-24 15:31 Automatic recognition of some specific coding systems Jürgen Hartmann
2015-02-24 18:28 ` Eli Zaretskii
2015-02-24 22:30 ` Jürgen Hartmann
2015-02-25 16:19 ` Eli Zaretskii
2015-02-25 17:53 ` Jürgen Hartmann
2015-02-25 20:29 ` Eli Zaretskii
2015-02-25 23:23 ` Jürgen Hartmann
2015-02-26 16:36 ` Eli Zaretskii [this message]
2015-02-26 22:34 ` Jürgen Hartmann
2015-02-28 16:55 ` Eli Zaretskii
2015-03-03 22:58 ` Jürgen Hartmann
2015-02-27 1:50 ` Yuri Khan
2015-02-27 12:12 ` Jürgen Hartmann
2015-02-27 12:25 ` Jürgen Hartmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83ioeo6363.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).