all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Jürgen Hartmann" <juergen_hartmann_@hotmail.com>
To: "help-gnu-emacs@gnu.org" <help-gnu-emacs@gnu.org>
Subject: RE: Automatic recognition of some specific coding systems
Date: Thu, 26 Feb 2015 23:34:05 +0100	[thread overview]
Message-ID: <DUB124-W30B80440023C24E4E559A0A8140@phx.gbl> (raw)
In-Reply-To: <83ioeo6363.fsf@gnu.org>

@Eli Zaretskii: Thank you very much for your profound assessment:

> It looks like what you want is beyond the current capabilities of
> Emacs's auto-detection of encoding.  See below for some alternatives.
>  
> Having said that...
>  
>> By the way, could you verify, that this is possible with Emacs 22.3
>> with the customization described in my previous post?
>  
> ...no, it doesn't work for me.  The latin-9 file is decoded using my
> locale's encoding (which isn't latin-9), and cp850 file is still
> raw-text.

Oops, this is an important finding indeed.

> So I think some other factor(s) is/are at work on your system.  Your
> locale's encoding is certainly one of them, but I think there should
> be something else, either in your customizations or somewhere else.

I just repeated the tests with Emacs 22.3 using the POSIX locale,

   LC_ALL=C ./emacs -q

and you are right: the cp850 file was recognized as raw-text now. The
locale I used before was

   de_DE.UTF-8

The more I get involved in this topic the more I see that it is much
more complex that I thought at first glance.

> In general, even if Emacs 22.3 was capable to do the job, I think it
> was by sheer luck, and is anyway fragile, since the same
> customizations don't work for me (and AFAIU, aren't supposed to work).
> So I would suggest to explore alternative ways of doing this in Emacs
> 24 reliably.

This sounds reasonable to me. Besides the aspect of reliability, which
is of curse the most important one, doing so might also yield a
solution that is likely to survive future updates.

> Some possibilities you may wish to explore:
>  
>   . Put a 'coding: cp850' cookie in the cp850 files

I would rather avoid altering the files content for this technical reason.

>   . If the names of the cp850 files all match some common pattern, you
>     can use modify-coding-system-alist to tell Emacs to decode them by
>     cp850

Unfortunately in my case there is no such pattern in the file names
that would allow to tell which coding the respective file might use.

>   . Similarly, if the cp850 files' contents match some common regexp,
>     you can customize auto-coding-regexp-alist to force their decoding
>     by cp850

That one might do the trick: In my case the only files (at least in
the big picture) that use the DOS EOL variant are those encoded with
cp850 and vice versa. So one could think about a regular expression
that matches this unique EOL pattern.

> Of course, you can always turn the table, and do the above for
> latin-9, while keeping cp850 in set-coding-system-priority call.  It
> all depends which one of these 2 lends itself better to one of these
> methods.
>  
> I believe that if one of these alternatives can do the job for you,
> the result will be much more reliable.

I also think so.

So, I have to play around a little bit to get acquainted with the
construction of regular expressions for Emacs. I will be back when I
have gained a deeper insight, or a concrete solution at best.

Meanwhile I would like to thank you, Eli Zaretskii, very much for your
time and effort that you spent to provide me with this thorough
analysis and your valuable suggestions.

Juergen

 		 	   		  


  reply	other threads:[~2015-02-26 22:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24 15:31 Automatic recognition of some specific coding systems Jürgen Hartmann
2015-02-24 18:28 ` Eli Zaretskii
2015-02-24 22:30   ` Jürgen Hartmann
2015-02-25 16:19     ` Eli Zaretskii
2015-02-25 17:53       ` Jürgen Hartmann
2015-02-25 20:29         ` Eli Zaretskii
2015-02-25 23:23           ` Jürgen Hartmann
2015-02-26 16:36             ` Eli Zaretskii
2015-02-26 22:34               ` Jürgen Hartmann [this message]
2015-02-28 16:55                 ` Eli Zaretskii
2015-03-03 22:58                   ` Jürgen Hartmann
2015-02-27  1:50 ` Yuri Khan
2015-02-27 12:12   ` Jürgen Hartmann
2015-02-27 12:25     ` Jürgen Hartmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DUB124-W30B80440023C24E4E559A0A8140@phx.gbl \
    --to=juergen_hartmann_@hotmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.