From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?iso-8859-1?B?SvxyZ2VuIEhhcnRtYW5u?= Newsgroups: gmane.emacs.help Subject: RE: Automatic recognition of some specific coding systems Date: Thu, 26 Feb 2015 23:34:05 +0100 Message-ID: References: , <83fv9v6u5o.fsf@gnu.org>, , <83twya55h9.fsf@gnu.org>, , <83mw4168ha.fsf@gnu.org>, , <83ioeo6363.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1424990073 1062 80.91.229.3 (26 Feb 2015 22:34:33 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 26 Feb 2015 22:34:33 +0000 (UTC) To: "help-gnu-emacs@gnu.org" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Feb 26 23:34:25 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YR70h-0004p3-Rb for geh-help-gnu-emacs@m.gmane.org; Thu, 26 Feb 2015 23:34:23 +0100 Original-Received: from localhost ([::1]:32979 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YR70h-000550-3M for geh-help-gnu-emacs@m.gmane.org; Thu, 26 Feb 2015 17:34:23 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52358) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YR70V-00054g-CE for help-gnu-emacs@gnu.org; Thu, 26 Feb 2015 17:34:12 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YR70Q-0005Bz-BN for help-gnu-emacs@gnu.org; Thu, 26 Feb 2015 17:34:11 -0500 Original-Received: from dub004-omc4s10.hotmail.com ([157.55.2.85]:55618) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YR70Q-0005Bs-3m for help-gnu-emacs@gnu.org; Thu, 26 Feb 2015 17:34:06 -0500 Original-Received: from DUB124-W30 ([157.55.2.73]) by DUB004-OMC4S10.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Thu, 26 Feb 2015 14:34:05 -0800 X-TMN: [vWGrDz1+dxVR0E1MrskCCdcazrs0jLEt] X-Originating-Email: [juergen_hartmann_@hotmail.com] Importance: Normal In-Reply-To: <83ioeo6363.fsf@gnu.org> X-OriginalArrivalTime: 26 Feb 2015 22:34:05.0176 (UTC) FILETIME=[53C18780:01D05214] X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 157.55.2.85 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:102908 Archived-At: @Eli Zaretskii: Thank you very much for your profound assessment:=0A= =0A= > It looks like what you want is beyond the current capabilities of=0A= > Emacs's auto-detection of encoding.=A0 See below for some alternatives.= =0A= >=A0 =0A= > Having said that...=0A= >=A0 =0A= >> By the way=2C could you verify=2C that this is possible with Emacs 22.3= =0A= >> with the customization described in my previous post?=0A= >=A0 =0A= > ...no=2C it doesn't work for me.=A0 The latin-9 file is decoded using my= =0A= > locale's encoding (which isn't latin-9)=2C and cp850 file is still=0A= > raw-text.=0A= =0A= Oops=2C this is an important finding indeed.=0A= =0A= > So I think some other factor(s) is/are at work on your system.=A0 Your=0A= > locale's encoding is certainly one of them=2C but I think there should=0A= > be something else=2C either in your customizations or somewhere else.=0A= =0A= I just repeated the tests with Emacs 22.3 using the POSIX locale=2C=0A= =0A= =A0=A0 LC_ALL=3DC ./emacs -q=0A= =0A= and you are right: the cp850 file was recognized as raw-text now. The=0A= locale I used before was=0A= =0A= =A0=A0 de_DE.UTF-8=0A= =0A= The more I get involved in this topic the more I see that it is much=0A= more complex that I thought at first glance.=0A= =0A= > In general=2C even if Emacs 22.3 was capable to do the job=2C I think it= =0A= > was by sheer luck=2C and is anyway fragile=2C since the same=0A= > customizations don't work for me (and AFAIU=2C aren't supposed to work).= =0A= > So I would suggest to explore alternative ways of doing this in Emacs=0A= > 24 reliably.=0A= =0A= This sounds reasonable to me. Besides the aspect of reliability=2C which=0A= is of curse the most important one=2C doing so might also yield a=0A= solution that is likely to survive future updates.=0A= =0A= > Some possibilities you may wish to explore:=0A= >=A0 =0A= >=A0=A0 . Put a 'coding: cp850' cookie in the cp850 files=0A= =0A= I would rather avoid altering the files content for this technical reason.= =0A= =0A= >=A0=A0 . If the names of the cp850 files all match some common pattern=2C = you=0A= >=A0=A0=A0=A0 can use modify-coding-system-alist to tell Emacs to decode th= em by=0A= >=A0=A0=A0=A0 cp850=0A= =0A= Unfortunately in my case there is no such pattern in the file names=0A= that would allow to tell which coding the respective file might use.=0A= =0A= >=A0=A0 . Similarly=2C if the cp850 files' contents match some common regex= p=2C=0A= >=A0=A0=A0=A0 you can customize auto-coding-regexp-alist to force their dec= oding=0A= >=A0=A0=A0=A0 by cp850=0A= =0A= That one might do the trick: In my case the only files (at least in=0A= the big picture) that use the DOS EOL variant are those encoded with=0A= cp850 and vice versa. So one could think about a regular expression=0A= that matches this unique EOL pattern.=0A= =0A= > Of course=2C you can always turn the table=2C and do the above for=0A= > latin-9=2C while keeping cp850 in set-coding-system-priority call.=A0 It= =0A= > all depends which one of these 2 lends itself better to one of these=0A= > methods.=0A= >=A0 =0A= > I believe that if one of these alternatives can do the job for you=2C=0A= > the result will be much more reliable.=0A= =0A= I also think so.=0A= =0A= So=2C I have to play around a little bit to get acquainted with the=0A= construction of regular expressions for Emacs. I will be back when I=0A= have gained a deeper insight=2C or a concrete solution at best.=0A= =0A= Meanwhile I would like to thank you=2C Eli Zaretskii=2C very much for your= =0A= time and effort that you spent to provide me with this thorough=0A= analysis and your valuable suggestions.=0A= =0A= Juergen=0A= =0A= =