From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?iso-8859-1?B?SvxyZ2VuIEhhcnRtYW5u?= Newsgroups: gmane.emacs.help Subject: RE: Automatic recognition of some specific coding systems Date: Tue, 24 Feb 2015 23:30:49 +0100 Message-ID: References: , <83fv9v6u5o.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1424817073 25018 80.91.229.3 (24 Feb 2015 22:31:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 24 Feb 2015 22:31:13 +0000 (UTC) To: "help-gnu-emacs@gnu.org" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Feb 24 23:31:07 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YQO0Q-0007UJ-L2 for geh-help-gnu-emacs@m.gmane.org; Tue, 24 Feb 2015 23:31:06 +0100 Original-Received: from localhost ([::1]:51964 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQO0Q-00029d-3K for geh-help-gnu-emacs@m.gmane.org; Tue, 24 Feb 2015 17:31:06 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58408) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQO0E-00028t-K4 for help-gnu-emacs@gnu.org; Tue, 24 Feb 2015 17:30:55 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQO0B-0003p4-Ct for help-gnu-emacs@gnu.org; Tue, 24 Feb 2015 17:30:54 -0500 Original-Received: from dub004-omc4s30.hotmail.com ([157.55.2.105]:65213) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQO0B-0003oj-49 for help-gnu-emacs@gnu.org; Tue, 24 Feb 2015 17:30:51 -0500 Original-Received: from DUB124-W9 ([157.55.2.72]) by DUB004-OMC4S30.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Tue, 24 Feb 2015 14:30:49 -0800 X-TMN: [yMi6GBsy916QReRbI9zqpXo1yvSMDiP/] X-Originating-Email: [juergen_hartmann_@hotmail.com] Importance: Normal In-Reply-To: <83fv9v6u5o.fsf@gnu.org> X-OriginalArrivalTime: 24 Feb 2015 22:30:49.0884 (UTC) FILETIME=[8A86D5C0:01D05081] X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 157.55.2.105 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:102876 Archived-At: Thank you=2C Eli Zaretskii=2C for your speedy answer:=0A= =0A= >> Is this conclusion correct?=0A= > =0A= > No=2C I don't think so.=A0 There's no direct relation between categories= =0A= > and recognition of encoding.=0A= =0A= I am very glad to hear that.=0A= =0A= > If you have specific problems=2C i.e. if Emacs doesn't recognize the=0A= > encoding of some file(s)=2C please post the details.=A0 (I'd suggest to= =0A= > try in "emacs -Q" first=2C because some problems might be caused by your= =0A= > customizations that need to be removed or adapted to the new version.)=0A= > Then people here could review the problems and advise you about=0A= > possible solutions=2C or ask you to file a bug report.=0A= > =0A= > But in general=2C there shouldn't be any regressions in recognizing=0A= > encodings.=0A= =0A= OK. I will try to give a specific example - it is rather artificial=0A= but representative:=0A= =0A= Consider an utf-8-unix encoded text file=2C meaningfully named=0A= utf-8-unix=2C that just contains the seven German special characters=0A= =0A= =A0=A0 =E4=F6=FC=DF=C4=D6=DC=A0=A0 ("a"o"u"s"A"O"U)=0A= =0A= in one single line followed by a newline character. Now we make two=0A= copies of this file and recode them to the other coding systems of=0A= interest:=0A= =0A= =A0=A0 cp utf-8-unix latin-9-unix=0A= =A0=A0 recode ..l9 latin-9-unix=0A= =0A= =A0=A0 cp utf-8-unix cp850-dos=0A= =A0=A0 recode ..pc cp850-dos=0A= =0A= Visiting all tree files in an Emacs session that was freshly started=0A= by means of=0A= =0A= =A0=A0 emacs -Q=0A= =0A= - thank you for that important hint - yields a perfect recognition of=0A= the respective coding in the case of=0A= =0A= =A0=A0 utf-8-unix=0A= =A0=A0 latin-9-unix=A0=A0 (recognized as latin-1-unix=2C equivalent here)= =0A= =0A= but the recognition fails tor the cp850-dos encoded file=2C as it is=0A= recognized as=0A= =0A= =A0=A0 raw-text-dos=0A= =0A= encoded and its contents is displayed as=0A= =0A= =A0=A0 \204\224\201\341\216\231\232=0A= =0A= Looking on the contents of the variable coding-category-list=2C it has=0A= the form=0A= =0A= =A0=A0 (coding-category-utf-8 coding-category-charset ...=0A= =A0=A0=A0 coding-category-raw-text ...)=0A= =0A= where the values of the variables coding-category-utf-8 and=0A= coding-category-charset are utf-8 and iso-latin-1 respectively.=0A= =0A= If I start again with a new Emacs session (emacs -Q)=2C but this time=0A= performing the commands=0A= =0A= =A0=A0 prefer-coding-system cp850=0A= =A0=A0 prefer-coding-system utf-8=0A= =0A= prior to visiting the files=2C the codings of=0A= =0A= =A0=A0 utf-8-unix=0A= =A0=A0 cp850-dos=0A= =0A= are recognized correctly=2C while the file latin-9-unix is recognized as=0A= cp850-unix encoded and its contents is displayed as some cryptic symbols.= =0A= =0A= The coding-category-list and the variable coding-category-utf-8 have=0A= the same values as before=2C but the variable coding-category-charset=0A= contains cp850 this time.=0A= =0A= So my problem is to find a configuration of Emacs 24.4 that yields a=0A= correct automatic recognition of all tree coding systems=0A= =0A= =A0=A0 utf-8-unix=0A= =A0=A0 latin-9-unix or=0A= =A0=A0 cp850-dos=0A= =0A= when the files of the example above are visited. One has to keep in=0A= mind that this was perfectly possible with Emacs 22.3=2C as I just=0A= verified again.=0A= =0A= Sorry for the rather long post=2C but I hope that I could state my=0A= problem more precisely.=0A= =0A= Juergen=0A= =0A= =