From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Vincent Lefevre Newsgroups: gmane.emacs.bugs Subject: bug#13505: Bug#696026: emacs24: file corruption on saving Date: Sun, 20 Jan 2013 23:10:08 +0100 Message-ID: <20130120221007.GG2695@xvii.vinc17.org> References: <20121215223809.GA7549@xvii.vinc17.org> <877gn8ijgn.fsf@trouble.defaultvalue.org> <83obgjpzod.fsf@gnu.org> <20130120212508.GF2695@xvii.vinc17.org> <83bocjpm81.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1358719869 23796 80.91.229.3 (20 Jan 2013 22:11:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 20 Jan 2013 22:11:09 +0000 (UTC) Cc: 696026-forwarded@bugs.debian.org, 696026@bugs.debian.org, rlb@defaultvalue.org, 13505@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jan 20 23:11:26 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Tx36q-0003Of-R4 for geb-bug-gnu-emacs@m.gmane.org; Sun, 20 Jan 2013 23:11:25 +0100 Original-Received: from localhost ([::1]:48885 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tx36Z-0004DY-Mt for geb-bug-gnu-emacs@m.gmane.org; Sun, 20 Jan 2013 17:11:07 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:58716) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tx36V-0004DM-Tj for bug-gnu-emacs@gnu.org; Sun, 20 Jan 2013 17:11:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Tx36U-0005zH-KV for bug-gnu-emacs@gnu.org; Sun, 20 Jan 2013 17:11:03 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36433) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tx36U-0005zA-Gn for bug-gnu-emacs@gnu.org; Sun, 20 Jan 2013 17:11:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1Tx37S-0004kW-90 for bug-gnu-emacs@gnu.org; Sun, 20 Jan 2013 17:12:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Vincent Lefevre Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 20 Jan 2013 22:12:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13505 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 13505-submit@debbugs.gnu.org id=B13505.135871987518194 (code B ref 13505); Sun, 20 Jan 2013 22:12:02 +0000 Original-Received: (at 13505) by debbugs.gnu.org; 20 Jan 2013 22:11:15 +0000 Original-Received: from localhost ([127.0.0.1]:41897 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Tx36g-0004jP-F1 for submit@debbugs.gnu.org; Sun, 20 Jan 2013 17:11:14 -0500 Original-Received: from vinc17.pck.nerim.net ([213.41.242.187]:61276 helo=smtp-xvii.vinc17.net) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Tx36c-0004jD-PM for 13505@debbugs.gnu.org; Sun, 20 Jan 2013 17:11:13 -0500 Original-Received: by xvii.vinc17.org (Postfix, from userid 1000) id 5EE7C31000A; Sun, 20 Jan 2013 23:10:08 +0100 (CET) Content-Disposition: inline In-Reply-To: <83bocjpm81.fsf@gnu.org> X-Mailer-Info: http://www.vinc17.net/mutt/ User-Agent: Mutt/1.5.21-6290-vl-r57386 (2013-01-17) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:70105 Archived-At: On 2013-01-20 23:40:14 +0200, Eli Zaretskii wrote: > > Date: Sun, 20 Jan 2013 22:25:08 +0100 > > From: Vincent Lefevre > > Cc: Rob Browning , Kenichi Handa , > > 13505@debbugs.gnu.org, 696026-forwarded@bugs.debian.org, > > 696026@bugs.debian.org > >=20 > > On 2013-01-20 18:49:38 +0200, Eli Zaretskii wrote: > > > Personally, I don't think there's a bug here. It's a cockpit error= . > >=20 > > Perhaps it isn't a bug at save time. But then, selecting a lossy > > encoding by default when visiting the file is the bug (and really > > a regression), particularly if this isn't clearly told to the user. >=20 > The encoding isn't lossy. You said: | The original encoded form of the characters as found on disk at | visit time _cannot_ be recovered by saving with raw-text, because | that encoded form is lost without a trace when the file is _visited_ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | and decoded into the internal representation. This is what lossy is. On the opposite, the utf-8 encoding doesn't seem to be lossy: Emacs seems to handle files with invalid UTF-8 sequences without any loss. So, this encoding is safe, even if Emacs wrongly guess the encoding. > In any case, I don't really understand your proposal. Suppose the > file was indeed encoded in in-is13194-devanagari, would you argue then > that selecting it would be incorrect or undesirable behavior? If Emacs modifies the contents when saving the file, it would be incorrect. > > Actually this is related, since the lossy encoding becomes a real > > problem only at save time (and for copy-paste I assume, though the > > file doesn't get overwritten by that). >=20 > It is only a problem when you try to save or otherwise output it > (e.g., send in an email). >=20 > But what you should do then is "C-x RET r raw-text RET", and recover. > That is the only way to avoid corruption in files that use > inconsistent encoding. But Emacs should clearly tell the user what to do after C-x C-s and clearly say when there can be data loss. Currently it says: "These default coding systems were tried to encode text in the buffer `file1': (in-is13194-devanagari-unix (2 . 2376) (3 . 4194176) (4 . 4194201) (5 . 2341) (6 . 2314) (12 . 2364)) (utf-8-unix (3 . 4194176) (4 . 4194201)) However, each of them encountered characters it couldn't encode: in-is13194-devanagari-unix cannot encode these: [...] utf-8-unix cannot encode these: [...]" This shouldn't be regarded as a problem by the user, because if Emacs could read and interpret the file (and such characters have not been added by the user), it should be able to save it. Then Emacs says: "Select one of the safe coding systems listed below [...]", but doesn't say that something has already been lost. So, the words "safe coding systems" are really misleading. --=20 Vincent Lef=E8vre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)