From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Coding system robustness? Date: Sat, 19 Mar 2005 10:10:07 +0100 Message-ID: References: <87wts43jxx.fsf-monnier+emacs@gnu.org> <200503190108.KAA22411@etlken.m17n.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1111223583 15515 80.91.229.2 (19 Mar 2005 09:13:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 19 Mar 2005 09:13:03 +0000 (UTC) Cc: Stefan Monnier , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Mar 19 10:13:02 2005 Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1DCa0v-0007KW-ED for ged-emacs-devel@m.gmane.org; Sat, 19 Mar 2005 10:12:57 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DCaHZ-00041n-3b for ged-emacs-devel@m.gmane.org; Sat, 19 Mar 2005 04:30:09 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1DCaFM-0003mi-LZ for emacs-devel@gnu.org; Sat, 19 Mar 2005 04:27:53 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1DCaF6-0003kj-QB for emacs-devel@gnu.org; Sat, 19 Mar 2005 04:27:43 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1DCaF4-0003kR-Tg for emacs-devel@gnu.org; Sat, 19 Mar 2005 04:27:35 -0500 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1DCZyK-0000Bw-1u for emacs-devel@gnu.org; Sat, 19 Mar 2005 04:10:16 -0500 Original-Received: from localhost ([127.0.0.1] helo=lola.goethe.zz) by fencepost.gnu.org with esmtp (Exim 4.34) id 1DCZyJ-0003U2-Aj; Sat, 19 Mar 2005 04:10:15 -0500 Original-To: Kenichi Handa In-Reply-To: <200503190108.KAA22411@etlken.m17n.org> (Kenichi Handa's message of "Sat, 19 Mar 2005 10:08:16 +0900 (JST)") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org X-MailScanner-To: ged-emacs-devel@m.gmane.org Xref: news.gmane.org gmane.emacs.devel:34764 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:34764 Kenichi Handa writes: > In article <87wts43jxx.fsf-monnier+emacs@gnu.org>, Stefan Monnier writes: > >>> I'd like to know whether coding systems in general are supposed to be >>> robust, meaning that decoding some random byte string into the coding >>> system and reencoding it is guaranteed to deliver the same byte string >>> again? > >> AFAIK, (encode-coding-string (decode-coding-string STR 'foo) 'foo) >> should always return STR, otherwise it's a bug. >> With the introduction of eight-bit-*, this should be true of "all" >> coding-systems in Emacs-21, > > No. Redundant escape sequences in iso-2022 based coding > systems are just ignored. For instance, > > (decode-coding-string "\e(J" 'iso-2022-jp) => "" > > And we can't recover "\e(J" on encoding. Ok, making the problem somewhat more confined: if I have a file that is written _by_ _Emacs_ in some coding system, and then externally I chop parts of it into pieces (not dropping material) not taking into account multibyte boundaries, convert these pieces with interspersed ASCII) into the original decoding, encode it again to a unibyte string, properly replace the ASCII-fied pieces with the original material and decode to the original decoding (phew), I am pretty sure that I have round-trip behavior, right? Well, almost. On escape-based coding systems I don't see in the first place that one can encode/decode string parts in isolation, so I am afraid that it is not really feasible to promise anything. Do the escapes at least start fresh every line? I am just being curious here, there is no actual chance that I am going to support such a coding system, and I don't see how I sensibly could. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum