unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: David Kastrup <dak@gnu.org>
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: Coding system robustness?
Date: Sat, 19 Mar 2005 10:10:07 +0100	[thread overview]
Message-ID: <x5ll8kdms0.fsf@lola.goethe.zz> (raw)
In-Reply-To: <200503190108.KAA22411@etlken.m17n.org> (Kenichi Handa's message of "Sat, 19 Mar 2005 10:08:16 +0900 (JST)")

Kenichi Handa <handa@m17n.org> writes:

> In article <87wts43jxx.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>>  I'd like to know whether coding systems in general are supposed to be
>>>  robust, meaning that decoding some random byte string into the coding
>>>  system and reencoding it is guaranteed to deliver the same byte string
>>>  again?
>
>> AFAIK, (encode-coding-string (decode-coding-string STR 'foo) 'foo)
>> should always return STR, otherwise it's a bug.
>> With the introduction of eight-bit-*, this should be true of "all"
>> coding-systems in Emacs-21,
>
> No.  Redundant escape sequences in iso-2022 based coding
> systems are just ignored.  For instance,
>
>   (decode-coding-string "\e(J" 'iso-2022-jp) => ""
>
> And we can't recover "\e(J" on encoding.

Ok, making the problem somewhat more confined: if I have a file that
is written _by_ _Emacs_ in some coding system, and then externally I
chop parts of it into pieces (not dropping material) not taking into
account multibyte boundaries, convert these pieces with interspersed
ASCII) into the original decoding, encode it again to a unibyte
string, properly replace the ASCII-fied pieces with the original
material and decode to the original decoding (phew), I am pretty sure
that I have round-trip behavior, right?

Well, almost.  On escape-based coding systems I don't see in the first
place that one can encode/decode string parts in isolation, so I am
afraid that it is not really feasible to promise anything.  Do the
escapes at least start fresh every line?  I am just being curious
here, there is no actual chance that I am going to support such a
coding system, and I don't see how I sensibly could.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

  reply	other threads:[~2005-03-19  9:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-18 17:45 Coding system robustness? David Kastrup
2005-03-18 18:11 ` Stefan Monnier
2005-03-18 18:33   ` David Kastrup
2005-03-20  0:22     ` Richard Stallman
2005-03-19  1:08   ` Kenichi Handa
2005-03-19  9:10     ` David Kastrup [this message]
2005-03-19  0:52 ` Kenichi Handa
2005-03-19  3:09 ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x5ll8kdms0.fsf@lola.goethe.zz \
    --to=dak@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).