From: David Kastrup <dak@gnu.org>
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: Coding system robustness?
Date: Sat, 19 Mar 2005 10:10:07 +0100 [thread overview]
Message-ID: <x5ll8kdms0.fsf@lola.goethe.zz> (raw)
In-Reply-To: <200503190108.KAA22411@etlken.m17n.org> (Kenichi Handa's message of "Sat, 19 Mar 2005 10:08:16 +0900 (JST)")
Kenichi Handa <handa@m17n.org> writes:
> In article <87wts43jxx.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> I'd like to know whether coding systems in general are supposed to be
>>> robust, meaning that decoding some random byte string into the coding
>>> system and reencoding it is guaranteed to deliver the same byte string
>>> again?
>
>> AFAIK, (encode-coding-string (decode-coding-string STR 'foo) 'foo)
>> should always return STR, otherwise it's a bug.
>> With the introduction of eight-bit-*, this should be true of "all"
>> coding-systems in Emacs-21,
>
> No. Redundant escape sequences in iso-2022 based coding
> systems are just ignored. For instance,
>
> (decode-coding-string "\e(J" 'iso-2022-jp) => ""
>
> And we can't recover "\e(J" on encoding.
Ok, making the problem somewhat more confined: if I have a file that
is written _by_ _Emacs_ in some coding system, and then externally I
chop parts of it into pieces (not dropping material) not taking into
account multibyte boundaries, convert these pieces with interspersed
ASCII) into the original decoding, encode it again to a unibyte
string, properly replace the ASCII-fied pieces with the original
material and decode to the original decoding (phew), I am pretty sure
that I have round-trip behavior, right?
Well, almost. On escape-based coding systems I don't see in the first
place that one can encode/decode string parts in isolation, so I am
afraid that it is not really feasible to promise anything. Do the
escapes at least start fresh every line? I am just being curious
here, there is no actual chance that I am going to support such a
coding system, and I don't see how I sensibly could.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
next prev parent reply other threads:[~2005-03-19 9:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-18 17:45 Coding system robustness? David Kastrup
2005-03-18 18:11 ` Stefan Monnier
2005-03-18 18:33 ` David Kastrup
2005-03-20 0:22 ` Richard Stallman
2005-03-19 1:08 ` Kenichi Handa
2005-03-19 9:10 ` David Kastrup [this message]
2005-03-19 0:52 ` Kenichi Handa
2005-03-19 3:09 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=x5ll8kdms0.fsf@lola.goethe.zz \
--to=dak@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).