From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Emacs Lisp's future
Date: Mon, 06 Oct 2014 19:47:11 +0300
Message-ID: <838uktm9gw.fsf@gnu.org>
References: <54193A70.9020901@member.fsf.org>
	<87k34qo4c1.fsf@fencepost.gnu.org> <54257C22.2000806@yandex.ru>
	<83iokato6x.fsf@gnu.org> <87wq8pwjen.fsf@uwakimon.sk.tsukuba.ac.jp>
	<837g0ptnlj.fsf@gnu.org> <87r3yxwdr6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<87tx3tmi3t.fsf@fencepost.gnu.org> <834mvttgsf.fsf@gnu.org>
	<jwvoau19n3n.fsf-monnier+emacs@gnu.org>
	<87lhp5m99w.fsf@fencepost.gnu.org>
	<jwviok99jki.fsf-monnier+emacs@gnu.org>
	<87h9ztm5oa.fsf@fencepost.gnu.org>
	<jwvd2ah9hve.fsf-monnier+emacs@gnu.org>
	<87d2ahm3nw.fsf@fencepost.gnu.org>
	<jwv1tqx9ea3.fsf-monnier+emacs@gnu.org>
	<E1XYNnY-0005Zo-Kz@fencepost.gnu.org> <871tqneyvl.fsf@netris.org>
	<E1XatgY-00062K-7y@fencepost.gnu.org>
	<87d2a54t1m.fsf@yeeloong.lan> <83lhotme1e.fsf@gnu.org>
	<871tql17uw.fsf@yeeloong.lan>
Reply-To: Eli Zaretskii <eliz@gnu.org>
NNTP-Posting-Host: plane.gmane.org
X-Trace: ger.gmane.org 1412614093 32287 80.91.229.3 (6 Oct 2014 16:48:13 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Mon, 6 Oct 2014 16:48:13 +0000 (UTC)
Cc: dak@gnu.org, rms@gnu.org, dmantipov@yandex.ru, emacs-devel@gnu.org,
	handa@gnu.org, monnier@iro.umontreal.ca, stephen@xemacs.org
To: Mark H Weaver <mhw@netris.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Oct 06 18:48:05 2014
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1XbBS3-00038D-2T
	for ged-emacs-devel@m.gmane.org; Mon, 06 Oct 2014 18:47:59 +0200
Original-Received: from localhost ([::1]:53055 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1XbBS2-00077N-P1
	for ged-emacs-devel@m.gmane.org; Mon, 06 Oct 2014 12:47:58 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53072)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1XbBRL-0006Hz-4F
	for emacs-devel@gnu.org; Mon, 06 Oct 2014 12:47:19 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1XbBRG-0003Kh-NX
	for emacs-devel@gnu.org; Mon, 06 Oct 2014 12:47:15 -0400
Original-Received: from mtaout27.012.net.il ([80.179.55.183]:33415)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@gnu.org>)
	id 1XbBRB-0003Go-Nb; Mon, 06 Oct 2014 12:47:06 -0400
Original-Received: from conversion-daemon.mtaout27.012.net.il by mtaout27.012.net.il
	(HyperSendmail v2007.08) id <0ND100O007M2IN00@mtaout27.012.net.il>;
	Mon, 06 Oct 2014 19:41:43 +0300 (IDT)
Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout27.012.net.il
	(HyperSendmail v2007.08) with ESMTPA id
	<0ND100K3Z7PJJY40@mtaout27.012.net.il>;
	Mon, 06 Oct 2014 19:41:43 +0300 (IDT)
In-reply-to: <871tql17uw.fsf@yeeloong.lan>
X-012-Sender: halo1@inter.net.il
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 80.179.55.183
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:175024
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/175024>

> From: Mark H Weaver <mhw@netris.org>
> Cc: dak@gnu.org,  rms@gnu.org,  dmantipov@yandex.ru,  emacs-devel@gnu.org,  handa@gnu.org,  monnier@iro.umontreal.ca,  stephen@xemacs.org
> Date: Mon, 06 Oct 2014 12:27:35 -0400
> 
> > The obvious solution is to encode the raw bytes internally in a UTF-8
> > compatible way.  Which is what Emacs does in its buffers and strings,
> > as I'm sure you know.  Can't Guile do something similar?
> 
> I'm afraid you've misunderstood, or perhaps I've failed to explain it
> clearly.

I think I did understand your perfectly clear explanation.

> It doesn't matter how these raw bytes are encoded internally.  No matter
> what mechanism we use to accomplish it, propagating invalid byte
> sequences by default is bad security policy.

How can we be responsible for byte streams that originated outside?
That's the responsibility of the source.  And if there is a consumer,
then it is their responsibility not to trip upon such bytes.

But how can you refuse to copy such bytes when you are just a pipe
that is expected not to change anything it wasn't toild to?

Btw, Emacs doesn't expose the internal representation of these bytes
easily to Lisp programs.  That is, whenever any program tries to
access the character at that position, it gets the original raw byte
that was there before the string was read from outside.  A Lisp
program needs some very tricky and deliberate techniques to access the
internal representation of such bytes.  (It isn't "overlong", btw, we
just represent the 128 bytes as codepoints in the 0x3fffXX range, and
encode it in UTF-8 with 5 bytes.)

> The Unicode standard requires that all UTF-8 codecs refuse to accept,
> produce, or propagate invalid byte sequences, including the troublesome
> overlong encodings.

What Emacs does is interpret each byte of such invalid byte sequences
as a separate raw byte, and represent each one of them internally as
described above.  Emacs cannot "refuse to propagate" the original
sequence, because users of an editor expect it not to alter any part
of the input that wasn't explicitly modified by the user or commands
she invoked.

> I'm not one for blindly following standards, but in my opinion this
> is the default policy we should adopt.

So just passing a string unaltered through a Guile program would
change that string?  That sounds like unpleasant surprise for the
users, at least for Emacs users.  Emacs has been there around v20.x,
and we still carry the scars.  It would be a unwise, IMO, if Guile
would repeat those same mistakes.