From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Emacs Lisp's future Date: Tue, 07 Oct 2014 20:56:31 +0200 Message-ID: <87y4sr909s.fsf@fencepost.gnu.org> References: <54193A70.9020901@member.fsf.org> <87lhp5m99w.fsf@fencepost.gnu.org> <87h9ztm5oa.fsf@fencepost.gnu.org> <87d2ahm3nw.fsf@fencepost.gnu.org> <871tqneyvl.fsf@netris.org> <87d2a54t1m.fsf@yeeloong.lan> <83lhotme1e.fsf@gnu.org> <871tql17uw.fsf@yeeloong.lan> <838uktm9gw.fsf@gnu.org> <87h9zgarvp.fsf@fencepost.gnu.org> <87mw97rjwm.fsf@yeeloong.lan> <8761fvn8io.fsf@yeeloong.lan> <87egujahw6.fsf@fencepost.gnu.org> <87wq8bd8w2.fsf@netris.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1412708214 11016 80.91.229.3 (7 Oct 2014 18:56:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 7 Oct 2014 18:56:54 +0000 (UTC) Cc: Richard Stallman , Andreas Schwab , dmantipov@yandex.ru, emacs-devel@gnu.org, handa@gnu.org, monnier@iro.umontreal.ca, Eli Zaretskii , stephen@xemacs.org To: Mark H Weaver Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 07 20:56:46 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XbZwE-0008Jv-5a for ged-emacs-devel@m.gmane.org; Tue, 07 Oct 2014 20:56:46 +0200 Original-Received: from localhost ([::1]:60462 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZwD-00081r-Dw for ged-emacs-devel@m.gmane.org; Tue, 07 Oct 2014 14:56:45 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52736) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZwA-00081b-DK for emacs-devel@gnu.org; Tue, 07 Oct 2014 14:56:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XbZw9-0004cc-Eb for emacs-devel@gnu.org; Tue, 07 Oct 2014 14:56:42 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48173) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZw9-0004cX-BV for emacs-devel@gnu.org; Tue, 07 Oct 2014 14:56:41 -0400 Original-Received: from localhost ([127.0.0.1]:55344 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZw0-0003nG-B3; Tue, 07 Oct 2014 14:56:32 -0400 Original-Received: by lola (Postfix, from userid 1000) id 80012DF341; Tue, 7 Oct 2014 20:56:31 +0200 (CEST) In-Reply-To: <87wq8bd8w2.fsf@netris.org> (Mark H. Weaver's message of "Tue, 07 Oct 2014 14:36:45 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:175095 Archived-At: Mark H Weaver writes: > Relying on users to explicitly sanitize the result of decoding UTF-8 > to check for "raw bytes", and to explicitly check for "raw bytes" > before encoding UTF-8 (as if that term didn't already have a > well-known meaning that excludes arbitrary byte sequences) is a recipe > for security holes. You are calling "application programmers" here "users" and call them incapable of designing their application. Any application in need of sanitizing will not stop in its requirements at UTF-8 sanitization. You cannot successfully cater for clueless application programmers. And nobody says that GUILE should _crash_ when provided non-sanitized UTF-8. It has to be able to deal with everything thrown at it. And you want it to _not_ do that by default. That means that _any_ programmer wanting to do his own verification will not be able to use _any_ module provided by someone else which does not explicitly override the defaults, since then modules he has no control over will refuse cooperating. GUILE is an extension language and system. It should _not_ do policing. Every attempt at policing makes it impossible to design the policing into the place where it makes sense. Worse, it leads to sloppy code since then people start to consider an internal UTF-8 based encoding to be identical to an external UTF-8 encoding, making it _impossible_ to design byte-transparent workflows. That is the current state of GUILE=A02, and as an application programmer I=A0can testify that it is a huge headache. Both in practice as well as conceptually. I am glad that Emacs started its history with a multibyte encoding incompatible with any external encoding since that has given it lots of impetus to get that distinction right. With the "we don't want to cater for raw bytes by default" attitude you'll never get away in a reasonably reliable manner from the "our code will not deal with raw bytes" situation you have now with regard to string manipulation. It took Emacs years to get this into a really reliable and good state, with many more active users of multibyte character sets than GUILE has. --=20 David Kastrup