From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Emacs Lisp's future Date: Sun, 12 Oct 2014 18:50:42 +0200 Message-ID: <87zjd1z0yl.fsf@fencepost.gnu.org> References: <54193A70.9020901@member.fsf.org> <87d2a54t1m.fsf@yeeloong.lan> <83lhotme1e.fsf@gnu.org> <871tql17uw.fsf@yeeloong.lan> <838uktm9gw.fsf@gnu.org> <87h9zgarvp.fsf@fencepost.gnu.org> <83y4srjaot.fsf@gnu.org> <83r3yhiu8c.fsf@gnu.org> <83siiw9c6t.fsf@gnu.org> <83zjd3846e.fsf@gnu.org> <8738auyxke.fsf@netris.org> <874mvaoys7.fsf@uwakimon.sk.tsukuba.ac.jp> <87h9z91y52.fsf@fencepost.gnu.org> <871tqdpjoi.fsf@uwakimon.sk.tsukuba.ac.jp> <874mv91n6a.fsf@fencepost.gnu.org> <87zjd1ny1h.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1413132686 26137 80.91.229.3 (12 Oct 2014 16:51:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 12 Oct 2014 16:51:26 +0000 (UTC) Cc: emacs-devel@gnu.org To: "Stephen J. Turnbull" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Oct 12 18:51:17 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XdMMX-0006j1-Na for ged-emacs-devel@m.gmane.org; Sun, 12 Oct 2014 18:51:17 +0200 Original-Received: from localhost ([::1]:57933 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdMMX-0001iR-C7 for ged-emacs-devel@m.gmane.org; Sun, 12 Oct 2014 12:51:17 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33138) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdMMU-0001iL-Qo for emacs-devel@gnu.org; Sun, 12 Oct 2014 12:51:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XdMMT-00063S-O0 for emacs-devel@gnu.org; Sun, 12 Oct 2014 12:51:14 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42421) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdMMT-00063O-KK for emacs-devel@gnu.org; Sun, 12 Oct 2014 12:51:13 -0400 Original-Received: from localhost ([127.0.0.1]:49597 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XdMMS-000840-O2; Sun, 12 Oct 2014 12:51:13 -0400 Original-Received: by lola (Postfix, from userid 1000) id 74376E0691; Sun, 12 Oct 2014 18:50:42 +0200 (CEST) In-Reply-To: <87zjd1ny1h.fsf@uwakimon.sk.tsukuba.ac.jp> (Stephen J. Turnbull's message of "Sun, 12 Oct 2014 23:49:14 +0900") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:175292 Archived-At: "Stephen J. Turnbull" writes: > Sigh. It is *Emacs* that assumes the world is full of valid data, Nonsense. It would not need to _carefully_ _deal_ with data not fitting an encoding if it assumed that. It _carefully_ decodes non-representable data into a code page reserved for non-representable data. It will deal _properly_ with that data while it is under control of its strings (not upper/lowercasing it or mixing it up with other stuff) and will carefully repackage it when encoding it. As a consequence, it is easy to apply _any_ strategy to your data. If you want to clean out characters that are invalid for your application, any respective positive or negative character and coding ranges in a regexp pattern will carefully deal with it. > and happily shovels any hazmat it receives on to the next user or > program without validation. Emacs has no way to know what input is valid for the next user or program. An application programmed in Elisp may know, and it has _all_ the tools to deal _gracefully_ with it since Emacs' string processing will _not_ get confused by data it decoded itself and will preserve all information. > And you're right, it *is* a security problem. Not just denial of > service, either. You say that behavior is what Emacs users want, and > maybe it is. Because most of the time the data is "nearly" valid and > the defects are "insignificant", and hardly a security problem. It's > the "worse is better" philosophy.[1] No, it is the "clueless is useless" philosophy. Don't second-guess other systems. Do your job properly, regardless of what is thrown at you. Don't be the weakest chain in a link. Emacs cannot be a verification engine if it has no clue what it should be verifying. If you know what you want, you can get it. Regardless of what you want. libunistring (which is what GUILE currently uses for UTF-8 processing) has a _closed_ set of recovery strategies. As it stands, it is useless for implementing Emacs-like behavior because "encode invalid bytes into something libunistring can deal with transparently" is not part of its recovery strategies. Once you _have_ a useful encoding into the space of properly working strings, _any_ recovery strategy is easy to implement on top of that. For a platform, being forced to a closed set of behaviors is an extremely limiting choice. > But the rest of the software development world is going in the > opposite direction. "In God we trust. All others, present photo ID." > Maybe they have figured something out? Heck, even Emacs is moving in > the direction of defending *itself* from invalid data in other ways > (thank you, Ted Z!) You don't need to defend yourself from something you are equipped to deal with. -- David Kastrup