From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.emacs.devel Subject: Re: Emacs Lisp's future Date: Tue, 07 Oct 2014 14:36:45 -0400 Message-ID: <87wq8bd8w2.fsf@netris.org> References: <54193A70.9020901@member.fsf.org> <87lhp5m99w.fsf@fencepost.gnu.org> <87h9ztm5oa.fsf@fencepost.gnu.org> <87d2ahm3nw.fsf@fencepost.gnu.org> <871tqneyvl.fsf@netris.org> <87d2a54t1m.fsf@yeeloong.lan> <83lhotme1e.fsf@gnu.org> <871tql17uw.fsf@yeeloong.lan> <838uktm9gw.fsf@gnu.org> <87h9zgarvp.fsf@fencepost.gnu.org> <87mw97rjwm.fsf@yeeloong.lan> <8761fvn8io.fsf@yeeloong.lan> <87egujahw6.fsf@fencepost.gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1412707067 27897 80.91.229.3 (7 Oct 2014 18:37:47 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 7 Oct 2014 18:37:47 +0000 (UTC) Cc: Richard Stallman , Andreas Schwab , dmantipov@yandex.ru, emacs-devel@gnu.org, handa@gnu.org, monnier@iro.umontreal.ca, Eli Zaretskii , stephen@xemacs.org To: David Kastrup Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 07 20:37:39 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XbZdg-0005na-HP for ged-emacs-devel@m.gmane.org; Tue, 07 Oct 2014 20:37:36 +0200 Original-Received: from localhost ([::1]:60380 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZdg-0000aK-6O for ged-emacs-devel@m.gmane.org; Tue, 07 Oct 2014 14:37:36 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47006) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZdE-0000Ui-Aj for emacs-devel@gnu.org; Tue, 07 Oct 2014 14:37:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XbZd8-00056x-Rh for emacs-devel@gnu.org; Tue, 07 Oct 2014 14:37:08 -0400 Original-Received: from world.peace.net ([96.39.62.75]:58543) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbZcx-00051T-LV; Tue, 07 Oct 2014 14:36:51 -0400 Original-Received: from c-24-62-95-23.hsd1.ma.comcast.net ([24.62.95.23] helo=jojen) by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1XbZcp-0007SV-KD; Tue, 07 Oct 2014 14:36:43 -0400 In-Reply-To: <87egujahw6.fsf@fencepost.gnu.org> (David Kastrup's message of "Tue, 07 Oct 2014 19:50:33 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.94 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 96.39.62.75 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:175092 Archived-At: David Kastrup writes: > Mark H Weaver writes: > >> Andreas Schwab writes: >> >>> Mark H Weaver writes: >>> >>>> However, if the overlong sequence came from the network, and Emacs >>>> propagates it unchanged to internal subsystems[*] (e.g. via command-line >>>> arguments to subprocesses), that's not good. It exposes another program >>>> to invalid input -- a program that might not be designed for exposure to >>>> possible attacks via overlong encodings. >>> >>> At least it doesn't make it worse (it is unchanged from the situation if >>> you remove Emacs as a filter). >> >> In the case of mere "filtering", you might be right in some cases. >> >> However, the case I'm worried about is where some small piece of the >> hostile input is extracted and passed as an argument to another program. >> In cases like this it doesn't make sense to think of emacs as a >> "filter", and you'd never be able to "remove" it. >> >> It's like saying that a web application that passes unsanitized input to >> an SQL query "doesn't make it worse", and that the situation is >> unchanged from if you provided public access to the SQL database. > > If GUILE or Emacs is supposed to sanitize input, you tell it to sanitize > input. That's different from GUILE/Emacs deciding over your head what > is good for your application. I've already said more than once that I agree Guile and Emacs should provide the *option* to handle invalid byte sequences transparently, if explicitly requested to do so, and furthermore that this is appropriate default behavior when editing files. What I'm saying is that in most other cases, the codecs should be strict, and therefore this should be the default behavior of the underlying functions. When users call an Emacs function to decode UTF-8, it should report an error if that input isn't actually UTF-8. Conversely, when encoding UTF-8, the output should be UTF-8 and not some arbitrary byte sequence. Relying on users to explicitly sanitize the result of decoding UTF-8 to check for "raw bytes", and to explicitly check for "raw bytes" before encoding UTF-8 (as if that term didn't already have a well-known meaning that excludes arbitrary byte sequences) is a recipe for security holes. Mark