From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Dynamic loading progress Date: Sun, 22 Nov 2015 21:43:25 +0200 Message-ID: <83ziy5252a.fsf@gnu.org> References: <83k2ptq5t3.fsf@gnu.org> <87h9kxx60e.fsf@lifelogs.com> <877flswse5.fsf@lifelogs.com> <8737wgw7kf.fsf@lifelogs.com> <87io5bv1it.fsf@lifelogs.com> <87egfzuwca.fsf@lifelogs.com> <876118u6f2.fsf@lifelogs.com> <8737w3qero.fsf@lifelogs.com> <831tbn9g9j.fsf@gnu.org> <878u5upw7o.fsf@lifelogs.com> <83ziya8xph.fsf@gnu.org> <83y4du80xo.fsf@gnu.org> <837fld6lps.fsf@gnu.org> <83si3z4s5n.fsf@gnu.org> <83mvu74nhm.fsf@gnu.org> <83d1v34hba.fsf@gnu.org> <83egfh3o7n.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1448221434 13305 80.91.229.3 (22 Nov 2015 19:43:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 22 Nov 2015 19:43:54 +0000 (UTC) Cc: aurelien.aptel+emacs@gmail.com, tzz@lifelogs.com, emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Nov 22 20:43:44 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a0aY1-0002q9-Tw for ged-emacs-devel@m.gmane.org; Sun, 22 Nov 2015 20:43:42 +0100 Original-Received: from localhost ([::1]:57234 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0aY2-00072m-3k for ged-emacs-devel@m.gmane.org; Sun, 22 Nov 2015 14:43:42 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39488) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0aXy-00072K-2Y for emacs-devel@gnu.org; Sun, 22 Nov 2015 14:43:39 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0aXu-0002LN-QX for emacs-devel@gnu.org; Sun, 22 Nov 2015 14:43:37 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:46025) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0aXu-0002KV-IB for emacs-devel@gnu.org; Sun, 22 Nov 2015 14:43:34 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NY800G00EN0X600@a-mtaout20.012.net.il> for emacs-devel@gnu.org; Sun, 22 Nov 2015 21:43:32 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NY800GEZESKPB90@a-mtaout20.012.net.il>; Sun, 22 Nov 2015 21:43:32 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.166 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:195059 Archived-At: > From: Philipp Stephani > Date: Sun, 22 Nov 2015 19:10:44 +0000 > Cc: tzz@lifelogs.com, aurelien.aptel+emacs@gmail.com, emacs-devel@g= nu.org >=20 > It is only used in one place: the internal representation of > characters in buffers and strings. Emacs _never_ lets this inte= rnal > representation leak outside. >=20 > If I run in scratch: >=20 > (with-temp-buffer > (insert #x3fff40) > (describe-char (point-min))) Emacs will never find such "byte" in any text. So this feature is no= t really relevant to the issue at hand. > Then the resulting help buffer says "buffer code: #xF8 #x8F #xBF #x= BD #x80", is > that not considered a leak? No. You created this yourself, and got what you asked for. More generally, can you imagine a real-life situation where a string with such "bytes" could be received from a module, as part of a C 'char *' string? > You are suggesting to expose the internal representation to out= side > application code, which predictably will cause that representat= ion to > leak into Lisp. That'd be a disaster. We had something like tha= t > back in the Emacs 20 era, and it took many years to plug those = leaks. > We would be making a grave mistake to go back there. >=20 > I don't suggest leaking anything what isn't already leaked. The ext= ension of > the codespace to 22 bits is well documented. I don't think it's reasonable to request that module authors read all that stuff and understand it, before they can write a simple module that manipulates non-ASCII text. Writing such modules should be that hard. > Returning raw bytes means that encoding and decoding isn't a perfec= t roundtrip: >=20 > (decode-coding-string (encode-coding-string (string #x3fffc2 #x3fff= bb) > 'utf-8-unix) 'utf-8-unix) > "=C2=BB" If you start with raw bytes, not large integers, then the roundtrip will be perfect. > What are the exact difference between the approaches? As far as I c= an see > differences exist only for the following points: > - Accepting invalid sequences. I consider that a bug in general-pur= pose APIs, > including decode-coding-string. However, given that Emacs already e= xtends the > Unicode codespace and therefore has to accept some invalid sequence= s anyway, it > might be OK if it's clearly documented. > - Emitting raw bytes instead of extended sequences. Though I'm not = a fan of > this it might be unavoidable to be able to treat strings transparen= tly (which > is desirable).=20 Then I think we agree after all. Thanks.