From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Dynamic loading progress Date: Sat, 21 Nov 2015 11:45:48 +0100 Message-ID: <87wptb3a1v.fsf@fencepost.gnu.org> References: <87io5bv1it.fsf@lifelogs.com> <87egfzuwca.fsf@lifelogs.com> <876118u6f2.fsf@lifelogs.com> <8737w3qero.fsf@lifelogs.com> <831tbn9g9j.fsf@gnu.org> <878u5upw7o.fsf@lifelogs.com> <83ziya8xph.fsf@gnu.org> <83y4du80xo.fsf@gnu.org> <837fld6lps.fsf@gnu.org> <83si3z4s5n.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1448102767 32737 80.91.229.3 (21 Nov 2015 10:46:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 21 Nov 2015 10:46:07 +0000 (UTC) Cc: aurelien.aptel+emacs@gmail.com, Eli Zaretskii , tzz@lifelogs.com, emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 21 11:46:03 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a05g7-0007yB-OH for ged-emacs-devel@m.gmane.org; Sat, 21 Nov 2015 11:45:59 +0100 Original-Received: from localhost ([::1]:51832 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a05g6-0006qd-VN for ged-emacs-devel@m.gmane.org; Sat, 21 Nov 2015 05:45:58 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42047) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a05g2-0006qV-Fi for emacs-devel@gnu.org; Sat, 21 Nov 2015 05:45:55 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a05g1-0001zx-Ik for emacs-devel@gnu.org; Sat, 21 Nov 2015 05:45:54 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:54755) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a05fy-0001zR-5f; Sat, 21 Nov 2015 05:45:50 -0500 Original-Received: from localhost ([127.0.0.1]:40341 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.82) (envelope-from ) id 1a05fx-0006m8-Bx; Sat, 21 Nov 2015 05:45:49 -0500 Original-Received: by lola (Postfix, from userid 1000) id E7093DF5F8; Sat, 21 Nov 2015 11:45:48 +0100 (CET) In-Reply-To: (Philipp Stephani's message of "Sat, 21 Nov 2015 10:31:24 +0000") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:194941 Archived-At: Philipp Stephani writes: > Eli Zaretskii schrieb am Sa., 21. Nov. 2015 um 10:30 Uhr: > >> This is what my comments were about. I think that you, by contrast, >> are talking about the encoding of the _input_ strings, in this case >> the 'documentation' argument to module_make_function and 'str' >> argument to module_make_string. My assumption was that these >> arguments will always have to be in UTF-8 encoding; if that >> assumption is true, then no decoding via code_convert_string_norecord >> is necessary, since make_multibyte_string will DTRT. We can (and >> probably should) document the fact that all non-ASCII strings must be >> UTF-8 encoded as a requirement of the emacs-module interface. >> > > Or rather, an extension to UTF-8 capable of encoding surrogate code > points and numbers that are not code points, as described in > https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Representations.html > . That's mostly irrelevant for fixed strings as valid UTF-8 is encoded as itself and invalid UTF-8 is encoded as well-processable invalid UTF-8. Apart from strings generated with `string-as-multibyte' (what kind of terrible idea is that?) or unibyte strings, all Emacs strings are well-processable (meaning that they are represented by start byte and extension bytes, with the start byte encoding the byte count 1-6 in the UTF-8 typical manner even if UTF-8 itself does not go beyond the Unicode range encodable in 4 bytes). > Yes, provided the internal Emacs encoding is stable. It has changed several times in the past. It's a reasonably good bet that the basics of its current UTF-8 scheme will stick around. -- David Kastrup