From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Dynamic loading progress Date: Sat, 21 Nov 2015 15:23:37 +0200 Message-ID: <83d1v34hba.fsf@gnu.org> References: <83k2ptq5t3.fsf@gnu.org> <87h9kxx60e.fsf@lifelogs.com> <877flswse5.fsf@lifelogs.com> <8737wgw7kf.fsf@lifelogs.com> <87io5bv1it.fsf@lifelogs.com> <87egfzuwca.fsf@lifelogs.com> <876118u6f2.fsf@lifelogs.com> <8737w3qero.fsf@lifelogs.com> <831tbn9g9j.fsf@gnu.org> <878u5upw7o.fsf@lifelogs.com> <83ziya8xph.fsf@gnu.org> <83y4du80xo.fsf@gnu.org> <837fld6lps.fsf@gnu.org> <83si3z4s5n.fsf@gnu.org> <83mvu74nhm.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1448112254 3571 80.91.229.3 (21 Nov 2015 13:24:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 21 Nov 2015 13:24:14 +0000 (UTC) Cc: aurelien.aptel+emacs@gmail.com, tzz@lifelogs.com, emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 21 14:24:05 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a0890-0002WD-Cr for ged-emacs-devel@m.gmane.org; Sat, 21 Nov 2015 14:23:58 +0100 Original-Received: from localhost ([::1]:52340 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a088z-00027W-Df for ged-emacs-devel@m.gmane.org; Sat, 21 Nov 2015 08:23:57 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36890) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a088v-00022x-Cy for emacs-devel@gnu.org; Sat, 21 Nov 2015 08:23:54 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a088s-0001c1-97 for emacs-devel@gnu.org; Sat, 21 Nov 2015 08:23:53 -0500 Original-Received: from mtaout29.012.net.il ([80.179.55.185]:60539) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a088r-0001Xu-SE for emacs-devel@gnu.org; Sat, 21 Nov 2015 08:23:50 -0500 Original-Received: from conversion-daemon.mtaout29.012.net.il by mtaout29.012.net.il (HyperSendmail v2007.08) id <0NY600F002932000@mtaout29.012.net.il> for emacs-devel@gnu.org; Sat, 21 Nov 2015 15:23:08 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by mtaout29.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NY6009HJ2IJ3P60@mtaout29.012.net.il>; Sat, 21 Nov 2015 15:23:08 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.185 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:194955 Archived-At: > From: Philipp Stephani > Date: Sat, 21 Nov 2015 12:11:45 +0000 > Cc: tzz@lifelogs.com, aurelien.aptel+emacs@gmail.com, emacs-devel@gnu.org > > No, we cannot, or rather should not. It is unreasonable to expect > external modules to know the intricacies of the internal > representation. Most Emacs hackers don't. > > Fine with me, but how would we then represent Emacs strings that are not valid > Unicode strings? Just raise an error? No need to raise an error. Strings that are returned to modules should be encoded into UTF-8. That encoding already takes care of these situations: it either produces the UTF-8 encoding of the equivalent Unicode characters, or outputs raw bytes. We are using this all the time when we save files or send stuff over the network. > No, I meant strict UTF-8, not its Emacs extension. > > That would be possible and provide a clean interface. However, Emacs strings > are extended, so we'd need to specify how they interact with UTF-8 strings. > > * If a module passes a char sequence that's not a valid UTF-8 string, but a > valid Emacs multibyte string, what should happen? Error, undefined behavior, > silently accepted? We are quite capable of quietly accepting such strings, so that is what I would suggest. Doing so would be in line with what Emacs does when such invalid sequences come from other sources, like files. > * If copy_string_contents is passed an Emacs string that is not a valid Unicode > string, what should happen? How can that happen? The Emacs string comes from the Emacs bowels, so it must be "valid" string by Emacs standards. Or maybe I don't understand what you mean by "invalid Unicode string". In any case, we already deal with any such problems when we save a buffer to a file, or send it over the network. This isn't some new problem we need to cope with. > OK, then we can use that, of course. The question of handling invalid UTF-8 > strings is still open, though, as make_multibyte_string doesn't enforce valid > UTF-8. It doesn't enforce valid UTF-8 because it can handle invalid UTF-8 as well. That's by design. > If it's the contract of make_multibyte_string that it will always accept UTF-8, > then that should be added as a comment to that function. Currently I don't see > it documented anywhere. That part of the documentation is only revealed to veteran Emacs hackers, subject to swearing not to reveal that to the uninitiated and to some blood-letting that seals the oath ;-)