From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Allow inserting non-BMP characters Date: Tue, 26 Dec 2017 22:22:36 +0200 Message-ID: <83zi65grxv.fsf@gnu.org> References: <20171225210115.13789-1-phst@google.com> <83d132hz9e.fsf@gnu.org> <834lodii55.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1514319641 11833 195.159.176.226 (26 Dec 2017 20:20:41 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 26 Dec 2017 20:20:41 +0000 (UTC) Cc: phst@google.com, emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Dec 26 21:20:37 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTviG-0002iq-Jo for ged-emacs-devel@m.gmane.org; Tue, 26 Dec 2017 21:20:36 +0100 Original-Received: from localhost ([::1]:55720 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eTvkF-0004b9-7B for ged-emacs-devel@m.gmane.org; Tue, 26 Dec 2017 15:22:39 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54666) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eTvk9-0004ao-2U for emacs-devel@gnu.org; Tue, 26 Dec 2017 15:22:34 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eTvk7-0000gn-TZ for emacs-devel@gnu.org; Tue, 26 Dec 2017 15:22:33 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:57126) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eTvk3-0000aW-F0; Tue, 26 Dec 2017 15:22:27 -0500 Original-Received: from [176.228.60.248] (port=3990 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eTvk2-0002xE-RG; Tue, 26 Dec 2017 15:22:27 -0500 In-reply-to: (message from Philipp Stephani on Tue, 26 Dec 2017 18:50:46 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221425 Archived-At: > From: Philipp Stephani > Date: Tue, 26 Dec 2017 18:50:46 +0000 > Cc: emacs-devel@gnu.org, phst@google.com > > I don't think we have a policy to prefer inline functions to macros, > and I don't think we should have such a policy. We use inline > functions when that's necessary, but we don't in general prefer them. > They have their own problems, see the comments in lisp.h for some of > that. > > Thanks, the only discussion I saw there was about some performance issues: Let me make it more clear: Macros are faster in non-optimized builds (that's why we have such a complex setup with them in lisp.h). Macros are also better for debugging, especially when you debug a core file. Invoking a function when debugging needs a running process, so it cannot be done when debugging a core file. And sometimes the compiler doesn't keep a non-inlined version of an inline function, so it cannot be called from the debugger at all. On the downside, macros can be less readable when they are complex, and have additional problems when you need local variables. (None of that is relevant to the issue at hand.) > > and don't seem to be correct either (what about a value such as 0x11DC00?). > > ??? They care correct for UTF-16 sequences, which are 16-bit numbers. > If you need to augment them by testing the high-order bits to be zero > in your case, that's okay, but I don't see any need for introducing > similar but different functionality. > > I'd be OK with using the macros since they already exist, but I wouldn't want to touch them without converting > them to functions first, and for using them in nsterm.m I'd have to move them around. You don't need to convert the macros to anything, just add a test that you need, as in if (c < 0xFFFF && UTF_16_HIGH_SURROGATE_P (c)) ... > > No new macros please if we can avoid it. Functions are strictly better. > > Sorry, I disagree. Each has its advantages, and on balance I find > macros to be slightly better, certainly not worse. There's no need to > avoid them in C. > > I disagree, see e.g. https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html and many other sources. > Sometimes macros are unavoidable, but not here. See above. > Yes, but why do you first copy the input into a separate buffer? Why > not convert each UTF-16 sequence separately, as you go through the > loop? > > Message (method) invocations in Objective-C have high overhead because they are late-bound. Therefore it is > advisable to minimize the number of messages sent. > https://developer.apple.com/documentation/foundation/nsstring/1408720-getcharacters?language=objc also > indicates that a (properly implemented) getCharacters call is faster than calling characterAtIndex in a loop. Is that a fact, or should we measure that?