Eli Zaretskii schrieb am Di., 26. Dez. 2017 um 17:11 Uhr: > > From: Philipp Stephani > > Date: Tue, 26 Dec 2017 10:35:42 +0000 > > Cc: emacs-devel@gnu.org, phst@google.com > > > > Suggest to move surrogates_to_codepoint to coding.c, and then use the > > macros UTF_16_HIGH_SURROGATE_P and UTF_16_LOW_SURROGATE_P defined > > there. > > > > Hmm, I'd rather go the other way round and remove these macros later. > They are macros, thus worse than > > functions, > > I don't think we have a policy to prefer inline functions to macros, > and I don't think we should have such a policy. We use inline > functions when that's necessary, but we don't in general prefer them. > They have their own problems, see the comments in lisp.h for some of > that. > Thanks, the only discussion I saw there was about some performance issues: Some operations are so commonly executed that they are implemented as macros, not functions, because otherwise runtime performance would suffer too much when compiling with GCC without optimization. FIXME: Remove the lisp_h_OP macros, and define just the inline OP functions, once "gcc -Og" (new to GCC 4.8) works well enough for Emacs developers. Maybe in the year 2020. See Bug#11935. That bug says that GCC 4.8 should be available in Debian stable to support -Og. Debian stable now already has 6.3 ( https://wiki.debian.org/DebianStretch#Packages_.26_versions). So maybe it's time to compile development versions with -Og and try to reapply the original patch. 5 years have passed, and compilers have gotten a lot better. In any case, the new functions are certainly not commonly executed. They are currently only executed when converting non-BMP keystrokes on macOS, which is rare enough. > > > and don't seem to be correct either (what about a value such as > 0x11DC00?). > > ??? They care correct for UTF-16 sequences, which are 16-bit numbers. > If you need to augment them by testing the high-order bits to be zero > in your case, that's okay, but I don't see any need for introducing > similar but different functionality. > I'd be OK with using the macros since they already exist, but I wouldn't want to touch them without converting them to functions first, and for using them in nsterm.m I'd have to move them around. > > > No new macros please if we can avoid it. Functions are strictly better. > > Sorry, I disagree. Each has its advantages, and on balance I find > macros to be slightly better, certainly not worse. There's no need to > avoid them in C. > I disagree, see e.g. https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html and many other sources. Sometimes macros are unavoidable, but not here. > > > I don't care much whether they are in character.h or coding.h, but > char_surrogate_p is already in character.h. > > char_surrogate_p should have used the coding.h macros as well. > > > > + USE_SAFE_ALLOCA; > > > + unichar *utf16_buffer; > > > + SAFE_NALLOCA (utf16_buffer, 1, len); > > > > Maximum length of a UTF-16 sequence is known in advance, so why do you > > need SAFE_NALLOCA here? Couldn't you use a buffer of fixed length > > instead? > > > > The text being inserted can be arbitrarily long. Even single characters > (i.e. extended grapheme clusters) can > > be arbitrarily long. > > Yes, but why do you first copy the input into a separate buffer? Why > not convert each UTF-16 sequence separately, as you go through the > loop? > Message (method) invocations in Objective-C have high overhead because they are late-bound. Therefore it is advisable to minimize the number of messages sent. https://developer.apple.com/documentation/foundation/nsstring/1408720-getcharacters?language=objc also indicates that a (properly implemented) getCharacters call is faster than calling characterAtIndex in a loop.