From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 26 Dec 2017 10:35:42 +0000
> Cc: emacs-dev= el@gnu.org, phst@g= oogle.com
>
>=C2=A0 Suggest to move surrogates_to_codepoint to coding.c, and then us= e the
>=C2=A0 macros UTF_16_HIGH_SURROGATE_P and UTF_16_LOW_SURROGATE_P define= d
>=C2=A0 there.
>
> Hmm, I'd rather go the other way round and remove these macros lat= er. They are macros, thus worse than
> functions,
I don't think we have a policy to prefer inline functions to macros,
and I don't think we should have such a policy.=C2=A0 We use inline
functions when that's necessary, but we don't in general prefer the= m.
They have their own problems, see the comments in lisp.h for some of
that.
> and don't seem to be correct either (what about a value such as 0x= 11DC00?).
??? They care correct for UTF-16 sequences, which are 16-bit numbers.
If you need to augment them by testing the high-order bits to be zero
in your case, that's okay, but I don't see any need for introducing=
similar but different functionality.
> No new macros please if we can avoid it. Functions are strictly better= .
Sorry, I disagree.=C2=A0 Each has its advantages, and on balance I find
macros to be slightly better, certainly not worse.=C2=A0 There's no nee= d to
avoid them in C.
> I don't care much whether they are in character.h or coding.h, but= char_surrogate_p is already in character.h.
char_surrogate_p should have used the coding.h macros as well.
>=C2=A0 > +=C2=A0 USE_SAFE_ALLOCA;
>=C2=A0 > +=C2=A0 unichar *utf16_buffer;
>=C2=A0 > +=C2=A0 SAFE_NALLOCA (utf16_buffer, 1, len);
>
>=C2=A0 Maximum length of a UTF-16 sequence is known in advance, so why = do you
>=C2=A0 need SAFE_NALLOCA here?=C2=A0 Couldn't you use a buffer of f= ixed length
>=C2=A0 instead?
>
> The text being inserted can be arbitrarily long. Even single character= s (i.e. extended grapheme clusters) can
> be arbitrarily long.
Yes, but why do you first copy the input into a separate buffer?=C2=A0 Why<= br> not convert each UTF-16 sequence separately, as you go through the
loop?