all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Alan Third <alan@idiocy.org>
Cc: 42904@debbugs.gnu.org
Subject: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS
Date: Thu, 20 Aug 2020 11:27:01 +0200	[thread overview]
Message-ID: <E703798B-9BB3-4C3D-927C-52BE5A6CD192@acm.org> (raw)
In-Reply-To: <20200818172824.GA90575@breton.holly.idiocy.org>

[-- Attachment #1: Type: text/plain, Size: 1329 bytes --]

18 aug. 2020 kl. 19.28 skrev Alan Third <alan@idiocy.org>:

> Looks good to me. The only thought I have is that perhaps we should
> consider extending NSString to handle these lisp strings rather than
> making it a separate function? We could provide a method to convert to
> a lisp string as well, although that's not as complex.
> 
> I believe using categories would do it without us having to create a
> new EmacsString class or similar.

Fun, I hadn't done that before! Of course we should.

As it happens I just enjoyed the HOPL paper about the history of Objective-C (https://dl.acm.org/doi/10.1145/3386332). An excellent read in general, and it has some history about how the categories came about.

Here is an updated patch: it is now self-contained and does not change anything outside the NS backend.

There is a minor imperfection: the incoming name string can actually be miscoded if it contains both non-ASCII characters and raw bytes. As an example, consider

 (rename-buffer "aéb\300")

In xdisp.c:12497, the Lisp name string is created using make_string which decides that the above multibyte string should really be unibyte, and that confuses the converter. It is of no great consequence, but it makes the result look messier than it should have: "a��b��c" instead of "aéb�c".


[-- Attachment #2: 0001-Fix-NS-crash-on-invalid-frame-title-string-bug-42904.patch --]
[-- Type: application/octet-stream, Size: 4140 bytes --]

From 18c2f2d8aee5ca6708232477d12dc1d884c75235 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Tue, 18 Aug 2020 12:58:12 +0200
Subject: [PATCH] Fix NS crash on invalid frame title string (bug#42904)

Instead of blindly assuming that all Emacs strings are valid UTF-8,
which they are not, use a more careful conversion going via UTF-16
which is what NSString uses internally.  Unpaired surrogates will
still go through to the NSString objects, but the NS libs handle them
gracefully.

* src/nsterm.h (EmacsString): New category.
* src/nsfns.m (all_nonzero_ascii): New helper function.
([NSString stringWithLispString:]): New method.
(ns_set_name_internal): Use new conversion method.
---
 src/nsfns.m  | 65 +++++++++++++++++++++++++++++++++++++++++-----------
 src/nsterm.h |  5 ++++
 2 files changed, 56 insertions(+), 14 deletions(-)

diff --git a/src/nsfns.m b/src/nsfns.m
index 628233ea0d..8357b8c290 100644
--- a/src/nsfns.m
+++ b/src/nsfns.m
@@ -398,29 +398,66 @@ Turn the input menu (an NSMenu) into a lisp list for tracking on lisp side.
         [NSString stringWithUTF8String: SSDATA (arg)]];
 }
 
+/* Whether N bytes at STR are in the [0,127] range.  */
+static bool
+all_nonzero_ascii (unsigned char *str, ptrdiff_t n)
+{
+  for (ptrdiff_t i = 0; i < n; i++)
+    if (str[i] < 1 || str[i] > 127)
+      return false;
+  return true;
+}
+
+@implementation NSString (EmacsString)
+/* Make an NSString from a Lisp string.  */
++ (NSString *)stringWithLispString:(Lisp_Object)string
+{
+  /* Shortcut for the common case.  */
+  if (all_nonzero_ascii (SDATA (string), SBYTES (string)))
+    return [NSString stringWithCString: SSDATA (string)
+                              encoding: NSASCIIStringEncoding];
+  string = string_to_multibyte (string);
+
+  /* Now the string is multibyte; convert to UTF-16.  */
+  unichar *chars = xmalloc (4 * SCHARS (string));
+  unichar *d = chars;
+  const unsigned char *s = SDATA (string);
+  const unsigned char *end = s + SBYTES (string);
+  while (s < end)
+    {
+      int c = string_char_advance (&s);
+      /* We pass unpaired surrogates through, because they are typically
+         handled fairly well by the NS libraries (displayed with distinct
+         glyphs etc).  */
+      if (c <= 0xffff)
+        *d++ = c;
+      else if (c <= 0x10ffff)
+        {
+          *d++ = 0xd800 + (c & 0x3ff);
+          *d++ = 0xdc00 + ((c - 0x10000) >> 10);
+        }
+      else
+        *d++ = 0xfffd;          /* Not valid for UTF-16.  */
+    }
+  NSString *str = [NSString stringWithCharacters: chars
+                                          length: d - chars];
+  xfree (chars);
+  return str;
+}
+@end
+
 static void
 ns_set_name_internal (struct frame *f, Lisp_Object name)
 {
-  Lisp_Object encoded_name, encoded_icon_name;
-  NSString *str;
   NSView *view = FRAME_NS_VIEW (f);
-
-
-  encoded_name = ENCODE_UTF_8 (name);
-
-  str = [NSString stringWithUTF8String: SSDATA (encoded_name)];
-
+  NSString *str = [NSString stringWithLispString: name];
 
   /* Don't change the name if it's already NAME.  */
   if (! [[[view window] title] isEqualToString: str])
     [[view window] setTitle: str];
 
-  if (!STRINGP (f->icon_name))
-    encoded_icon_name = encoded_name;
-  else
-    encoded_icon_name = ENCODE_UTF_8 (f->icon_name);
-
-  str = [NSString stringWithUTF8String: SSDATA (encoded_icon_name)];
+  if (STRINGP (f->icon_name))
+    str = [NSString stringWithLispString: f->icon_name];
 
   if ([[view window] miniwindowTitle]
       && ! [[[view window] miniwindowTitle] isEqualToString: str])
diff --git a/src/nsterm.h b/src/nsterm.h
index a511fef5b9..ab868ed344 100644
--- a/src/nsterm.h
+++ b/src/nsterm.h
@@ -361,6 +361,11 @@ #define NS_DRAW_TO_BUFFER 1
 
 @end
 
+
+@interface NSString (EmacsString)
++ (NSString *)stringWithLispString:(Lisp_Object)string;
+@end
+
 /* ==========================================================================
 
    The Emacs application
-- 
2.21.1 (Apple Git-122.3)


  reply	other threads:[~2020-08-20  9:27 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-17 14:11 bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Mattias Engdegård
2020-08-17 14:54 ` Andrii Kolomoiets
2020-08-17 15:55   ` Mattias Engdegård
2020-08-17 15:55 ` Eli Zaretskii
2020-08-17 16:11   ` Mattias Engdegård
2020-08-17 17:05     ` Eli Zaretskii
2020-08-17 18:48       ` Mattias Engdegård
2020-08-17 19:56         ` Alan Third
2020-08-18  8:07           ` Mattias Engdegård
2020-08-18  8:43             ` Alan Third
2020-08-18 11:48               ` Mattias Engdegård
2020-08-18 12:22                 ` Eli Zaretskii
2020-08-18 17:28                 ` Alan Third
2020-08-20  9:27                   ` Mattias Engdegård [this message]
2020-08-20 13:24                     ` Eli Zaretskii
2020-08-20 18:46                       ` Mattias Engdegård
2020-08-20 19:13                         ` Eli Zaretskii
2020-08-21  9:39                           ` Mattias Engdegård
2020-08-21 13:26                             ` Eli Zaretskii
2020-08-21 14:53                               ` Mattias Engdegård
2020-08-21 15:27                                 ` Eli Zaretskii
2020-08-21 15:50                                   ` Mattias Engdegård
2020-08-23 17:23                                     ` Mattias Engdegård
2020-08-20 13:24                     ` Alan Third
2020-08-20 17:44                       ` Mattias Engdegård
2020-08-18 12:24               ` Eli Zaretskii
2020-08-18 14:11                 ` Mattias Engdegård
2020-08-18 14:40                   ` Eli Zaretskii
2020-08-18 15:21                     ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E703798B-9BB3-4C3D-927C-52BE5A6CD192@acm.org \
    --to=mattiase@acm.org \
    --cc=42904@debbugs.gnu.org \
    --cc=alan@idiocy.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.