unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Gerry Agbobada <gagbobada@gmail.com>
To: emacs-devel@gnu.org
Subject: Question about composite.c
Date: Mon, 20 Jan 2020 23:17:19 +0100	[thread overview]
Message-ID: <CAHhDRB85NHEjmKKnJ=LzMBbvZZNtyJmANVbzEjJdrx5iF3v3-Q@mail.gmail.com> (raw)

#+TITLE: empty_string_issue

The rest of this file has been edited with org-mode, to allow simple tangling to
create an elisp file to try to setup the bug I had ; and have
highlighting for the
diff I tried.

* Issue
While trying to use elisp code to get ligatures working on a build of emacs-27
branch, I noticed that a few ligatures from the table was freezing emacs.

I wasn't able to reproduce properly using =emacs -q=, but I was able to
reproduce it in 2 different major modes using a heavily customized emacs (Doom
Emacs). I was able to patch emacs C code to make my issue stop though; so I
mainly want to know if that is a logic error I was able to find out or just
something I should keep investigating on the elisp side.

* Recipe
** Emacs version
I'm playing with emacs-27 builds =--with-harfbuzz=.

The version is 27.0.60 with harfbuzz feature.

I think I was on commit =5841240= but I'm not 100% positive since I don't have
the machine easily available. I know I pulled emacs-27 between
Fri, 10 Jan 2020 08:00:00 GMT and Fri, 10 Jan 2020 11:00:00 GMT.

The =src/composite.c= file I was able to single out didn't change since then it
seems.

** Set up a buffer
This is extracted from microsoft/cascadia-code#153 on Github
[[https://github.com/microsoft/cascadia-code/issues/153][Link to the issue]]
#+BEGIN_SRC emacs-lisp :tangle yes
(defvar composition-ligature-table (make-char-table nil))
(require 'composite)

(let ((alist
       '(
         (?* . ".\\(?:\\(\\*\\*\\|[*>]\\)[*>]?\\)")
         )))
  (dolist (char-regexp alist)
    (set-char-table-range composition-ligature-table (car char-regexp)
                          `([,(cdr char-regexp) 0 font-shape-gstring]))))

(set-char-table-parent composition-ligature-table composition-function-table)

(setq-local composition-function-table composition-ligature-table)
#+END_SRC

** Test fonts
- Write a lot of * : *****
- Delete asterisks one by one (=DEL=)
- EXPECTED : asterisks get deleted one by one
- ACTUAL : emacs freezes and after modifying the =src/composite.c= file I found
  out the error is =Attempt to shape unibyte text= from this
[[https://github.com/emacs-mirror/emacs/blob/b651939aaf06f4f1ddcd1f750bb8825546027b96/src/composite.c#L1749][if
branch]] with an
  *empty string*.

I think my error may come from having a composition-table where a replacement
triggered by =prettify-symbols= occurs before the regex for
=composition-ligature-table= happens, so there's only an empty string for
replacement and there's an error because it doesn't pass the test.

* Patch to "make it work"
With this patch I was able to stop having the infinite freeze/stuttering
I think the issue is that :
- when the font does not have a ligature for the matching pattern
- it somehow passes an empty string =""= in this if-statement
- =(if "" 'not-nil)= return ='not-nil= so it goes to the else branch
- =STRING_MULTIBYTE("")= is false because an empty string doesn't have the
  marker
- composite errors and loops

Or maybe the error comes from having a composition-table where a replacement
triggered by =prettify-symbols= occurs before the regex for
=composition-ligature-table= happens, so there's only an empty string for
replacement and there's an error because it doesn't pass the test.

NOTE : I know this is a "dirty" patch from thw warning I got from the compiler
(i.e. using strlen on the =string= variable here is not clean.) but I feel it
conveys what I mean in a better way, and to be honest I don't know what the
clean version is.

NOTE 2 : this patch also shows how I was able to find out which if-branch
triggered. I guess the reason for not displaying the bad string is to avoid
side-effects in the minibuffer.

#+BEGIN_SRC diff :tangle no
diff --git a/src/composite.c b/src/composite.c
index 53e6930b5f..1151721d61 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -1735,7 +1735,7 @@ Otherwise (for terminal display), FONT-OBJECT
must be a terminal ID, a
   if (NILP (string))
     {
       if (NILP (BVAR (current_buffer, enable_multibyte_characters)))
-       error ("Attempt to shape unibyte text");
+       error ("Attempt to shape unibyte text \"%s\" in non multibyte
buffer", string);
       validate_region (&from, &to);
       frompos = XFIXNAT (from);
       topos = XFIXNAT (to);
@@ -1745,8 +1745,8 @@ Otherwise (for terminal display), FONT-OBJECT
must be a terminal ID, a
     {
       CHECK_STRING (string);
       validate_subarray (string, from, to, SCHARS (string), &frompos, &topos);
-      if (! STRING_MULTIBYTE (string))
-       error ("Attempt to shape unibyte text");
+      if (strlen(string) != 0 && ! STRING_MULTIBYTE (string))
+       error ("Attempt to shape unibyte text \"%s\"", string);
       frombyte = string_char_to_byte (string, frompos);
     }
#+END_SRC

* Question
I guess the only question is : what's supposed to happen when =string= is an
empty lisp string in this condition ?

Gerry AGBOBADA



             reply	other threads:[~2020-01-20 22:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-20 22:17 Gerry Agbobada [this message]
2020-01-21 18:15 ` Question about composite.c Eli Zaretskii
2020-01-21 18:57   ` Gerry Agbobada
2020-01-21 19:15     ` Eli Zaretskii
2020-01-22  8:55       ` Gerry Agbobada
2020-04-14 22:43         ` Gerry Agbobada
2020-04-15  1:10           ` Noam Postavsky
2020-04-15  7:39             ` Gerry Agbobada
2020-04-25 12:44           ` Eli Zaretskii
2020-04-25 13:06             ` Gerry Agbobada
2020-04-25 13:29               ` Gerry Agbobada
2020-04-25 13:53                 ` Gerry Agbobada
2020-04-25 15:23                   ` Eli Zaretskii
2020-04-25 15:32                     ` Gerry Agbobada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHhDRB85NHEjmKKnJ=LzMBbvZZNtyJmANVbzEjJdrx5iF3v3-Q@mail.gmail.com' \
    --to=gagbobada@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).