unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
@ 2014-01-15 17:24 Dmitry Antipov
  2014-01-15 17:41 ` Eli Zaretskii
  2014-01-19 13:45 ` K. Handa
  0 siblings, 2 replies; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-15 17:24 UTC (permalink / raw)
  To: 16457

[-- Attachment #1: Type: text/plain, Size: 4249 bytes --]

Run 'emacs -Q', then visit attached file and eval '(move-to-column 10)' ==>

#0  0x00000034d9e0f62b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00000000005644fb in terminate_due_to_signal (sig=11, backtrace_limit=40) at ../../trunk/src/emacs.c:378
#2  0x000000000058e5c8 in handle_fatal_signal (sig=11) at ../../trunk/src/sysdep.c:1628
#3  0x000000000058e59d in deliver_thread_signal (sig=11, handler=0x58e5ae <handle_fatal_signal>) at ../../trunk/src/sysdep.c:1602
#4  0x000000000058e5fe in deliver_fatal_thread_signal (sig=11) at ../../trunk/src/sysdep.c:1640
#5  <signal handler called>
#6  0x000000000056134d in PSEUDOVECTOR_TYPEP (a=0x400000000d000040, code=14) at ../../trunk/src/lisp.h:2377
#7  0x00000000005613bd in PSEUDOVECTORP (a=..., code=14) at ../../trunk/src/lisp.h:2391
#8  0x00000000005614d4 in SUB_CHAR_TABLE_P (a=...) at ../../trunk/src/lisp.h:2449
#9  0x00000000004fb65a in char_table_ref (table=..., c=4195104) at ../../trunk/src/chartab.c:251
#10 0x00000000005605f8 in CHAR_TABLE_REF (ct=..., idx=4195104) at ../../trunk/src/lisp.h:1466
#11 0x00000000006846d3 in composition_compute_stop_pos (cmp_it=0x7fffa4063870, charpos=14, bytepos=26, endpos=101, string=...)
     at ../../trunk/src/composite.c:1039
#12 0x00000000005c14ad in scan_for_column (endpos=0x7fffa4063988, goalcol=0x7fffa4063978, prevcol=0x7fffa4063980)
     at ../../trunk/src/indent.c:601
#13 0x00000000005c26a7 in Fmove_to_column (column=..., force=...) at ../../trunk/src/indent.c:989
#14 0x000000000060a88c in eval_sub (form=...) at ../../trunk/src/eval.c:2179
#15 0x0000000000609f7e in Feval (form=..., lexical=...) at ../../trunk/src/eval.c:1994
#16 0x000000000060bfe0 in Ffuncall (nargs=3, args=0x7fffa4063cc8) at ../../trunk/src/eval.c:2809
#17 0x0000000000655807 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=2, args=0x7fffa4064528)
     at ../../trunk/src/bytecode.c:919
#18 0x000000000060c7fe in funcall_lambda (fun=..., nargs=2, arg_vector=0x7fffa4064518) at ../../trunk/src/eval.c:2974
#19 0x000000000060c196 in Ffuncall (nargs=3, args=0x7fffa4064510) at ../../trunk/src/eval.c:2855
#20 0x000000000060b1e8 in Fapply (nargs=2, args=0x7fffa4064620) at ../../trunk/src/eval.c:2345
#21 0x000000000060b881 in apply1 (fn=..., arg=...) at ../../trunk/src/eval.c:2579
#22 0x0000000000602e28 in Fcall_interactively (function=..., record_flag=..., keys=...) at ../../trunk/src/callint.c:378
#23 0x000000000060c00c in Ffuncall (nargs=4, args=0x7fffa4064a98) at ../../trunk/src/eval.c:2813
#24 0x0000000000655807 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=1, args=0x7fffa4065340)
     at ../../trunk/src/bytecode.c:919
#25 0x000000000060c7fe in funcall_lambda (fun=..., nargs=1, arg_vector=0x7fffa4065338) at ../../trunk/src/eval.c:2974
#26 0x000000000060c196 in Ffuncall (nargs=2, args=0x7fffa4065330) at ../../trunk/src/eval.c:2855
#27 0x000000000060b8eb in call1 (fn=..., arg1=...) at ../../trunk/src/eval.c:2605
#28 0x0000000000569169 in command_loop_1 () at ../../trunk/src/keyboard.c:1552
#29 0x0000000000608523 in internal_condition_case (bfun=0x568a4f <command_loop_1>, handlers=..., hfun=0x568225 <cmd_error>)
     at ../../trunk/src/eval.c:1345
#30 0x00000000005686ed in command_loop_2 (ignore=...) at ../../trunk/src/keyboard.c:1170
#31 0x00000000006079a6 in internal_catch (tag=..., func=0x5686ca <command_loop_2>, arg=...) at ../../trunk/src/eval.c:1109
#32 0x00000000005686a1 in command_loop () at ../../trunk/src/keyboard.c:1149
#33 0x0000000000567d51 in recursive_edit_1 () at ../../trunk/src/keyboard.c:777
#34 0x0000000000567f21 in Frecursive_edit () at ../../trunk/src/keyboard.c:841
#35 0x0000000000565e85 in main (argc=2, argv=0x7fffa4065838) at ../../trunk/src/emacs.c:1637

In GNU Emacs 24.3.50.25 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.6)
  of 2014-01-15 on localhost.localdomain
Repository revision: 116032 rgm@gnu.org-20140115084938-1ot2l5bk0x6xgol7
Windowing system distributor `Fedora Project', version 11.0.11404000
System Description:	Fedora release 20 (Heisenbug)

Configured using:
  `configure --prefix=/not/exists --enable-gcc-warnings
  --enable-check-lisp-object-type --enable-checking 'CFLAGS=-O0 -g3''

Dmitry

[-- Attachment #2: uthmani-test.txt --]
[-- Type: text/plain, Size: 190 bytes --]

يُخَٰدِعُونَ ٱللَّهَ وَٱلَّذِينَ ءَامَنُوا۟ وَمَا يَخْدَعُونَ إِلَّآ أَنفُسَهُمْ وَمَا يَشْعُرُونَ

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-15 17:24 bug#16457: 24.3.50; crash rendering Arabic Uthmani script Dmitry Antipov
@ 2014-01-15 17:41 ` Eli Zaretskii
  2014-01-15 21:44   ` Glenn Morris
  2014-01-19 13:45 ` K. Handa
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2014-01-15 17:41 UTC (permalink / raw)
  To: Dmitry Antipov, Kenichi Handa; +Cc: 16457

> Date: Wed, 15 Jan 2014 21:24:54 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> 
> Run 'emacs -Q', then visit attached file and eval '(move-to-column 10)' ==>
> 
> #0  0x00000034d9e0f62b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
> #1  0x00000000005644fb in terminate_due_to_signal (sig=11, backtrace_limit=40) at ../../trunk/src/emacs.c:378
> #2  0x000000000058e5c8 in handle_fatal_signal (sig=11) at ../../trunk/src/sysdep.c:1628
> #3  0x000000000058e59d in deliver_thread_signal (sig=11, handler=0x58e5ae <handle_fatal_signal>) at ../../trunk/src/sysdep.c:1602
> #4  0x000000000058e5fe in deliver_fatal_thread_signal (sig=11) at ../../trunk/src/sysdep.c:1640
> #5  <signal handler called>
> #6  0x000000000056134d in PSEUDOVECTOR_TYPEP (a=0x400000000d000040, code=14) at ../../trunk/src/lisp.h:2377
> #7  0x00000000005613bd in PSEUDOVECTORP (a=..., code=14) at ../../trunk/src/lisp.h:2391
> #8  0x00000000005614d4 in SUB_CHAR_TABLE_P (a=...) at ../../trunk/src/lisp.h:2449
> #9  0x00000000004fb65a in char_table_ref (table=..., c=4195104) at ../../trunk/src/chartab.c:251
> #10 0x00000000005605f8 in CHAR_TABLE_REF (ct=..., idx=4195104) at ../../trunk/src/lisp.h:1466
> #11 0x00000000006846d3 in composition_compute_stop_pos (cmp_it=0x7fffa4063870, charpos=14, bytepos=26, endpos=101, string=...)
>      at ../../trunk/src/composite.c:1039
> #12 0x00000000005c14ad in scan_for_column (endpos=0x7fffa4063988, goalcol=0x7fffa4063978, prevcol=0x7fffa4063980)
>      at ../../trunk/src/indent.c:601
> #13 0x00000000005c26a7 in Fmove_to_column (column=..., force=...) at ../../trunk/src/indent.c:989

Looks similar to #15984, although this is a GUI session.






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-15 17:41 ` Eli Zaretskii
@ 2014-01-15 21:44   ` Glenn Morris
  2014-01-16  8:01     ` Dmitry Antipov
  2014-01-17 13:51     ` K. Handa
  0 siblings, 2 replies; 14+ messages in thread
From: Glenn Morris @ 2014-01-15 21:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Antipov, 16457

Eli Zaretskii wrote:

> Looks similar to #15984, although this is a GUI session.

http://lists.gnu.org/archive/html/emacs-diffs/2014-01/msg00176.html

was supposed [1] to fix #15984, and this comes from a later revision.


[1] Well, personally I have no idea if it was or not. No comment was
posted to #15984's bug report, but it is referenced in the commit.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-15 21:44   ` Glenn Morris
@ 2014-01-16  8:01     ` Dmitry Antipov
  2014-01-16 10:07       ` Dmitry Antipov
  2014-01-16 17:33       ` Eli Zaretskii
  2014-01-17 13:51     ` K. Handa
  1 sibling, 2 replies; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-16  8:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 16457

I'm not familiar with composition sequences in detail, but there is a hint.

For the uthmani-test.txt, the following code in set_iterator_to_next:

   7127                /* Composition created while scanning forward.  */
   7128                /* Update IT's char/byte positions to point to the first
   7129                   character of the next grapheme cluster, or to the
   7130                   character visually after the current composition.  */
   7131                for (i = 0; i < it->cmp_it.nchars; i++)
   7132                  bidi_move_to_visually_next (&it->bidi_it);
   7133                IT_BYTEPOS (*it) = it->bidi_it.bytepos;
   7134                IT_CHARPOS (*it) = it->bidi_it.charpos;

advances IT from charpos:bytepos 11:21 to 13:25.  But the following fragment
from scan_for_column:

    586        /* Check composition sequence.  */
    587        if (cmp_it.id >= 0
    588            || (scan == cmp_it.stop_pos
    589                && composition_reseat_it (&cmp_it, scan, scan_byte, end,
    590                                          w, NULL, Qnil)))
    591          composition_update_it (&cmp_it, scan, scan_byte, Qnil);
    592        if (cmp_it.id >= 0)
    593          {
    594            scan += cmp_it.nchars;
    595            scan_byte += cmp_it.nbytes;

advances SCAN:SCAN_BYTE from 11:21 to 13:24.  So the byte position becomes invalid
and FETCH_CHAR_ADVANCE decodes invalid byte sequence to invalid character C.
Finally, CHAR_TABLE_REF (Vcomposition_function_table, C) goes out of bounds.

Dmitry






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-16  8:01     ` Dmitry Antipov
@ 2014-01-16 10:07       ` Dmitry Antipov
  2014-01-16 17:33         ` Eli Zaretskii
  2014-01-16 17:33       ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-16 10:07 UTC (permalink / raw)
  To: 16457

On 01/16/2014 12:01 PM, Dmitry Antipov wrote:

> I'm not familiar with composition sequences in detail, but there is a hint.

And there is one more hint: with revision 116000 reverted, this crash goes away.

Dmitry






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-16  8:01     ` Dmitry Antipov
  2014-01-16 10:07       ` Dmitry Antipov
@ 2014-01-16 17:33       ` Eli Zaretskii
  2014-01-17  7:34         ` Dmitry Antipov
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2014-01-16 17:33 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: 16457

> Date: Thu, 16 Jan 2014 12:01:04 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: 16457@debbugs.gnu.org
> 
> I'm not familiar with composition sequences in detail

The compositions stuff is under-documented.  I provide some
information I know of below.

> For the uthmani-test.txt, the following code in set_iterator_to_next:
> 
>    7127                /* Composition created while scanning forward.  */
>    7128                /* Update IT's char/byte positions to point to the first
>    7129                   character of the next grapheme cluster, or to the
>    7130                   character visually after the current composition.  */
>    7131                for (i = 0; i < it->cmp_it.nchars; i++)
>    7132                  bidi_move_to_visually_next (&it->bidi_it);
>    7133                IT_BYTEPOS (*it) = it->bidi_it.bytepos;
>    7134                IT_CHARPOS (*it) = it->bidi_it.charpos;
> 
> advances IT from charpos:bytepos 11:21 to 13:25.  But the following fragment
> from scan_for_column:
> 
>     586        /* Check composition sequence.  */
>     587        if (cmp_it.id >= 0
>     588            || (scan == cmp_it.stop_pos
>     589                && composition_reseat_it (&cmp_it, scan, scan_byte, end,
>     590                                          w, NULL, Qnil)))
>     591          composition_update_it (&cmp_it, scan, scan_byte, Qnil);
>     592        if (cmp_it.id >= 0)
>     593          {
>     594            scan += cmp_it.nchars;
>     595            scan_byte += cmp_it.nbytes;
> 
> advances SCAN:SCAN_BYTE from 11:21 to 13:24.  So the byte position becomes invalid
> and FETCH_CHAR_ADVANCE decodes invalid byte sequence to invalid character C.
> Finally, CHAR_TABLE_REF (Vcomposition_function_table, C) goes out of bounds.

In effect, you are saying that cmp_it.nbytes above is incorrect.

This is really strange.  First, I cannot reproduce the crash on
MS-Windows, so the problem might be related to the shaping engine
being used (I presume yours is libotf and libm17n).  (I tried on both
Windows XP and on Windows 7, which have very different versions of
Uniscribe, and they both work fine.)

Moreover, set_iterator_to_next uses the same code from composite.c
that scan_for_column does, so it is unclear to me how the former
works, while the latter doesn't.

Specifically, cmp_it.nbytes is computed in composition_update_it as
the sum of byte-widths of all the characters being composed:

      cmp_it->width = 0;
      for (i = cmp_it->nchars - 1; i >= 0; i--)
	{
	  c = XINT (LGSTRING_CHAR (gstring, cmp_it->from + i));
	  cmp_it->nbytes += CHAR_BYTES (c);
	  cmp_it->width += CHAR_WIDTH (c);
	}

And the characters in the LGSTRING object are simply copied from the
buffer in fill_gstring_header, when LGSTRING is created:

  for (i = 0; i < len; i++)
    {
      int c;

      if (NILP (string))
	FETCH_CHAR_ADVANCE_NO_CHECK (c, from, from_byte);
      else
	FETCH_STRING_CHAR_ADVANCE_NO_CHECK (c, string, from, from_byte);
      ASET (header, i + 1, make_number (c));
    }

Could you please trace through these fragments and see what goes wrong
there?  Specifically, what characters (which Unicode codepoints) are
being composed, and what are the contents of the cmp_it structure in
scan_for_column when it advances from 11:21 to 13:24.  (Granted, here
I see it advance from 11:21 to 13:25, as expected.)

Also, what does "C-u C-x =" report when you put the cursor in column
10?

Some more details:

The LGSTRING object is created when Emacs encounters for the first
time a group of characters that should be composed together.  The
structure of LGSTRING is describe in the comments to
composition-get-gstring.  Emacs recognizes the character compositions
in composition_reseat_it, which calls autocmp_chars, which calls
composition-get-gstring, which collects the characters to be composed
by calling fill_gstring_header, as shown in the fragment above.

The LGSTRING object is then cached, such that later references to it
use the cached data, instead of computing it from scratch.  The cmp_it
structure holds an ID of the LGSTRING which can be used to look it up
in the cached.  When composition_update_it is called, simply uses the
information already stored in LGSTRING to advance past the composed
characters.

So to understand why it crashes for you, we need to find out why the
nbytes value stored by fill_gstring_header somehow became incorrect.

Btw, does the problem go away if you disable cache-long-scans?





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-16 10:07       ` Dmitry Antipov
@ 2014-01-16 17:33         ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2014-01-16 17:33 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: 16457

> Date: Thu, 16 Jan 2014 14:07:56 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Eli Zaretskii <eliz@gnu.org>, Kenichi Handa <handa@gnu.org>
> 
> On 01/16/2014 12:01 PM, Dmitry Antipov wrote:
> 
> > I'm not familiar with composition sequences in detail, but there is a hint.
> 
> And there is one more hint: with revision 116000 reverted, this crash goes away.

I'd prefer to fix this bug without reintroducing another one ;-)





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-16 17:33       ` Eli Zaretskii
@ 2014-01-17  7:34         ` Dmitry Antipov
  2014-01-17  9:10           ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-17  7:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 16457

On 01/16/2014 09:33 PM, Eli Zaretskii wrote:

> This is really strange.  First, I cannot reproduce the crash on
> MS-Windows, so the problem might be related to the shaping engine
> being used (I presume yours is libotf and libm17n).  (I tried on both
> Windows XP and on Windows 7, which have very different versions of
> Uniscribe, and they both work fine.)

Yes, with ' --without-m17n-flt' it doesn't crash.

> Specifically, cmp_it.nbytes is computed in composition_update_it as
> the sum of byte-widths of all the characters being composed:
>
>        cmp_it->width = 0;
>        for (i = cmp_it->nchars - 1; i >= 0; i--)
> 	{
> 	  c = XINT (LGSTRING_CHAR (gstring, cmp_it->from + i));
> 	  cmp_it->nbytes += CHAR_BYTES (c);
> 	  cmp_it->width += CHAR_WIDTH (c);
> 	}

I'm trying this:

=== modified file 'src/composite.c'
--- src/composite.c     2014-01-12 23:23:55 +0000
+++ src/composite.c     2014-01-17 07:16:11 +0000
@@ -24,6 +24,7 @@

  #include <config.h>

+#include <stdio.h>
  #include "lisp.h"
  #include "character.h"
  #include "buffer.h"
@@ -1410,9 +1411,16 @@
        cmp_it->nchars = LGLYPH_TO (glyph) + 1 - from;
        cmp_it->nbytes = 0;
        cmp_it->width = 0;
+
+      fprintf (stderr, "%s: from %d, nchars %d, header %p is:\n", __func__,
+              cmp_it->from, cmp_it->nchars, XPNTR (LGSTRING_HEADER (gstring)));
+      debug_print (LGSTRING_HEADER (gstring));
+
        for (i = cmp_it->nchars - 1; i >= 0; i--)
         {
           c = XINT (LGSTRING_CHAR (gstring, cmp_it->from + i));
+         fprintf (stderr, " at %d: char %d, %d bytes\n",
+                  cmp_it->from + i, c, CHAR_BYTES (c));
           cmp_it->nbytes += CHAR_BYTES (c);
           cmp_it->width += CHAR_WIDTH (c);
         }

And now seeing an illegal access beyond end of gstring header:

;; OK
composition_update_it: from 0, nchars 1, header 0x100c958 is:
[#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
  at 0: char 1648, 2 bytes

;; OK
composition_update_it: from 2, nchars 2, header 0x100c958 is:
[#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
  at 3: char 1593, 2 bytes
  at 2: char 1616, 2 bytes

;; OK
composition_update_it: from 4, nchars 2, header 0x100c958 is:
[#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
  at 5: char 1608, 2 bytes
  at 4: char 1615, 2 bytes

;; OK
composition_update_it: from 6, nchars 1, header 0x100c958 is:
[#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
  at 6: char 1606, 2 bytes

;; BAD
composition_update_it: from 7, nchars 2, header 0x100c958 is:
[#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
  at 8: char 2, 1 bytes
  at 7: char 1614, 2 bytes

IIUC 2 is the garbage at (presumably invalid) position 8.

> And the characters in the LGSTRING object are simply copied from the
> buffer in fill_gstring_header, when LGSTRING is created:
>
>    for (i = 0; i < len; i++)
>      {
>        int c;
>
>        if (NILP (string))
> 	FETCH_CHAR_ADVANCE_NO_CHECK (c, from, from_byte);
>        else
> 	FETCH_STRING_CHAR_ADVANCE_NO_CHECK (c, string, from, from_byte);
>        ASET (header, i + 1, make_number (c));
>      }

AFAICS gstring header is correct here.

> Btw, does the problem go away if you disable cache-long-scans?

No.

Dmitry






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-17  7:34         ` Dmitry Antipov
@ 2014-01-17  9:10           ` Eli Zaretskii
  2014-01-17 11:16             ` Dmitry Antipov
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2014-01-17  9:10 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: 16457

> Date: Fri, 17 Jan 2014 11:34:11 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: 16457@debbugs.gnu.org
> 
> On 01/16/2014 09:33 PM, Eli Zaretskii wrote:
> 
> > This is really strange.  First, I cannot reproduce the crash on
> > MS-Windows, so the problem might be related to the shaping engine
> > being used (I presume yours is libotf and libm17n).  (I tried on both
> > Windows XP and on Windows 7, which have very different versions of
> > Uniscribe, and they both work fine.)
> 
> Yes, with ' --without-m17n-flt' it doesn't crash.

Can you show the same results of debugging printouts in a
"--without-m17n-flt" build?

> ;; BAD
> composition_update_it: from 7, nchars 2, header 0x100c958 is:
> [#<font-object "-unknown-PakType Naqsh-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1"> 1648 1583 1616 1593 1615 1608 1606 1614]
>   at 8: char 2, 1 bytes
>   at 7: char 1614, 2 bytes
> 
> IIUC 2 is the garbage at (presumably invalid) position 8.

What I see on my system that the CHARACTERs part of the header is 12
characters long:

 1610 1615 1582 1614 1648 1583 1616 1593 1615 1608 1606 1614

and the value of cmp_it->from + i never goes beyond 11, which is OK.

Also, note that the indices into the header seem to be off-by-one in
your case: the characters to compose for buffer position 11 are 1606
and 1614, whereas in your case 1606 is used for the previous buffer
position.  Also, the index 1 is nowhere to be seen.

So what does that mean? that cmp_it->nchars here

      cmp_it->nchars = LGLYPH_TO (glyph) + 1 - from;

is incorrect in your case?  Or that the gstring header becomes
corrupted somehow?

> > And the characters in the LGSTRING object are simply copied from the
> > buffer in fill_gstring_header, when LGSTRING is created:
> >
> >    for (i = 0; i < len; i++)
> >      {
> >        int c;
> >
> >        if (NILP (string))
> > 	FETCH_CHAR_ADVANCE_NO_CHECK (c, from, from_byte);
> >        else
> > 	FETCH_STRING_CHAR_ADVANCE_NO_CHECK (c, string, from, from_byte);
> >        ASET (header, i + 1, make_number (c));
> >      }
> 
> AFAICS gstring header is correct here.

Can you show the gstring header at that point in the build that
crashes?

Also, if you manually move point to buffer position 11, what column
number do you see there?





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-17  9:10           ` Eli Zaretskii
@ 2014-01-17 11:16             ` Dmitry Antipov
  2014-01-17 12:03               ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-17 11:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 16457

[-- Attachment #1: Type: text/plain, Size: 2822 bytes --]

On 01/17/2014 01:10 PM, Eli Zaretskii wrote:

> Can you show the same results of debugging printouts in a
> "--without-m17n-flt" build?

Results are eventually empty - I don't see anything. IOW, composition_update_it
is never called, at least when I do (move-to-column 10). Since this doesn't crash,
I can do M-x describe-char at column 10, which is:

              position: 17 of 100 (16%), column: 10
             character: ّ (displayed as ّ) (codepoint 1617, #o3121, #x651)
     preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x0651
                script: arabic
                syntax: w 	which means: word
              category: b:Arabic
              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
           buffer code: #xD9 #x91
             file code: #xD9 #x91 (encoded by coding system utf-8-unix)
               display: by this font (glyph code)
     xft:-unknown-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1 (#x47C)

Character code properties: customize what to show
   name: ARABIC SHADDA
   old-name: ARABIC SHADDAH
   general-category: Mn (Mark, Nonspacing)
   decomposition: (1617) ('ّ')

> So what does that mean? that cmp_it->nchars here
>
>        cmp_it->nchars = LGLYPH_TO (glyph) + 1 - from;
>
> is incorrect in your case?  Or that the gstring header becomes
> corrupted somehow?

No ideas - this needs more tracing.

> Can you show the gstring header at that point in the build that
> crashes?

Note that original gstring header is copied in composition_gstring_put_cache,
so it's better to print both:

@@ -675,9 +676,13 @@
      }

    copy = Fmake_vector (make_number (len + 2), Qnil);
+  fprintf (stderr, "%s: original header %p is: ", __func__, XPNTR (header));
+  debug_print (header);
    LGSTRING_SET_HEADER (copy, Fcopy_sequence (header));
    for (i = 0; i < len; i++)
      LGSTRING_SET_GLYPH (copy, i, Fcopy_sequence (LGSTRING_GLYPH (gstring, i)));
+  fprintf (stderr, "%s: copy %p is: ", __func__, XPNTR (LGSTRING_HEADER (copy)));
+  debug_print (LGSTRING_HEADER (copy));
    i = hash_put (h, LGSTRING_HEADER (copy), copy, hash);
    LGSTRING_SET_ID (copy, make_number (i));
    return copy;

Result is attached. For the moment, I assume that gstring header is valid,
but some values within cmp_it (in composition_update_it at least) aren't.

> Also, if you manually move point to buffer position 11, what column
> number do you see there?

I can't move to 11 by advancing the cursor because it crasher earlier.

BTW, there is one more glitch - when I do 'emacs -Q uthmani-test.txt',
then 'M-x column-number-mode', then [left-arrow] few times, the column
number in mode line becomes incorrect and shown as 1003 or something
like this.

Dmitry



[-- Attachment #2: bug16457_debug.log.gz --]
[-- Type: application/gzip, Size: 2009 bytes --]

[-- Attachment #3: bug16457_column_number.png --]
[-- Type: image/png, Size: 25296 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-17 11:16             ` Dmitry Antipov
@ 2014-01-17 12:03               ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2014-01-17 12:03 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: 16457

> Date: Fri, 17 Jan 2014 15:16:56 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: 16457@debbugs.gnu.org
> 
> On 01/17/2014 01:10 PM, Eli Zaretskii wrote:
> 
> > Can you show the same results of debugging printouts in a
> > "--without-m17n-flt" build?
> 
> Results are eventually empty - I don't see anything. IOW, composition_update_it
> is never called, at least when I do (move-to-column 10).

That means we never try to compose the characters, so this is not very
useful to debug the problem.

> > Also, if you manually move point to buffer position 11, what column
> > number do you see there?
> 
> I can't move to 11 by advancing the cursor because it crasher earlier.

So redisplay also crashes.  At least this is consistent, as the same
code is used both for move-to-column and for display.

> BTW, there is one more glitch - when I do 'emacs -Q uthmani-test.txt',
> then 'M-x column-number-mode', then [left-arrow] few times, the column
> number in mode line becomes incorrect and shown as 1003 or something
> like this.

I think this is part of the same problem.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-15 21:44   ` Glenn Morris
  2014-01-16  8:01     ` Dmitry Antipov
@ 2014-01-17 13:51     ` K. Handa
  1 sibling, 0 replies; 14+ messages in thread
From: K. Handa @ 2014-01-17 13:51 UTC (permalink / raw)
  To: Glenn Morris; +Cc: dmantipov, 16457

In article <7obnzcor73.fsf@fencepost.gnu.org>, Glenn Morris <rgm@gnu.org> writes:

> Eli Zaretskii wrote:
> > Looks similar to #15984, although this is a GUI session.

> http://lists.gnu.org/archive/html/emacs-diffs/2014-01/msg00176.html

> was supposed [1] to fix #15984, and this comes from a later revision.

> [1] Well, personally I have no idea if it was or not. No comment was
> posted to #15984's bug report, but it is referenced in the commit.

I've just sent a mail to 15984@debbugs.gnu.org about the
committing of that patch.  And, at that time, I found that
mails exchanged for bug#15984 was not CC:ed to
15984@debbugs.gnu.org.  So, I attached some key mails among
them to the mail to 15984@debbugs.gnu.org.

As for bug#16457, I have not yet checked anything.

---
Kenichi Handa
handa@gnu.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-15 17:24 bug#16457: 24.3.50; crash rendering Arabic Uthmani script Dmitry Antipov
  2014-01-15 17:41 ` Eli Zaretskii
@ 2014-01-19 13:45 ` K. Handa
  2014-01-19 16:00   ` Dmitry Antipov
  1 sibling, 1 reply; 14+ messages in thread
From: K. Handa @ 2014-01-19 13:45 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: 16457

In article <52D6C466.9080909@yandex.ru>, Dmitry Antipov <dmantipov@yandex.ru> writes:

> Run 'emacs -Q', then visit attached file and eval '(move-to-column 10)' ==>

> #0  0x00000034d9e0f62b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
> #1  0x00000000005644fb in terminate_due_to_signal (sig=11, backtrace_limit=40) at ../../trunk/src/emacs.c:378
> #2  0x000000000058e5c8 in handle_fatal_signal (sig=11) at ../../trunk/src/sysdep.c:1628
> #3  0x000000000058e59d in deliver_thread_signal (sig=11, handler=0x58e5ae <handle_fatal_signal>) at ../../trunk/src/sysdep.c:1602
> #4  0x000000000058e5fe in deliver_fatal_thread_signal (sig=11) at ../../trunk/src/sysdep.c:1640
> #5  <signal handler called>
> #6  0x000000000056134d in PSEUDOVECTOR_TYPEP (a=0x400000000d000040, code=14) at ../../trunk/src/lisp.h:2377
> #7  0x00000000005613bd in PSEUDOVECTORP (a=..., code=14) at ../../trunk/src/lisp.h:2391
> #8  0x00000000005614d4 in SUB_CHAR_TABLE_P (a=...) at ../../trunk/src/lisp.h:2449

I can't reproduce it.  I tried to set Arabic font to 'DejaVu
Sans Mono' and 'PakType Naqsh' (they are shown in your
bug16457_debug.log), but Emacs didn't crash.  ???

But, by checking my latest change again, I noticed a silly
mistake :-(, and just installed the following patch.

=== modified file 'src/composite.c'
--- src/composite.c	2014-01-12 23:23:55 +0000
+++ src/composite.c	2014-01-19 13:24:59 +0000
@@ -1412,7 +1412,7 @@
       cmp_it->width = 0;
       for (i = cmp_it->nchars - 1; i >= 0; i--)
 	{
-	  c = XINT (LGSTRING_CHAR (gstring, cmp_it->from + i));
+	  c = XINT (LGSTRING_CHAR (gstring, from + i));
 	  cmp_it->nbytes += CHAR_BYTES (c);
 	  cmp_it->width += CHAR_WIDTH (c);
 	}

Coud you please try agian with the latest trunk code?

---
Kenichi Handa
handa@gnu.org





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16457: 24.3.50; crash rendering Arabic Uthmani script
  2014-01-19 13:45 ` K. Handa
@ 2014-01-19 16:00   ` Dmitry Antipov
  0 siblings, 0 replies; 14+ messages in thread
From: Dmitry Antipov @ 2014-01-19 16:00 UTC (permalink / raw)
  To: K. Handa; +Cc: 16457

On 01/19/2014 05:45 PM, K. Handa wrote:

> But, by checking my latest change again, I noticed a silly
> mistake :-(, and just installed the following patch.
>
> === modified file 'src/composite.c'
> --- src/composite.c	2014-01-12 23:23:55 +0000
> +++ src/composite.c	2014-01-19 13:24:59 +0000
> @@ -1412,7 +1412,7 @@
>         cmp_it->width = 0;
>         for (i = cmp_it->nchars - 1; i >= 0; i--)
>   	{
> -	  c = XINT (LGSTRING_CHAR (gstring, cmp_it->from + i));
> +	  c = XINT (LGSTRING_CHAR (gstring, from + i));
>   	  cmp_it->nbytes += CHAR_BYTES (c);
>   	  cmp_it->width += CHAR_WIDTH (c);
>   	}
>
> Coud you please try agian with the latest trunk code?

Now it doesn't crash, and column number looks correct. Thanks.

Dmitry







^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-01-19 16:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-15 17:24 bug#16457: 24.3.50; crash rendering Arabic Uthmani script Dmitry Antipov
2014-01-15 17:41 ` Eli Zaretskii
2014-01-15 21:44   ` Glenn Morris
2014-01-16  8:01     ` Dmitry Antipov
2014-01-16 10:07       ` Dmitry Antipov
2014-01-16 17:33         ` Eli Zaretskii
2014-01-16 17:33       ` Eli Zaretskii
2014-01-17  7:34         ` Dmitry Antipov
2014-01-17  9:10           ` Eli Zaretskii
2014-01-17 11:16             ` Dmitry Antipov
2014-01-17 12:03               ` Eli Zaretskii
2014-01-17 13:51     ` K. Handa
2014-01-19 13:45 ` K. Handa
2014-01-19 16:00   ` Dmitry Antipov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).