* bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS
@ 2019-07-05 2:04 Stefan Kangas
2019-07-05 2:22 ` YAMAMOTO Mitsuharu
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Kangas @ 2019-07-05 2:04 UTC (permalink / raw)
To: 36507
When evaluating the following expression, I get a crash under "emacs -Q"
compiled from current master.
(decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
This expression is tested in batch mode with no problems on the same
system, now on master in test/lisp/bookmark-tests.el:281.
The expression was suggested in Bug#36452, where
Eli Zaretskii <eliz@gnu.org> writes:
> Please add to that text something that doesn't yield valid
> UTF-8 byte sequence. For example, these two strings:
>
> (decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
I think the issue as such is beyond me, but I can reproduce this every time.
Please let me know if you need help testing or more information.
Before crash, I get this output:
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00007fff8ddbd326 in CFCharacterSetIsLongCharacterMember () from
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
Here is the stack trace, and report-emacs-bug info below:
(gdb) bt
#0 0x00007fff8ddbd326 in CFCharacterSetIsLongCharacterMember () from
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
#1 0x0000000100437f31 in macfont_has_char (font=XIL(0x101937625),
c=2246732) at macfont.m:2727
#2 0x000000010030bd9d in font_has_char (f=0x102849430,
font=XIL(0x101937625), c=2246732) at font.c:3002
#3 0x00000001003d62c5 in fontset_find_font (fontset=XIL(0x10482b515),
c=2246732, face=0x101163460, charset_id=-1, fallback=true) at
fontset.c:676
#4 0x00000001003ccc19 in fontset_font (fontset=XIL(0x10189f835),
c=2246732, face=0x101163460, id=-1) at fontset.c:799
#5 0x00000001003cc48c in face_for_char (f=0x102849430,
face=0x101163460, c=2246732, pos=4, object=XIL(0)) at fontset.c:989
#6 0x00000001001a1fd3 in FACE_FOR_CHAR (f=0x102849430,
face=0x101163460, character=2246732, pos=4, object=XIL(0)) at
./dispextern.h:1846
#7 0x0000000100040f2d in get_next_display_element (it=0x7fff5fbfd980)
at xdisp.c:7447
#8 0x000000010004396b in move_it_in_display_line_to
(it=0x7fff5fbfd980, to_charpos=42, to_x=-1, op=MOVE_TO_POS) at
xdisp.c:8933
#9 0x000000010003f618 in move_it_to (it=0x7fff5fbfd980,
to_charpos=42, to_x=-1, to_y=-1, to_vpos=-1, op=8) at xdisp.c:9683
#10 0x000000010005f2df in resize_mini_window (w=0x101836220,
exact_p=true) at xdisp.c:11447
#11 0x000000010005be65 in resize_mini_window_1 (a1=4320354848,
exactly=XIL(0xb970)) at xdisp.c:11364
#12 0x000000010005bce9 in with_echo_area_buffer (w=0x101836220,
which=0, fn=0x10005be10 <resize_mini_window_1>, a1=4320354848,
a2=XIL(0xb970)) at xdisp.c:11086
#13 0x000000010005b7cc in resize_echo_area_exactly () at xdisp.c:11342
#14 0x00000001001ac700 in command_loop_1 () at keyboard.c:1484
#15 0x00000001002d43af in internal_condition_case (bfun=0x1001ab850
<command_loop_1>, handlers=XIL(0x4c50), hfun=0x1001ca1a0 <cmd_error>)
at eval.c:1352
#16 0x00000001001ca081 in command_loop_2 (ignore=XIL(0)) at keyboard.c:1091
#17 0x00000001002d3508 in internal_catch (tag=XIL(0xbfd0),
func=0x1001ca050 <command_loop_2>, arg=XIL(0)) at eval.c:1113
#18 0x00000001001aab25 in command_loop () at keyboard.c:1070
#19 0x00000001001aa927 in recursive_edit_1 () at keyboard.c:714
#20 0x00000001001aad76 in Frecursive_edit () at keyboard.c:786
#21 0x00000001001a7e27 in main (argc=2, argv=0x7fff5fbffad8) at emacs.c:2103
[New Thread 0x20db of process 22966]
[New Thread 0x2203 of process 22966]
[New Thread 0x145b of process 22966]
(gdb) xbacktrace
(gdb)
In GNU Emacs 27.0.50 (build 1, x86_64-apple-darwin15.6.0, NS
appkit-1404.47 Version 10.11.6 (Build 15G22010))
of 2019-07-05 built on Stefans-MBP
Repository revision: 44f199648b0c986a0ac7608f4e9d803c619ae2d6
Repository branch: master
Windowing system distributor 'Apple', version 10.3.1404
System Description: Mac OS X 10.11.6
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Configured using:
'configure --without-makeinfo --enable-checking=yes,glyphs
--enable-check-lisp-object-type 'CFLAGS=-O0 -g3''
Configured features:
NOTIFY KQUEUE ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS THREADS
PDUMPER LCMS2 GMP
Important settings:
value of $LANG: en_SE@calendar=iso8601.UTF-8
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs text-property-search time-date
seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs
cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
elec-pair tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util
term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads kqueue cocoa ns lcms2 multi-tty make-network-process
emacs)
Memory information:
((conses 16 44089 5693)
(symbols 48 5808 1)
(strings 32 15104 1574)
(string-bytes 1 497022)
(vectors 16 9842)
(vector-slots 8 115136 11088)
(floats 8 17 25)
(intervals 56 183 0)
(buffers 992 11))
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS
2019-07-05 2:04 bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS Stefan Kangas
@ 2019-07-05 2:22 ` YAMAMOTO Mitsuharu
2019-07-05 11:36 ` Stefan Kangas
0 siblings, 1 reply; 4+ messages in thread
From: YAMAMOTO Mitsuharu @ 2019-07-05 2:22 UTC (permalink / raw)
To: Stefan Kangas; +Cc: 36507
On Fri, 05 Jul 2019 11:04:21 +0900,
Stefan Kangas wrote:
>
> When evaluating the following expression, I get a crash under "emacs -Q"
> compiled from current master.
>
> (decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
>
> This expression is tested in batch mode with no problems on the same
> system, now on master in test/lisp/bookmark-tests.el:281.
>
> The expression was suggested in Bug#36452, where
>
> Eli Zaretskii <eliz@gnu.org> writes:
> > Please add to that text something that doesn't yield valid
> > UTF-8 byte sequence. For example, these two strings:
> >
> > (decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
>
> I think the issue as such is beyond me, but I can reproduce this every time.
> Please let me know if you need help testing or more information.
>
> Before crash, I get this output:
> Thread 1 received signal SIGSEGV, Segmentation fault.
> 0x00007fff8ddbd326 in CFCharacterSetIsLongCharacterMember () from
> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
Please try the patch below.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
diff --git a/src/macfont.m b/src/macfont.m
index f736fbf0e1e..2b7f963fd61 100644
--- a/src/macfont.m
+++ b/src/macfont.m
@@ -2076,7 +2076,7 @@ static int macfont_variation_glyphs (struct font *, int c,
ptrdiff_t j;
for (j = 0; j < ASIZE (chars); j++)
- if (TYPE_RANGED_FIXNUMP (UTF32Char, AREF (chars, j))
+ if (RANGED_FIXNUMP (0, AREF (chars, j), MAX_UNICODE_CHAR)
&& CFCharacterSetIsLongCharacterMember (desc_charset,
XFIXNAT (AREF (chars, j))))
break;
@@ -2710,6 +2710,9 @@ So we use CTFontDescriptorCreateMatchingFontDescriptor (no
int result;
CFCharacterSetRef charset;
+ if (c < 0 || c > MAX_UNICODE_CHAR)
+ return false;
+
block_input ();
if (FONT_ENTITY_P (font))
{
^ permalink raw reply related [flat|nested] 4+ messages in thread
* bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS
2019-07-05 2:22 ` YAMAMOTO Mitsuharu
@ 2019-07-05 11:36 ` Stefan Kangas
2019-07-06 5:26 ` YAMAMOTO Mitsuharu
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Kangas @ 2019-07-05 11:36 UTC (permalink / raw)
To: YAMAMOTO Mitsuharu; +Cc: 36507
YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:
> > > (decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
> >
> > I think the issue as such is beyond me, but I can reproduce this every time.
> > Please let me know if you need help testing or more information.
> >
> > Before crash, I get this output:
> > Thread 1 received signal SIGSEGV, Segmentation fault.
> > 0x00007fff8ddbd326 in CFCharacterSetIsLongCharacterMember () from
> > /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
>
> Please try the patch below.
The patch works; I no longer get the crash. The return value is now:
"#(" " 0 1 (charset gb18030-4-byte-ext-2))"
Note that the " " is a visually wide white space character that I
can't copy to other programs for some reason. It is here replaced
with a space. Not sure if this is expected or not.
Thank you for providing a fix so swiftly.
Best regards,
Stefan Kangas
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS
2019-07-05 11:36 ` Stefan Kangas
@ 2019-07-06 5:26 ` YAMAMOTO Mitsuharu
0 siblings, 0 replies; 4+ messages in thread
From: YAMAMOTO Mitsuharu @ 2019-07-06 5:26 UTC (permalink / raw)
To: Stefan Kangas; +Cc: 36507-done
On Fri, 05 Jul 2019 20:36:34 +0900,
Stefan Kangas wrote:
>
> YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:
> > > > (decode-coding-string "\xE3\x32\x9A\x36" 'chinese-gb18030)
> > >
> > > I think the issue as such is beyond me, but I can reproduce this every time.
> > > Please let me know if you need help testing or more information.
> > >
> > > Before crash, I get this output:
> > > Thread 1 received signal SIGSEGV, Segmentation fault.
> > > 0x00007fff8ddbd326 in CFCharacterSetIsLongCharacterMember () from
> > > /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
> >
> > Please try the patch below.
>
> The patch works; I no longer get the crash. The return value is now:
>
> "#(" " 0 1 (charset gb18030-4-byte-ext-2))"
Thanks. I pushed the patch to master and the emacs-26 branch as
0e15bd11dc0 and f0db687a285, respectively. (I forgot to add the bug
ID to commit log for the former.) Closing the bug.
> Note that the " " is a visually wide white space character that I
> can't copy to other programs for some reason. It is here replaced
> with a space. Not sure if this is expected or not.
On the Mac port, from which macfont.m originally came, the character
is displayed with boxed hexadecimal. So, this would be another issue
specific to the NS port.
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-07-06 5:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-05 2:04 bug#36507: 27.0.50; Crash on evaluating invalid UTF-8 byte sequence on MacOS Stefan Kangas
2019-07-05 2:22 ` YAMAMOTO Mitsuharu
2019-07-05 11:36 ` Stefan Kangas
2019-07-06 5:26 ` YAMAMOTO Mitsuharu
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).