unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* HTML Mode and Turkish Locale - Segfault
@ 2006-11-26 20:58 Cafer Şimşek
  2006-11-27  6:37 ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Cafer Şimşek @ 2006-11-26 20:58 UTC (permalink / raw)


Hi,

I'm newbie for this list. I'm sorry if previously posted the same
subject.

I've just googled and no any result.

It's crash when using html-mode randomly (seg fault) when using
tr_TR.UTF-8 locale. I've tried it en_US.UTF-8 locale and it seems
working.

I've tried with both (from CVS and from Debian Repository)

Version: 22.0.91.1

(sorry for by bad English)

Best Regards.


-- 
Cafer 'cfb' Şimşek
http://cafer.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HTML Mode and Turkish Locale - Segfault
  2006-11-26 20:58 HTML Mode and Turkish Locale - Segfault Cafer Şimşek
@ 2006-11-27  6:37 ` Eli Zaretskii
  2006-11-27  8:19   ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2006-11-27  6:37 UTC (permalink / raw)
  Cc: emacs-devel

> From: cfb@cafer.org (Cafer =?utf-8?B?xZ5pbcWfZWs=?=)
> Date: Sun, 26 Nov 2006 22:58:29 +0200
> 
> It's crash when using html-mode randomly (seg fault) when using
> tr_TR.UTF-8 locale. I've tried it en_US.UTF-8 locale and it seems
> working.
> 
> I've tried with both (from CVS and from Debian Repository)
> 
> Version: 22.0.91.1

Thank you for your report.

However, there's not enough information in this for us to try to find
out what is wrong.  Please use "M-x report-emacs-bug RET" to provide
the information.  Also, since this is a segfault, please run GDB on
the core file, type the command "bt" inside GDB, and post the
resulting backtrace here.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HTML Mode and Turkish Locale - Segfault
  2006-11-27  6:37 ` Eli Zaretskii
@ 2006-11-27  8:19   ` Kenichi Handa
  2006-11-27 13:24     ` Eli Zaretskii
  2006-11-28  1:17     ` regex.c bug? - " Kenichi Handa
  0 siblings, 2 replies; 10+ messages in thread
From: Kenichi Handa @ 2006-11-27  8:19 UTC (permalink / raw)
  Cc: cfb, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3183 bytes --]

In article <uu00lz96t.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > From: cfb@cafer.org (Cafer =?utf-8?B?xZ5pbcWfZWs=?=)
> > Date: Sun, 26 Nov 2006 22:58:29 +0200
> > 
> > It's crash when using html-mode randomly (seg fault) when using
> > tr_TR.UTF-8 locale. I've tried it en_US.UTF-8 locale and it seems
> > working.
> > 
> > I've tried with both (from CVS and from Debian Repository)
> > 
> > Version: 22.0.91.1

> Thank you for your report.

> However, there's not enough information in this for us to try to find
> out what is wrong.  Please use "M-x report-emacs-bug RET" to provide
> the information.  Also, since this is a segfault, please run GDB on
> the core file, type the command "bt" inside GDB, and post the
> resulting backtrace here.

I can reproduce it with the following scenario (on Debian
testing) with the attached temp.html.  But, I have not yet
found what is wrong.  I suspect that case-table handling has
a problem because it happenes only in tr_TR.UTF-8.

(gdb) set env LANG=tr_TR.UTF-8
(gdb) run -Q temp.html

ESC : (garbage-collect) RET

Then Emacs crashes as this:

Program received signal SIGSEGV, Segmentation fault.
mark_object (arg=139689009) at alloc.c:5717
(gdb) bt
#0  mark_object (arg=139689009) at alloc.c:5717
#1  0x0813ab66 in mark_object (arg=139272845) at alloc.c:5825
#2  0x0813ab66 in mark_object (arg=141201765) at alloc.c:5825
#3  0x0813af6e in mark_object (arg=138980860) at alloc.c:5700
[...]
#119 0x0813aa7f in mark_object (arg=139883241) at alloc.c:5714
#120 0x0813af6e in mark_object (arg=137465060) at alloc.c:5700
#121 0x0813e8ff in Fgarbage_collect () at alloc.c:5156
#122 0x081522b3 in Feval (form=141197693) at eval.c:2325
#123 0x08152da7 in Ffuncall (nargs=2, args=0xafcfabb0) at eval.c:2997
#124 0x0817d61a in Fbyte_code (bytestr=136311491, vector=136311508, maxdepth=40) at bytecode.c:679
#125 0x08152844 in funcall_lambda (fun=136311436, nargs=2, arg_vector=0xafcface4) at eval.c:3184
#126 0x08152c5b in Ffuncall (nargs=3, args=0xafcface0) at eval.c:3054
#127 0x08154523 in Fapply (nargs=2, args=0xafcfad30) at eval.c:2485
#128 0x08154654 in apply1 (fn=137689233, arg=141197565) at eval.c:2749
#129 0x0814fdf7 in Fcall_interactively (function=137689233, record_flag=137464009, keys=137504524) at callint.c:406
#130 0x080f09c3 in Fcommand_execute (cmd=137689233, record_flag=137464009, keys=137464009, special=137464009) at keyboard.c:9867
#131 0x080fc00a in command_loop_1 () at keyboard.c:1858
#132 0x0815187b in internal_condition_case (bfun=0x80fbc90 <command_loop_1>, handlers=137508713, hfun=0x80f66a0 <cmd_error>) at eval.c:1481
#133 0x080f5a7e in command_loop_2 () at keyboard.c:1326
#134 0x0815193c in internal_catch (tag=137504921, func=0x80f5a50 <command_loop_2>, arg=137464009) at eval.c:1222
#135 0x080f64ee in command_loop () at keyboard.c:1305
#136 0x080f6878 in recursive_edit_1 () at keyboard.c:1003
#137 0x080f6966 in Frecursive_edit () at keyboard.c:1064
#138 0x080ecbb2 in main (argc=1526726658, argv=0xafcfb5c4) at emacs.c:1794

Lisp Backtrace:
"garbage-collect" (0x2)
"eval" (0x86a817d)
"eval-expression" (0x86a817d)
"call-interactively" (0x834f891)
(gdb) 

---
Kenichi Handa
handa@m17n.org


[-- Attachment #2: Type: text/html, Size: 208 bytes --]

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: HTML Mode and Turkish Locale - Segfault
  2006-11-27  8:19   ` Kenichi Handa
@ 2006-11-27 13:24     ` Eli Zaretskii
  2006-11-28  1:17     ` regex.c bug? - " Kenichi Handa
  1 sibling, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2006-11-27 13:24 UTC (permalink / raw)
  Cc: cfb, emacs-devel

> From: Kenichi Handa <handa@m17n.org>
> Date: Mon, 27 Nov 2006 17:19:19 +0900
> Cc: cfb@cafer.org, emacs-devel@gnu.org
> 
> ESC : (garbage-collect) RET
> 
> Then Emacs crashes as this:
> 
> Program received signal SIGSEGV, Segmentation fault.
> mark_object (arg=139689009) at alloc.c:5717
> (gdb) bt
> #0  mark_object (arg=139689009) at alloc.c:5717
> #1  0x0813ab66 in mark_object (arg=139272845) at alloc.c:5825

Sounds like the techniques described in etc/DEBUG should be used to
debug this, as this is a crash inside GC.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-27  8:19   ` Kenichi Handa
  2006-11-27 13:24     ` Eli Zaretskii
@ 2006-11-28  1:17     ` Kenichi Handa
  2006-11-28  6:49       ` Stefan Monnier
  1 sibling, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2006-11-28  1:17 UTC (permalink / raw)
  Cc: eliz, cfb, emacs-devel

It seems that I found the reason of the attached crash.

Currently we have this code in regex.c.

			if (multibyte)
			  SET_RANGE_TABLE_WORK_AREA_BIT (range_table_work,
							 re_wctype_to_bit (cc));

                        for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
			  {
			    int translated = TRANSLATE (ch);
			    if (re_iswctype (btowc (ch), cc))
			      SET_LIST_BIT (translated);
			  }

In tr_TR.UTF-8, 'I' is translated to #x51051 (U+0131).  But,
it seems that SET_LIST_BIT assumes that the argument is less
than 256 (or 128).  So, I've just installed the following
change.

@@ -2939,7 +2939,8 @@
                         for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
 			  {
 			    int translated = TRANSLATE (ch);
-			    if (re_iswctype (btowc (ch), cc))
+			    if (translated < (1 << BYTEWIDTH)
+				&& re_iswctype (btowc (ch), cc))
 			      SET_LIST_BIT (translated);
 			  }

If translated is set to a mutibyte character, I think the
above SET_RANGE_TABLE_WORK_AREA_BIT handles such a case.

Stefan, could you please confirm that my guess above is
correct?

---
Kenichi Handa
handa@m17n.org

In article <E1Gobhv-0007ci-6R@etlken.m17n.org>, Kenichi Handa <handa@m17n.org> writes:

> [1  <text/plain; US-ASCII (7bit)>]
> In article <uu00lz96t.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > > From: cfb@cafer.org (Cafer =?utf-8?B?xZ5pbcWfZWs=?=)
> > > Date: Sun, 26 Nov 2006 22:58:29 +0200
> > > 
> > > It's crash when using html-mode randomly (seg fault) when using
> > > tr_TR.UTF-8 locale. I've tried it en_US.UTF-8 locale and it seems
> > > working.
> > > 
> > > I've tried with both (from CVS and from Debian Repository)
> > > 
> > > Version: 22.0.91.1

> > Thank you for your report.

> > However, there's not enough information in this for us to try to find
> > out what is wrong.  Please use "M-x report-emacs-bug RET" to provide
> > the information.  Also, since this is a segfault, please run GDB on
> > the core file, type the command "bt" inside GDB, and post the
> > resulting backtrace here.

> I can reproduce it with the following scenario (on Debian
> testing) with the attached temp.html.  But, I have not yet
> found what is wrong.  I suspect that case-table handling has
> a problem because it happenes only in tr_TR.UTF-8.

> (gdb) set env LANG=tr_TR.UTF-8
> (gdb) run -Q temp.html

> ESC : (garbage-collect) RET

> Then Emacs crashes as this:

> Program received signal SIGSEGV, Segmentation fault.
> mark_object (arg=139689009) at alloc.c:5717
> (gdb) bt
> #0  mark_object (arg=139689009) at alloc.c:5717
> #1  0x0813ab66 in mark_object (arg=139272845) at alloc.c:5825
> #2  0x0813ab66 in mark_object (arg=141201765) at alloc.c:5825
> #3  0x0813af6e in mark_object (arg=138980860) at alloc.c:5700
> [...]
> #119 0x0813aa7f in mark_object (arg=139883241) at alloc.c:5714
> #120 0x0813af6e in mark_object (arg=137465060) at alloc.c:5700
> #121 0x0813e8ff in Fgarbage_collect () at alloc.c:5156
> #122 0x081522b3 in Feval (form=141197693) at eval.c:2325
> #123 0x08152da7 in Ffuncall (nargs=2, args=0xafcfabb0) at eval.c:2997
> #124 0x0817d61a in Fbyte_code (bytestr=136311491, vector=136311508, maxdepth=40) at bytecode.c:679
> #125 0x08152844 in funcall_lambda (fun=136311436, nargs=2, arg_vector=0xafcface4) at eval.c:3184
> #126 0x08152c5b in Ffuncall (nargs=3, args=0xafcface0) at eval.c:3054
> #127 0x08154523 in Fapply (nargs=2, args=0xafcfad30) at eval.c:2485
> #128 0x08154654 in apply1 (fn=137689233, arg=141197565) at eval.c:2749
> #129 0x0814fdf7 in Fcall_interactively (function=137689233, record_flag=137464009, keys=137504524) at callint.c:406
> #130 0x080f09c3 in Fcommand_execute (cmd=137689233, record_flag=137464009, keys=137464009, special=137464009) at keyboard.c:9867
> #131 0x080fc00a in command_loop_1 () at keyboard.c:1858
> #132 0x0815187b in internal_condition_case (bfun=0x80fbc90 <command_loop_1>, handlers=137508713, hfun=0x80f66a0 <cmd_error>) at eval.c:1481
> #133 0x080f5a7e in command_loop_2 () at keyboard.c:1326
> #134 0x0815193c in internal_catch (tag=137504921, func=0x80f5a50 <command_loop_2>, arg=137464009) at eval.c:1222
> #135 0x080f64ee in command_loop () at keyboard.c:1305
> #136 0x080f6878 in recursive_edit_1 () at keyboard.c:1003
> #137 0x080f6966 in Frecursive_edit () at keyboard.c:1064
> #138 0x080ecbb2 in main (argc=1526726658, argv=0xafcfb5c4) at emacs.c:1794

> Lisp Backtrace:
> "garbage-collect" (0x2)
> "eval" (0x86a817d)
> "eval-expression" (0x86a817d)
> "call-interactively" (0x834f891)
> (gdb) 

> ---
> Kenichi Handa
> handa@m17n.org

> [2  <text/html; US-ASCII (7bit)>]
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <html>

> <head>
>    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
>    <title> Sample </title>
> </head>

> <body>
> </body>
> </html>
> [3  <text/plain; us-ascii (7bit)>]
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-28  1:17     ` regex.c bug? - " Kenichi Handa
@ 2006-11-28  6:49       ` Stefan Monnier
  2006-11-29 18:15         ` Cafer Şimşek
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2006-11-28  6:49 UTC (permalink / raw)
  Cc: eliz, cfb, emacs-devel

> In tr_TR.UTF-8, 'I' is translated to #x51051 (U+0131).  But,
> it seems that SET_LIST_BIT assumes that the argument is less
> than 256 (or 128).  So, I've just installed the following
> change.

> @@ -2939,7 +2939,8 @@
>                          for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
>  			  {
>  			    int translated = TRANSLATE (ch);
> -			    if (re_iswctype (btowc (ch), cc))
> +			    if (translated < (1 << BYTEWIDTH)
> +				&& re_iswctype (btowc (ch), cc))
>  			      SET_LIST_BIT (translated);
>  			  }

> If translated is set to a mutibyte character, I think the
> above SET_RANGE_TABLE_WORK_AREA_BIT handles such a case.

> Stefan, could you please confirm that my guess above is
> correct?

That looks correct, yes.  Thank you,


        Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-28  6:49       ` Stefan Monnier
@ 2006-11-29 18:15         ` Cafer Şimşek
  2006-11-29 19:46           ` Andreas Schwab
  2006-11-30  2:09           ` Kenichi Handa
  0 siblings, 2 replies; 10+ messages in thread
From: Cafer Şimşek @ 2006-11-29 18:15 UTC (permalink / raw)
  Cc: eliz, emacs-devel, cfb, Kenichi Handa

I'm getting SegFault already.

Program received signal SIGSEGV, Segmentation fault.
0x080e400a in re_set_syntax ()
(gdb)

I want to help to fix it, so how can I compile Emacs with debug
symbols?

Best Regards.

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> In tr_TR.UTF-8, 'I' is translated to #x51051 (U+0131).  But,
>> it seems that SET_LIST_BIT assumes that the argument is less
>> than 256 (or 128).  So, I've just installed the following
>> change.
>
>> @@ -2939,7 +2939,8 @@
>>                          for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
>>  			  {
>>  			    int translated = TRANSLATE (ch);
>> -			    if (re_iswctype (btowc (ch), cc))
>> +			    if (translated < (1 << BYTEWIDTH)
>> +				&& re_iswctype (btowc (ch), cc))
>>  			      SET_LIST_BIT (translated);
>>  			  }
>
>> If translated is set to a mutibyte character, I think the
>> above SET_RANGE_TABLE_WORK_AREA_BIT handles such a case.
>
>> Stefan, could you please confirm that my guess above is
>> correct?
>
> That looks correct, yes.  Thank you,
>
>
>         Stefan

-- 
maybe you want to lost (lene)

Cafer 'cfb' Şimşek
http://cafer.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-29 18:15         ` Cafer Şimşek
@ 2006-11-29 19:46           ` Andreas Schwab
  2006-11-30  2:09           ` Kenichi Handa
  1 sibling, 0 replies; 10+ messages in thread
From: Andreas Schwab @ 2006-11-29 19:46 UTC (permalink / raw)
  Cc: eliz, Kenichi Handa, Stefan Monnier, emacs-devel

cfb@cafer.org (Cafer Şimşek) writes:

> I'm getting SegFault already.
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x080e400a in re_set_syntax ()
> (gdb)
>
> I want to help to fix it, so how can I compile Emacs with debug
> symbols?

Just add -g to CFLAGS, either during configuring (./configure CFLAGS=-g
...) or when building (make CFLAGS=-g).  Actually, the default CFLAGS
should already include -g.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-29 18:15         ` Cafer Şimşek
  2006-11-29 19:46           ` Andreas Schwab
@ 2006-11-30  2:09           ` Kenichi Handa
  2006-12-04  1:30             ` Cafer Simsek
  1 sibling, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2006-11-30  2:09 UTC (permalink / raw)
  Cc: eliz, emacs-devel, monnier, cfb

In article <87r6vm3yrw.fsf@medic.epidio.net>, cfb@cafer.org (Cafer Şimşek) writes:

> I'm getting SegFault already.

I've just installed the simlar fix to another place using
SET_LIST_BIT.  So, please try the latest code again.

---
Kenichi Handa
handa@m17n.org

> Program received signal SIGSEGV, Segmentation fault.
> 0x080e400a in re_set_syntax ()
> (gdb)

> I want to help to fix it, so how can I compile Emacs with debug
> symbols?

> Best Regards.

> Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> In tr_TR.UTF-8, 'I' is translated to #x51051 (U+0131).  But,
>>> it seems that SET_LIST_BIT assumes that the argument is less
>>> than 256 (or 128).  So, I've just installed the following
change.
> >
>>> @@ -2939,7 +2939,8 @@
>>> for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
>>> {
>>> int translated = TRANSLATE (ch);
>>> -			    if (re_iswctype (btowc (ch), cc))
>>> +			    if (translated < (1 << BYTEWIDTH)
>>> +				&& re_iswctype (btowc (ch), cc))
>>> SET_LIST_BIT (translated);
>>> }
> >
>>> If translated is set to a mutibyte character, I think the
>>> above SET_RANGE_TABLE_WORK_AREA_BIT handles such a case.
> >
>>> Stefan, could you please confirm that my guess above is
>>> correct?
> >
> > That looks correct, yes.  Thank you,
> >
> >
> >         Stefan

> -- 
> maybe you want to lost (lene)

> Cafer 'cfb' Şimşek
> http://cafer.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
  2006-11-30  2:09           ` Kenichi Handa
@ 2006-12-04  1:30             ` Cafer Simsek
  0 siblings, 0 replies; 10+ messages in thread
From: Cafer Simsek @ 2006-12-04  1:30 UTC (permalink / raw)
  Cc: eliz, monnier, emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> I've just installed the simlar fix to another place using
> SET_LIST_BIT.  So, please try the latest code again.

Ok, thank you very much. But I found a problem in nnimap (maybe gnus)
that I think related same problem. In Turkish locale (tr_TR.UTF-8)
it shows me the message

  "error in process filter: In imap-parse-body 2"
  
in mini-buffer and waits still, when I want to open a group from
*Group* buffer. Other locales (ex: en_US.UTF-8) the problem does not
shown.

Best Regards.

-- 
rahmetli de spam yapardı

Cafer 'cfb' Simsek
http://cafer.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-12-04  1:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-26 20:58 HTML Mode and Turkish Locale - Segfault Cafer Şimşek
2006-11-27  6:37 ` Eli Zaretskii
2006-11-27  8:19   ` Kenichi Handa
2006-11-27 13:24     ` Eli Zaretskii
2006-11-28  1:17     ` regex.c bug? - " Kenichi Handa
2006-11-28  6:49       ` Stefan Monnier
2006-11-29 18:15         ` Cafer Şimşek
2006-11-29 19:46           ` Andreas Schwab
2006-11-30  2:09           ` Kenichi Handa
2006-12-04  1:30             ` Cafer Simsek

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).