all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* 23.0.60; Segmentation fault loading auto-lang.el
@ 2008-03-30 18:38 intrigeri
  0 siblings, 0 replies; 6+ messages in thread
From: intrigeri @ 2008-03-30 18:38 UTC (permalink / raw)
  To: bug-gnu-emacs; +Cc: Colin Marquardt, rfrancoise

Hello,

First glance :
- download http://www.marquardt-home.de/auto-lang.el to ~/.elisp/
- run emacs -Q
- M-x load-file
- choose file ~/.elisp/auto-lang.el
=> Emacs segfaults (same result with emacs -Q -nw)

Trying harder :
- download http://www.marquardt-home.de/auto-lang.el to ~/.elisp/
- run emacs -Q
- C-x C-f ~/.elisp/auto-lang.el
- select region from the beginning of the file to, and including, line 1398
- eval-region
=> Emacs eval’s the region just fine
- then eval the next sexp : (defvar al-german-common-8bit-regexp ... )
=> Emacs segfaults (same result with emacs -Q -nw)

I know that auto-lang.el is not part of GNU Emacs, but I guess that
Emacs is supposed not to segfault when loading random *.el files.

I’m running Romain Françoise’s emacs-snapshot Debian package, based on
Emacs CVS (2008-03-28) :

In GNU Emacs 23.0.60.1 (i486-pc-linux-gnu, GTK+ Version 2.12.9)
 of 2008-03-28 on elegiac, modified by Debian
 (emacs-snapshot package, version 1:20080328-1)
configured using `configure  '--build' 'i486-linux-gnu' '--host' 'i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/23.0.60/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/23.0.60/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/23.0.60/leim' '--with-x=yes' '--with-x-toolkit=gtk' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -DSITELOAD_PURESIZE_EXTRA=5000 -g -O2''

Important settings:
  value of $LC_ALL: fr_FR.UTF-8
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: UTF-8
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: fr_FR.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
ESC x r e p o r t - e m TAB RET

Recent messages:
("emacs" "-Q")

Bye,
-- 
  intrigeri <intrigeri@boum.org>




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 23.0.60; Segmentation fault loading auto-lang.el
@ 2008-04-08  5:29 Chong Yidong
  2008-04-08  6:52 ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Chong Yidong @ 2008-04-08  5:29 UTC (permalink / raw)
  To: emacs-devel; +Cc: intrigeri, 103

> - download http://www.marquardt-home.de/auto-lang.el to ~/.elisp/
> - run emacs -Q
> - M-x load-file
> - choose file ~/.elisp/auto-lang.el
> => Emacs segfaults (same result with emacs -Q -nw)

This is due to an infinite nesting depth in regexp-opt, which can be
tracked down to the following problem:

(let ((str (string-as-unibyte "ä")))
  (string-match (char-to-string (string-to-char str)) str))

evaluates to 0 in Emacs 22, and to nil in Emacs 23.  It turns out that
this screws up the use of all-completions in regexp-opt-group.

Anyone have any idea what's going on here?




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 23.0.60; Segmentation fault loading auto-lang.el
  2008-04-08  5:29 23.0.60; Segmentation fault loading auto-lang.el Chong Yidong
@ 2008-04-08  6:52 ` Kenichi Handa
  2008-04-08 16:50   ` Chong Yidong
  0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2008-04-08  6:52 UTC (permalink / raw)
  To: Chong Yidong; +Cc: intrigeri, 103, emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=ISO-2022-JP-2, Size: 1122 bytes --]

In article <87r6dg3oe2.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:

> > - download http://www.marquardt-home.de/auto-lang.el to ~/.elisp/
> > - run emacs -Q
> > - M-x load-file
> > - choose file ~/.elisp/auto-lang.el
> > => Emacs segfaults (same result with emacs -Q -nw)

> This is due to an infinite nesting depth in regexp-opt, which can be
> tracked down to the following problem:

> (let ((str (string-as-unibyte "^[$(D+#^[(B")))
>   (string-match (char-to-string (string-to-char str)) str))

> evaluates to 0 in Emacs 22, and to nil in Emacs 23.  It turns out that
> this screws up the use of all-completions in regexp-opt-group.

> Anyone have any idea what's going on here?

(string-as-unibyte "^[$(D+#^[(B") => "\303\244"
(string-to-char "\303\244") => 195 (because ?\303 == 195)
(char-to-string 195) => "^[$(D**^[(B" (because 195==0xC3 U+00C3=='^[$(D**^[(B')
(string-match "^[$(D**^[(B" "^[$(D+#^[(B") => nil (obvious)

Any Lisp program that depends on the result of
string-as-unibyte (thus Emacs' internal character
representation) won't work in Emacs 23.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 23.0.60; Segmentation fault loading auto-lang.el
  2008-04-08  6:52 ` Kenichi Handa
@ 2008-04-08 16:50   ` Chong Yidong
  2008-04-09  1:42     ` Stefan Monnier
  2008-04-09  2:19     ` Kenichi Handa
  0 siblings, 2 replies; 6+ messages in thread
From: Chong Yidong @ 2008-04-08 16:50 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: intrigeri, 103, emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> In article <87r6dg3oe2.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:
>
>> > - download http://www.marquardt-home.de/auto-lang.el to ~/.elisp/
>> > - run emacs -Q
>> > - M-x load-file
>> > - choose file ~/.elisp/auto-lang.el
>> > => Emacs segfaults (same result with emacs -Q -nw)
>
>> This is due to an infinite nesting depth in regexp-opt, which can be
>> tracked down to the following problem:
>
>> (let ((str (string-as-unibyte "ä")))
>>   (string-match (char-to-string (string-to-char str)) str))
>
>> evaluates to 0 in Emacs 22, and to nil in Emacs 23.  It turns out that
>> this screws up the use of all-completions in regexp-opt-group.
>
>> Anyone have any idea what's going on here?
>
> (string-as-unibyte "ä") => "\303\244"
> (string-to-char "\303\244") => 195 (because ?\303 == 195)
> (char-to-string 195) => "Ã" (because 195==0xC3 U+00C3=='Ã')
> (string-match "Ã" "ä") => nil (obvious)
>
> Any Lisp program that depends on the result of
> string-as-unibyte (thus Emacs' internal character
> representation) won't work in Emacs 23.

I see.  However, maybe the following change to regexp-opt-group in
regexp-opt.el would make things a little more predictable.  What do you
think?

*** trunk/lisp/emacs-lisp/regexp-opt.el.~1.37.~	2008-03-14 17:17:34.000000000 -0400
--- trunk/lisp/emacs-lisp/regexp-opt.el	2008-04-08 12:46:49.000000000 -0400
***************
*** 226,232 ****
  
  	      ;; Otherwise, divide the list into those that start with a
  	      ;; particular letter and those that do not, and recurse on them.
! 	      (let* ((char (char-to-string (string-to-char (car strings))))
  		     (half1 (all-completions char strings))
  		     (half2 (nthcdr (length half1) strings)))
  		(concat open-group
--- 226,232 ----
  
  	      ;; Otherwise, divide the list into those that start with a
  	      ;; particular letter and those that do not, and recurse on them.
! 	      (let* ((char (substring (car strings) 0 1))
  		     (half1 (all-completions char strings))
  		     (half2 (nthcdr (length half1) strings)))
  		(concat open-group




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 23.0.60; Segmentation fault loading auto-lang.el
  2008-04-08 16:50   ` Chong Yidong
@ 2008-04-09  1:42     ` Stefan Monnier
  2008-04-09  2:19     ` Kenichi Handa
  1 sibling, 0 replies; 6+ messages in thread
From: Stefan Monnier @ 2008-04-09  1:42 UTC (permalink / raw)
  To: Chong Yidong; +Cc: intrigeri, emacs-devel, 103, Kenichi Handa

>>> (let ((str (string-as-unibyte "ä")))
>>> (string-match (char-to-string (string-to-char str)) str))
>> 
>>> evaluates to 0 in Emacs 22, and to nil in Emacs 23.  It turns out that
>>> this screws up the use of all-completions in regexp-opt-group.
>> 
>>> Anyone have any idea what's going on here?
>> 
>> (string-as-unibyte "ä") => "\303\244"
>> (string-to-char "\303\244") => 195 (because ?\303 == 195)
>> (char-to-string 195) => "Ã" (because 195==0xC3 U+00C3=='Ã')
>> (string-match "Ã" "ä") => nil (obvious)
>> 
>> Any Lisp program that depends on the result of
>> string-as-unibyte (thus Emacs' internal character
>> representation) won't work in Emacs 23.

Notice that the problem is unrelated to string-as-unibyte:

   (string-match (char-to-string (string-to-char str)) str)

this should intuitively always return 0.  Of course, once you replace
`char-to-string' with just `string', you may be reminded that Emacs-23
introduced `unibyte-string', which leads you to the key, if `str' is
unibyte, you need to do

   (string-match (unibyte-string (string-to-char str)) str)

In Emacs-22, `string' used a heuristic to decide whether to build
a unibyte or multibyte string, and more importantly, the character
representing byte code 209 had code 209, whereas in Emacs-23, we have
the strange situation that byte 209 is character 4194257.

So an integer <256 needs to be accompagnied with some contextual info
that says whether it represents a char or a byte, otherwise you get
ambiguity that lead to bugs.  And string-to-char returns either a byte
or a char depending on whether the string was unibyte or multibyte.

> I see.  However, maybe the following change to regexp-opt-group in
> regexp-opt.el would make things a little more predictable.  What do you
> think?

Yes, it looks like a good fix.  Maybe "-no-properties" would be even
better.


        Stefan






^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 23.0.60; Segmentation fault loading auto-lang.el
  2008-04-08 16:50   ` Chong Yidong
  2008-04-09  1:42     ` Stefan Monnier
@ 2008-04-09  2:19     ` Kenichi Handa
  1 sibling, 0 replies; 6+ messages in thread
From: Kenichi Handa @ 2008-04-09  2:19 UTC (permalink / raw)
  To: Chong Yidong; +Cc: intrigeri, 103, emacs-devel

In article <87skxwl29o.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:

> > Any Lisp program that depends on the result of
> > string-as-unibyte (thus Emacs' internal character
> > representation) won't work in Emacs 23.

> I see.  However, maybe the following change to regexp-opt-group in
> regexp-opt.el would make things a little more predictable.  What do you
> think?

I agree because that change will avoid a unibyte string
being changed to multibyte by accident.

But, I've just downloaded auto-lang.el and found that it
has codes something like this:

(string-as-multibyte
 (regexp-opt
  (mapcar 'string-as-unibyte
	  (append
	   al-german-common-words
	   al-german-8bit-words
	   nil))))

All of them should be changed to this simple form:
  (regexp-opt (append al-german-common-words al-german-8bit-words))

The above german case works just by chance, but
al-danish-common-words doesn't.  You'll see peculiar 8-bit
codes in it.

And, the file should have a coding tag.

---
Kenichi Handa
handa@ni.aist.go.jp

> *** trunk/lisp/emacs-lisp/regexp-opt.el.~1.37.~	2008-03-14 17:17:34.000000000 -0400
> --- trunk/lisp/emacs-lisp/regexp-opt.el	2008-04-08 12:46:49.000000000 -0400
> ***************
> *** 226,232 ****
  
>   	      ;; Otherwise, divide the list into those that start with a
>   	      ;; particular letter and those that do not, and recurse on them.
> ! 	      (let* ((char (char-to-string (string-to-char (car strings))))
>   		     (half1 (all-completions char strings))
>   		     (half2 (nthcdr (length half1) strings)))
>   		(concat open-group
> --- 226,232 ----
  
>   	      ;; Otherwise, divide the list into those that start with a
>   	      ;; particular letter and those that do not, and recurse on them.
> ! 	      (let* ((char (substring (car strings) 0 1))
>   		     (half1 (all-completions char strings))
>   		     (half2 (nthcdr (length half1) strings)))
>   		(concat open-group







^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-09  2:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-08  5:29 23.0.60; Segmentation fault loading auto-lang.el Chong Yidong
2008-04-08  6:52 ` Kenichi Handa
2008-04-08 16:50   ` Chong Yidong
2008-04-09  1:42     ` Stefan Monnier
2008-04-09  2:19     ` Kenichi Handa
  -- strict thread matches above, loose matches on Subject: below --
2008-03-30 18:38 intrigeri

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.