Crashes with non-default language environments

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Crashes with non-default language environments
@ 2008-02-07 19:54 Juri Linkov
  2008-02-09 22:17 ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2008-02-07 19:54 UTC (permalink / raw)
  To: emacs-pretest-bug

In GNU Emacs 23.0.60 (x86_64-unknown-linux-gnu, GTK+ Version 2.12.5)
trying to open an article in Gnus causes an Emacs crash.

This crash is reproducible with a small .emacs file that contains
only the necessary settings to start Gnus, and a line to set
the language environment `(set-language-environment 'cyrillic-koi8)'.

The backtrace is below:

Breakpoint 1, abort () at emacs.c:432
432     {
(gdb) bt
#0  abort () at emacs.c:432
#1  0x0000000000570f1f in Fbyte_code (bytestr=10713489, vector=0,
    maxdepth=<value optimized out>) at bytecode.c:1673
#2  0x0000000000547078 in funcall_lambda (fun=37761044, nargs=0,
    arg_vector=0x7fff71b3a078) at eval.c:3212
#3  0x0000000000547427 in Ffuncall (nargs=1, args=<value optimized out>)
    at eval.c:3082
#4  0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>, vector=0,
    maxdepth=<value optimized out>) at bytecode.c:679
#5  0x0000000000547078 in funcall_lambda (fun=34745972, nargs=0,
    arg_vector=0x7fff71b3a288) at eval.c:3212
#6  0x0000000000547427 in Ffuncall (nargs=1, args=<value optimized out>)
    at eval.c:3082
#7  0x0000000000548ac5 in run_hook_with_args (nargs=1, args=0x7fff71b3a280,
    cond=to_completion) at eval.c:2684
#8  0x0000000000548c03 in Frun_hooks (nargs=1, args=<value optimized out>)
    at eval.c:2547
#9  0x00000000005476dc in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3006
#10 0x0000000000548e36 in Fapply (nargs=2, args=0x7fff71b3a428) at eval.c:2458
#11 0x00000000005476dc in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3006
#12 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=11289745, maxdepth=<value optimized out>) at bytecode.c:679
#13 0x0000000000547078 in funcall_lambda (fun=20476148, nargs=1,
    arg_vector=0x7fff71b3a5d8) at eval.c:3212
#14 0x0000000000547427 in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3082
#15 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>, vector=0,
    maxdepth=<value optimized out>) at bytecode.c:679
#16 0x0000000000547078 in funcall_lambda (fun=32606148, nargs=1,
    arg_vector=0x7fff71b3a798) at eval.c:3212
#17 0x0000000000547427 in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3082
#18 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=33559377, maxdepth=<value optimized out>) at bytecode.c:679
#19 0x0000000000547078 in funcall_lambda (fun=34589764, nargs=2,
    arg_vector=0x7fff71b3a958) at eval.c:3212
#20 0x0000000000547427 in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3082
#21 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=10973281, maxdepth=<value optimized out>) at bytecode.c:679
#22 0x0000000000547078 in funcall_lambda (fun=34586116, nargs=2,
    arg_vector=0x7fff71b3ab28) at eval.c:3212
#23 0x0000000000547427 in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3082
#24 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=13838577, maxdepth=<value optimized out>) at bytecode.c:679
#25 0x0000000000547078 in funcall_lambda (fun=34608756, nargs=0,
    arg_vector=0x7fff71b3ad18) at eval.c:3212
#26 0x0000000000547427 in Ffuncall (nargs=1, args=<value optimized out>)
    at eval.c:3082
#27 0x0000000000548ac5 in run_hook_with_args (nargs=1, args=0x7fff71b3ad10,
    cond=to_completion) at eval.c:2684
#28 0x0000000000548c03 in Frun_hooks (nargs=1, args=<value optimized out>)
    at eval.c:2547
#29 0x00000000005476dc in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3006
#30 0x0000000000548e36 in Fapply (nargs=2, args=0x7fff71b3aeb8) at eval.c:2458
#31 0x00000000005476dc in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3006
#32 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=11289745, maxdepth=<value optimized out>) at bytecode.c:679
#33 0x0000000000547078 in funcall_lambda (fun=20476148, nargs=1,
    arg_vector=0x7fff71b3b068) at eval.c:3212
#34 0x0000000000547427 in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3082
#35 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>, vector=1,
    maxdepth=<value optimized out>) at bytecode.c:679
#36 0x0000000000547078 in funcall_lambda (fun=34804052, nargs=2,
    arg_vector=0x7fff71b3b218) at eval.c:3212
#37 0x0000000000547427 in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3082
#38 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=33713169, maxdepth=<value optimized out>) at bytecode.c:679
#39 0x0000000000547078 in funcall_lambda (fun=33714644, nargs=2,
    arg_vector=0x7fff71b3b3c8) at eval.c:3212
#40 0x0000000000547427 in Ffuncall (nargs=3, args=<value optimized out>)
    at eval.c:3082
#41 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=33713217, maxdepth=<value optimized out>) at bytecode.c:679
#42 0x0000000000547078 in funcall_lambda (fun=33715236, nargs=3,
    arg_vector=0x7fff71b3b578) at eval.c:3212
#43 0x0000000000547427 in Ffuncall (nargs=4, args=<value optimized out>)
    at eval.c:3082
#44 0x0000000000571897 in Fbyte_code (bytestr=<value optimized out>,
    vector=34388996, maxdepth=<value optimized out>) at bytecode.c:679
#45 0x0000000000547078 in funcall_lambda (fun=34389444, nargs=1,
    arg_vector=0x7fff71b3b778) at eval.c:3212
#46 0x0000000000547427 in Ffuncall (nargs=2, args=<value optimized out>)
    at eval.c:3082
#47 0x00000000005445a3 in Fcall_interactively (function=33337041,
    record_flag=10713489, keys=140735100991320) at callint.c:842
#48 0x000000000054761e in Ffuncall (nargs=4, args=<value optimized out>)
    at eval.c:3031
#49 0x00000000005477d4 in call3 (fn=<value optimized out>,
    arg1=<value optimized out>, arg2=0, arg3=37738854) at eval.c:2851
#50 0x00000000004e9aa9 in command_loop_1 () at keyboard.c:1910
#51 0x0000000000545e8f in internal_condition_case (
    bfun=0x4e9710 <command_loop_1>, handlers=10800753,
    hfun=0x4e3eb0 <cmd_error>) at eval.c:1494
#52 0x00000000004e327a in command_loop_2 () at keyboard.c:1370
#53 0x0000000000545fa7 in internal_catch (tag=<value optimized out>,
    func=0x4e3260 <command_loop_2>, arg=10713489) at eval.c:1230
#54 0x00000000004e3cf3 in command_loop () at keyboard.c:1349
#55 0x00000000004e408a in recursive_edit_1 () at keyboard.c:958
#56 0x00000000004e41df in Frecursive_edit () at keyboard.c:1020
#57 0x00000000004d7a92 in main (argc=3, argv=0x7fff71b3c218) at emacs.c:1794

Lisp Backtrace:
0x2403014 PVEC_COMPILED
"gnus-summary-highlight-line" (0x71b3a288)
"run-hooks" (0x71b3a430)
"apply" (0x71b3a428)
"gnus-run-hooks" (0x71b3a5d8)
"gnus-summary-update-line" (0x71b3a798)
"gnus-summary-update-mark" (0x71b3a958)
"gnus-summary-mark-article" (0x71b3ab28)
"gnus-summary-mark-read-and-unread-as-read" (0x71b3ad18)
"run-hooks" (0x71b3aec0)
"apply" (0x71b3aeb8)
"gnus-run-hooks" (0x71b3b068)
"gnus-article-prepare" (0x71b3b218)
"gnus-summary-display-article" (0x71b3b3c8)
"gnus-summary-select-article" (0x71b3b578)
"gnus-summary-scroll-up" (0x71b3b778)
"call-interactively" (0x71b3b9b8)

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-07 19:54 Crashes with non-default language environments Juri Linkov
@ 2008-02-09 22:17 ` Juri Linkov
  2008-02-10  2:09   ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2008-02-09 22:17 UTC (permalink / raw)
  To: emacs-pretest-bug

> In GNU Emacs 23.0.60 (x86_64-unknown-linux-gnu, GTK+ Version 2.12.5)
> trying to open an article in Gnus causes an Emacs crash.
>
> This crash is reproducible with a small .emacs file that contains
> only the necessary settings to start Gnus, and a line to set
> the language environment `(set-language-environment 'cyrillic-koi8)'.

This crash is caused by the corrupt byte-code produced by
`byte-compile-lapcode'.  `string-make-unibyte' at the end of this
function produces different bytecode strings in different
language environments.  This problem can be narrowed down to:

(setq bytes '(135 213 135 212 0 192 131 87 13 11 135 211 0 184 131 86 12
11 135 210 0 176 131 61 25 14 8 135 209 0 167 131 61 25 14 8 0 167 131
87 13 11 135 208 0 152 131 61 25 14 8 0 152 131 86 12 11 135 207 0 137
131 61 24 14 8 135 206 0 128 131 61 24 14 8 0 128 131 87 13 11 135 205
0 113 131 61 24 14 8 0 113 131 86 12 11 135 204 0 98 131 61 23 14 8 0 96
132 61 22 14 8 135 203 0 82 131 61 23 14 8 0 80 132 61 22 14 8 0 82 131
87 13 11 135 202 0 60 131 61 23 14 8 0 58 132 61 22 14 8 0 60 131 86 12
11 135 201 0 38 131 10 135 200 0 32 131 87 13 11 0 32 131 10 135 199
0 20 131 86 12 11 0 20 131 10 135 198 0 8 131 61 9 8))
(setq bytestr (string-make-unibyte (concat (nreverse bytes))))
(aref bytestr 172) => 176

that produces correct bytecode in the default language environment, but
after evaluating (set-language-environment 'cyrillic-koi8), the bytecode
string differs by one byte:

(aref bytestr 172) => 156

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-09 22:17 ` Juri Linkov
@ 2008-02-10  2:09   ` Stefan Monnier
  2008-02-10 22:48     ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-02-10  2:09 UTC (permalink / raw)
  To: Juri Linkov; +Cc: emacs-pretest-bug

> This crash is caused by the corrupt byte-code produced by
> `byte-compile-lapcode'.  `string-make-unibyte' at the end of this
> function produces different bytecode strings in different
> language environments.  This problem can be narrowed down to:

Shouldn't it be string-to-unibyte instead?


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-10  2:09   ` Stefan Monnier
@ 2008-02-10 22:48     ` Juri Linkov
  2008-02-11  1:39       ` Stefan Monnier
  2008-02-12 11:23       ` Kenichi Handa
  0 siblings, 2 replies; 16+ messages in thread
From: Juri Linkov @ 2008-02-10 22:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-pretest-bug

>> This crash is caused by the corrupt byte-code produced by
>> `byte-compile-lapcode'.  `string-make-unibyte' at the end of this
>> function produces different bytecode strings in different
>> language environments.  This problem can be narrowed down to:
>
> Shouldn't it be string-to-unibyte instead?

I've just checked that `string-as-unibyte' produces even worse results
than `string-make-unibyte'.  It replaces every byte in the original
string with 2-byte sequences.

The change to use `string-as-unibyte' came from the Unicode branch:

2008-02-02  Kenichi Handa  <handa@m17n.org>

	* emacs-lisp/bytecomp.el (byte-compile-lapcode): Be sure to
	return a unibyte string.

Maybe, this change is correct, but the bug is in the definition of the
language environment, I can't say for sure.  Comparing results of calling
`string-make-unibyte' on 256 bytes in different language environments
gives only 6 differences:

\240 -> \232
\251 -> \277
\260 -> \234
\262 -> \235
\267 -> \236
\367 -> \237

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-10 22:48     ` Juri Linkov
@ 2008-02-11  1:39       ` Stefan Monnier
  2008-02-11  1:56         ` Miles Bader
  2008-02-12 11:23       ` Kenichi Handa
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-02-11  1:39 UTC (permalink / raw)
  To: Juri Linkov; +Cc: emacs-pretest-bug

>>> This crash is caused by the corrupt byte-code produced by
>>> `byte-compile-lapcode'.  `string-make-unibyte' at the end of this
>>> function produces different bytecode strings in different
>>> language environments.  This problem can be narrowed down to:
>> 
>> Shouldn't it be string-to-unibyte instead?

> I've just checked that `string-as-unibyte' produces even worse results
> than `string-make-unibyte'.  It replaces every byte in the original
> string with 2-byte sequences.

Of course, string-AS-unibyte is the worst of all three.  But nobody
suggested to use that one.  I just suggested to replace
string-MAKE-uniybte by string-TO-unibyte.

string-TO-unibyte ~= (encode-coding-string STR 'binary)
string-AS-unibyte ~= (encode-coding-string STR 'emacs-internal)
                     (unicode or emacs-mule, depending on the Emacs version)
string-MAKE-unibyte ~= (encode-coding-string STR locale-coding-system)


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11  1:39       ` Stefan Monnier
@ 2008-02-11  1:56         ` Miles Bader
  2008-02-11  3:02           ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Miles Bader @ 2008-02-11  1:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Juri Linkov, emacs-pretest-bug

Stefan Monnier <monnier@iro.umontreal.ca> writes:
> Of course, string-AS-unibyte is the worst of all three.  But nobody
> suggested to use that one.  I just suggested to replace
> string-MAKE-uniybte by string-TO-unibyte.

Where's this string-to-unibyte function?  My emacs doesn't have it...

-Miles

-- 
`Suppose Korea goes to the World Cup final against Japan and wins,' Moon said.
`All the past could be forgiven.'   [NYT]




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11  1:56         ` Miles Bader
@ 2008-02-11  3:02           ` Stefan Monnier
  2008-02-11  4:11             ` Miles Bader
  2008-02-12 11:41             ` Kenichi Handa
  0 siblings, 2 replies; 16+ messages in thread
From: Stefan Monnier @ 2008-02-11  3:02 UTC (permalink / raw)
  To: Miles Bader; +Cc: Juri Linkov, emacs-pretest-bug

>> Of course, string-AS-unibyte is the worst of all three.  But nobody
>> suggested to use that one.  I just suggested to replace
>> string-MAKE-uniybte by string-TO-unibyte.

> Where's this string-to-unibyte function?  My emacs doesn't have it...

Oh, that's right, we still don't have it.  We only have the 3 variants
on the uni->multi, but not on the multi->uni.
I guess now is a good time to introduce it.


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11  3:02           ` Stefan Monnier
@ 2008-02-11  4:11             ` Miles Bader
  2008-02-11 14:06               ` Stefan Monnier
  2008-02-12 11:41             ` Kenichi Handa
  1 sibling, 1 reply; 16+ messages in thread
From: Miles Bader @ 2008-02-11  4:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Juri Linkov, emacs-pretest-bug

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>> Of course, string-AS-unibyte is the worst of all three.  But nobody
>>> suggested to use that one.  I just suggested to replace
>>> string-MAKE-uniybte by string-TO-unibyte.
>
>> Where's this string-to-unibyte function?  My emacs doesn't have it...
>
> Oh, that's right, we still don't have it.  We only have the 3 variants
> on the uni->multi, but not on the multi->uni.
> I guess now is a good time to introduce it.

I find the names of all these function incredibly confusing though...

What's really wanted here, is something like
vector-to-raw-string-dont-you-dare-do-any-encoding, right?
[As the bytecode engine wants raw bytes with the same numbers, which
just happened to be inside a string]

-Miles

-- 
Politeness, n. The most acceptable hypocrisy.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11  4:11             ` Miles Bader
@ 2008-02-11 14:06               ` Stefan Monnier
  2008-02-11 15:16                 ` Miles Bader
  2008-02-11 21:27                 ` Juri Linkov
  0 siblings, 2 replies; 16+ messages in thread
From: Stefan Monnier @ 2008-02-11 14:06 UTC (permalink / raw)
  To: Miles Bader; +Cc: Juri Linkov, emacs-pretest-bug

>>>> Of course, string-AS-unibyte is the worst of all three.  But nobody
>>>> suggested to use that one.  I just suggested to replace
>>>> string-MAKE-uniybte by string-TO-unibyte.
>> 
>>> Where's this string-to-unibyte function?  My emacs doesn't have it...
>> 
>> Oh, that's right, we still don't have it.  We only have the 3 variants
>> on the uni->multi, but not on the multi->uni.
>> I guess now is a good time to introduce it.

> I find the names of all these function incredibly confusing though...

Agreed.

> What's really wanted here, is something like
> vector-to-raw-string-dont-you-dare-do-any-encoding, right?
> [As the bytecode engine wants raw bytes with the same numbers, which
> just happened to be inside a string]

The problem is that "no encoding" means different things to
different people.  At some point I proposed to just throw out all those
functions, and force people to use encode/decode-coding-string instead,
which forces them to think a bit about what they're doing.

Oh, and throw away the `no-conversion' coding-system, of course, since
it has the same problem.


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11 14:06               ` Stefan Monnier
@ 2008-02-11 15:16                 ` Miles Bader
  2008-02-11 16:51                   ` Stefan Monnier
  2008-02-11 21:27                 ` Juri Linkov
  1 sibling, 1 reply; 16+ messages in thread
From: Miles Bader @ 2008-02-11 15:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Juri Linkov, emacs-pretest-bug

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> What's really wanted here, is something like
>> vector-to-raw-string-dont-you-dare-do-any-encoding, right?
>> [As the bytecode engine wants raw bytes with the same numbers, which
>> just happened to be inside a string]
>
> The problem is that "no encoding" means different things to
> different people.

Ok, perhaps
`vector-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-vector'.

-Miles

-- 
Logic, n. The art of thinking and reasoning in strict accordance with the
limitations and incapacities of the human misunderstanding.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11 15:16                 ` Miles Bader
@ 2008-02-11 16:51                   ` Stefan Monnier
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Monnier @ 2008-02-11 16:51 UTC (permalink / raw)
  To: Miles Bader; +Cc: Juri Linkov, emacs-pretest-bug

>>> What's really wanted here, is something like
>>> vector-to-raw-string-dont-you-dare-do-any-encoding, right?
>>> [As the bytecode engine wants raw bytes with the same numbers, which
>>> just happened to be inside a string]
>> 
>> The problem is that "no encoding" means different things to
>> different people.

> Ok, perhaps

> `vector-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-vector'.

Sounds good.  The function name may want to be a bit more specific about
what happens w.r.t to eight-bit-control and eight-bit-graphic chars as
compared to latin-1 chars (in Emacs-22, the former had values between
128 and 255 and the latter had much larger values whereas with the
unicode switch the reverse is true IIUC).


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11 14:06               ` Stefan Monnier
  2008-02-11 15:16                 ` Miles Bader
@ 2008-02-11 21:27                 ` Juri Linkov
  1 sibling, 0 replies; 16+ messages in thread
From: Juri Linkov @ 2008-02-11 21:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-pretest-bug, Miles Bader

>>>>> Of course, string-AS-unibyte is the worst of all three.  But nobody
>>>>> suggested to use that one.  I just suggested to replace
>>>>> string-MAKE-uniybte by string-TO-unibyte.
>>>
>>>> Where's this string-to-unibyte function?  My emacs doesn't have it...
>>>
>>> Oh, that's right, we still don't have it.  We only have the 3 variants
>>> on the uni->multi, but not on the multi->uni.
>>> I guess now is a good time to introduce it.

Ah, sorry, since I've found no such function I assumed a typo.

I think adding a new function string-to-unibyte to complement
string-to-multibyte and other 2 multi->uni functions would be a good
thing for the short term even though all these names are confusing.

>> What's really wanted here, is something like
>> vector-to-raw-string-dont-you-dare-do-any-encoding, right?
>> [As the bytecode engine wants raw bytes with the same numbers, which
>> just happened to be inside a string]
>
> The problem is that "no encoding" means different things to
> different people.  At some point I proposed to just throw out all those
> functions, and force people to use encode/decode-coding-string instead,
> which forces them to think a bit about what they're doing.

Since all those uni->multi/multi->uni functions have non-descriptive
names, using encode/decode-coding-string with explicit coding will
help writing more clean and less error-prone code.  So I'd give a vote
for it.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-11  3:02           ` Stefan Monnier
  2008-02-11  4:11             ` Miles Bader
@ 2008-02-12 11:41             ` Kenichi Handa
  2008-02-12 16:29               ` Stefan Monnier
  1 sibling, 1 reply; 16+ messages in thread
From: Kenichi Handa @ 2008-02-12 11:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: juri, emacs-pretest-bug, miles

In article <jwvir0wci47.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> Of course, string-AS-unibyte is the worst of all three.  But nobody
>>> suggested to use that one.  I just suggested to replace
>>> string-MAKE-uniybte by string-TO-unibyte.

> > Where's this string-to-unibyte function?  My emacs doesn't have it...

> Oh, that's right, we still don't have it.  We only have the 3 variants
> on the uni->multi, but not on the multi->uni.
> I guess now is a good time to introduce it.

But, even if we implement string-to-unibyte, it should be
used for a string containing only ascii and eight-bit chars.
And in that case, string-make-unibyte behaves exactly the
same as string-to-unibyte.

Miles Bader <miles@gnu.org> writes:
[...]
> Ok, perhaps
> `vector-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-vector'.

We now have
`list-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-list';
that is `unibyte-string'

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-12 11:41             ` Kenichi Handa
@ 2008-02-12 16:29               ` Stefan Monnier
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Monnier @ 2008-02-12 16:29 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: juri, emacs-pretest-bug, miles

>> Oh, that's right, we still don't have it.  We only have the 3 variants
>> on the uni->multi, but not on the multi->uni.
>> I guess now is a good time to introduce it.

> But, even if we implement string-to-unibyte, it should be
> used for a string containing only ascii and eight-bit chars.
> And in that case, string-make-unibyte behaves exactly the
> same as string-to-unibyte.

No it would be different: it would also signal an error if some
non-binary char is found.  (I might potentially be convinced that it's
OK to additionally accept the 128-255 latin1 chars as alternatives to
eight-bit chars, since they now get character codes 128-255).

> We now have
> `list-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-list';
> that is `unibyte-string'

Yes, thanks.  I didn't know about it.  It's a great addition (tho the
name might be a bit short for Miles's tastes, we may want to add
list-to-raw-string-whose-bytes-in-memory-should-have-exactly-the-same-values-as-the-elements-of-this-list'
as an alias for it ;-).

        Stefan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-10 22:48     ` Juri Linkov
  2008-02-11  1:39       ` Stefan Monnier
@ 2008-02-12 11:23       ` Kenichi Handa
  2008-02-12 19:29         ` Juri Linkov
  1 sibling, 1 reply; 16+ messages in thread
From: Kenichi Handa @ 2008-02-12 11:23 UTC (permalink / raw)
  To: Juri Linkov; +Cc: emacs-pretest-bug, monnier

In article <87prv4l96f.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes:

>>> This crash is caused by the corrupt byte-code produced by
>>> `byte-compile-lapcode'.  `string-make-unibyte' at the end of this
>>> function produces different bytecode strings in different
>>> language environments.  This problem can be narrowed down to:
> >
> > Shouldn't it be string-to-unibyte instead?

> I've just checked that `string-as-unibyte' produces even worse results
> than `string-make-unibyte'.  It replaces every byte in the original
> string with 2-byte sequences.

> The change to use `string-as-unibyte' came from the Unicode branch:

> 2008-02-02  Kenichi Handa  <handa@m17n.org>

> 	* emacs-lisp/bytecomp.el (byte-compile-lapcode): Be sure to
> 	return a unibyte string.

> Maybe, this change is correct, but the bug is in the definition of the
> language environment, I can't say for sure.  Comparing results of calling
> `string-make-unibyte' on 256 bytes in different language environments
> gives only 6 differences:

It was my fault.  I've just installed this change.  Could
you please try with the latest code?

Index: bytecomp.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/emacs-lisp/bytecomp.el,v
retrieving revision 2.229
retrieving revision 2.230
diff -u -r2.229 -r2.230
--- bytecomp.el	1 Feb 2008 16:01:26 -0000	2.229
+++ bytecomp.el	12 Feb 2008 11:21:31 -0000	2.230
@@ -864,7 +864,7 @@
 	       (setcar (cdr bytes) (logand pc 255))
 	       (setcar bytes (lsh pc -8))))
 	(setq patchlist (cdr patchlist))))
-    (string-make-unibyte (concat (nreverse bytes)))))
+    (apply 'unibyte-string (nreverse bytes))))
 
 \f
 ;;; compile-time evaluation


---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Crashes with non-default language environments
  2008-02-12 11:23       ` Kenichi Handa
@ 2008-02-12 19:29         ` Juri Linkov
  0 siblings, 0 replies; 16+ messages in thread
From: Juri Linkov @ 2008-02-12 19:29 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-pretest-bug, monnier

>> 	* emacs-lisp/bytecomp.el (byte-compile-lapcode): Be sure to
>> 	return a unibyte string.
>
>> Maybe, this change is correct, but the bug is in the definition of the
>> language environment, I can't say for sure.  Comparing results of calling
>> `string-make-unibyte' on 256 bytes in different language environments
>> gives only 6 differences:
>
> It was my fault.  I've just installed this change.  Could
> you please try with the latest code?

Thank you!  The crashes are now gone.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-02-12 19:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-07 19:54 Crashes with non-default language environments Juri Linkov
2008-02-09 22:17 ` Juri Linkov
2008-02-10  2:09   ` Stefan Monnier
2008-02-10 22:48     ` Juri Linkov
2008-02-11  1:39       ` Stefan Monnier
2008-02-11  1:56         ` Miles Bader
2008-02-11  3:02           ` Stefan Monnier
2008-02-11  4:11             ` Miles Bader
2008-02-11 14:06               ` Stefan Monnier
2008-02-11 15:16                 ` Miles Bader
2008-02-11 16:51                   ` Stefan Monnier
2008-02-11 21:27                 ` Juri Linkov
2008-02-12 11:41             ` Kenichi Handa
2008-02-12 16:29               ` Stefan Monnier
2008-02-12 11:23       ` Kenichi Handa
2008-02-12 19:29         ` Juri Linkov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).