unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
@ 2008-08-26 17:28 Chong Yidong
  2008-08-27  1:13 ` Kenichi Handa
  0 siblings, 1 reply; 16+ messages in thread
From: Chong Yidong @ 2008-08-26 17:28 UTC (permalink / raw)
  To: emacs-devel

Could someone review the patch submitted with bug#778 for correctness?
I've excerpted part of the bug report below; the full report can be
viewed at

 http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=778


Dmitry Dzhus <dima@sphinx.net.ru> wrote:

> Looks like the code of `report_file_error' function at fileio.c assumes
> errstring is unibyte after `code_convert_string_norecord' call.
>
>     str = strerror (errorno);
>     errstring = code_convert_string_norecord (make_unibyte_string (str,
> 								 strlen (str)),
> 					    Vlocale_coding_system, 0);
>...
>     /* System error messages are capitalized.  Downcase the initial
> 	   unless it is followed by a slash.  */
> 	if (SREF (errstring, 1) != '/')
>           SSET (errstring, 0, DOWNCASE (SREF (errstring, 0)));
>
> That's not always true. 
>
> The result of `code_convert_string_norecord' results in a multibyte
> string object when `CODING_FOR_UNIBYTE (Vlocale_coding_system)' is not
> true, that is indicated by code in `decode_coding_object' function from
> coding.c:
>
>     if (EQ (dst_object, Qt)
>       || (! NILP (CODING_ATTR_POST_READ (attrs))
> 	  && NILP (dst_object)))
>     {
>       coding->dst_multibyte = !CODING_FOR_UNIBYTE (coding);
>....
>
> Looks like the issue may be resolved replacing a SSET&SREF call at
> `report_file_error' with something multibyte-aware. I've attached a
> one-line patch which does the job. It makes an assertion that a
> downcase and uppercase character both occupy the same amount of bytes.
>
> Using `Fdowncase' function instead also fixes the problem, but at a
> cost of slight overhead (the whole string gets processed, not just the
> first character (this cannot impact performance seriously though, as I
> believe those `errstring' variables are not processed often)) my patch
> does not introduce.
>
> I've tested the patch with several locales (both problematic and
> traditional ones) and couldn't spot any regression introduced with
> this patch.

*** fileio.c.~1.629.~	2008-08-06 01:41:14.000000000 +0400
--- fileio.c	2008-08-26 02:09:11.000000000 +0400
***************
*** 262,268 ****
  	/* System error messages are capitalized.  Downcase the initial
  	   unless it is followed by a slash.  */
  	if (SREF (errstring, 1) != '/')
! 	  SSET (errstring, 0, DOWNCASE (SREF (errstring, 0)));
  
  	xsignal (Qfile_error,
  		 Fcons (build_string (string), Fcons (errstring, data)));
--- 262,268 ----
  	/* System error messages are capitalized.  Downcase the initial
  	   unless it is followed by a slash.  */
  	if (SREF (errstring, 1) != '/')
! 	  CHAR_STRING (DOWNCASE (STRING_CHAR (SDATA(errstring), 0)), SDATA(errstring));
  
  	xsignal (Qfile_error,
  		 Fcons (build_string (string), Fcons (errstring, data)));




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-26 17:28 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes Chong Yidong
@ 2008-08-27  1:13 ` Kenichi Handa
  2008-08-27  3:20   ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Kenichi Handa @ 2008-08-27  1:13 UTC (permalink / raw)
  To: Chong Yidong; +Cc: dima, emacs-devel

In article <87y72ju1mg.fsf@cyd.mit.edu>, Chong Yidong <cyd@stupidchicken.com> writes:

> Could someone review the patch submitted with bug#778 for correctness?
> I've excerpted part of the bug report below; the full report can be
> viewed at

>  http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=778

> Dmitry Dzhus <dima@sphinx.net.ru> wrote:
[...]
> > Looks like the issue may be resolved replacing a SSET&SREF call at
> > `report_file_error' with something multibyte-aware. I've attached a
> > one-line patch which does the job. It makes an assertion that a
> > downcase and uppercase character both occupy the same amount of bytes.
> >
> > Using `Fdowncase' function instead also fixes the problem, but at a
> > cost of slight overhead (the whole string gets processed, not just the
> > first character (this cannot impact performance seriously though, as I
> > believe those `errstring' variables are not processed often)) my patch
> > does not introduce.
> >
> > I've tested the patch with several locales (both problematic and
> > traditional ones) and couldn't spot any regression introduced with
> > this patch.

Unfortunately, downcasing may change character bytes in some
locale (e.g. turkish), and if the first character is
multibyte, "SREF (errstring, 1)" doesn't give the second
character.

So, I just installed the following change.

2008-08-27  Kenichi Handa  <handa@m17n.org>

	* fileio.c (report_file_error): Fix handling of multibyte error
	string.

Index: fileio.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/fileio.c,v
retrieving revision 1.629
retrieving revision 1.630
diff -u -r1.629 -r1.630
--- fileio.c	5 Aug 2008 21:41:14 -0000	1.629
+++ fileio.c	27 Aug 2008 01:11:26 -0000	1.630
@@ -261,8 +261,14 @@
       default:
 	/* System error messages are capitalized.  Downcase the initial
 	   unless it is followed by a slash.  */
-	if (SREF (errstring, 1) != '/')
-	  SSET (errstring, 0, DOWNCASE (SREF (errstring, 0)));
+	if (! EQ (Faref (errstring, make_number (1)), make_number ('/')))
+	  {
+	    int c;
+
+	    str = (char *) SDATA (errstring);
+	    c = STRING_CHAR (str, 0);
+	    Faset (errstring, 0, make_number (DOWNCASE (c)));
+	  }
 
 	xsignal (Qfile_error,
 		 Fcons (build_string (string), Fcons (errstring, data)));

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-27  1:13 ` Kenichi Handa
@ 2008-08-27  3:20   ` Eli Zaretskii
  2008-08-28  0:40     ` Kenichi Handa
  2008-08-31 18:36     ` Stefan Monnier
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2008-08-27  3:20 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: cyd, emacs-devel, dima

> From: Kenichi Handa <handa@m17n.org>
> Date: Wed, 27 Aug 2008 10:13:06 +0900
> Cc: dima@sphinx.net.ru, emacs-devel@gnu.org
> 
> +	if (! EQ (Faref (errstring, make_number (1)), make_number ('/')))
> +	  {
> +	    int c;
> +
> +	    str = (char *) SDATA (errstring);
> +	    c = STRING_CHAR (str, 0);
> +	    Faset (errstring, 0, make_number (DOWNCASE (c)));
> +	  }

Can we explain in a comment what is special about a slash as the
second character of the error message?




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-27  3:20   ` Eli Zaretskii
@ 2008-08-28  0:40     ` Kenichi Handa
  2008-08-28  3:20       ` Eli Zaretskii
  2008-08-31 18:36     ` Stefan Monnier
  1 sibling, 1 reply; 16+ messages in thread
From: Kenichi Handa @ 2008-08-28  0:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: cyd, emacs-devel, dima

In article <uk5e316vn.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > From: Kenichi Handa <handa@m17n.org>
> > Date: Wed, 27 Aug 2008 10:13:06 +0900
> > Cc: dima@sphinx.net.ru, emacs-devel@gnu.org
> > 
> > +	if (! EQ (Faref (errstring, make_number (1)), make_number ('/')))
> > +	  {
> > +	    int c;
> > +
> > +	    str = (char *) SDATA (errstring);
> > +	    c = STRING_CHAR (str, 0);
> > +	    Faset (errstring, 0, make_number (DOWNCASE (c)));
> > +	  }

> Can we explain in a comment what is special about a slash as the
> second character of the error message?

We should, but I don't know the reason.  By grepping
ChangeLog files, I found this item:

1991-02-21  Richard Stallman  (rms@mole.ai.mit.edu)

	* fileio.c (report_file_error): Don't downcase "I/O".

Is this the reason?

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-28  0:40     ` Kenichi Handa
@ 2008-08-28  3:20       ` Eli Zaretskii
  2008-08-28  5:53         ` Kenichi Handa
  2008-08-28 20:43         ` Stephen Berman
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2008-08-28  3:20 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: cyd, emacs-devel, dima

> From: Kenichi Handa <handa@m17n.org>
> CC: cyd@stupidchicken.com, dima@sphinx.net.ru, emacs-devel@gnu.org
> Date: Thu, 28 Aug 2008 09:40:01 +0900
> 
> > Can we explain in a comment what is special about a slash as the
> > second character of the error message?
> 
> We should, but I don't know the reason.  By grepping
> ChangeLog files, I found this item:
> 
> 1991-02-21  Richard Stallman  (rms@mole.ai.mit.edu)
> 
> 	* fileio.c (report_file_error): Don't downcase "I/O".
> 
> Is this the reason?

Yes, that's what I remembered as well.  But if the reason is "I/O",
why not test for that explicitly?  Do certain locales translate "I/O"
into a different string, which still has a slash as its second
character?

(I'm having a strange feeling that we already discussed this at some
point in the past.  Perhaps searching the archives will bring some
useful hits.)




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-28  3:20       ` Eli Zaretskii
@ 2008-08-28  5:53         ` Kenichi Handa
  2008-08-28  9:23           ` Jason Rumney
  2008-08-28 20:43         ` Stephen Berman
  1 sibling, 1 reply; 16+ messages in thread
From: Kenichi Handa @ 2008-08-28  5:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: cyd, emacs-devel, dima

In article <uej4925bu.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > We should, but I don't know the reason.  By grepping
> > ChangeLog files, I found this item:
> > 
> > 1991-02-21  Richard Stallman  (rms@mole.ai.mit.edu)
> > 
> > 	* fileio.c (report_file_error): Don't downcase "I/O".
> > 
> > Is this the reason?

> Yes, that's what I remembered as well.  But if the reason is "I/O",
> why not test for that explicitly?  Do certain locales translate "I/O"
> into a different string, which still has a slash as its second
> character?

I have no idea.

> (I'm having a strange feeling that we already discussed this at some
> point in the past.  Perhaps searching the archives will bring some
> useful hits.)

I don't remember anything about that, and the oldest
emacs-devel archive is 2000-09.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-28  5:53         ` Kenichi Handa
@ 2008-08-28  9:23           ` Jason Rumney
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Rumney @ 2008-08-28  9:23 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Eli Zaretskii, dima, cyd, emacs-devel

Kenichi Handa wrote:
> I don't remember anything about that, and the oldest
> emacs-devel archive is 2000-09.
>   
Is emacs-hackers still archived somewhere on fencepost.gnu.org?





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-28  3:20       ` Eli Zaretskii
  2008-08-28  5:53         ` Kenichi Handa
@ 2008-08-28 20:43         ` Stephen Berman
  1 sibling, 0 replies; 16+ messages in thread
From: Stephen Berman @ 2008-08-28 20:43 UTC (permalink / raw)
  To: emacs-devel

On Thu, 28 Aug 2008 06:20:37 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Kenichi Handa <handa@m17n.org>
>> CC: cyd@stupidchicken.com, dima@sphinx.net.ru, emacs-devel@gnu.org
>> Date: Thu, 28 Aug 2008 09:40:01 +0900
>> 
>> > Can we explain in a comment what is special about a slash as the
>> > second character of the error message?
>> 
>> We should, but I don't know the reason.  By grepping
>> ChangeLog files, I found this item:
>> 
>> 1991-02-21  Richard Stallman  (rms@mole.ai.mit.edu)
>> 
>> 	* fileio.c (report_file_error): Don't downcase "I/O".
>> 
>> Is this the reason?
>
> Yes, that's what I remembered as well.  But if the reason is "I/O",
> why not test for that explicitly?  Do certain locales translate "I/O"
> into a different string, which still has a slash as its second
> character?

"E/A" (for "Ein-/Ausgabe") is commonly used in German.

Steve Berman





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-27  3:20   ` Eli Zaretskii
  2008-08-28  0:40     ` Kenichi Handa
@ 2008-08-31 18:36     ` Stefan Monnier
  2008-09-01  6:11       ` Richard M. Stallman
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-08-31 18:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: cyd, emacs-devel, dima, Kenichi Handa

>> +	if (! EQ (Faref (errstring, make_number (1)), make_number ('/')))
>> +	  {
>> +	    int c;
>> +
>> +	    str = (char *) SDATA (errstring);
>> +	    c = STRING_CHAR (str, 0);
>> +	    Faset (errstring, 0, make_number (DOWNCASE (c)));
>> +	  }

> Can we explain in a comment what is special about a slash as the
> second character of the error message?

Actually, I think we need an explanation for why we bother to downcase
this thing at all.


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-08-31 18:36     ` Stefan Monnier
@ 2008-09-01  6:11       ` Richard M. Stallman
  2008-09-01 18:22         ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Richard M. Stallman @ 2008-09-01  6:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, dima, cyd, handa, emacs-devel

    Actually, I think we need an explanation for why we bother to downcase
    this thing at all.

Because it isn't the beginning of a sentence.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-01  6:11       ` Richard M. Stallman
@ 2008-09-01 18:22         ` Stefan Monnier
  2008-09-02  1:09           ` Richard M. Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-09-01 18:22 UTC (permalink / raw)
  To: rms; +Cc: eliz, dima, cyd, handa, emacs-devel

>     Actually, I think we need an explanation for why we bother to downcase
>     this thing at all.

> Because it isn't the beginning of a sentence.

Are you saying that this code is meant to fix an error in the C-level
error messages?

Or that we use them in a different context (if so, what's different
about it)?


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-01 18:22         ` Stefan Monnier
@ 2008-09-02  1:09           ` Richard M. Stallman
  2008-09-02 19:56             ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Richard M. Stallman @ 2008-09-02  1:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, handa, emacs-devel, cyd, dima

    > Because it isn't the beginning of a sentence.

    Are you saying that this code is meant to fix an error in the C-level
    error messages?

    Or that we use them in a different context (if so, what's different
    about it)?

Emacs uses them in a different context -- following some other text,
and a colon.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-02  1:09           ` Richard M. Stallman
@ 2008-09-02 19:56             ` Stefan Monnier
  2008-09-03  2:41               ` Richard M. Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-09-02 19:56 UTC (permalink / raw)
  To: rms; +Cc: eliz, handa, emacs-devel, cyd, dima

>> Because it isn't the beginning of a sentence.
>     Are you saying that this code is meant to fix an error in the C-level
>     error messages?

>     Or that we use them in a different context (if so, what's different
>     about it)?

> Emacs uses them in a different context -- following some other text,
> and a colon.

I still don't understand why that's different: most C applications also
output those error message prepended with some other text and a colon.


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-02 19:56             ` Stefan Monnier
@ 2008-09-03  2:41               ` Richard M. Stallman
  2008-09-03 16:09                 ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Richard M. Stallman @ 2008-09-03  2:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dima, eliz, emacs-devel, cyd, handa

    I still don't understand why that's different: most C applications also
    output those error message prepended with some other text and a colon.

Then they too should downcase the messages, to be othographically
incorrect.  The capital letters are correct if the message appears
as the start of a sentence, but not after other text and a colon.

I made Emacs do it right.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-03  2:41               ` Richard M. Stallman
@ 2008-09-03 16:09                 ` Stefan Monnier
  2008-09-04  0:10                   ` Richard M. Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2008-09-03 16:09 UTC (permalink / raw)
  To: rms; +Cc: dima, eliz, emacs-devel, cyd, handa

>     I still don't understand why that's different: most C applications also
>     output those error message prepended with some other text and a colon.

> Then they too should downcase the messages, to be othographically
> incorrect.  The capital letters are correct if the message appears
> as the start of a sentence, but not after other text and a colon.

> I made Emacs do it right.

So it's fixing a bug in the C library (the library should return
non-capitalized messages, since you can reliably capitalize a message
but not the other way around).


        Stefan





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes
  2008-09-03 16:09                 ` Stefan Monnier
@ 2008-09-04  0:10                   ` Richard M. Stallman
  0 siblings, 0 replies; 16+ messages in thread
From: Richard M. Stallman @ 2008-09-04  0:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, emacs-devel, cyd, handa, dima

    So it's fixing a bug in the C library (the library should return
    non-capitalized messages, since you can reliably capitalize a message
    but not the other way around).

That might have been a good idea when there was just one C library
implementation, but we cannot change them all incompatibly now.




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-09-04  0:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-26 17:28 23.0.60; Assuming errstring unibyte is incorrect and leads to Emacs crashes Chong Yidong
2008-08-27  1:13 ` Kenichi Handa
2008-08-27  3:20   ` Eli Zaretskii
2008-08-28  0:40     ` Kenichi Handa
2008-08-28  3:20       ` Eli Zaretskii
2008-08-28  5:53         ` Kenichi Handa
2008-08-28  9:23           ` Jason Rumney
2008-08-28 20:43         ` Stephen Berman
2008-08-31 18:36     ` Stefan Monnier
2008-09-01  6:11       ` Richard M. Stallman
2008-09-01 18:22         ` Stefan Monnier
2008-09-02  1:09           ` Richard M. Stallman
2008-09-02 19:56             ` Stefan Monnier
2008-09-03  2:41               ` Richard M. Stallman
2008-09-03 16:09                 ` Stefan Monnier
2008-09-04  0:10                   ` Richard M. Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).