unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* No coding system used for environment variables
@ 2008-02-21 21:40 Göran Uddeborg
  2008-03-05  0:40 ` Jason Rumney
  0 siblings, 1 reply; 10+ messages in thread
From: Göran Uddeborg @ 2008-02-21 21:40 UTC (permalink / raw)
  To: bug-gnu-emacs

It seems there is no coding system applied to values of environment
variables.

I'm running a system using UTF-8.  My locale is sv_SE.utf8.  And emacs
uses UTF-8 as default most of the time.  When I open a new file for
example.

I do have issues with strings coming from environment variables
though.  I first discovered this in the vm mail system, since it
misinterpreted the variable MAIL which has the value
/var/spool/mail/göran encoded in UTF-8.  (In case your mailer mangles
it, the last file name component is g ö r a n.)  But it also
causes problems in various places, for example with functions relating
to the home directory.  $HOME is /home/göran (same last component as
before).

As an example, I start emacs in my home directory, and do a few
experiments in the scratch buffer (which has a "u" for coding system
in the mode line):

    default-directory
    "/home/göran/"

Looks good.  I see my ö.

    (expand-file-name "")
    "/home/göran"

Ok too.

    (expand-file-name "~")
    "/home/g\303\266ran"

Here the octal codes for a UTF-8 encoded ö is shown instead of
the ö itself.  The source of ~ is the environment variable HOME.
But if I explicitly ask for that variable:

    (getenv "HOME")
    "/home/göran"

Here I see the ö.

Let's have a bit more fun.  Here I try to expand a FILE with my own
name:

    (expand-file-name "göran")
    "/home/göran/göran"

Looks the way I expected it.  Now the same thing, explicitly saying to
put it in the home directory:

    (expand-file-name "~/göran")
    "/home/g\xc3\xb6ran/göran"

The ö in the file name is ok.  The ö in the directory name
is strange again, only this time it is shown in hex rather than octal.

I asked about this on gnu.emacs.help first,
(http://groups.google.se/group/gnu.emacs.help/browse_thread/thread/80258d0a17e37138/75411fce63db9b2c#75411fce63db9b2c)
I was unsure if it was a bug or my lack of understanding.  But two
other posters have suggested I report it as a bug.



In GNU Emacs 22.1.1 (x86_64-redhat-linux-gnu, GTK+ Version 2.12.1)
 of 2007-11-06 on xenbuilder2.fedora.redhat.com
Windowing system distributor `The X.Org Foundation', version 11.0.70101000
configured using `configure  '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-pop' '--with-sound' '--with-gtk' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-DMAIL_USE_LOCKF -DSYSTEM_PURESIZE_EXTRA=16777216 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: sv_SE.utf8
  locale-coding-system: utf-8
  default-enable-multibyte-characters: t

Major mode: Fundamental

Minor modes in effect:
  which-function-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  unify-8859-on-encoding-mode: t
  utf-translate-cjk-mode: t
  auto-compression-mode: t
  temp-buffer-resize-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
? <return> M-< C-n C-k C-k <switch-frame> <switch-frame> 
<switch-frame> C-y <switch-frame> <next> <down> <down> 
<down> <up> <up> <up> <up> <up> <up> <up> <up> p <switch-frame> 
<switch-frame> <switch-frame> <down-mouse-2> <mouse-2> 
<backspace> C-j C-x 4 C-f . e m <tab> <return> C-s 
v m - s p o o M-< C-x C-f . v m C-g C-x C-f ~ / . v 
m <return> C-s C-g C-_ C-s v m - s p o o l - f i l 
e s C-a ; C-x C-s <help-echo> <switch-frame> <switch-frame> 
q <switch-frame> C-n C-n C-n C-n C-n C-n C-n C-n C-n 
C-n C-n C-n C-n C-n C-c C-g <switch-frame> <help-echo> 
<help-echo> C-x M q q <help-echo> C-x M n n n n n n 
n M-< C-s S E C C-a SPC <switch-frame> <help-echo> 
C-x C-f ~ / N <tab> <return> C-x c C-a C-k r p m g 
r e p SPC l h a <return> ! <help-echo> <help-echo> 
<switch-frame> <switch-frame> <help-echo> <switch-frame> 
<switch-frame> <help-echo> <switch-frame> <switch-frame> 
<switch-frame> <switch-frame> <help-echo> <switch-frame> 
d <switch-frame> <help-echo> C-u C-u C-u <f6> C-x o 
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p 
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p M-> 
C-x o C-x C-f <M-backspace> <M-backspace> u p d <return> 
<help-echo> <switch-frame> M-> C-p C-p C-p C-p C-p 
C-p C-p C-p C-p C-p C-p SPC n n d d d e <next> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <left> = C-c C-c 
SPC SPC <backspace> <backspace> <down-mouse-2> <mouse-2> 
<switch-frame> <switch-frame> <switch-frame> <switch-frame> 
n <down-mouse-2> <mouse-2> s I <tab> <return> q d SPC 
<switch-frame> M-x r e p o <tab> r <tab> <return>

Recent messages:
End of message 1059 from Göran Uddeborg
Loading vm-digest...done
Decoding MIME message... done
End of message 1 from Gunilla Christensson
1 message saved to buffer INBOX
Quitting...
Decoding MIME message... done
End of message 1060 from Göran Uddeborg
Making completion list...
Loading emacsbug...done




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-02-21 21:40 No coding system used for environment variables Göran Uddeborg
@ 2008-03-05  0:40 ` Jason Rumney
  2008-03-05  2:22   ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Rumney @ 2008-03-05  0:40 UTC (permalink / raw)
  To: Göran Uddeborg; +Cc: bug-gnu-emacs, 38-done

Version: 22.1.92

Göran Uddeborg wrote:
> It seems there is no coding system applied to values of environment
> variables.
>   

Thank you for your report. This should be now fixed for 22.2.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  0:40 ` Jason Rumney
@ 2008-03-05  2:22   ` YAMAMOTO Mitsuharu
  2008-03-05  8:57     ` Jason Rumney
  2008-03-05 16:25     ` Stefan Monnier
  0 siblings, 2 replies; 10+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-03-05  2:22 UTC (permalink / raw)
  To: Jason Rumney; +Cc: bug-gnu-emacs, Göran Uddeborg, 38-done

>>>>> On Wed, 05 Mar 2008 00:40:29 +0000, Jason Rumney <jasonr@gnu.org> said:

> Version: 22.1.92 Göran Uddeborg wrote:
>> It seems there is no coding system applied to values of environment
>> variables.
>> 

> Thank you for your report. This should be now fixed for 22.2.

I think you mean the latest changes for fileio.c below:

2008-03-05  Jason Rumney  <jasonr@gnu.org>

        * fileio.c (Fexpand_file_name): Decode home directory names.
        (Fsubstitute_in_file_name): Decode substituted variables.

But I'd strongly suggest to revert this changes at this timing of
pretest for upcoming Emacs 22.2.  First, some coding systems are not
ready until some .elc files get loaded (a chicken-and-egg problem).
Second, as DECODE_FILE causes GC and string compaction in general,
some variables such as `nm' in Fexpand_file_name may not point to
valid data after that.  You may also want to see a related patch in
http://lists.gnu.org/archive/html/emacs-pretest-bug/2007-05/msg00115.html

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  2:22   ` YAMAMOTO Mitsuharu
@ 2008-03-05  8:57     ` Jason Rumney
  2008-03-05  9:16       ` YAMAMOTO Mitsuharu
  2008-03-05 10:34       ` Andreas Schwab
  2008-03-05 16:25     ` Stefan Monnier
  1 sibling, 2 replies; 10+ messages in thread
From: Jason Rumney @ 2008-03-05  8:57 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: bug-gnu-emacs, 38, Göran Uddeborg

YAMAMOTO Mitsuharu wrote:
> 2008-03-05 Jason Rumney <jasonr@gnu.org>
>         * fileio.c (Fexpand_file_name): Decode home directory names.
>         (Fsubstitute_in_file_name): Decode substituted variables.
>
> But I'd strongly suggest to revert this changes at this timing of
> pretest for upcoming Emacs 22.2.

It fixes a serious bug. Users with non-ASCII names in their user names 
get strange behaviour of filename expansion.

>   First, some coding systems are not
> ready until some .elc files get loaded (a chicken-and-egg problem).
>   

It should not present a chicken and egg problem, as no files are loaded 
during bootstrap that require expansion of ~ or environment variables.

> Second, as DECODE_FILE causes GC and string compaction in general,
> some variables such as `nm' in Fexpand_file_name may not point to
> valid data after that.

This is a problem on some systems that still do not support stack 
marking for GC protection of such variables. But I think this bug is 
important enough to fix those problems rather than revert the patch.

>   You may also want to see a related patch in
> http://lists.gnu.org/archive/html/emacs-pretest-bug/2007-05/msg00115.html
>   

Was there a problem with that patch? Why was it not installed at the time?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  8:57     ` Jason Rumney
@ 2008-03-05  9:16       ` YAMAMOTO Mitsuharu
  2008-03-05 10:11         ` Jason Rumney
  2008-03-05 10:34       ` Andreas Schwab
  1 sibling, 1 reply; 10+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-03-05  9:16 UTC (permalink / raw)
  To: Jason Rumney; +Cc: bug-gnu-emacs, 38, Göran Uddeborg

>>>>> On Wed, 05 Mar 2008 08:57:31 +0000, Jason Rumney <jasonr@gnu.org> said:

> YAMAMOTO Mitsuharu wrote:
>> 2008-03-05 Jason Rumney <jasonr@gnu.org> * fileio.c
>> (Fexpand_file_name): Decode home directory names.
>> (Fsubstitute_in_file_name): Decode substituted variables.
>> 
>> But I'd strongly suggest to revert this changes at this timing of
>> pretest for upcoming Emacs 22.2.

> It fixes a serious bug. Users with non-ASCII names in their user
> names get strange behaviour of filename expansion.

I know, but your patch has a serious problem and leads to regression.

>> First, some coding systems are not ready until some .elc files get
>> loaded (a chicken-and-egg problem).
>> 

> It should not present a chicken and egg problem, as no files are
> loaded during bootstrap that require expansion of ~ or environment
> variables.

I meant the startup of the dumped executable.  Users may set
EMACS_LOAD_PATH and so on.

>> Second, as DECODE_FILE causes GC and string compaction in general,
>> some variables such as `nm' in Fexpand_file_name may not point to
>> valid data after that.

> This is a problem on some systems that still do not support stack
> marking for GC protection of such variables. But I think this bug is
> important enough to fix those problems rather than revert the patch.

Relocation of string data caused by GC has nothing to do with
(semi-obsolete) GCPROs.  Believe me, it causes a real problem.

>> You may also want to see a related patch in
>> http://lists.gnu.org/archive/html/emacs-pretest-bug/2007-05/msg00115.html
>> 

> Was there a problem with that patch? Why was it not installed at the time?

Because no expert in this area made a response about the patch.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  9:16       ` YAMAMOTO Mitsuharu
@ 2008-03-05 10:11         ` Jason Rumney
  2008-03-05 11:00           ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Rumney @ 2008-03-05 10:11 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: bug-gnu-emacs, 38, Göran Uddeborg

YAMAMOTO Mitsuharu wrote:

>> Was there a problem with that patch? Why was it not installed at the time?
>>     
>
> Because no expert in this area made a response about the patch.
>   

OK, I reverted my change. I'll let Chong and Stefan decide whether to 
fix the bug with your safer patch now (and a similar patch for 
Fsubstitute_in_file_name) or release 22.2 with this as a known bug.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  8:57     ` Jason Rumney
  2008-03-05  9:16       ` YAMAMOTO Mitsuharu
@ 2008-03-05 10:34       ` Andreas Schwab
  1 sibling, 0 replies; 10+ messages in thread
From: Andreas Schwab @ 2008-03-05 10:34 UTC (permalink / raw)
  To: Jason Rumney; +Cc: bug-gnu-emacs, 38, Göran Uddeborg

Jason Rumney <jasonr@gnu.org> writes:

> YAMAMOTO Mitsuharu wrote:
>> Second, as DECODE_FILE causes GC and string compaction in general,
>> some variables such as `nm' in Fexpand_file_name may not point to
>> valid data after that.
>
> This is a problem on some systems that still do not support stack marking
> for GC protection of such variables. But I think this bug is important
> enough to fix those problems rather than revert the patch.

Only Lisp_Object variables are protected.  Failing to protect
non-Lisp_Object pointers can result in crashes.  A crash is always the
worst possible problem.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05 10:11         ` Jason Rumney
@ 2008-03-05 11:00           ` YAMAMOTO Mitsuharu
  0 siblings, 0 replies; 10+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-03-05 11:00 UTC (permalink / raw)
  To: jasonr; +Cc: bug-gnu-emacs, 38, goeran

>>>>> On Wed, 05 Mar 2008 10:11:42 +0000, Jason Rumney <jasonr@gnu.org> said:

>>> Was there a problem with that patch? Why was it not installed at
>>> the time?
>> 
>> Because no expert in this area made a response about the patch.

> OK, I reverted my change. 

Thanks.

> I'll let Chong and Stefan decide whether to fix the bug with your
> safer patch now (and a similar patch for Fsubstitute_in_file_name)
> or release 22.2 with this as a known bug.

I understand the bug is annoying on certain environments, but I would
like to defer any kinds of fixes for this bug to later versions, as
the fixes involve nontrivial issues and may affect in unexpected ways.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05  2:22   ` YAMAMOTO Mitsuharu
  2008-03-05  8:57     ` Jason Rumney
@ 2008-03-05 16:25     ` Stefan Monnier
  2008-03-07 12:12       ` YAMAMOTO Mitsuharu
  1 sibling, 1 reply; 10+ messages in thread
From: Stefan Monnier @ 2008-03-05 16:25 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: bug-gnu-emacs, G\x1fFFFFFFran Uddeborg, 38-done

> But I'd strongly suggest to revert this changes at this timing of
> pretest for upcoming Emacs 22.2.  First, some coding systems are not
> ready until some .elc files get loaded (a chicken-and-egg problem).
> Second, as DECODE_FILE causes GC and string compaction in general,
> some variables such as `nm' in Fexpand_file_name may not point to
> valid data after that.  You may also want to see a related patch in
> http://lists.gnu.org/archive/html/emacs-pretest-bug/2007-05/msg00115.html

How 'bout installing this change on the trunk?


        Stefan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: No coding system used for environment variables
  2008-03-05 16:25     ` Stefan Monnier
@ 2008-03-07 12:12       ` YAMAMOTO Mitsuharu
  0 siblings, 0 replies; 10+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-03-07 12:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: öran Uddeborg, bug-gnu-emacs, 38-done

>>>>> On Wed, 05 Mar 2008 11:25:59 -0500, Stefan Monnier <monnier@iro.umontreal.ca> said:

>> But I'd strongly suggest to revert this changes at this timing of
>> pretest for upcoming Emacs 22.2.  First, some coding systems are
>> not ready until some .elc files get loaded (a chicken-and-egg
>> problem).  Second, as DECODE_FILE causes GC and string compaction
>> in general, some variables such as `nm' in Fexpand_file_name may
>> not point to valid data after that.  You may also want to see a
>> related patch in
>> http://lists.gnu.org/archive/html/emacs-pretest-bug/2007-05/msg00115.html

> How 'bout installing this change on the trunk?

While I was looking at the code of Fsubstitute_in_file_name to
make the patch for the trunk, I noticed that it contains a danger
of destination buffer shortage in both the EMACS_22_BASE branch
and the trunk.

*** src/fileio.c.~1.580.2.10.~	Thu Mar  6 09:44:27 2008
--- src/fileio.c	Fri Mar  7 20:55:26 2008
***************
*** 2227,2233 ****
  	o = (unsigned char *) egetenv (target);
  	if (o)
  	  {
! 	    total += strlen (o);
  	    substituted = 1;
  	  }
  	else if (*p == '}')
--- 2227,2238 ----
  	o = (unsigned char *) egetenv (target);
  	if (o)
  	  {
! 	    if (STRING_MULTIBYTE (filename))
! 	      /* A unibyte character may occupy 2 bytes when converted
! 		 to multibyte.  */
! 	      total += strlen (o) * 2;
! 	    else
! 	      total += strlen (o);
  	    substituted = 1;
  	  }
  	else if (*p == '}')

As I can't install it too soon, please install it to EMACS_22_BASE if
the next pretest is out shortly (and if the patch looks good, of
course.)

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-03-07 12:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-21 21:40 No coding system used for environment variables Göran Uddeborg
2008-03-05  0:40 ` Jason Rumney
2008-03-05  2:22   ` YAMAMOTO Mitsuharu
2008-03-05  8:57     ` Jason Rumney
2008-03-05  9:16       ` YAMAMOTO Mitsuharu
2008-03-05 10:11         ` Jason Rumney
2008-03-05 11:00           ` YAMAMOTO Mitsuharu
2008-03-05 10:34       ` Andreas Schwab
2008-03-05 16:25     ` Stefan Monnier
2008-03-07 12:12       ` YAMAMOTO Mitsuharu

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).