unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Coding system and environment variables
@ 2008-02-20  8:00 Göran Uddeborg
  2008-02-20 15:41 ` Sven Joachim
  2008-02-20 16:23 ` Piet van Oostrum
  0 siblings, 2 replies; 5+ messages in thread
From: Göran Uddeborg @ 2008-02-20  8:00 UTC (permalink / raw)
  To: help-gnu-emacs

How is the coding system decided when reading an environment variable?

I'm running a system using UTF-8.  My locale is sv_SE.utf8.  And emacs
uses UTF-8 as default most of the time.  When I open a new file for
example.

I do have issues with strings coming from environment variables though.
I first discovered this in the vm mail system, since it misinterpreted
the variable MAIL which has the value /var/spool/mail/göran.  (In case
your mailer mangles it, the last file name component is "g ä r a
n".)  But it also causes problems with functions relating to the home
directory.  HOME is /home/göran (same last component as before).

As an example, I start emacs in my home directory, and do a few
experiments in the scratch buffer (which has a "u" for coding system in
the mode line):

    default-directory
    "/home/göran/"

Looks good.  I see my ö.

    (expand-file-name "")
    "/home/göran"

Ok too.

    (expand-file-name "~")
    "/home/g\303\266ran"

Here the octal codes for a UTF-8 encoded ö is shown instead of the
ö itself.  Why is this different?  The source of ~ is the
environment variable HOME.  But if I explicitly ask for that variable:

    (getenv "HOME")
    "/home/göran"

Here I see the ö

Let's have a bit more fun.  Here I try to expand a FILE with my own
name:

    (expand-file-name "göran")
    "/home/göran/göran"

Looks the way I would expect.  Now the same thing, explicitly saying to
put it in the home directory:

    (expand-file-name "~/göran")
    "/home/g\xc3\xb6ran/göran"

The ö in the file name is ok.  The ö in the directory name is
strange again, only this time it is shown in hex rather than octal.

Can anyone explain what is going on?  And most importantly, how do I
tell emacs that environment variables are using the UTF-8 coding system?

I've read the chapter on International Character Set Support in the info
manual, but I couldn't find any help on this in there.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coding system and environment variables
  2008-02-20  8:00 Coding system and environment variables Göran Uddeborg
@ 2008-02-20 15:41 ` Sven Joachim
  2008-02-21 10:10   ` Göran Uddeborg
  2008-02-20 16:23 ` Piet van Oostrum
  1 sibling, 1 reply; 5+ messages in thread
From: Sven Joachim @ 2008-02-20 15:41 UTC (permalink / raw)
  To: help-gnu-emacs

On 2008-02-20 09:00 +0100, Göran Uddeborg wrote:

> How is the coding system decided when reading an environment variable?

Normally it should use your preferred choice.

> I'm running a system using UTF-8.  My locale is sv_SE.utf8.  And emacs
> uses UTF-8 as default most of the time.  When I open a new file for
> example.
>
> I do have issues with strings coming from environment variables though.
> I first discovered this in the vm mail system, since it misinterpreted
> the variable MAIL which has the value /var/spool/mail/göran.  (In case
> your mailer mangles it, the last file name component is "g ä r a
> n".)  But it also causes problems with functions relating to the home
> directory.  HOME is /home/göran (same last component as before).
>
> As an example, I start emacs in my home directory, and do a few
> experiments in the scratch buffer (which has a "u" for coding system in
> the mode line):
>
>     default-directory
>     "/home/göran/"
>
> Looks good.  I see my ö.
>
>     (expand-file-name "")
>     "/home/göran"
>
> Ok too.
>
>     (expand-file-name "~")
>     "/home/g\303\266ran"

Yeah, I can reproduce this.  There seems to be something fishy when
expand-file-name expands the tilde.  But I'm not familiar with the code.

> Here the octal codes for a UTF-8 encoded ö is shown instead of the
> ö itself.  Why is this different?  The source of ~ is the
> environment variable HOME.  But if I explicitly ask for that variable:
>
>     (getenv "HOME")
>     "/home/göran"

That's okay.

> Here I see the ö
>
> Let's have a bit more fun.  Here I try to expand a FILE with my own
> name:
>
>     (expand-file-name "göran")
>     "/home/göran/göran"
>
> Looks the way I would expect.  Now the same thing, explicitly saying to
> put it in the home directory:
>
>     (expand-file-name "~/göran")
>     "/home/g\xc3\xb6ran/göran"

Please file a bug with M-x report-emacs-bug, I think the issue should be
brought to the developers' attention.

Cheers,
       Sven


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coding system and environment variables
  2008-02-20  8:00 Coding system and environment variables Göran Uddeborg
  2008-02-20 15:41 ` Sven Joachim
@ 2008-02-20 16:23 ` Piet van Oostrum
  2008-02-21 10:14   ` Göran Uddeborg
  1 sibling, 1 reply; 5+ messages in thread
From: Piet van Oostrum @ 2008-02-20 16:23 UTC (permalink / raw)
  To: help-gnu-emacs

>>>>> Göran Uddeborg <uddeborg@carmen.se> (GU) wrote:

>GU> How is the coding system decided when reading an environment variable?
>GU> I'm running a system using UTF-8.  My locale is sv_SE.utf8.  And emacs
>GU> uses UTF-8 as default most of the time.  When I open a new file for
>GU> example.

[snip]

I looked in the code and it seems that Emacs doesn't apply
file-name-coding-system when expanding ~ to $HOME. Neither when you
interpolate $XXX environment variables in a file name. It just copies the
bytes. I think this is a bug. Please report it.

By the way, your posting contains 
Content-Type: text/plain; charset=utf8
That should be utf-8 (with the hyphen). Therefore your posting reads wrong.
-- 
Piet van Oostrum <piet@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet@vanoostrum.org


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coding system and environment variables
  2008-02-20 15:41 ` Sven Joachim
@ 2008-02-21 10:10   ` Göran Uddeborg
  0 siblings, 0 replies; 5+ messages in thread
From: Göran Uddeborg @ 2008-02-21 10:10 UTC (permalink / raw)
  To: help-gnu-emacs

ons 2008-02-20 klockan 16:41 +0100 skrev Sven Joachim:
> Please file a bug with M-x report-emacs-bug, I think the issue should be
> brought to the developers' attention.

I'll do that.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Coding system and environment variables
  2008-02-20 16:23 ` Piet van Oostrum
@ 2008-02-21 10:14   ` Göran Uddeborg
  0 siblings, 0 replies; 5+ messages in thread
From: Göran Uddeborg @ 2008-02-21 10:14 UTC (permalink / raw)
  To: help-gnu-emacs

ons 2008-02-20 klockan 17:23 +0100 skrev Piet van Oostrum:
> I think this is a bug. Please report it.

I will.

> By the way, your posting contains 
> Content-Type: text/plain; charset=utf8
> That should be utf-8 (with the hyphen). Therefore your posting reads wrong.

Hm, I thought I'd worked around that, but apparently I missed one case.
Thanks for pointing it out.  I hope this one is correct.

(The underlying reason is what I believe is a bug in evolution:
http://bugzilla.gnome.org/show_bug.cgi?id=517244)

/Göran



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-02-21 10:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-20  8:00 Coding system and environment variables Göran Uddeborg
2008-02-20 15:41 ` Sven Joachim
2008-02-21 10:10   ` Göran Uddeborg
2008-02-20 16:23 ` Piet van Oostrum
2008-02-21 10:14   ` Göran Uddeborg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).