all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* environment variable don't get coding conversion
@ 2003-01-16 12:05 Dave Love
  2003-01-17  6:13 ` Kenichi Handa
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Love @ 2003-01-16 12:05 UTC (permalink / raw)


Non-ASCII environment variables aren't useful because no coding
conversion is done on them.  I guess the environment should be kept
encoded, and setenv and getenv should be changed to convert through
`locale-coding-system'.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-16 12:05 environment variable don't get coding conversion Dave Love
@ 2003-01-17  6:13 ` Kenichi Handa
  2003-01-18  0:46   ` Richard Stallman
  0 siblings, 1 reply; 27+ messages in thread
From: Kenichi Handa @ 2003-01-17  6:13 UTC (permalink / raw)
  Cc: bug-gnu-emacs

In article <rzqadi1i9ka.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:

> Non-ASCII environment variables aren't useful because no coding
> conversion is done on them.  I guess the environment should be kept
> encoded, and setenv and getenv should be changed to convert through
> `locale-coding-system'.

It seems to me that elements of process-environment should
be decoded by locale-coding-system because
process-environment is exposed to Emacs Lisp, and there are
codes that directly manipulate that variable.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-17  6:13 ` Kenichi Handa
@ 2003-01-18  0:46   ` Richard Stallman
  2003-01-20  0:38     ` Kenichi Handa
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2003-01-18  0:46 UTC (permalink / raw)
  Cc: emacs-devel

    It seems to me that elements of process-environment should
    be decoded by locale-coding-system because
    process-environment is exposed to Emacs Lisp, and there are
    codes that directly manipulate that variable.

We would have to encode them again when running a subprocess.  I think
that could have the effect of altering the values, so that they are
not passed down properly from Emacs's parent process to its children.

It might be better for user programs to decode the values
if they want to.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-18  0:46   ` Richard Stallman
@ 2003-01-20  0:38     ` Kenichi Handa
  2003-01-20 16:46       ` Richard Stallman
  2003-01-21 18:18       ` Dave Love
  0 siblings, 2 replies; 27+ messages in thread
From: Kenichi Handa @ 2003-01-20  0:38 UTC (permalink / raw)
  Cc: emacs-devel

In article <E18Zh83-0000pl-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     It seems to me that elements of process-environment should
>     be decoded by locale-coding-system because
>     process-environment is exposed to Emacs Lisp, and there are
>     codes that directly manipulate that variable.

> We would have to encode them again when running a subprocess.  I think
> that could have the effect of altering the values, so that they are
> not passed down properly from Emacs's parent process to its children.

As far as I know, any coding system set in
locale-coding-system is stateless.  None of them uses escape
sequences.  So, decoding and encoding should yield the same
value.  If that is uncertain, we can put the original
unibyte string as a text property (say, `encoded-string') to
each of decoded string.

> It might be better for user programs to decode the values
> if they want to.

User programs?  Then, what to do with M-x setenv?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-20  0:38     ` Kenichi Handa
@ 2003-01-20 16:46       ` Richard Stallman
  2003-01-21 18:21         ` Dave Love
  2003-01-21 18:18       ` Dave Love
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2003-01-20 16:46 UTC (permalink / raw)
  Cc: emacs-devel

    As far as I know, any coding system set in locale-coding-system is
    stateless.  None of them uses escape sequences.  So, decoding and
    encoding should yield the same value.

If this is true, it would be safe, but is it really an issue
we need to worry about?

    > It might be better for user programs to decode the values
    > if they want to.

    User programs?  Then, what to do with M-x setenv?

For the mean time, maybe nothing needs to be done.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-20  0:38     ` Kenichi Handa
  2003-01-20 16:46       ` Richard Stallman
@ 2003-01-21 18:18       ` Dave Love
  2003-01-23  8:00         ` Richard Stallman
  1 sibling, 1 reply; 27+ messages in thread
From: Dave Love @ 2003-01-21 18:18 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> As far as I know, any coding system set in
> locale-coding-system is stateless.  None of them uses escape
> sequences.

The second entry in locale-language-names (Amharic) uses
iso-2022-7bit, but I don't think that's really an issue.

> > It might be better for user programs to decode the values
> > if they want to.
> 
> User programs?  Then, what to do with M-x setenv?

setenv and getenv must receive and return decoded text one way or
another.

(It seems to me that the basic problem is that the environment string
is exposed to Lisp, presumably because it predates multilingual
issues, and really shouldn't be.  I think accessing it directly should
be deprecated.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-20 16:46       ` Richard Stallman
@ 2003-01-21 18:21         ` Dave Love
  2003-01-23  7:59           ` Richard Stallman
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Love @ 2003-01-21 18:21 UTC (permalink / raw)
  Cc: Kenichi Handa

Richard Stallman <rms@gnu.org> writes:

> For the mean time, maybe nothing needs to be done.

I'm not sure what that means, but something needs to be done to make
non-ASCII usable.  Note that setenv should check that its arg can be
encoded by the locale coding system however the environment is stored.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-21 18:21         ` Dave Love
@ 2003-01-23  7:59           ` Richard Stallman
  2003-01-23 23:04             ` Dave Love
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2003-01-23  7:59 UTC (permalink / raw)
  Cc: handa

    I'm not sure what that means, but something needs to be done to make
    non-ASCII usable.

Could you show an example of what exactly you would like to do,
that doesn't work now?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-21 18:18       ` Dave Love
@ 2003-01-23  8:00         ` Richard Stallman
  2003-01-25  0:56           ` Kenichi Handa
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2003-01-23  8:00 UTC (permalink / raw)
  Cc: handa

    (It seems to me that the basic problem is that the environment string
    is exposed to Lisp, presumably because it predates multilingual
    issues, and really shouldn't be.  I think accessing it directly should
    be deprecated.)

The general idea of Emacs is to expose data structure to the Lisp
program whenever possible.  I see no reason to go against that here.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-23  7:59           ` Richard Stallman
@ 2003-01-23 23:04             ` Dave Love
  2003-01-25 19:22               ` Richard Stallman
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Love @ 2003-01-23 23:04 UTC (permalink / raw)
  Cc: handa

Richard Stallman <rms@gnu.org> writes:

>     I'm not sure what that means, but something needs to be done to make
>     non-ASCII usable.
> 
> Could you show an example of what exactly you would like to do,
> that doesn't work now?

I'm surprised there's any question about doing coding conversion, if
that's what you mean.

If you have a non-ASCII variable name, you can't sensibly use M-x
getenv with it; if you have your non-ASCII personal name in NAME,
Emacs gets your `user-full-name' as a unibyte string with which random
things will happen; &c.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-23  8:00         ` Richard Stallman
@ 2003-01-25  0:56           ` Kenichi Handa
  2003-01-25 17:09             ` Eli Zaretskii
  2003-01-26 18:23             ` Dave Love
  0 siblings, 2 replies; 27+ messages in thread
From: Kenichi Handa @ 2003-01-25  0:56 UTC (permalink / raw)
  Cc: emacs-devel

In article <E18bcHR-0007cG-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>     (It seems to me that the basic problem is that the environment string
>     is exposed to Lisp, presumably because it predates multilingual
>     issues, and really shouldn't be.  I think accessing it directly should
>     be deprecated.)

> The general idea of Emacs is to expose data structure to the Lisp
> program whenever possible.  I see no reason to go against that here.

But, I think it's also the general idea of Emacs to
decode/encode characters automatically.  As far as we expose
process-environment, the strings in it should already be
decoded.  If we want undecoded string in it, we shouldn't
expose it.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25  0:56           ` Kenichi Handa
@ 2003-01-25 17:09             ` Eli Zaretskii
  2003-01-26 15:36               ` Richard Stallman
  2003-01-26 18:23             ` Dave Love
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2003-01-25 17:09 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Sat, 25 Jan 2003 09:56:16 +0900 (JST)
> From: Kenichi Handa <handa@m17n.org>
> 
> But, I think it's also the general idea of Emacs to
> decode/encode characters automatically.  As far as we expose
> process-environment, the strings in it should already be
> decoded.

I agree.  Text strings exposed to Lisp programs should be in the
internal representation, i.e. decoded.  Otherwise, Lisp programs would
be unable to easily use them in text-processing context.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-23 23:04             ` Dave Love
@ 2003-01-25 19:22               ` Richard Stallman
  2003-01-25 22:05                 ` Ehud Karni
  2003-01-26 18:22                 ` Dave Love
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Stallman @ 2003-01-25 19:22 UTC (permalink / raw)
  Cc: handa

    >     I'm not sure what that means, but something needs to be done to make
    >     non-ASCII usable.
    > 
    > Could you show an example of what exactly you would like to do,
    > that doesn't work now?

    I'm surprised there's any question about doing coding conversion, if
    that's what you mean.

"Make non-ASCII usable" is not specific enough for me to understand.
I cannot be sure if there is a real problem.

Please show me a specific problem case so I can judge if there is a
significant problem.

    If you have a non-ASCII variable name, you can't sensibly use M-x
    getenv with it;

getenv could decode the variable name, if that is useful.

		    if you have your non-ASCII personal name in NAME,
    Emacs gets your `user-full-name' as a unibyte string with which random
    things will happen;

Please show specific examples so I can judge whether these are real problems
and, if so, look for solutions.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 19:22               ` Richard Stallman
@ 2003-01-25 22:05                 ` Ehud Karni
  2003-01-26 14:41                   ` Kai Großjohann
                                     ` (2 more replies)
  2003-01-26 18:22                 ` Dave Love
  1 sibling, 3 replies; 27+ messages in thread
From: Ehud Karni @ 2003-01-25 22:05 UTC (permalink / raw)
  Cc: handa

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 25 Jan 2003 14:22:27 -0500, Richard Stallman <rms@gnu.org> wrote:
>
> "make non-ascii usable" is not specific enough for me to understand.
> I cannot be sure if there is a real problem.

I'd say "Make non-ASCII usable" means that the env var should be usable
in Emacs sub shell. i.e. getenv in program run in subshell will get the
proper value (usually an 8 bit characters for ISO-8859-x).

> Please show me a specific problem case so I can judge if there is a
> significant problem.

We have many variables containing Hebrew strings (ISO-8859-8). They
are accessed by programs and scripts. They are represented by 8 bit
values in process-environment.
e.g. HUSER is set to "דוהא", is stored as "HUSER=\343\345\344\340".

If I do (setenv "HUTST" "דוהא") it is stored as "HUTST=דוהא". Now,
when a subshell is accessing the the HUTST variable it gets
ˆדˆוˆהˆא (i.e. \210 before each Hebrew character).
Here is the output of "env | grep "^HU" | cat -v" run from Emacs:
HUTST=M-^HM-cM-^HM-eM-^HM-dM-^HM-`
HUSER=M-cM-eM-dM-`

The way to overcome it in Emacs is to set the env var like this:
  (setenv "HUTST" (encode-coding-string "דוהא" 'hebrew-iso-8bit))

>     If you have a non-ASCII variable name, you can't sensibly use M-x
>     getenv with it;
>
> getenv could decode the variable name, if that is useful.

You can set non-ASCII variable name, e.g.
  (setenv (encode-coding-string "Hםש" 'hebrew-iso-8bit)
          (encode-coding-string "ינרק" 'hebrew-iso-8bit))
and you can use them in subprocesses. To use them in Emacs itself,
You have to use the encoded Hebrew name, i.e.
  (getenv (encode-coding-string "Hםש" 'hebrew-iso-8bit))

I agree that to use non-ASCII environment variable name is not
practical, so this problem is not really important, but the non-ASCII
values are used a lot, and the practical way for ISO-8859-x is to
have them in unibyte.

The `process-environment' can hold the values in multibyte but
`child_setup' and `getenv_internal' (in callproc.c) should transform
the values to unibyte (or whatever necessary for other coding systems
like CJK and utf-8).

Ehud.


- --
 Ehud Karni           Tel: +972-3-7966-561  /"\
 Mivtach - Simon      Fax: +972-3-7966-667  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 mailto:ehud@unix.mvs.co.il                  Better  Safe  Than  Sorry
-----BEGIN PGP SIGNATURE-----
Comment: use http://www.keyserver.net/ to get my key (and others)

iD8DBQE+Mwo2LFvTvpjqOY0RAu3ZAJ4rS0k9OUjuWnrmmCMLyQz73uQurgCdG5n+
KAZmF3+IUi+jSO8yHWUUEyo=
=TPq/
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 22:05                 ` Ehud Karni
@ 2003-01-26 14:41                   ` Kai Großjohann
  2003-01-26 16:11                     ` Eli Zaretskii
  2003-01-27  2:48                     ` Kenichi Handa
  2003-01-26 18:50                   ` Dave Love
  2003-01-27  2:31                   ` Richard Stallman
  2 siblings, 2 replies; 27+ messages in thread
From: Kai Großjohann @ 2003-01-26 14:41 UTC (permalink / raw)


"Ehud Karni" <ehud@unix.mvs.co.il> writes:

> I agree that to use non-ASCII environment variable name is not
> practical, so this problem is not really important, but the non-ASCII
> values are used a lot, and the practical way for ISO-8859-x is to
> have them in unibyte.

I guess the most important question is: which coding system is used
for values of environment variables?  Emacs already has a number of
different coding systems; it's not clear to me which one of them, if
any, is appropriate for environment variables.
process-coding-system?  file-name-coding-system?

And what happens if you do (setenv foo bar) and bar is a string which
can't be encoded in the coding system specified for environment
variables?
-- 
Ambibibentists unite!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 17:09             ` Eli Zaretskii
@ 2003-01-26 15:36               ` Richard Stallman
  2003-01-26 16:08                 ` Eli Zaretskii
  2003-01-27  2:27                 ` Kenichi Handa
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Stallman @ 2003-01-26 15:36 UTC (permalink / raw)
  Cc: handa

    I agree.  Text strings exposed to Lisp programs should be in the
    internal representation, i.e. decoded.  Otherwise, Lisp programs would
    be unable to easily use them in text-processing context.

This would be true ordinarily, but here there is a more important
factor.  The usual thing to do with environment variables is to pass
them through.  Making sure that is solidly reliable is the highest
priority for environment variables.  Therefore we have to leave these
strings in their original format.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 15:36               ` Richard Stallman
@ 2003-01-26 16:08                 ` Eli Zaretskii
  2003-01-27 17:41                   ` Richard Stallman
  2003-01-27  2:27                 ` Kenichi Handa
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2003-01-26 16:08 UTC (permalink / raw)
  Cc: emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Date: Sun, 26 Jan 2003 10:36:54 -0500
> 
> This would be true ordinarily, but here there is a more important
> factor.  The usual thing to do with environment variables is to pass
> them through.  Making sure that is solidly reliable is the highest
> priority for environment variables.

Does it really have to be more reliable than what we do with users'
precious files?  I'm not sure I see why the reliability of launching
subprocesses is more important than potential loss of information due
to incorrect decoding and encoding of user files.

In any case, at least we should IMHO consider whether getenv and
setenv need to decode and encode the environment variables' values.

> Therefore we have to leave these strings in their original format.

If this is the final decision, we probably should tell Lisp
programmers (in the ELisp manual) how to deal with such unibyte
strings in text processing.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 14:41                   ` Kai Großjohann
@ 2003-01-26 16:11                     ` Eli Zaretskii
  2003-01-27 13:18                       ` Stefan Monnier
  2003-01-27  2:48                     ` Kenichi Handa
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2003-01-26 16:11 UTC (permalink / raw)
  Cc: emacs-devel

> From: kai.grossjohann@uni-duisburg.de (Kai =?iso-8859-1?q?Gro=DFjohann?=)
> Date: Sun, 26 Jan 2003 15:41:39 +0100
> 
> I guess the most important question is: which coding system is used
> for values of environment variables?

Something locale-dependent, I guess.

We could also try guessing the encoding, since a typical process
environment is a substantial chunk of text (so a probability of an
error is not very high, I think).

> And what happens if you do (setenv foo bar) and bar is a string which
> can't be encoded in the coding system specified for environment
> variables?

How is this different from hitting the same problem inside
write-region or save-buffer?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 19:22               ` Richard Stallman
  2003-01-25 22:05                 ` Ehud Karni
@ 2003-01-26 18:22                 ` Dave Love
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Love @ 2003-01-26 18:22 UTC (permalink / raw)
  Cc: handa

Richard Stallman <rms@gnu.org> writes:

> "Make non-ASCII usable" is not specific enough for me to understand.
> I cannot be sure if there is a real problem.

Can't you leave it up to handa, then?

> Please show me a specific problem case so I can judge if there is a
> significant problem.

I don't understand how these aren't specific enough, but it's
obviously not right to treat environment variables as unibyte in a
multibyte session.

>     If you have a non-ASCII variable name, you can't sensibly use M-x
>     getenv with it;
> 
> getenv could decode the variable name, if that is useful.
> 
> 		    if you have your non-ASCII personal name in NAME,
>     Emacs gets your `user-full-name' as a unibyte string with which random
>     things will happen;
> 
> Please show specific examples so I can judge whether these are real problems
> and, if so, look for solutions.

You could try non-ASCII variable names and values in the environment
with an appropriate locale.  The solution is either to have
setenv/getenv do coding conversion, or to do conversion both on the
imported environment and the one exported to subprocesses.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25  0:56           ` Kenichi Handa
  2003-01-25 17:09             ` Eli Zaretskii
@ 2003-01-26 18:23             ` Dave Love
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Love @ 2003-01-26 18:23 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> But, I think it's also the general idea of Emacs to
> decode/encode characters automatically.  As far as we expose
> process-environment, the strings in it should already be
> decoded.  If we want undecoded string in it, we shouldn't
> expose it.

Yes, and setenv should be able to sanity-check its args.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 22:05                 ` Ehud Karni
  2003-01-26 14:41                   ` Kai Großjohann
@ 2003-01-26 18:50                   ` Dave Love
  2003-01-27  2:31                   ` Richard Stallman
  2 siblings, 0 replies; 27+ messages in thread
From: Dave Love @ 2003-01-26 18:50 UTC (permalink / raw)
  Cc: handa

[Sorry if this is a duplicate -- I've had trouble sending out over the
weekend.]

"Ehud Karni" <ehud@unix.mvs.co.il> writes:

> You have to use the encoded Hebrew name, i.e.
>   (getenv (encode-coding-string "Hmy" 'hebrew-iso-8bit))

I hope you're not suggesting people _should_ do that!

> I agree that to use non-ASCII environment variable name is not
> practical,

I'm trying to get it made practical for Emacs users.  Why should they
be restricted to ASCII names?

> so this problem is not really important, but the non-ASCII
> values are used a lot, and the practical way for ISO-8859-x is to
> have them in unibyte.

I don't know what that means, but please avoid talking about specific
charsets or classes of them like 8859.  Even if you're only concerned
with Hebrew, presumably you might also want to use it encoded as
windows-1255 or utf-8.  Anyway, things like this should simply work
for any supported encoding.

> The `process-environment' can hold the values in multibyte but 
> `child_setup' and `getenv_internal' (in callproc.c) should transform
> the values to unibyte (or whatever necessary for other coding systems
> like CJK and utf-8).

This sounds confused.  If `process-environment' is stored decoded, as
handa suggests, then it has to be encoded for subprocesses in
`locale-coding-system'.  The encoded value is implicitly unibyte.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 15:36               ` Richard Stallman
  2003-01-26 16:08                 ` Eli Zaretskii
@ 2003-01-27  2:27                 ` Kenichi Handa
  1 sibling, 0 replies; 27+ messages in thread
From: Kenichi Handa @ 2003-01-27  2:27 UTC (permalink / raw)
  Cc: emacs-devel

In article <E18coq6-0001RI-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     I agree.  Text strings exposed to Lisp programs should be in the
>     internal representation, i.e. decoded.  Otherwise, Lisp programs would
>     be unable to easily use them in text-processing context.

> This would be true ordinarily, but here there is a more important
> factor.  The usual thing to do with environment variables is to pass
> them through.  Making sure that is solidly reliable is the highest
> priority for environment variables.  Therefore we have to leave these
> strings in their original format.

For that, as I wrote before, we can attach a text-property
that contains the original unibyte string.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-25 22:05                 ` Ehud Karni
  2003-01-26 14:41                   ` Kai Großjohann
  2003-01-26 18:50                   ` Dave Love
@ 2003-01-27  2:31                   ` Richard Stallman
  2003-01-28 18:42                     ` Dave Love
  2 siblings, 1 reply; 27+ messages in thread
From: Richard Stallman @ 2003-01-27  2:31 UTC (permalink / raw)
  Cc: handa

    > "make non-ascii usable" is not specific enough for me to understand.
    > I cannot be sure if there is a real problem.

    I'd say "Make non-ASCII usable" means that the env var should be usable
    in Emacs sub shell. i.e. getenv in program run in subshell will get the
    proper value (usually an 8 bit characters for ISO-8859-x).

That is true, now, in the normal case.  The normal case is that Emacs
inherits the environment variable.  When a subprocess of Emacs inherits it,
Emacs should make sure never to have modified it accidentally.

Are you talking about some different case? If so, could you be
specific?

    The `process-environment' can hold the values in multibyte but 
    `child_setup' and `getenv_internal' (in callproc.c) should transform
    the values to unibyte (or whatever necessary for other coding systems
    like CJK and utf-8).

That would make the normal case unreliable.  The cure would be worse
than the disease.

    The way to overcome it in Emacs is to set the env var like this:
      (setenv "HUTST" (encode-coding-string "????" 'hebrew-iso-8bit))

Perhaps the setenv function (and maybe getenv too) should do this
encoding and decoding using locale-coding-system.  Would you like
to give that a try?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 14:41                   ` Kai Großjohann
  2003-01-26 16:11                     ` Eli Zaretskii
@ 2003-01-27  2:48                     ` Kenichi Handa
  1 sibling, 0 replies; 27+ messages in thread
From: Kenichi Handa @ 2003-01-27  2:48 UTC (permalink / raw)
  Cc: emacs-devel

In article <84r8b07z30.fsf@lucy.is.informatik.uni-duisburg.de>, kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> "Ehud Karni" <ehud@unix.mvs.co.il> writes:
>>  I agree that to use non-ASCII environment variable name is not
>>  practical, so this problem is not really important, but the non-ASCII
>>  values are used a lot, and the practical way for ISO-8859-x is to
>>  have them in unibyte.

> I guess the most important question is: which coding system is used
> for values of environment variables?  Emacs already has a number of
> different coding systems; it's not clear to me which one of them, if
> any, is appropriate for environment variables.
> process-coding-system?  file-name-coding-system?

We must decode them by the locale-coding-system at startup
time.  But, we have to check the values of LC_ALL, LC_CTYPE,
or LANG on encoding.  Those values may be changed by a user
by M-x setenv or by the direct modification of
process-environment (in the case that it is kept exposed).

> And what happens if you do (setenv foo bar) and bar is a string which
> can't be encoded in the coding system specified for environment
> variables?

We can signal an error.  Or, it may be ok to force encoding
it by the locale coding system (usually results in "????"),
and use the result blindly because such an env. variable is
anyway not usable in a subprocess.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 16:11                     ` Eli Zaretskii
@ 2003-01-27 13:18                       ` Stefan Monnier
  0 siblings, 0 replies; 27+ messages in thread
From: Stefan Monnier @ 2003-01-27 13:18 UTC (permalink / raw)
  Cc: kai.grossjohann

> > And what happens if you do (setenv foo bar) and bar is a string which
> > can't be encoded in the coding system specified for environment
> > variables?

What currently happens is that foo gets the internal (i.e. emacs-mule)
encoding of bar, which is more or less guaranteed to be the wrong thing
to do.  So whatever change we make, it won't be much worse than the
current state of affairs.
We could of course require foo and bar to be unibyte strings so as to
force caller to do the encoding.


	Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-26 16:08                 ` Eli Zaretskii
@ 2003-01-27 17:41                   ` Richard Stallman
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Stallman @ 2003-01-27 17:41 UTC (permalink / raw)
  Cc: emacs-devel

    Does it really have to be more reliable than what we do with users'
    precious files?

Since it can easily be 100% reliable, it should be.

    In any case, at least we should IMHO consider whether getenv and
    setenv need to decode and encode the environment variables' values.

That's exactly what I've suggested.  That can do the user-level part
of this job in a way that cannot interfere with inheritance.
I recommend using locale-coding-system.

    > Therefore we have to leave these strings in their original format.

    If this is the final decision, we probably should tell Lisp
    programmers (in the ELisp manual) how to deal with such unibyte
    strings in text processing.

That is a good idea too.  It shouldn't take much text to do this.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: environment variable don't get coding conversion
  2003-01-27  2:31                   ` Richard Stallman
@ 2003-01-28 18:42                     ` Dave Love
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Love @ 2003-01-28 18:42 UTC (permalink / raw)
  Cc: handa

Richard Stallman <rms@gnu.org> writes:

> That is true, now, in the normal case.  The normal case is that Emacs
> inherits the environment variable.  When a subprocess of Emacs inherits it,
> Emacs should make sure never to have modified it accidentally.

Of course, if you modify the codeset-affecting locale variables, it's
effectively screwed anyhow if you don't recognize the change and
recode the environment.  That's actually not so unreasonable if you're
starting a shell session to test something, I think.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-01-28 18:42 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-16 12:05 environment variable don't get coding conversion Dave Love
2003-01-17  6:13 ` Kenichi Handa
2003-01-18  0:46   ` Richard Stallman
2003-01-20  0:38     ` Kenichi Handa
2003-01-20 16:46       ` Richard Stallman
2003-01-21 18:21         ` Dave Love
2003-01-23  7:59           ` Richard Stallman
2003-01-23 23:04             ` Dave Love
2003-01-25 19:22               ` Richard Stallman
2003-01-25 22:05                 ` Ehud Karni
2003-01-26 14:41                   ` Kai Großjohann
2003-01-26 16:11                     ` Eli Zaretskii
2003-01-27 13:18                       ` Stefan Monnier
2003-01-27  2:48                     ` Kenichi Handa
2003-01-26 18:50                   ` Dave Love
2003-01-27  2:31                   ` Richard Stallman
2003-01-28 18:42                     ` Dave Love
2003-01-26 18:22                 ` Dave Love
2003-01-21 18:18       ` Dave Love
2003-01-23  8:00         ` Richard Stallman
2003-01-25  0:56           ` Kenichi Handa
2003-01-25 17:09             ` Eli Zaretskii
2003-01-26 15:36               ` Richard Stallman
2003-01-26 16:08                 ` Eli Zaretskii
2003-01-27 17:41                   ` Richard Stallman
2003-01-27  2:27                 ` Kenichi Handa
2003-01-26 18:23             ` Dave Love

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.