unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* shell-quote-argument and multibyte
@ 2003-04-13 20:28 Lars Hansen
  2003-04-13 21:57 ` Benjamin Riefenstahl
  2003-04-13 22:02 ` Benjamin Riefenstahl
  0 siblings, 2 replies; 13+ messages in thread
From: Lars Hansen @ 2003-04-13 20:28 UTC (permalink / raw)


I dired one can specify a list of external viewers. This is a nice new 
feature. However, there is a problem, at least on Windows, when there 
are special characters in the file name. I don't know where the problem 
should be fixed, but I do know that it disapears if string-make-unibyte 
is called on ARGUMENT in shell-quote-argument.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-13 20:28 Lars Hansen
@ 2003-04-13 21:57 ` Benjamin Riefenstahl
  2003-04-13 22:02 ` Benjamin Riefenstahl
  1 sibling, 0 replies; 13+ messages in thread
From: Benjamin Riefenstahl @ 2003-04-13 21:57 UTC (permalink / raw)


Hi Lars,


Lars Hansen <larsh@math.ku.dk> writes:
> I dired one can specify a list of external viewers. This is a nice
> new feature. However, there is a problem, at least on Windows, when
> there are special characters in the file name. I don't know where
> the problem should be fixed, but I do know that it disapears if
> string-make-unibyte is called on ARGUMENT in shell-quote-argument.

Isn't this a place to use file-name-coding-system?  Like

  (decode-coding-string ARGUMENT
                        (or file-name-coding-system
                            default-file-name-coding-system))

Or something similar. 


so long, benny

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-13 20:28 Lars Hansen
  2003-04-13 21:57 ` Benjamin Riefenstahl
@ 2003-04-13 22:02 ` Benjamin Riefenstahl
  2003-04-15 13:16   ` Kenichi Handa
  1 sibling, 1 reply; 13+ messages in thread
From: Benjamin Riefenstahl @ 2003-04-13 22:02 UTC (permalink / raw)


Hi Lars,

Lars Hansen <larsh@math.ku.dk> writes:
> I dired one can specify a list of external viewers. This is a nice
> new feature. However, there is a problem, at least on Windows, when
> there are special characters in the file name. I don't know where
> the problem should be fixed, but I do know that it disapears if
> string-make-unibyte is called on ARGUMENT in shell-quote-argument.

I would guess you rather want something like this to be generic:

  (decode-coding-string ARGUMENT
                        (or file-name-coding-system
                            default-file-name-coding-system))

so long, benny

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-13 22:02 ` Benjamin Riefenstahl
@ 2003-04-15 13:16   ` Kenichi Handa
  2003-04-15 17:52     ` Lars Hansen
  2003-04-17 13:54     ` Benjamin Riefenstahl
  0 siblings, 2 replies; 13+ messages in thread
From: Kenichi Handa @ 2003-04-15 13:16 UTC (permalink / raw)
  Cc: emacs-devel

In article <m3smsmm4ye.fsf@cicero.benny.turtle-trading.net>, Benjamin Riefenstahl <Benjamin.Riefenstahl@epost.de> writes:
> Hi Lars,
> Lars Hansen <larsh@math.ku.dk> writes:
>>  I dired one can specify a list of external viewers. This is a nice
>>  new feature. However, there is a problem, at least on Windows, when
>>  there are special characters in the file name. I don't know where
>>  the problem should be fixed, but I do know that it disapears if
>>  string-make-unibyte is called on ARGUMENT in shell-quote-argument.

By default, process arguements (including the filename in
the above case) are encoded by:
    (cdr default-process-coding-system)
And usually, it is the same as
default-file-name-coding-system.

So, if it doesn't work, it means that something is wrong in
setting up coding systems on Windows.

Please show me the result of C-h C RET.

Benjamin Riefenstahl <Benjamin.Riefenstahl@epost.de> writes:

> I would guess you rather want something like this to be generic:
>   (decode-coding-string ARGUMENT
>                         (or file-name-coding-system
>                             default-file-name-coding-system))

No.  At least "decode" must be actually "encode".  And, it
shouldn't be done in shell-quote-argument.  Such an encoding
should be done only for file names.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-15 13:16   ` Kenichi Handa
@ 2003-04-15 17:52     ` Lars Hansen
  2003-04-16  2:00       ` Kenichi Handa
  2003-04-17 13:54     ` Benjamin Riefenstahl
  1 sibling, 1 reply; 13+ messages in thread
From: Lars Hansen @ 2003-04-15 17:52 UTC (permalink / raw)
  Cc: emacs-devel

>
>
>Please show me the result of C-h C RET.
>  
>
Coding system for saving this buffer:
  1 -- iso-latin-1-dos

Default coding system (for new files):
  1 -- iso-latin-1-dos

Coding system for keyboard input:
  nil
Coding system for terminal output:
  1 -- iso-latin-1 (alias: iso-8859-1 latin-1)

Defaults for subprocess I/O:
  decoding: 1 -- iso-latin-1-dos

  encoding: 1 -- iso-latin-1-unix


Priority order for recognizing coding systems when reading files:
  1. iso-latin-1 (alias: iso-8859-1 latin-1)
  2. cp850
  3. iso-2022-jp (alias: junet)
  4. iso-2022-7bit
  5. iso-2022-7bit-lock (alias: iso-2022-int-1)
  6. iso-2022-8bit-ss2
  7. emacs-mule
  8. raw-text
  9. japanese-shift-jis (alias: shift_jis sjis)
  10. chinese-big5 (alias: big5 cn-big5)
  11. no-conversion
  12. mule-utf-8 (alias: utf-8)

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-7bit-lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext 
iso-2022-jp-2
    iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION    TARGET PATTERN        CODING SYSTEM(s)
  ---------    --------------        ----------------
  File I/O      "\\.elc\\'"             (emacs-mule . emacs-mule)
                "\\.utf\\(-8\\)?\\'"    utf-8
                "\\(\\`\\|/\\)loaddefs.el\\'"
                                        (raw-text . raw-text-unix)
                "\\.tar\\'"             (no-conversion . no-conversion)
                "\\.po[tx]?\\'\\|\\.po\\."
                                        po-find-file-coding-system
                ""                      find-buffer-file-type-coding-system
  Process I/O   "[cC][mM][dD][pP][rR][oO][xX][yY]"
                                        (undecided-dos . undecided-dos)
  Network I/O    nothing specified

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-15 17:52     ` Lars Hansen
@ 2003-04-16  2:00       ` Kenichi Handa
  2003-04-16  6:16         ` Lars Hansen
  0 siblings, 1 reply; 13+ messages in thread
From: Kenichi Handa @ 2003-04-16  2:00 UTC (permalink / raw)
  Cc: emacs-devel

In article <3E9C46C3.1010101@math.ku.dk>, Lars Hansen <larsh@math.ku.dk> writes:
>> Please show me the result of C-h C RET.
[...]
>   Process I/O   "[cC][mM][dD][pP][rR][oO][xX][yY]"
>                                         (undecided-dos . undecided-dos)

I don't know how an external program is called on Windows,
but if it's via "cmdproxy", this line may be the culprit.

Could you try this?
  (modify-coding-system-alist 'process "[cC][mM][dD][pP][rR][oO][xX][yY]"
     '(iso-latin-1-dos . iso-latin-1-dos))

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-16  2:00       ` Kenichi Handa
@ 2003-04-16  6:16         ` Lars Hansen
  2003-04-16  6:28           ` Kenichi Handa
  0 siblings, 1 reply; 13+ messages in thread
From: Lars Hansen @ 2003-04-16  6:16 UTC (permalink / raw)
  Cc: emacs-devel

>
>
>I don't know how an external program is called on Windows,
>but if it's via "cmdproxy", this line may be the culprit.
>  
>
External programs are (by default) called via cmdproxy.

>Could you try this?
>  (modify-coding-system-alist 'process "[cC][mM][dD][pP][rR][oO][xX][yY]"
>     '(iso-latin-1-dos . iso-latin-1-dos))
>  
>
It works!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-16  6:16         ` Lars Hansen
@ 2003-04-16  6:28           ` Kenichi Handa
  0 siblings, 0 replies; 13+ messages in thread
From: Kenichi Handa @ 2003-04-16  6:28 UTC (permalink / raw)
  Cc: emacs-devel

In article <3E9CF532.1090505@math.ku.dk>, Lars Hansen <larsh@math.ku.dk> writes:
>> I don't know how an external program is called on Windows,
>> but if it's via "cmdproxy", this line may be the culprit.
>>   
>> 
> External programs are (by default) called via cmdproxy.

I see.

>> Could you try this?
>>   (modify-coding-system-alist 'process "[cC][mM][dD][pP][rR][oO][xX][yY]"
>>      '(iso-latin-1-dos . iso-latin-1-dos))
>>   
> It works!

Thank you for confirming that.

To Windows port maintainers:

Why should we treat "cmdproxy" specially on Windows?  Isn't
default-process-coding-system enough?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-15 13:16   ` Kenichi Handa
  2003-04-15 17:52     ` Lars Hansen
@ 2003-04-17 13:54     ` Benjamin Riefenstahl
  2003-04-17 19:35       ` Kai Großjohann
  1 sibling, 1 reply; 13+ messages in thread
From: Benjamin Riefenstahl @ 2003-04-17 13:54 UTC (permalink / raw)


Hi,


Kenichi Handa <handa@m17n.org> writes:
> By default, process arguements (including the filename in the above
> case) are encoded by:
>     (cdr default-process-coding-system)
> And usually, it is the same as default-file-name-coding-system.
> 
> So, if it doesn't work, it means that something is wrong in setting
> up coding systems on Windows.

Thanks for the clarification. 

> And, [encode-coding-string] shouldn't be done in
> shell-quote-argument.  Such an encoding should be done only for file
> names.

I understood that file names were the actual concern of the OP.  But
you are right, the encoding should probably be done outside of
shell-quote-argument.

Still the connections are not entirely clear to me.  I routinely
configure tools so that they output UTF-8, so the coding system for
I/O should be UTF-8.  But that doesn't change the fact that the file
name encoding is latin-1.

Commands like shell-quote-argument, call-process or
shell-command-to-string don't know which of their arguments are file
names.  While most of the command line arguments that are not file
names will not have non-ASCII characters (options), there are of
course arguments that are free text, like e.g. CVS log messages or
verbatim scripts.

That does mean that, to do the right thing, I will currently have to
encode the file names myself, right?  Like e.g. in this fragment,
which I use for reading Word documents with an external tool:

      (let ((coding-system-for-read 'utf-8)
	    (filename (encode-coding-string
		       buffer-file-name
		       (or file-name-coding-system
			   default-file-name-coding-system))))
	(call-process "antiword" nil t nil "-m" "UTF-8.txt"
		      filename))

Or is there a way to simplify this kind of thing?


so long, benny

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-17 13:54     ` Benjamin Riefenstahl
@ 2003-04-17 19:35       ` Kai Großjohann
  2003-04-18 12:14         ` Benjamin Riefenstahl
  0 siblings, 1 reply; 13+ messages in thread
From: Kai Großjohann @ 2003-04-17 19:35 UTC (permalink / raw)


Benjamin Riefenstahl <Benjamin.Riefenstahl@epost.de> writes:

> That does mean that, to do the right thing, I will currently have to
> encode the file names myself, right?

I have the nagging feeling that there might be problems with this.
Are there encodings that use ESC as a special character?  I think so.

Now, suppose you have a string x which is a filename.  Then you
encode that string into the corresponding filename encoding, which
just happens to be an encoding which uses ESC.  And then you pass the
whole shebang to shell-quote-argument which will then happily escape
the ESC for you.

I'm not sure that this is the desired result.
-- 
file-error; Data: (Opening input file no such file or directory ~/.signature)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-17 19:35       ` Kai Großjohann
@ 2003-04-18 12:14         ` Benjamin Riefenstahl
  2003-04-18 15:35           ` Kai Großjohann
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Riefenstahl @ 2003-04-18 12:14 UTC (permalink / raw)


Hi Kai,


kai.grossjohann@gmx.net (Kai Großjohann) writes:
> Now, suppose you have a string x which is a filename.  Then you
> encode that string into the corresponding filename encoding, which
> just happens to be an encoding which uses ESC.  And then you pass
> the whole shebang to shell-quote-argument which will then happily
> escape the ESC for you.

I may be missing something, but why is ESC especially a problem?
Shouldn't shell-quote-argument do just exactly those modifications
that the shell will undo?  So that the combination of
shell-quote-argument and the actual parsing of the command-line in the
shell is a no-op?


so long, benny

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
  2003-04-18 12:14         ` Benjamin Riefenstahl
@ 2003-04-18 15:35           ` Kai Großjohann
  0 siblings, 0 replies; 13+ messages in thread
From: Kai Großjohann @ 2003-04-18 15:35 UTC (permalink / raw)


Benjamin Riefenstahl <Benjamin.Riefenstahl@epost.de> writes:

> I may be missing something, but why is ESC especially a problem?

I've now been thinking about it some more, and it seems that I'm the
one who's missing something.

I'm confused now :-|

Sorry for the line noise.
-- 
file-error; Data: (Opening input file no such file or directory ~/.signature)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: shell-quote-argument and multibyte
@ 2003-05-11 17:51 Lars Hansen
  0 siblings, 0 replies; 13+ messages in thread
From: Lars Hansen @ 2003-05-11 17:51 UTC (permalink / raw)


Did we reach a conclusion on how to solve this problem?

One way to remove the problem is not to treat cmdproxy specially.
Kenichi Handa asked:

>To Windows port maintainers:
>
>Why should we treat "cmdproxy" specially on Windows?  Isn't
>default-process-coding-system enough?
>
but we never got an answer.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2003-05-11 17:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-11 17:51 shell-quote-argument and multibyte Lars Hansen
  -- strict thread matches above, loose matches on Subject: below --
2003-04-13 20:28 Lars Hansen
2003-04-13 21:57 ` Benjamin Riefenstahl
2003-04-13 22:02 ` Benjamin Riefenstahl
2003-04-15 13:16   ` Kenichi Handa
2003-04-15 17:52     ` Lars Hansen
2003-04-16  2:00       ` Kenichi Handa
2003-04-16  6:16         ` Lars Hansen
2003-04-16  6:28           ` Kenichi Handa
2003-04-17 13:54     ` Benjamin Riefenstahl
2003-04-17 19:35       ` Kai Großjohann
2003-04-18 12:14         ` Benjamin Riefenstahl
2003-04-18 15:35           ` Kai Großjohann

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).