From: "Herbert Euler" <herberteuler@hotmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Fcall_process: wrong conversion
Date: Mon, 15 May 2006 23:17:06 +0800 [thread overview]
Message-ID: <BAY112-F23A14F71F826D50ED045A5DAA30@phx.gbl> (raw)
In-Reply-To: <k6z64k71ihm.fsf-monnier+emacs@gnu.org>
I followed these steps:
- Create a file contains UTF-16 text, either UTF-16BE or UTF-16LE
is OK. For example, create a file contains "a" in UTF-16LE as
its content and name this file with "1".
- Visit file "1" with C-x C-f.
In fact, files in UTF-16 can be interpreted as UTF-16 text, or ASCII
text with non-ASCII characters. The UTF-16LE representation of
content of file "1" is "a", and the ASCII representation is
"\377\376a^@", where "\377\376" means the text is in UTF-16LE
encoding, and in which "a" is represented as "a^@" (^@ is \0 here).
If for some reason Emacs doesn't visit the file with correct encoding,
one can type C-x RET r followed by the correct encoding and RET to
correct it.
- In case the buffer is encoded with raw-text-unix, the content is
displayed as "\377\376a^@". Type M-x hexl-mode RET, correct
result is displayed (no description here, since it's easy to
get).
- In case the buffer is encoded with utf-16-le, the content is
displayed as "a". Type M-x hexl-mode RET, the result is
\377?: Invalid argument
displayed in the buffer.
This is because hexl-mode finishes its job as follows:
1. Store the buffer content in a temporary file.
2. Invoke "hexl" with argument "-hex" and stdin set to the
temporary file, and put its output into the same buffer. This
is done by calling `call-process-region' (and so
`call-process').
3. Manipulate the output to generate correct result.
When the buffer is encoded with raw-text-unix, the code of
`Fcall_process' in callproc.c shown in the last mail will not convert
the argument "-hex", so the actual command to be invoked is "hexl
-hex". But if the buffer is encoded with utf-16-le, "-hex" will be
converted to "\377\376-^@h^@e^@x^@", so the command to be invoked is
"hexl \377\376-^@h^@e^@x^@". Since "^@" is actually '\0', "hexl"
would see "\377\376-" as its first argument. That's why the content
displayed in the second case is an error message. The following code
of hexl-mode can't manipulate the (wrong) output correctly as a
result.
Hope I've described clearly.
Regards,
Guanpeng Xu
>From: Stefan Monnier <monnier@iro.umontreal.ca>
>To: "Herbert Euler" <herberteuler@hotmail.com>
>CC: emacs-devel@gnu.org
>Subject: Re: Fcall_process: wrong conversion
>Date: Mon, 15 May 2006 10:25:27 -0400
>
> > Fcall_process in callproc.c, which is correspond to `call-process',
> > cannot handle UTF-16 (both LE or BE) correctly. Take a look at line
>
>Actually, it handles it just fine. The problem is that call-process and
>start-process both use the same coding system to encode arguments and to
>encode the data sent via stdin to the process, whereas you want them to
>be distinct.
>If you want them to be distinct, then you need to manually encode your
>arguments before passing them to call-process.
>
>I.e. the bug with hexl-mode is in hexl.el. Please report it separately
>indicating how to reproduce the problem (I don't know how to "applying
>`hexl-mode' to UTF-16 texts").
>
>
> Stefan
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
next prev parent reply other threads:[~2006-05-15 15:17 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-15 6:09 Fcall_process: wrong conversion Herbert Euler
2006-05-15 14:25 ` Stefan Monnier
2006-05-15 15:17 ` Herbert Euler [this message]
2006-05-15 16:06 ` Stefan Monnier
2006-05-16 2:59 ` Herbert Euler
2006-05-16 4:10 ` Kenichi Handa
2006-05-16 4:34 ` Herbert Euler
2006-05-16 4:39 ` Kenichi Handa
2006-05-16 5:40 ` Herbert Euler
2006-05-18 2:24 ` Kenichi Handa
2006-05-18 6:07 ` Herbert Euler
2006-05-18 6:14 ` Herbert Euler
2006-05-18 6:26 ` Kenichi Handa
2006-05-18 6:40 ` Herbert Euler
2006-05-19 3:01 ` Herbert Euler
2006-05-18 17:35 ` Stefan Monnier
2006-05-19 2:49 ` Herbert Euler
2006-05-19 10:41 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BAY112-F23A14F71F826D50ED045A5DAA30@phx.gbl \
--to=herberteuler@hotmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).