From: "Herbert Euler" <herberteuler@hotmail.com>
Subject: Re: Fcall_process: wrong conversion
Date: Tue, 16 May 2006 10:59:10 +0800 [thread overview]
Message-ID: <BAY112-F7409CF46063B56C37BE1ADAA00@phx.gbl> (raw)
In-Reply-To: <k6z8xp3z31n.fsf-monnier+emacs@gnu.org>
This doesn't work. I've followed the code, seems the reason is as
follows.
You changed the code in hexl.el to:
(let ((coding-system-for-read 'raw-text)
(coding-system-for-write buffer-file-coding-system)
(buffer-undo-list t))
(apply 'call-process-region (point-min) (point-max)
(expand-file-name hexl-program exec-directory)
t t nil
;; Manually encode the args, otherwise they're encoded using
;; coding-system-for-write (i.e. buffer-file-coding-system) which
;; may not be what we want (e.g. utf-16 on a non-utf-16 system).
(mapcar (lambda (s) (encode-coding-string s
locale-coding-system))
(split-string hexl-options)))
So when invoking call-process, the value of `coding-system-for-write'
is not nil. In my test, it is `utf-16le-with-signature'. The
coding-decide part in callproc.c is line 269 to 300:
if (nargs >= 5)
{
int must_encode = 0;
for (i = 4; i < nargs; i++)
CHECK_STRING (args[i]);
for (i = 4; i < nargs; i++)
if (STRING_MULTIBYTE (args[i]))
must_encode = 1;
if (!NILP (Vcoding_system_for_write))
val = Vcoding_system_for_write;
else if (! must_encode)
val = Qnil;
else
{
args2 = (Lisp_Object *) alloca ((nargs + 1) * sizeof *args2);
args2[0] = Qcall_process;
for (i = 0; i < nargs; i++) args2[i + 1] = args[i];
coding_systems = Ffind_operation_coding_system (nargs + 1,
args2);
if (CONSP (coding_systems))
val = XCDR (coding_systems);
else if (CONSP (Vdefault_process_coding_system))
val = XCDR (Vdefault_process_coding_system);
else
val = Qnil;
}
val = coding_inherit_eol_type (val, Qnil);
setup_coding_system (Fcheck_coding_system (val), &argument_coding);
}
}
If `Vcoding_system_for_write' is not nil, `val' will be set to that
value. So at the last line of this code, `detector', `decoder', and
`encoder' field of `argument_coding' will be set to UTF-16 relative
ones, and CODING_REQUIRE_ENCODING_MASK flag is turned on for
`common_flags' of `argument_coding' in coding.c, line 5042 to 5059:
else if (EQ (coding_type, Qutf_16))
{
val = AREF (attrs, coding_attr_utf_16_bom);
CODING_UTF_16_BOM (coding) = (CONSP (val) ? utf_16_detect_bom
: EQ (val, Qt) ? utf_16_with_bom
: utf_16_without_bom);
val = AREF (attrs, coding_attr_utf_16_endian);
CODING_UTF_16_ENDIAN (coding) = (EQ (val, Qbig) ? utf_16_big_endian
: utf_16_little_endian);
CODING_UTF_16_SURROGATE (coding) = 0;
coding->detector = detect_coding_utf_16;
coding->decoder = decode_coding_utf_16;
coding->encoder = encode_coding_utf_16;
coding->common_flags
|= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK);
if (CODING_UTF_16_BOM (coding) == utf_16_detect_bom)
coding->common_flags |= CODING_REQUIRE_DETECTION_MASK;
}
Go back to line 410 to 427, callproc.c:
if (nargs > 4)
{
register int i;
struct gcpro gcpro1, gcpro2, gcpro3;
GCPRO3 (infile, buffer, current_dir);
argument_coding.dst_multibyte = 0;
for (i = 4; i < nargs; i++)
{
argument_coding.src_multibyte = STRING_MULTIBYTE (args[i]);
if (CODING_REQUIRE_ENCODING (&argument_coding))
/* We must encode this argument. */
args[i] = encode_coding_string (&argument_coding, args[i], 1);
new_argv[i - 3] = SDATA (args[i]);
}
UNGCPRO;
new_argv[nargs - 3] = 0;
}
`CODING_REQUIRE_ENCODING' test the following things (line 491 to 496,
coding.h):
/* Return 1 if the coding context CODING requires code conversion on
encoding. */
#define CODING_REQUIRE_ENCODING(coding) \
((coding)->src_multibyte \
|| (coding)->common_flags & CODING_REQUIRE_ENCODING_MASK \
|| (coding)->mode & CODING_MODE_SELECTIVE_DISPLAY)
Although `argument_coding.src_multibyte' may be 0,
`argument_coding.common_flags & CODING_REQUIRE_ENCODING_MASK' must be
non-zero in this case. So `CODING_REQUIRE_ENCODING
(&argument_coding)' will return true.
As a result, whether arguments are encoded with `encode-coding-string'
like in your change will not affect the conversion done by
`call-process'. Perhaps we should not set `coding-system-for-write'
in `let' special form in such conditions.
And there is another problem: if `locale-coding-system' is UTF-16, is
it correct to add prefix "\377\376" or "\376\377" to every command
argument? If not, the current code of `call-process' is wrong, since
it will always add the prefix.
Regards,
Guanpeng Xu
>From: Stefan Monnier <monnier@iro.umontreal.ca>
>To: "Herbert Euler" <herberteuler@hotmail.com>
>CC: emacs-devel@gnu.org
>Subject: Re: Fcall_process: wrong conversion
>Date: Mon, 15 May 2006 12:06:48 -0400
>
> > - Create a file contains UTF-16 text, either UTF-16BE or UTF-16LE
> > is OK. For example, create a file contains "a" in UTF-16LE as
> > its content and name this file with "1".
>[...]
> > - In case the buffer is encoded with utf-16-le, the content is
> > displayed as "a". Type M-x hexl-mode RET, the result is
>
> > \377?: Invalid argument
>
> > displayed in the buffer.
>
>Thanks. I've installed the patch below which should fix the problem.
>Please confirm,
>
>
> Stefan
>
>
>--- hexl.el 11 avr 2006 12:45:49 -0400 1.103
>+++ hexl.el 15 mai 2006 12:02:32 -0400
>@@ -704,7 +704,12 @@
> (buffer-undo-list t))
> (apply 'call-process-region (point-min) (point-max)
> (expand-file-name hexl-program exec-directory)
>- t t nil (split-string hexl-options))
>+ t t nil
>+ ;; Manually encode the args, otherwise they're encoded using
>+ ;; coding-system-for-write (i.e. buffer-file-coding-system)
>which
>+ ;; may not be what we want (e.g. utf-16 on a non-utf-16
>system).
>+ (mapcar (lambda (s) (encode-coding-string s
>locale-coding-system))
>+ (split-string hexl-options)))
> (if (> (point) (hexl-address-to-marker hexl-max-address))
> (hexl-goto-address hexl-max-address))))
>
>
>
>_______________________________________________
>Emacs-devel mailing list
>Emacs-devel@gnu.org
>http://lists.gnu.org/mailman/listinfo/emacs-devel
_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.com/
next prev parent reply other threads:[~2006-05-16 2:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-15 6:09 Fcall_process: wrong conversion Herbert Euler
2006-05-15 14:25 ` Stefan Monnier
2006-05-15 15:17 ` Herbert Euler
2006-05-15 16:06 ` Stefan Monnier
2006-05-16 2:59 ` Herbert Euler [this message]
2006-05-16 4:10 ` Kenichi Handa
2006-05-16 4:34 ` Herbert Euler
2006-05-16 4:39 ` Kenichi Handa
2006-05-16 5:40 ` Herbert Euler
2006-05-18 2:24 ` Kenichi Handa
2006-05-18 6:07 ` Herbert Euler
2006-05-18 6:14 ` Herbert Euler
2006-05-18 6:26 ` Kenichi Handa
2006-05-18 6:40 ` Herbert Euler
2006-05-19 3:01 ` Herbert Euler
2006-05-18 17:35 ` Stefan Monnier
2006-05-19 2:49 ` Herbert Euler
2006-05-19 10:41 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BAY112-F7409CF46063B56C37BE1ADAA00@phx.gbl \
--to=herberteuler@hotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.