From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Herbert Euler" Newsgroups: gmane.emacs.devel Subject: Re: Fcall_process: wrong conversion Date: Tue, 16 May 2006 10:59:10 +0800 Message-ID: References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-Trace: sea.gmane.org 1147748367 16770 80.91.229.2 (16 May 2006 02:59:27 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 16 May 2006 02:59:27 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue May 16 04:59:26 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FfpmQ-00068N-7g for ged-emacs-devel@m.gmane.org; Tue, 16 May 2006 04:59:26 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FfpmP-0003nQ-NR for ged-emacs-devel@m.gmane.org; Mon, 15 May 2006 22:59:25 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FfpmE-0003nA-EK for emacs-devel@gnu.org; Mon, 15 May 2006 22:59:14 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FfpmD-0003mu-Qf for emacs-devel@gnu.org; Mon, 15 May 2006 22:59:14 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FfpmD-0003mr-OF for emacs-devel@gnu.org; Mon, 15 May 2006 22:59:13 -0400 Original-Received: from [64.4.26.17] (helo=hotmail.com) by monty-python.gnu.org with esmtp (Exim 4.52) id 1Ffpol-0006bI-8M for emacs-devel@gnu.org; Mon, 15 May 2006 23:01:51 -0400 Original-Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Mon, 15 May 2006 19:59:11 -0700 Original-Received: from 64.4.26.200 by by112fd.bay112.hotmail.msn.com with HTTP; Tue, 16 May 2006 02:59:10 GMT X-Originating-IP: [216.145.54.158] X-Originating-Email: [herberteuler@hotmail.com] X-Sender: herberteuler@hotmail.com In-Reply-To: Original-To: emacs-devel@gnu.org X-OriginalArrivalTime: 16 May 2006 02:59:11.0980 (UTC) FILETIME=[B55B72C0:01C67894] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54543 Archived-At: This doesn't work. I've followed the code, seems the reason is as follows. You changed the code in hexl.el to: (let ((coding-system-for-read 'raw-text) (coding-system-for-write buffer-file-coding-system) (buffer-undo-list t)) (apply 'call-process-region (point-min) (point-max) (expand-file-name hexl-program exec-directory) t t nil ;; Manually encode the args, otherwise they're encoded using ;; coding-system-for-write (i.e. buffer-file-coding-system) which ;; may not be what we want (e.g. utf-16 on a non-utf-16 system). (mapcar (lambda (s) (encode-coding-string s locale-coding-system)) (split-string hexl-options))) So when invoking call-process, the value of `coding-system-for-write' is not nil. In my test, it is `utf-16le-with-signature'. The coding-decide part in callproc.c is line 269 to 300: if (nargs >= 5) { int must_encode = 0; for (i = 4; i < nargs; i++) CHECK_STRING (args[i]); for (i = 4; i < nargs; i++) if (STRING_MULTIBYTE (args[i])) must_encode = 1; if (!NILP (Vcoding_system_for_write)) val = Vcoding_system_for_write; else if (! must_encode) val = Qnil; else { args2 = (Lisp_Object *) alloca ((nargs + 1) * sizeof *args2); args2[0] = Qcall_process; for (i = 0; i < nargs; i++) args2[i + 1] = args[i]; coding_systems = Ffind_operation_coding_system (nargs + 1, args2); if (CONSP (coding_systems)) val = XCDR (coding_systems); else if (CONSP (Vdefault_process_coding_system)) val = XCDR (Vdefault_process_coding_system); else val = Qnil; } val = coding_inherit_eol_type (val, Qnil); setup_coding_system (Fcheck_coding_system (val), &argument_coding); } } If `Vcoding_system_for_write' is not nil, `val' will be set to that value. So at the last line of this code, `detector', `decoder', and `encoder' field of `argument_coding' will be set to UTF-16 relative ones, and CODING_REQUIRE_ENCODING_MASK flag is turned on for `common_flags' of `argument_coding' in coding.c, line 5042 to 5059: else if (EQ (coding_type, Qutf_16)) { val = AREF (attrs, coding_attr_utf_16_bom); CODING_UTF_16_BOM (coding) = (CONSP (val) ? utf_16_detect_bom : EQ (val, Qt) ? utf_16_with_bom : utf_16_without_bom); val = AREF (attrs, coding_attr_utf_16_endian); CODING_UTF_16_ENDIAN (coding) = (EQ (val, Qbig) ? utf_16_big_endian : utf_16_little_endian); CODING_UTF_16_SURROGATE (coding) = 0; coding->detector = detect_coding_utf_16; coding->decoder = decode_coding_utf_16; coding->encoder = encode_coding_utf_16; coding->common_flags |= (CODING_REQUIRE_DECODING_MASK | CODING_REQUIRE_ENCODING_MASK); if (CODING_UTF_16_BOM (coding) == utf_16_detect_bom) coding->common_flags |= CODING_REQUIRE_DETECTION_MASK; } Go back to line 410 to 427, callproc.c: if (nargs > 4) { register int i; struct gcpro gcpro1, gcpro2, gcpro3; GCPRO3 (infile, buffer, current_dir); argument_coding.dst_multibyte = 0; for (i = 4; i < nargs; i++) { argument_coding.src_multibyte = STRING_MULTIBYTE (args[i]); if (CODING_REQUIRE_ENCODING (&argument_coding)) /* We must encode this argument. */ args[i] = encode_coding_string (&argument_coding, args[i], 1); new_argv[i - 3] = SDATA (args[i]); } UNGCPRO; new_argv[nargs - 3] = 0; } `CODING_REQUIRE_ENCODING' test the following things (line 491 to 496, coding.h): /* Return 1 if the coding context CODING requires code conversion on encoding. */ #define CODING_REQUIRE_ENCODING(coding) \ ((coding)->src_multibyte \ || (coding)->common_flags & CODING_REQUIRE_ENCODING_MASK \ || (coding)->mode & CODING_MODE_SELECTIVE_DISPLAY) Although `argument_coding.src_multibyte' may be 0, `argument_coding.common_flags & CODING_REQUIRE_ENCODING_MASK' must be non-zero in this case. So `CODING_REQUIRE_ENCODING (&argument_coding)' will return true. As a result, whether arguments are encoded with `encode-coding-string' like in your change will not affect the conversion done by `call-process'. Perhaps we should not set `coding-system-for-write' in `let' special form in such conditions. And there is another problem: if `locale-coding-system' is UTF-16, is it correct to add prefix "\377\376" or "\376\377" to every command argument? If not, the current code of `call-process' is wrong, since it will always add the prefix. Regards, Guanpeng Xu >From: Stefan Monnier >To: "Herbert Euler" >CC: emacs-devel@gnu.org >Subject: Re: Fcall_process: wrong conversion >Date: Mon, 15 May 2006 12:06:48 -0400 > > > - Create a file contains UTF-16 text, either UTF-16BE or UTF-16LE > > is OK. For example, create a file contains "a" in UTF-16LE as > > its content and name this file with "1". >[...] > > - In case the buffer is encoded with utf-16-le, the content is > > displayed as "a". Type M-x hexl-mode RET, the result is > > > \377?: Invalid argument > > > displayed in the buffer. > >Thanks. I've installed the patch below which should fix the problem. >Please confirm, > > > Stefan > > >--- hexl.el 11 avr 2006 12:45:49 -0400 1.103 >+++ hexl.el 15 mai 2006 12:02:32 -0400 >@@ -704,7 +704,12 @@ > (buffer-undo-list t)) > (apply 'call-process-region (point-min) (point-max) > (expand-file-name hexl-program exec-directory) >- t t nil (split-string hexl-options)) >+ t t nil >+ ;; Manually encode the args, otherwise they're encoded using >+ ;; coding-system-for-write (i.e. buffer-file-coding-system) >which >+ ;; may not be what we want (e.g. utf-16 on a non-utf-16 >system). >+ (mapcar (lambda (s) (encode-coding-string s >locale-coding-system)) >+ (split-string hexl-options))) > (if (> (point) (hexl-address-to-marker hexl-max-address)) > (hexl-goto-address hexl-max-address)))) > > > >_______________________________________________ >Emacs-devel mailing list >Emacs-devel@gnu.org >http://lists.gnu.org/mailman/listinfo/emacs-devel _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.com/