* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark @ 2009-08-07 8:50 Pierre Bogossian 2009-08-08 12:20 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Pierre Bogossian @ 2009-08-07 8:50 UTC (permalink / raw) To: Eli Zaretskii, Pierre Bogossian, 4047 >What is the value of buffer-file-coding-system before you enter >hexl-mode? It can be utf-8-with-signature-dos or utf-8-with-signature-unix depending on the type of "end-of-line" used by the file. >[...] does it help to say >"C-x RET f utf-8-with-signature RET" before entering hexl-mode? No, but forcing the coding system of any buffer to utf_8-with-signature using this command and then entering hexl-mode is enough to trigger the error. I can even reproduce it with a blank scratch buffer. >> Unfortunately I can't test a unix version at the moment. > >Which means your OS is what? Windows XP SP3. -- Be Yourself @ mail.com! Choose From 200+ Email Addresses Get a Free Account at www.mail.com! ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-07 8:50 bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark Pierre Bogossian @ 2009-08-08 12:20 ` Eli Zaretskii 2009-08-08 13:22 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2009-08-08 12:20 UTC (permalink / raw) To: Pierre Bogossian, Kenichi Handa; +Cc: 4047 > From: "Pierre Bogossian" <bogossian@mail.com> > Date: Fri, 7 Aug 2009 09:50:54 +0100 > > >[...] does it help to say > >"C-x RET f utf-8-with-signature RET" before entering hexl-mode? > > No, but forcing the coding system of any buffer to utf_8-with-signature > using this command and then entering hexl-mode is enough to trigger > the error. I can even reproduce it with a blank scratch buffer. > > >> Unfortunately I can't test a unix version at the moment. > > > >Which means your OS is what? > > Windows XP SP3. The problem happens on GNU/Linux as well. I think I've identified why the problem happens, but I need help in finding the right solution. Handa-san, can you please comment on what's below? Of course, others are welcome to comment as well. The cause of the problem is this: hexlify-buffer must bind coding-system-for-write to the buffer's encoding, to force call-process-region use the buffer's encoding when writing the text to the temporary file. OTOH, it needs to avoid encoding the arguments passed to the `hexl' program by the buffer's encoding, because that could be inappropriate for encoding command lines on the underlying system. However, call-process-region normally uses coding-system-for-write, if it is non-nil, to encode the arguments as well. To resolve this contradiction, hexlify-buffer encodes the arguments manually (by locale-coding-system), assuming that, being unibyte strings after that encoding, they will not be encoded by call-process-region. But call-process (called by call-process-region) does this: /* If arguments are supplied, we may have to encode them. */ if (nargs >= 5) { int must_encode = 0; Lisp_Object coding_attrs; for (i = 4; i < nargs; i++) CHECK_STRING (args[i]); for (i = 4; i < nargs; i++) if (STRING_MULTIBYTE (args[i])) must_encode = 1; if (!NILP (Vcoding_system_for_write)) val = Vcoding_system_for_write; else if (! must_encode) val = Qnil; else { args2 = (Lisp_Object *) alloca ((nargs + 1) * sizeof *args2); args2[0] = Qcall_process; for (i = 0; i < nargs; i++) args2[i + 1] = args[i]; coding_systems = Ffind_operation_coding_system (nargs + 1, args2); First, if coding-system-for-write is non-nil, it is used, even if none of the argument strings is a multibyte string. (This particular bug can easily be solved by making the test for must_encode before we test that coding-system-for-write is non-nil, but I'm not sure this is the right solution because other arguments could be multibyte strings, which will still cause us to use coding-system-for-write for _all_ arguments.) And second, this fragment, which actually encodes the arguments, further down in call-process: if (nargs > 4) { register int i; struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; GCPRO5 (infile, buffer, current_dir, path, error_file); argument_coding.dst_multibyte = 0; for (i = 4; i < nargs; i++) { argument_coding.src_multibyte = STRING_MULTIBYTE (args[i]); if (CODING_REQUIRE_ENCODING (&argument_coding)) /* We must encode this argument. */ args[i] = encode_coding_string (&argument_coding, args[i], 1); } encodes the argument even though argument_coding.src_multibyte is set to nil. Is encode_coding_string supposed to encode unibyte strings? ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 12:20 ` Eli Zaretskii @ 2009-08-08 13:22 ` Eli Zaretskii 2009-08-08 14:29 ` Andreas Schwab 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2009-08-08 13:22 UTC (permalink / raw) To: 4047; +Cc: bogossian > Date: Sat, 08 Aug 2009 15:20:10 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 4047@emacsbugs.donarmstrong.com > > The cause of the problem is this: [...] I probably should have said explicitly that the end result of all I described is that the "-hex" command-line argument to `hexl' is encoded by utf-8-with-signature, and becomes "\357\273\277-hex", which, of course, utterly confuses `hexl'. Btw, I doubt that any encoding that uses BOM can ever be appropriate for encoding command-line arguments. Maybe we should treat them specially in call-process and its ilk. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 13:22 ` Eli Zaretskii @ 2009-08-08 14:29 ` Andreas Schwab 2009-08-08 15:29 ` Eli Zaretskii 2009-08-10 19:45 ` Stefan Monnier 0 siblings, 2 replies; 19+ messages in thread From: Andreas Schwab @ 2009-08-08 14:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 4047, bogossian Eli Zaretskii <eliz@gnu.org> writes: > Btw, I doubt that any encoding that uses BOM can ever be appropriate > for encoding command-line arguments. Maybe we should treat them > specially in call-process and its ilk. The bug is that hexlify-buffer assumes that manually encoding the command line stops call-process from encoding it again, which does not work: coding-system-for-write takes absolute precedence. IMHO call-process should not use coding-system-for-write for encoding the command line, if at all there should be a separate override. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 14:29 ` Andreas Schwab @ 2009-08-08 15:29 ` Eli Zaretskii 2009-08-08 15:47 ` Andreas Schwab 2009-08-08 15:56 ` Lennart Borgman 2009-08-10 19:45 ` Stefan Monnier 1 sibling, 2 replies; 19+ messages in thread From: Eli Zaretskii @ 2009-08-08 15:29 UTC (permalink / raw) To: Andreas Schwab; +Cc: 4047, bogossian > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: 4047@emacsbugs.donarmstrong.com, bogossian@mail.com > Date: Sat, 08 Aug 2009 16:29:31 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Btw, I doubt that any encoding that uses BOM can ever be appropriate > > for encoding command-line arguments. Maybe we should treat them > > specially in call-process and its ilk. > > The bug is that hexlify-buffer assumes that manually encoding the > command line stops call-process from encoding it again, which does not > work: coding-system-for-write takes absolute precedence. If encode_coding_string would leave unibyte strings alone (as I think it should, unless there's a good reason not to), the absolute precedence you mention would not matter. Or, if there _is_ a good reason for encode_coding_string's current behavior, we could avoid encoding unibyte strings in the command-line arguments (although admittedly that would be a kludge). > IMHO call-process should not use coding-system-for-write for > encoding the command line But if some of the command-line arguments are file names, say, we do need to encode them, don't we? > if at all there should be a separate override. That'd be fine by me, if there's no better alternative. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 15:29 ` Eli Zaretskii @ 2009-08-08 15:47 ` Andreas Schwab 2009-08-08 17:24 ` Eli Zaretskii 2009-08-08 15:56 ` Lennart Borgman 1 sibling, 1 reply; 19+ messages in thread From: Andreas Schwab @ 2009-08-08 15:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 4047, bogossian Eli Zaretskii <eliz@gnu.org> writes: > If encode_coding_string would leave unibyte strings alone It does if coding-system-for-write is nil. >> IMHO call-process should not use coding-system-for-write for >> encoding the command line > > But if some of the command-line arguments are file names, say, we do > need to encode them, don't we? coding-system-for-write is meant to override the coding system for write operations, but IMHO the coding system for file names is in a different category. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 15:47 ` Andreas Schwab @ 2009-08-08 17:24 ` Eli Zaretskii 2009-08-08 17:57 ` Lennart Borgman 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2009-08-08 17:24 UTC (permalink / raw) To: Andreas Schwab; +Cc: 4047, bogossian > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: 4047@emacsbugs.donarmstrong.com, bogossian@mail.com > Date: Sat, 08 Aug 2009 17:47:27 +0200 > > coding-system-for-write is meant to override the coding system for write > operations, but IMHO the coding system for file names is in a different > category. What about strings passed to Grep? ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 17:24 ` Eli Zaretskii @ 2009-08-08 17:57 ` Lennart Borgman 0 siblings, 0 replies; 19+ messages in thread From: Lennart Borgman @ 2009-08-08 17:57 UTC (permalink / raw) To: Eli Zaretskii, 4047; +Cc: bogossian, Andreas Schwab On Sat, Aug 8, 2009 at 7:24 PM, Eli Zaretskii<eliz@gnu.org> wrote: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: 4047@emacsbugs.donarmstrong.com, bogossian@mail.com >> Date: Sat, 08 Aug 2009 17:47:27 +0200 >> >> coding-system-for-write is meant to override the coding system for write >> operations, but IMHO the coding system for file names is in a different >> category. > > What about strings passed to Grep? Or arg to any program? The required coding could be different than the file name coding. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 15:29 ` Eli Zaretskii 2009-08-08 15:47 ` Andreas Schwab @ 2009-08-08 15:56 ` Lennart Borgman 2009-08-08 17:25 ` Eli Zaretskii 1 sibling, 1 reply; 19+ messages in thread From: Lennart Borgman @ 2009-08-08 15:56 UTC (permalink / raw) To: Eli Zaretskii, 4047; +Cc: bogossian, Andreas Schwab On Sat, Aug 8, 2009 at 5:29 PM, Eli Zaretskii<eliz@gnu.org> wrote: > But if some of the command-line arguments are file names, say, we do > need to encode them, don't we? Could not different programs (the program arg to call-process) have different requirements? At least on w32 that seems to me to be the case. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 15:56 ` Lennart Borgman @ 2009-08-08 17:25 ` Eli Zaretskii 0 siblings, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2009-08-08 17:25 UTC (permalink / raw) To: Lennart Borgman; +Cc: bogossian, 4047, schwab > Date: Sat, 8 Aug 2009 17:56:21 +0200 > From: Lennart Borgman <lennart.borgman@gmail.com> > Cc: Andreas Schwab <schwab@linux-m68k.org>, bogossian@mail.com > > On Sat, Aug 8, 2009 at 5:29 PM, Eli Zaretskii<eliz@gnu.org> wrote: > > > But if some of the command-line arguments are file names, say, we do > > need to encode them, don't we? > > Could not different programs (the program arg to call-process) have > different requirements? Of course, they do. But the Lisp code that invokes them should know what it is doing. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-08 14:29 ` Andreas Schwab 2009-08-08 15:29 ` Eli Zaretskii @ 2009-08-10 19:45 ` Stefan Monnier 2009-08-11 0:51 ` Kenichi Handa 1 sibling, 1 reply; 19+ messages in thread From: Stefan Monnier @ 2009-08-10 19:45 UTC (permalink / raw) To: Andreas Schwab; +Cc: 4047, bogossian >> Btw, I doubt that any encoding that uses BOM can ever be appropriate >> for encoding command-line arguments. Maybe we should treat them >> specially in call-process and its ilk. > The bug is that hexlify-buffer assumes that manually encoding the > command line stops call-process from encoding it again, which does not > work: coding-system-for-write takes absolute precedence. IMHO > call-process should not use coding-system-for-write for encoding the > command line, if at all there should be a separate override. I believe we've bumped into this problem already in the past. To me, it's clear that call-process should be careful about coding arguments, since the coding-system to use may depend on the argument and/or the command, so in general the caller will want to specify explicitly some coding system for the arguments, including a different coding system for each argument. An override var might be a good idea, but it won't cater to the case where each arg requires a different encoding, so the most important thing is to make sure that unibyte args don't get re-encoded. Unless Handa objects, I'd recommend we change encode_coding_string to be a nop on unibyte strings (tho, we may want to let it obey EOL conversions). If there are good reasons not to do that, then Fcall_process should be changed to not call encode_coding_string on unibyte strings. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-10 19:45 ` Stefan Monnier @ 2009-08-11 0:51 ` Kenichi Handa 2009-08-14 9:02 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Kenichi Handa @ 2009-08-11 0:51 UTC (permalink / raw) To: Stefan Monnier, 4047; +Cc: bogossian, 4047, schwab In article <jwvljlrlapn.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > Unless Handa objects, I'd recommend we change encode_coding_string to be > a nop on unibyte strings (tho, we may want to let it obey EOL > conversions). I don't object to that change. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-11 0:51 ` Kenichi Handa @ 2009-08-14 9:02 ` Eli Zaretskii 2009-08-21 9:33 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2009-08-14 9:02 UTC (permalink / raw) To: Kenichi Handa, 4047; +Cc: bogossian, schwab > From: Kenichi Handa <handa@m17n.org> > Date: Tue, 11 Aug 2009 09:51:49 +0900 > Cc: bogossian@mail.com, 4047@emacsbugs.donarmstrong.com, schwab@linux-m68k.org > > In article <jwvljlrlapn.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > Unless Handa objects, I'd recommend we change encode_coding_string to be > > a nop on unibyte strings (tho, we may want to let it obey EOL > > conversions). > > I don't object to that change. For strings only (i.e. in coding.h:encode_coding_string) or on the more basic level, in coding.c:encode_coding_object? ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-14 9:02 ` Eli Zaretskii @ 2009-08-21 9:33 ` Eli Zaretskii 2009-08-21 12:18 ` Kenichi Handa 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2009-08-21 9:33 UTC (permalink / raw) To: Kenichi Handa; +Cc: schwab, 4047, bogossian > Date: Fri, 14 Aug 2009 12:02:37 +0300 > From: Eli Zaretskii <eliz@gnu.org> > CC: monnier@iro.umontreal.ca, bogossian@mail.com, schwab@linux-m68k.org > > > From: Kenichi Handa <handa@m17n.org> > > Date: Tue, 11 Aug 2009 09:51:49 +0900 > > Cc: bogossian@mail.com, 4047@emacsbugs.donarmstrong.com, schwab@linux-m68k.org > > > > In article <jwvljlrlapn.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > > Unless Handa objects, I'd recommend we change encode_coding_string to be > > > a nop on unibyte strings (tho, we may want to let it obey EOL > > > conversions). > > > > I don't object to that change. > > For strings only (i.e. in coding.h:encode_coding_string) or on the > more basic level, in coding.c:encode_coding_object? Ping! ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-21 9:33 ` Eli Zaretskii @ 2009-08-21 12:18 ` Kenichi Handa [not found] ` <83praof8mu.fsf@gnu.org> 0 siblings, 1 reply; 19+ messages in thread From: Kenichi Handa @ 2009-08-21 12:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, 4047, bogossian In article <83ljldh5pm.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > Unless Handa objects, I'd recommend we change encode_coding_string to be > > > > a nop on unibyte strings (tho, we may want to let it obey EOL > > > > conversions). > > > > > > I don't object to that change. > > > > For strings only (i.e. in coding.h:encode_coding_string) or on the > > more basic level, in coding.c:encode_coding_object? > Ping! At the moment, all I can say is that changing coding.h:encode_coding_string is quite safe. But, encode_coding_object is used by Lisp functions encode-coding-region and encode-coding-string, and thus the change will break some packages that use them on unibyte string/buffer. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <83praof8mu.fsf@gnu.org>]
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark @ 2009-08-05 14:01 ` Pierre Bogossian 2009-08-06 17:49 ` Eli Zaretskii 2009-08-22 10:30 ` bug#4047: marked as done (23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark) Emacs bug Tracking System 0 siblings, 2 replies; 19+ messages in thread From: Pierre Bogossian @ 2009-08-05 14:01 UTC (permalink / raw) To: bug-gnu-emacs Hi, I'm testing the windows version of the new emacs 23.1.1 Here's what I noticed: If I open a UTF8 file with a byte-order mark, and if I try to enter hexl-mode, I get this error: "\357\273\277-hex: No such file or directory". The presence of the BOM is important, I can enter hexl-mode with no problem if I remove the BOM from the file. I did the same test with emacs 22.3.1 and it worked fine, so this looks like a regression. Unfortunately I can't test a unix version at the moment. Regards, Pierre -- Be Yourself @ mail.com! Choose From 200+ Email Addresses Get a Free Account at www.mail.com! ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark 2009-08-05 14:01 ` Pierre Bogossian @ 2009-08-06 17:49 ` Eli Zaretskii 2009-08-22 10:30 ` bug#4047: marked as done (23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark) Emacs bug Tracking System 1 sibling, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2009-08-06 17:49 UTC (permalink / raw) To: Pierre Bogossian, 4047 > From: "Pierre Bogossian" <bogossian@mail.com> > Date: Wed, 5 Aug 2009 15:01:31 +0100 > Cc: > > If I open a UTF8 file with a byte-order mark, and if I > try to enter hexl-mode, I get this error: "\357\273\277-hex: No such file or directory". What is the value of buffer-file-coding-system before you enter hexl-mode? If it is anything but utf-8-with-signature, does it help to say "C-x RET f utf-8-with-signature RET" before entering hexl-mode? > Unfortunately I can't test a unix version at the moment. Which means your OS is what? ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: marked as done (23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark) 2009-08-05 14:01 ` Pierre Bogossian 2009-08-06 17:49 ` Eli Zaretskii @ 2009-08-22 10:30 ` Emacs bug Tracking System 1 sibling, 0 replies; 19+ messages in thread From: Emacs bug Tracking System @ 2009-08-22 10:30 UTC (permalink / raw) To: Eli Zaretskii [-- Attachment #1: Type: text/plain, Size: 918 bytes --] Your message dated Sat, 22 Aug 2009 13:25:13 +0300 with message-id <83praof8mu.fsf@gnu.org> and subject line Re: bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark has caused the Emacs bug report #4047, regarding 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com immediately.) -- 4047: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4047 Emacs Bug Tracking System Contact owner@emacsbugs.donarmstrong.com with problems [-- Attachment #2: Type: message/rfc822, Size: 3239 bytes --] From: "Pierre Bogossian" <bogossian@mail.com> To: bug-gnu-emacs@gnu.org Subject: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark Date: Wed, 5 Aug 2009 15:01:31 +0100 Message-ID: <20090805140131.19FBE606865@ws1-4.us4.outblaze.com> Hi, I'm testing the windows version of the new emacs 23.1.1 Here's what I noticed: If I open a UTF8 file with a byte-order mark, and if I try to enter hexl-mode, I get this error: "\357\273\277-hex: No such file or directory". The presence of the BOM is important, I can enter hexl-mode with no problem if I remove the BOM from the file. I did the same test with emacs 22.3.1 and it worked fine, so this looks like a regression. Unfortunately I can't test a unix version at the moment. Regards, Pierre -- Be Yourself @ mail.com! Choose From 200+ Email Addresses Get a Free Account at www.mail.com! [-- Attachment #3: Type: message/rfc822, Size: 2821 bytes --] From: Eli Zaretskii <eliz@gnu.org> To: Kenichi Handa <handa@m17n.org> Cc: 4047-done@emacsbugs.donarmstrong.com, monnier@iro.umontreal.ca, bogossian@mail.com, schwab@linux-m68k.org Subject: Re: bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark Date: Sat, 22 Aug 2009 13:25:13 +0300 Message-ID: <83praof8mu.fsf@gnu.org> > From: Kenichi Handa <handa@m17n.org> > CC: 4047@emacsbugs.donarmstrong.com, monnier@iro.umontreal.ca, > bogossian@mail.com, schwab@linux-m68k.org > Date: Fri, 21 Aug 2009 21:18:53 +0900 > > In article <83ljldh5pm.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > > > Unless Handa objects, I'd recommend we change encode_coding_string to be > > > > > a nop on unibyte strings (tho, we may want to let it obey EOL > > > > > conversions). > > > > > > > > I don't object to that change. > > > > > > For strings only (i.e. in coding.h:encode_coding_string) or on the > > > more basic level, in coding.c:encode_coding_object? > > > Ping! > > At the moment, all I can say is that changing > coding.h:encode_coding_string is quite safe. But, > encode_coding_object is used by Lisp functions > encode-coding-region and encode-coding-string, and thus the > change will break some packages that use them on unibyte > string/buffer. I fixed this in encode-coding-string. Thanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
* bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark [not found] ` <83praof8mu.fsf@gnu.org> 2009-08-05 14:01 ` Pierre Bogossian @ 2009-08-27 11:15 ` Kenichi Handa 1 sibling, 0 replies; 19+ messages in thread From: Kenichi Handa @ 2009-08-27 11:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, 4047, bogossian In article <83praof8mu.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > At the moment, all I can say is that changing > > coding.h:encode_coding_string is quite safe. But, > > encode_coding_object is used by Lisp functions > > encode-coding-region and encode-coding-string, and thus the > > change will break some packages that use them on unibyte > > string/buffer. > I fixed this in encode-coding-string. I have overlooked this part: Stefan wrote: > I'd recommend we change encode_coding_string to be > a nop on unibyte strings (tho, we may want to let it obey EOL ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > conversions). ^^^^^^^^^^^ We surely need eol conversion in sending a unibyte string to a process. So, I've just installed this change. 2009-08-27 Kenichi Handa <handa@m17n.org> * process.c (send_process): Use encode_coding_object instead of encode_coding_string to perform eol-conversion even if the string is unibyte. Index: process.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/process.c,v retrieving revision 1.593 retrieving revision 1.594 diff -u -r1.593 -r1.594 --- process.c 17 Aug 2009 21:04:07 -0000 1.593 +++ process.c 27 Aug 2009 11:12:54 -0000 1.594 @@ -5721,7 +5721,8 @@ } else if (STRINGP (object)) { - encode_coding_string (coding, object, 1); + encode_coding_object (coding, object, 0, 0, SCHARS (object), + SBYTES (object), Qt); } else { --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-08-27 11:15 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-07 8:50 bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark Pierre Bogossian 2009-08-08 12:20 ` Eli Zaretskii 2009-08-08 13:22 ` Eli Zaretskii 2009-08-08 14:29 ` Andreas Schwab 2009-08-08 15:29 ` Eli Zaretskii 2009-08-08 15:47 ` Andreas Schwab 2009-08-08 17:24 ` Eli Zaretskii 2009-08-08 17:57 ` Lennart Borgman 2009-08-08 15:56 ` Lennart Borgman 2009-08-08 17:25 ` Eli Zaretskii 2009-08-10 19:45 ` Stefan Monnier 2009-08-11 0:51 ` Kenichi Handa 2009-08-14 9:02 ` Eli Zaretskii 2009-08-21 9:33 ` Eli Zaretskii 2009-08-21 12:18 ` Kenichi Handa [not found] ` <83praof8mu.fsf@gnu.org> 2009-08-05 14:01 ` Pierre Bogossian 2009-08-06 17:49 ` Eli Zaretskii 2009-08-22 10:30 ` bug#4047: marked as done (23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark) Emacs bug Tracking System 2009-08-27 11:15 ` bug#4047: 23.1.1: hexl-mode doesn't like UTF8 files with a byte-order mark Kenichi Handa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).