* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") @ 2009-03-21 23:23 ` Juanma Barranquero 2009-03-22 1:23 ` Stefan Monnier 2009-09-11 11:10 ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System 0 siblings, 2 replies; 5+ messages in thread From: Juanma Barranquero @ 2009-03-21 23:23 UTC (permalink / raw) To: Emacs Bug Tracker 1) Create a Git repository and add a Latin-1 file with some non-ASCII characters. In my example, the archive test.txt contains the following text: A few Spanish characters: áéíóúüñ 2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: áéíóúüñ 3) Set LANG to UTF-8 (for example, "set LANG=en_US.UTF-8"), and repeat "emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in `utf-8-dos', and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: áéíóúüñ 4) Finally, after unsetting LANG or not (it is irrelevant) do emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f vc-annotate Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture of utf-8 and raw bytes: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: \341\351\355\363\372\374\361 Juanma ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") 2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero @ 2009-03-22 1:23 ` Stefan Monnier 2009-03-22 1:31 ` Juanma Barranquero 2009-09-09 23:18 ` Juanma Barranquero 2009-09-11 11:10 ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System 1 sibling, 2 replies; 5+ messages in thread From: Stefan Monnier @ 2009-03-22 1:23 UTC (permalink / raw) To: Juanma Barranquero; +Cc: Emacs Bug Tracker, 2741 > 4) Finally, after unsetting LANG or not (it is irrelevant) do > emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f > vc-annotate > Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture > of utf-8 and raw bytes: > ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few > Spanish characters: \341\351\355\363\372\374\361 I don't see a mixture of anything, I just see latin-1 encoded chars decoded incorrectly because Emacs somehow decided to try and decode the stream using the utf-8 coding-system. But yes that's a bug. `vc-annotate' should use the main file's coding-system to decode the annotated text, regardless of language environment. Stefan ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") 2009-03-22 1:23 ` Stefan Monnier @ 2009-03-22 1:31 ` Juanma Barranquero 2009-09-09 23:18 ` Juanma Barranquero 1 sibling, 0 replies; 5+ messages in thread From: Juanma Barranquero @ 2009-03-22 1:31 UTC (permalink / raw) To: Stefan Monnier; +Cc: 2741 On Sun, Mar 22, 2009 at 02:23, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > I don't see a mixture of anything, I just see latin-1 encoded chars > decoded incorrectly because Emacs somehow decided to try and decode the > stream using the utf-8 coding-system. Whatever. What I meant is that the buffer is nominally utf-8, but contains raw bytes. > But yes that's a bug. `vc-annotate' should use the main file's > coding-system to decode the annotated text, regardless of > language environment. It seems also a bug that the behavior is different between emacs -Q --eval "(set-language-environment \"UTF-8\")" and set LANG=utf8.UTF-8 emacs -Q when, in both cases, `current-language-environment' is "UTF-8". Juanma ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") 2009-03-22 1:23 ` Stefan Monnier 2009-03-22 1:31 ` Juanma Barranquero @ 2009-09-09 23:18 ` Juanma Barranquero 1 sibling, 0 replies; 5+ messages in thread From: Juanma Barranquero @ 2009-09-09 23:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: 2741 On Sun, Mar 22, 2009 at 03:23, Stefan Monnier<monnier@iro.umontreal.ca> wrote: > I don't see a mixture of anything, I just see latin-1 encoded chars > decoded incorrectly because Emacs somehow decided to try and decode the > stream using the utf-8 coding-system. > But yes that's a bug. `vc-annotate' should use the main file's > coding-system to decode the annotated text, regardless of > language environment. The following patch fixes it. The change is in `vc-annotate' and not `vc-git-annotate-command' because the bug is not git-specific. I can easily reproduce it with bzr, for example. Juanma 2009-09-09 Juanma Barranquero <lekktu@gmail.com> * vc-annotate.el (vc-annotate): Use the main file's coding-system to decode annotated text, regardless of language environment. (Bug#2741) Index: vc-annotate.el =================================================================== RCS file: /cvsroot/emacs/emacs/lisp/vc-annotate.el,v retrieving revision 1.8 diff -u -2 -r1.8 vc-annotate.el --- vc-annotate.el 10 Mar 2009 00:59:09 -0000 1.8 +++ vc-annotate.el 9 Sep 2009 23:11:24 -0000 @@ -376,5 +376,6 @@ (setq temp-buffer-name (buffer-name)))) (with-output-to-temp-buffer temp-buffer-name - (let ((backend (vc-backend file))) + (let ((backend (vc-backend file)) + (coding-system-for-read buffer-file-coding-system)) (vc-call-backend backend 'annotate-command file (get-buffer temp-buffer-name) rev) ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) 2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero 2009-03-22 1:23 ` Stefan Monnier @ 2009-09-11 11:10 ` Emacs bug Tracking System 1 sibling, 0 replies; 5+ messages in thread From: Emacs bug Tracking System @ 2009-09-11 11:10 UTC (permalink / raw) To: Juanma Barranquero [-- Attachment #1: Type: text/plain, Size: 982 bytes --] Your message dated Fri, 11 Sep 2009 13:02:51 +0200 with message-id <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com> and subject line Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") has caused the Emacs bug report #2741, regarding Decoding of vc-annotate output affected by language environment to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com immediately.) -- 2741: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=2741 Emacs Bug Tracking System Contact owner@emacsbugs.donarmstrong.com with problems [-- Attachment #2: Type: message/rfc822, Size: 3359 bytes --] From: Juanma Barranquero <lekktu@gmail.com> To: Emacs Bug Tracker <submit@emacsbugs.donarmstrong.com> Subject: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Date: Sun, 22 Mar 2009 00:23:32 +0100 Message-ID: <f7ccd24b0903211623i58b2c88ek9ff252b0dac0b@mail.gmail.com> 1) Create a Git repository and add a Latin-1 file with some non-ASCII characters. In my example, the archive test.txt contains the following text: A few Spanish characters: áéíóúüñ 2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: áéíóúüñ 3) Set LANG to UTF-8 (for example, "set LANG=en_US.UTF-8"), and repeat "emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in `utf-8-dos', and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: áéíóúüñ 4) Finally, after unsetting LANG or not (it is irrelevant) do emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f vc-annotate Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture of utf-8 and raw bytes: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: \341\351\355\363\372\374\361 Juanma [-- Attachment #3: Type: message/rfc822, Size: 2926 bytes --] From: Juanma Barranquero <lekktu@gmail.com> To: Stefan Monnier <monnier@iro.umontreal.ca> Cc: 2741-done@emacsbugs.donarmstrong.com Subject: Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Date: Fri, 11 Sep 2009 13:02:51 +0200 Message-ID: <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com> On Thu, Sep 10, 2009 at 01:18, Juanma Barranquero <lekktu@gmail.com> wrote: > * vc-annotate.el (vc-annotate): Use the main file's coding-system to > decode annotated text, regardless of language environment. (Bug#2741) I've installed this change. Juanma ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-09-11 11:10 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com> 2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero 2009-03-22 1:23 ` Stefan Monnier 2009-03-22 1:31 ` Juanma Barranquero 2009-09-09 23:18 ` Juanma Barranquero 2009-09-11 11:10 ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).