unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8")
@ 2009-03-21 23:23 ` Juanma Barranquero
  2009-03-22  1:23   ` Stefan Monnier
  2009-09-11 11:10   ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System
  0 siblings, 2 replies; 5+ messages in thread
From: Juanma Barranquero @ 2009-03-21 23:23 UTC (permalink / raw)
  To: Emacs Bug Tracker

1) Create a Git repository and add a Latin-1 file with some non-ASCII
characters. In my example, the archive test.txt contains the following
text:

    A few Spanish characters: áéíóúüñ

2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate
test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and
shows:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: áéíóúüñ

3) Set LANG to UTF-8 (for example, "set LANG=en_US.UTF-8"), and repeat
"emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in
`utf-8-dos', and shows:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: áéíóúüñ

4) Finally, after unsetting LANG or not (it is irrelevant) do

    emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f
vc-annotate

  Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture
of utf-8 and raw bytes:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: \341\351\355\363\372\374\361

    Juanma






^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8")
  2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero
@ 2009-03-22  1:23   ` Stefan Monnier
  2009-03-22  1:31     ` Juanma Barranquero
  2009-09-09 23:18     ` Juanma Barranquero
  2009-09-11 11:10   ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System
  1 sibling, 2 replies; 5+ messages in thread
From: Stefan Monnier @ 2009-03-22  1:23 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: Emacs Bug Tracker, 2741

> 4) Finally, after unsetting LANG or not (it is irrelevant) do

>     emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f
> vc-annotate

>   Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture
> of utf-8 and raw bytes:

>     ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
> Spanish characters: \341\351\355\363\372\374\361

I don't see a mixture of anything, I just see latin-1 encoded chars
decoded incorrectly because Emacs somehow decided to try and decode the
stream using the utf-8 coding-system.
But yes that's a bug.  `vc-annotate' should use the main file's
coding-system to decode the annotated text, regardless of
language environment.


        Stefan






^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8")
  2009-03-22  1:23   ` Stefan Monnier
@ 2009-03-22  1:31     ` Juanma Barranquero
  2009-09-09 23:18     ` Juanma Barranquero
  1 sibling, 0 replies; 5+ messages in thread
From: Juanma Barranquero @ 2009-03-22  1:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 2741

On Sun, Mar 22, 2009 at 02:23, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> I don't see a mixture of anything, I just see latin-1 encoded chars
> decoded incorrectly because Emacs somehow decided to try and decode the
> stream using the utf-8 coding-system.

Whatever. What I meant is that the buffer is nominally utf-8, but
contains raw bytes.

> But yes that's a bug.  `vc-annotate' should use the main file's
> coding-system to decode the annotated text, regardless of
> language environment.

It seems also a bug that the behavior is different between

   emacs -Q --eval "(set-language-environment \"UTF-8\")"

and

  set LANG=utf8.UTF-8
  emacs -Q

when, in both cases, `current-language-environment' is "UTF-8".

    Juanma






^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8")
  2009-03-22  1:23   ` Stefan Monnier
  2009-03-22  1:31     ` Juanma Barranquero
@ 2009-09-09 23:18     ` Juanma Barranquero
  1 sibling, 0 replies; 5+ messages in thread
From: Juanma Barranquero @ 2009-09-09 23:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 2741

On Sun, Mar 22, 2009 at 03:23, Stefan Monnier<monnier@iro.umontreal.ca> wrote:

> I don't see a mixture of anything, I just see latin-1 encoded chars
> decoded incorrectly because Emacs somehow decided to try and decode the
> stream using the utf-8 coding-system.
> But yes that's a bug.  `vc-annotate' should use the main file's
> coding-system to decode the annotated text, regardless of
> language environment.

The following patch fixes it.

The change is in `vc-annotate' and not `vc-git-annotate-command'
because the bug is not git-specific. I can easily reproduce it with
bzr, for example.

    Juanma


2009-09-09  Juanma Barranquero  <lekktu@gmail.com>

	* vc-annotate.el (vc-annotate): Use the main file's coding-system to
	decode annotated text, regardless of language environment.  (Bug#2741)


Index: vc-annotate.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/vc-annotate.el,v
retrieving revision 1.8
diff -u -2 -r1.8 vc-annotate.el
--- vc-annotate.el	10 Mar 2009 00:59:09 -0000	1.8
+++ vc-annotate.el	9 Sep 2009 23:11:24 -0000
@@ -376,5 +376,6 @@
 		(setq temp-buffer-name (buffer-name))))
     (with-output-to-temp-buffer temp-buffer-name
-      (let ((backend (vc-backend file)))
+      (let ((backend (vc-backend file))
+	    (coding-system-for-read buffer-file-coding-system))
         (vc-call-backend backend 'annotate-command file
                          (get-buffer temp-buffer-name) rev)





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#2741: marked as done (Decoding of vc-annotate output affected by language environment)
  2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero
  2009-03-22  1:23   ` Stefan Monnier
@ 2009-09-11 11:10   ` Emacs bug Tracking System
  1 sibling, 0 replies; 5+ messages in thread
From: Emacs bug Tracking System @ 2009-09-11 11:10 UTC (permalink / raw)
  To: Juanma Barranquero

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

Your message dated Fri, 11 Sep 2009 13:02:51 +0200
with message-id <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com>
and subject line Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate  after (set-language-environment "UTF-8")
has caused the Emacs bug report #2741,
regarding Decoding of vc-annotate output affected by language environment
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com
immediately.)


-- 
2741: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=2741
Emacs Bug Tracking System
Contact owner@emacsbugs.donarmstrong.com with problems

[-- Attachment #2: Type: message/rfc822, Size: 3359 bytes --]

From: Juanma Barranquero <lekktu@gmail.com>
To: Emacs Bug Tracker <submit@emacsbugs.donarmstrong.com>
Subject: Mixed UTF-8 and raw bytes in output of vc-annotate after  (set-language-environment "UTF-8")
Date: Sun, 22 Mar 2009 00:23:32 +0100
Message-ID: <f7ccd24b0903211623i58b2c88ek9ff252b0dac0b@mail.gmail.com>

1) Create a Git repository and add a Latin-1 file with some non-ASCII
characters. In my example, the archive test.txt contains the following
text:

    A few Spanish characters: áéíóúüñ

2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate
test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and
shows:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: áéíóúüñ

3) Set LANG to UTF-8 (for example, "set LANG=en_US.UTF-8"), and repeat
"emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in
`utf-8-dos', and shows:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: áéíóúüñ

4) Finally, after unsetting LANG or not (it is irrelevant) do

    emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f
vc-annotate

  Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture
of utf-8 and raw bytes:

    ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few
Spanish characters: \341\351\355\363\372\374\361

    Juanma



[-- Attachment #3: Type: message/rfc822, Size: 2926 bytes --]

From: Juanma Barranquero <lekktu@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 2741-done@emacsbugs.donarmstrong.com
Subject: Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate  after (set-language-environment "UTF-8")
Date: Fri, 11 Sep 2009 13:02:51 +0200
Message-ID: <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com>

On Thu, Sep 10, 2009 at 01:18, Juanma Barranquero <lekktu@gmail.com> wrote:

>        * vc-annotate.el (vc-annotate): Use the main file's coding-system to
>        decode annotated text, regardless of language environment.  (Bug#2741)

I've installed this change.

    Juanma

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-11 11:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <f7ccd24b0909110402t5bf8123dh6104f26a17a9c3b8@mail.gmail.com>
2009-03-21 23:23 ` bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Juanma Barranquero
2009-03-22  1:23   ` Stefan Monnier
2009-03-22  1:31     ` Juanma Barranquero
2009-09-09 23:18     ` Juanma Barranquero
2009-09-11 11:10   ` bug#2741: marked as done (Decoding of vc-annotate output affected by language environment) Emacs bug Tracking System

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).