unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#23075: 24.5; vc-git-command should use coding-system-for-read
@ 2016-03-21 11:22 Nikolay Kudryavtsev
  2016-03-21 16:16 ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Nikolay Kudryavtsev @ 2016-03-21 11:22 UTC (permalink / raw)
  To: 23075

If you have files in your repositories, that are in different coding 
system from vc-git-commits-coding-system, vc-diff would still show diffs 
for them in vc-git-commits-coding-system.

This is unnecessary, since vc-diff-internal already decides what coding 
system to use based on the first file and then sets 
coding-system-for-read accordingly.

The fix is to change vc-git-command to use coding-system-for-read 
whenever it's bound.

-- 
Best Regards,
Nikolay Kudryavtsev






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 11:22 bug#23075: 24.5; vc-git-command should use coding-system-for-read Nikolay Kudryavtsev
@ 2016-03-21 16:16 ` Eli Zaretskii
  2016-03-21 16:54   ` Óscar Fuentes
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-21 16:16 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: 23075

> From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>
> Date: Mon, 21 Mar 2016 14:22:36 +0300
> 
> If you have files in your repositories, that are in different coding 
> system from vc-git-commits-coding-system, vc-diff would still show diffs 
> for them in vc-git-commits-coding-system.
> 
> This is unnecessary, since vc-diff-internal already decides what coding 
> system to use based on the first file and then sets 
> coding-system-for-read accordingly.
> 
> The fix is to change vc-git-command to use coding-system-for-read 
> whenever it's bound.

Please show a recipe for reproducing this problem with a recent Git
version and preferably with the latest pretest of Emacs 25.1.  Please
include any files necessary for the reproduction.  Before we dive into
this tricky issue, I'd like to be sure the problem still exists (since
more than a year has passed since your original reports).

Also, this is specific to MS-Windows, right?

Thanks.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 16:16 ` Eli Zaretskii
@ 2016-03-21 16:54   ` Óscar Fuentes
  2016-03-21 17:21     ` Eli Zaretskii
  2016-03-21 17:57     ` Andreas Schwab
  0 siblings, 2 replies; 14+ messages in thread
From: Óscar Fuentes @ 2016-03-21 16:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23075, Nikolay Kudryavtsev

Eli Zaretskii <eliz@gnu.org> writes:

>> If you have files in your repositories, that are in different coding 
>> system from vc-git-commits-coding-system, vc-diff would still show diffs 
>> for them in vc-git-commits-coding-system.
>> 
>> This is unnecessary, since vc-diff-internal already decides what coding 
>> system to use based on the first file and then sets 
>> coding-system-for-read accordingly.
>> 
>> The fix is to change vc-git-command to use coding-system-for-read 
>> whenever it's bound.
>
> Please show a recipe for reproducing this problem with a recent Git
> version and preferably with the latest pretest of Emacs 25.1.  Please
> include any files necessary for the reproduction.  Before we dive into
> this tricky issue, I'd like to be sure the problem still exists (since
> more than a year has passed since your original reports).
>
> Also, this is specific to MS-Windows, right?

Maybe this is related: on a Kubuntu 15.08 box with a 2 week old emacs
25.0.92.2 and git 2.5.0, if I visit etc/HELLO (on a Emacs checkout) and
do `C-x v l' and then, on the *vc-change-log* buffer, put the cursor on
the last commit (etc/HELLO: Add Armenian and Mongolian greetings.) and
press `d', the ensuing *vc-diff* buffer contains mostly garbage.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 16:54   ` Óscar Fuentes
@ 2016-03-21 17:21     ` Eli Zaretskii
  2016-03-21 17:57     ` Andreas Schwab
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-21 17:21 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: 23075, nikolay.kudryavtsev

> From: Óscar Fuentes <ofv@wanadoo.es>
> Cc: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>,  23075@debbugs.gnu.org
> Date: Mon, 21 Mar 2016 17:54:28 +0100
> 
> Maybe this is related: on a Kubuntu 15.08 box with a 2 week old emacs
> 25.0.92.2 and git 2.5.0, if I visit etc/HELLO (on a Emacs checkout) and
> do `C-x v l' and then, on the *vc-change-log* buffer, put the cursor on
> the last commit (etc/HELLO: Add Armenian and Mongolian greetings.) and
> press `d', the ensuing *vc-diff* buffer contains mostly garbage.

Thanks, indeed it looks like the same problem.  But I'd still like to
be sure there's nothing else; AFAIR the OP has some non-standard Git
setup wrt encoding stuff.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 16:54   ` Óscar Fuentes
  2016-03-21 17:21     ` Eli Zaretskii
@ 2016-03-21 17:57     ` Andreas Schwab
  2016-03-21 18:05       ` Óscar Fuentes
  2016-03-21 18:20       ` Eli Zaretskii
  1 sibling, 2 replies; 14+ messages in thread
From: Andreas Schwab @ 2016-03-21 17:57 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: 23075, Nikolay Kudryavtsev

Óscar Fuentes <ofv@wanadoo.es> writes:

> Maybe this is related: on a Kubuntu 15.08 box with a 2 week old emacs
> 25.0.92.2 and git 2.5.0, if I visit etc/HELLO (on a Emacs checkout) and
> do `C-x v l' and then, on the *vc-change-log* buffer, put the cursor on
> the last commit (etc/HELLO: Add Armenian and Mongolian greetings.) and
> press `d', the ensuing *vc-diff* buffer contains mostly garbage.

Garbage in which way?  The diff output is encoded in iso-2022-7bit, but
Emacs doesn't know that since the coding cookie is missing.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 17:57     ` Andreas Schwab
@ 2016-03-21 18:05       ` Óscar Fuentes
  2016-03-21 18:22         ` Eli Zaretskii
  2016-03-21 18:20       ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Óscar Fuentes @ 2016-03-21 18:05 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 23075, Nikolay Kudryavtsev

Andreas Schwab <schwab@linux-m68k.org> writes:

> Garbage in which way?

In the sense that what is shown in the diff does not resemble what the
file contains, nor it even makes sense to a human.

> The diff output is encoded in iso-2022-7bit, but Emacs doesn't know
> that since the coding cookie is missing.

Does git always use that encoding? If not, on what it depends the
encoding used by git?





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 17:57     ` Andreas Schwab
  2016-03-21 18:05       ` Óscar Fuentes
@ 2016-03-21 18:20       ` Eli Zaretskii
  2016-03-21 18:45         ` Nikolay Kudryavtsev
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-21 18:20 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: ofv, 23075, nikolay.kudryavtsev

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  23075@debbugs.gnu.org,  Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>
> Date: Mon, 21 Mar 2016 18:57:55 +0100
> 
> Óscar Fuentes <ofv@wanadoo.es> writes:
> 
> > Maybe this is related: on a Kubuntu 15.08 box with a 2 week old emacs
> > 25.0.92.2 and git 2.5.0, if I visit etc/HELLO (on a Emacs checkout) and
> > do `C-x v l' and then, on the *vc-change-log* buffer, put the cursor on
> > the last commit (etc/HELLO: Add Armenian and Mongolian greetings.) and
> > press `d', the ensuing *vc-diff* buffer contains mostly garbage.
> 
> Garbage in which way?  The diff output is encoded in iso-2022-7bit, but
> Emacs doesn't know that since the coding cookie is missing.

The OP seems to suggest that the correct encoding is already known,
since we visited the file itself.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 18:05       ` Óscar Fuentes
@ 2016-03-21 18:22         ` Eli Zaretskii
  2016-03-21 19:38           ` Óscar Fuentes
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-21 18:22 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: schwab, 23075, nikolay.kudryavtsev

> From: Óscar Fuentes <ofv@wanadoo.es>
> Cc: Eli Zaretskii <eliz@gnu.org>,  23075@debbugs.gnu.org,  Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>
> Date: Mon, 21 Mar 2016 19:05:11 +0100
> 
> Does git always use that encoding? If not, on what it depends the
> encoding used by git?

It just spews the bytes, AFAIK.  IOW, it's the encoding of the file
whose log we are viewing.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 18:20       ` Eli Zaretskii
@ 2016-03-21 18:45         ` Nikolay Kudryavtsev
  2016-03-21 19:17           ` Eli Zaretskii
  2016-04-02  9:58           ` Eli Zaretskii
  0 siblings, 2 replies; 14+ messages in thread
From: Nikolay Kudryavtsev @ 2016-03-21 18:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23075

Hello.

Ok, here are more detailed steps on how to repeat this this bug.
1. Clone https://github.com/sg2002/vc-git-bugs
2. Open 1251.txt in windows1251 encoding. Do some changes in it.
3. vc-diff. This would give you diff with broken encoding.
4. Evaluate vc-git-command from fix.el in the repo.
5. vc-diff again, it would give you the correct encoding as long as the 
file itself opened correctly.

Yes, there is some code that specifies encoding in vc already. See the 
first large comment in vc-diff-internal.

This issue is not platform specific, unlike 23076.

-- 
Best Regards,
Nikolay Kudryavtsev






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 18:45         ` Nikolay Kudryavtsev
@ 2016-03-21 19:17           ` Eli Zaretskii
  2016-03-21 19:35             ` Nikolay Kudryavtsev
  2016-04-02  9:58           ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2016-03-21 19:17 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: 23075

> From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>
> Cc: 23075@debbugs.gnu.org
> Date: Mon, 21 Mar 2016 21:45:40 +0300
> 
> Ok, here are more detailed steps on how to repeat this this bug.
> 1. Clone https://github.com/sg2002/vc-git-bugs
> 2. Open 1251.txt in windows1251 encoding. Do some changes in it.
> 3. vc-diff. This would give you diff with broken encoding.
> 4. Evaluate vc-git-command from fix.el in the repo.
> 5. vc-diff again, it would give you the correct encoding as long as the 
> file itself opened correctly.

What if the diffs show changes in more than one file?  Then the
encoding of the file being visited is no longer as relevant as in the
use case you describe.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 19:17           ` Eli Zaretskii
@ 2016-03-21 19:35             ` Nikolay Kudryavtsev
  0 siblings, 0 replies; 14+ messages in thread
From: Nikolay Kudryavtsev @ 2016-03-21 19:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Óscar Fuentes, 23075, Andreas Schwab

> Then the encoding of the file being visited is no longer as relevant as in the use case you describe.
That's true, and whoever wrote this part of vc was aware of the issue, 
as you can see from that comment. The simple fix at least allow getting 
diff for the file in a different encoding, even if you have to diff that 
file alone.

I can confirm that Oscar's issue with HELLO is the same thing. What he 
described as garbage was probably the result of vc-diff encoded in utf-8 
since that's the default value of vc-git-commits-coding-system.

-- 
Best Regards,
Nikolay Kudryavtsev






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 18:22         ` Eli Zaretskii
@ 2016-03-21 19:38           ` Óscar Fuentes
  2016-04-02 10:02             ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Óscar Fuentes @ 2016-03-21 19:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, 23075, nikolay.kudryavtsev

Eli Zaretskii <eliz@gnu.org> writes:

>> Does git always use that encoding? If not, on what it depends the
>> encoding used by git?
>
> It just spews the bytes, AFAIK.  IOW, it's the encoding of the file
> whose log we are viewing.

I was expecting from HELLO to be utf-8, but you are right.

Please note that a diff may contain changes from multiple files,
including some that do not longer exists (or that exists with another
encoding). So we can't process a diff as a single unit, it must be
separated by file.

Solving this on a general way does not seem possible, but we can do our
best by using the file's current encoding, or simply guessing it. The
former can have serious performance implications.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 18:45         ` Nikolay Kudryavtsev
  2016-03-21 19:17           ` Eli Zaretskii
@ 2016-04-02  9:58           ` Eli Zaretskii
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-04-02  9:58 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: 23075-done

> From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com>
> Cc: 23075@debbugs.gnu.org
> Date: Mon, 21 Mar 2016 21:45:40 +0300
> 
> 1. Clone https://github.com/sg2002/vc-git-bugs
> 2. Open 1251.txt in windows1251 encoding. Do some changes in it.
> 3. vc-diff. This would give you diff with broken encoding.
> 4. Evaluate vc-git-command from fix.el in the repo.
> 5. vc-diff again, it would give you the correct encoding as long as the 
> file itself opened correctly.
> 
> Yes, there is some code that specifies encoding in vc already. See the 
> first large comment in vc-diff-internal.
> 
> This issue is not platform specific, unlike 23076.

Thanks, this should now be fixed by commit 17b5152.

I'm closing this bug report.  Feel free to reopen if any issues remain
unresolved.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23075: 24.5; vc-git-command should use coding-system-for-read
  2016-03-21 19:38           ` Óscar Fuentes
@ 2016-04-02 10:02             ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2016-04-02 10:02 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: schwab, 23075, nikolay.kudryavtsev

> From: Óscar Fuentes <ofv@wanadoo.es>
> Cc: schwab@linux-m68k.org,  23075@debbugs.gnu.org,  nikolay.kudryavtsev@gmail.com
> Date: Mon, 21 Mar 2016 20:38:22 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> Please note that a diff may contain changes from multiple files,
> including some that do not longer exists (or that exists with another
> encoding). So we can't process a diff as a single unit, it must be
> separated by file.

That'd require a major redesign, and IMO is out of scope of this bug
report.





^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-04-02 10:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-21 11:22 bug#23075: 24.5; vc-git-command should use coding-system-for-read Nikolay Kudryavtsev
2016-03-21 16:16 ` Eli Zaretskii
2016-03-21 16:54   ` Óscar Fuentes
2016-03-21 17:21     ` Eli Zaretskii
2016-03-21 17:57     ` Andreas Schwab
2016-03-21 18:05       ` Óscar Fuentes
2016-03-21 18:22         ` Eli Zaretskii
2016-03-21 19:38           ` Óscar Fuentes
2016-04-02 10:02             ` Eli Zaretskii
2016-03-21 18:20       ` Eli Zaretskii
2016-03-21 18:45         ` Nikolay Kudryavtsev
2016-03-21 19:17           ` Eli Zaretskii
2016-03-21 19:35             ` Nikolay Kudryavtsev
2016-04-02  9:58           ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).