* Cyrillic VC Git commit messages @ 2014-11-21 4:30 Nikolay Kudryavtsev 2014-11-21 8:41 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Nikolay Kudryavtsev @ 2014-11-21 4:30 UTC (permalink / raw) To: help-gnu-emacs@gnu.org Hi all. Sometimes I work with projects that have Russian commit messages in the git log. I've found a way to make them work, but it's kind of counter-intuitive. First you set (setq vc-git-commits-coding-system 'windows-1251)) And then in .gitconfig: [i18n] logoutputencoding = windows-1251 This works fine inside of emacs, but totally breaks git log in windows cmd. For some reason git always expects windows-1251(system default) for input, but outputs windows-1252 to cmd.exe and utf-8 to emacs. So, did I miss something? Is there another way? -- Best Regards, Nikolay Kudryavtsev ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 4:30 Cyrillic VC Git commit messages Nikolay Kudryavtsev @ 2014-11-21 8:41 ` Eli Zaretskii 2014-11-21 13:21 ` Nikolay Kudryavtsev 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2014-11-21 8:41 UTC (permalink / raw) To: help-gnu-emacs > From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com> > Date: Fri, 21 Nov 2014 07:30:06 +0300 > > (setq vc-git-commits-coding-system 'windows-1251)) > And then in .gitconfig: > [i18n] > logoutputencoding = windows-1251 > > This works fine inside of emacs, but totally breaks git log in windows > cmd. For some reason git always expects windows-1251(system default) for > input, but outputs windows-1252 to cmd.exe and utf-8 to emacs. It's a missing feature in vc-git.el, see http://lists.gnu.org/archive/html/emacs-devel/2014-11/msg01274.html and perhaps also a bug in git. > So, did I miss something? Is there another way? To work around, try this in your ~/.emacs: (add-to-list process-coding-system-alist '("[gG][iI][tT]" windows-1251 . utf-8)) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 8:41 ` Eli Zaretskii @ 2014-11-21 13:21 ` Nikolay Kudryavtsev 2014-11-21 14:06 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Nikolay Kudryavtsev @ 2014-11-21 13:21 UTC (permalink / raw) To: help-gnu-emacs > http://lists.gnu.org/archive/html/emacs-devel/2014-11/msg01274.html Seen this before sending my question. Taking .gitcofig settings into account would break my workaround. > and perhaps also a bug in git. From what I read on the msysgit wiki, it seems that the developers consider this a feature. > To work around, try this in your ~/.emacs: > > (add-to-list process-coding-system-alist > '("[gG][iI][tT]" windows-1251 . utf-8)) This would not work, because vc-git-commits-coding-system is always used instead. And vc-git-commits-coding-system only accepts a single coding system. -- Best Regards, Nikolay Kudryavtsev ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 13:21 ` Nikolay Kudryavtsev @ 2014-11-21 14:06 ` Eli Zaretskii 2014-11-21 14:38 ` Nikolay Kudryavtsev 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2014-11-21 14:06 UTC (permalink / raw) To: help-gnu-emacs > From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com> > Date: Fri, 21 Nov 2014 16:21:28 +0300 > > > http://lists.gnu.org/archive/html/emacs-devel/2014-11/msg01274.html > Seen this before sending my question. Taking .gitcofig settings into > account would break my workaround. > > > and perhaps also a bug in git. > From what I read on the msysgit wiki, it seems that the developers > consider this a feature. Could you give a pointer to that place? > > To work around, try this in your ~/.emacs: > > > > (add-to-list process-coding-system-alist > > '("[gG][iI][tT]" windows-1251 . utf-8)) > This would not work, because vc-git-commits-coding-system is always used > instead. And vc-git-commits-coding-system only accepts a single coding > system. Even if you set vc-git-commits-coding-system to UTF-8? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 14:06 ` Eli Zaretskii @ 2014-11-21 14:38 ` Nikolay Kudryavtsev 2014-11-21 15:28 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Nikolay Kudryavtsev @ 2014-11-21 14:38 UTC (permalink / raw) To: help-gnu-emacs > Could you give a pointer to that place? It's mentioned here. <https://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support#Disable_commit_message_transcoding> Couldn't find a more elaborate explanation. > Even if you set vc-git-commits-coding-system to UTF-8? Yes. VC does is not using process-coding-system-alist at all. vc-git-commits-coding-system is used instead. -- Best Regards, Nikolay Kudryavtsev ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 14:38 ` Nikolay Kudryavtsev @ 2014-11-21 15:28 ` Eli Zaretskii 2014-11-21 16:48 ` Nikolay Kudryavtsev 0 siblings, 1 reply; 8+ messages in thread From: Eli Zaretskii @ 2014-11-21 15:28 UTC (permalink / raw) To: help-gnu-emacs > From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com> > Date: Fri, 21 Nov 2014 17:38:51 +0300 > > > Could you give a pointer to that place? > > It's mentioned here. <https://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support#Disable_commit_message_transcoding> Couldn't find a more elaborate explanation. I see nothing there that says it's a feature. I don't even see there a confirmation that output is always in UTF-8. Can you tell how you decided that, or where did you see that described? Do I understand correctly that you see Cyrillic text encoded differently when it is sent to Emacs and to the cmd.exe window? And it sends codepage 1252 (not 1251) to the cmd.exe window? Moreover, you seem to say that Git outputs in UTF-8 even though you customized i18n.logoutputencoding to be windows-1251? That'd be a real bug in Git. How about asking about that on the msysgit mailing list? This message: http://osdir.com/ml/msysgit/2009-11/msg00140.html seems to say that the problem disappears if you use --no-pager, so maybe the bug is in Less? There are some suggestions to play with the value of the environment variable LESSCHARSET. (This information might be obsolete with the current versions of msysgit.) > VC does is not using process-coding-system-alist at all. vc-git-commits-coding-system is used instead. That's not true. First, vc-git-commits-coding-system is used only in 2 commands in vc-git; others use process-coding-system-alist. More importantly, even those 2 commands bind only one of the coding systems, the other is determined by process-coding-system-alist. Not sure this helps you, though. Anyway, if nothing else works for you, modify vc-git.el to use 2 variables instead of just one for input and output of logs, then you can give each variable the value you need. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 15:28 ` Eli Zaretskii @ 2014-11-21 16:48 ` Nikolay Kudryavtsev 2014-11-22 13:42 ` Eli Zaretskii 0 siblings, 1 reply; 8+ messages in thread From: Nikolay Kudryavtsev @ 2014-11-21 16:48 UTC (permalink / raw) To: help-gnu-emacs > Can you tell how you decided that, or where did you see that described? That part implies that there is some new functionality in msysgit that does the recoding for windows cmd.exe. > And it sends codepage 1252 (not 1251) to the cmd.exe window? It first decodes the message with logoutputencoding, then recodes it with windows-1252. If you set logoutputencoding to windows-1251, like I do, it breaks cmd.exe output. > Moreover, you seem to say that Git outputs in UTF-8 even though you > customized i18n.logoutputencoding to be windows-1251? For vc log the second encoding with windows-1252 does not happen. For the commit message, git first recodes from windows-1251 to utf-8 and then recodes to commitencoding. This behavior is shared when called from VC and cmd.exe. > First, vc-git-commits-coding-system is used only in 2 commands in vc-git Yeah, but that's exactly the two commands we care about here. It sets coding-system-for-read for log and coding-system-for-write for commit message > modify vc-git.el to use 2 variables Thought about doing this, but first decided to see if I can get any help. Those git "hooks" do weird things to say the least. -- Best Regards, Nikolay Kudryavtsev ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Cyrillic VC Git commit messages 2014-11-21 16:48 ` Nikolay Kudryavtsev @ 2014-11-22 13:42 ` Eli Zaretskii 0 siblings, 0 replies; 8+ messages in thread From: Eli Zaretskii @ 2014-11-22 13:42 UTC (permalink / raw) To: help-gnu-emacs > From: Nikolay Kudryavtsev <nikolay.kudryavtsev@gmail.com> > Date: Fri, 21 Nov 2014 19:48:47 +0300 > > > Can you tell how you decided that, or where did you see that described? > That part implies that there is some new functionality in msysgit that > does the recoding for windows cmd.exe. > > > And it sends codepage 1252 (not 1251) to the cmd.exe window? > It first decodes the message with logoutputencoding, then recodes it > with windows-1252. If you set logoutputencoding to windows-1251, like I > do, it breaks cmd.exe output. > > > Moreover, you seem to say that Git outputs in UTF-8 even though you > > customized i18n.logoutputencoding to be windows-1251? > For vc log the second encoding with windows-1252 does not happen. > > For the commit message, git first recodes from windows-1251 to utf-8 > and then recodes to commitencoding. This behavior is shared when called > from VC and cmd.exe. I looked into this some more and ran some simple tests, and I'm not sure I see the same behavior as the one you describe. First, preliminaries: I tried this with msysGit version 1.9.4.msysgit.2 (the latest binary release) on Windows XP SP3. I cannot easily set up a Cyrillic locale on my machine, so I tried the Latin-1 locale, i.e. codepage 1252, instead. Also, I only have access to a Git repository whose commit log messages are encoded in UTF-8, so that's what I tried. What I see is this: . By default, Git outputs commit log messages in UTF-8 when redirected to a file and to Emacs. When it writes to the console, Git seems to use WriteConsoleW API after converting text from UTF-8 to UTF-16. The Windows console then displays that text according to the current codepage, converting to the supported characters if it can, and displaying '?' characters if not. . If I set i18n.logoutputencoding = windows-1252, Git outputs commit log messages in that encoding, both to the cmd, when redirected to a file, and to Emacs (I tried "C-x v L" command to see that). This behavior looks reasonable and expectable, given what the documentation says. In particular, I see no differences between the encoding Git outputs to the console and to Emacs. Please note that there's one more player in this game, when you invoke Git from cmd.exe prompt: in some versions of msysGit, when you type a Git command at cmd.exe prompt, what gets invoked is a git.cmd batch file supplied by msysGit, and that batch file manipulates the console codepage. (On my system, I disabled that manipulation, because it interferes with Git invocations from Emacs.) So it could be that what that batch file does is one reason for the unreasonable behavior you describe. If git.cmd is not the culprit, or if you run Git not through such a batch file, then perhaps you could see what encoding Git emits in the above 3 scenarios: to console, to file, and to Emacs. Also, please tell how you determine the encoding in each case. P.S. I tried to verify my observations by looking at the msysGit sources, but I cannot find the source distribution that corresponds to the 1.9.4.msysgit.2 binaries I installed. The download page provides a link to "Source code", but what gets downloaded by clicking that link is binaries without sources, which AFAIU is against the GPL. HTH ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-11-22 13:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-21 4:30 Cyrillic VC Git commit messages Nikolay Kudryavtsev 2014-11-21 8:41 ` Eli Zaretskii 2014-11-21 13:21 ` Nikolay Kudryavtsev 2014-11-21 14:06 ` Eli Zaretskii 2014-11-21 14:38 ` Nikolay Kudryavtsev 2014-11-21 15:28 ` Eli Zaretskii 2014-11-21 16:48 ` Nikolay Kudryavtsev 2014-11-22 13:42 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.