From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#23076: 24.5; vc-git: add a new variable for log output coding system Date: Mon, 04 Apr 2016 18:22:30 +0300 Message-ID: <83pou5o0uh.fsf@gnu.org> References: <56EFE033.7080900@gmail.com> <56F04527.6010901@gmail.com> <83fuv4s4cr.fsf@gnu.org> <57017E45.7050605@gmail.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1459783474 8435 80.91.229.3 (4 Apr 2016 15:24:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 4 Apr 2016 15:24:34 +0000 (UTC) Cc: 23076@debbugs.gnu.org To: Nikolay Kudryavtsev Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Apr 04 17:24:18 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1an6MT-0001Hf-76 for geb-bug-gnu-emacs@m.gmane.org; Mon, 04 Apr 2016 17:24:17 +0200 Original-Received: from localhost ([::1]:59323 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an6MS-00080o-Jy for geb-bug-gnu-emacs@m.gmane.org; Mon, 04 Apr 2016 11:24:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56542) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an6MN-0007uF-L8 for bug-gnu-emacs@gnu.org; Mon, 04 Apr 2016 11:24:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1an6ME-00074k-81 for bug-gnu-emacs@gnu.org; Mon, 04 Apr 2016 11:24:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:54128) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an6ME-00074g-4Z for bug-gnu-emacs@gnu.org; Mon, 04 Apr 2016 11:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1an6MD-00048W-UC for bug-gnu-emacs@gnu.org; Mon, 04 Apr 2016 11:24:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 04 Apr 2016 15:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23076 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 23076-submit@debbugs.gnu.org id=B23076.145978339215841 (code B ref 23076); Mon, 04 Apr 2016 15:24:01 +0000 Original-Received: (at 23076) by debbugs.gnu.org; 4 Apr 2016 15:23:12 +0000 Original-Received: from localhost ([127.0.0.1]:51255 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1an6LP-00047R-OM for submit@debbugs.gnu.org; Mon, 04 Apr 2016 11:23:11 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:37131) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1an6LN-00047D-Nn for 23076@debbugs.gnu.org; Mon, 04 Apr 2016 11:23:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1an6LD-0006ir-L8 for 23076@debbugs.gnu.org; Mon, 04 Apr 2016 11:23:04 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48676) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an6LD-0006in-I2; Mon, 04 Apr 2016 11:22:59 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1505 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1an6L9-0005FJ-Bn; Mon, 04 Apr 2016 11:22:59 -0400 In-reply-to: <57017E45.7050605@gmail.com> (message from Nikolay Kudryavtsev on Sun, 3 Apr 2016 23:34:13 +0300) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:116022 Archived-At: > From: Nikolay Kudryavtsev > Cc: 23076@debbugs.gnu.org > Date: Sun, 3 Apr 2016 23:34:13 +0300 > > Hello Eli. > > Just to explain the underlying issue. Thanks. > With emacs -Q try committing to the same repository by copy-pasting the previous commit message. Then do git log from shell. Your commit message would get broken. > > This happens because git on Windows expects the commit message to be in your Windows "language for non-Unicode programs" encoding. Then it recodes from it to utf-8. I think this conclusion is wrong. The real reason for the problem is that Emacs on Windows invokes subordinate programs in a way that non-ASCII characters in the command-line arguments can only be encoded in the system codepage. And Emacs uses the -m command-line argument to pass the commit log message to Git. IOW, the problem is not with Git, the problem is with how Emacs on Windows invokes it. (For complicated reasons I won't go into, this general problem cannot be easily fixed in Emacs.) So any non-ASCII text encoded in some encoding other than the current system codepage will become garbled even before it gets to Git. > So, to be able to commit in russian we: > 1. Change language for non-Unicode programs to russian. > 2. (setq vc-git-commits-coding-system 'windows-1251) This solution doesn't really work for the reasons explained above. > After doing this, commiting in russian would work. But now our C-x v l is broken. "C-x v l" is broken because it uses the same value of vc-git-commits-coding-system to read what Gt outputs, whereas Git outputs in UTF-8. > We can either fix it by setting logoutputencoding in git, but this would break git log outside of emacs, or add a new variable to vc, and that's what I want. I don't think this is the right solution, see below. > That's a relatively recent change in git, from 2013 or 2014, so if you're using some really old version, everything might just work out of box. I have Git 2.8.0, the latest official release. Since the problem is (a) specific to MS-Windows, and (b) related to encoding the command-line arguments, the solution should target the root cause and nothing else, IMO. Introducing a separate variable that users should need to configure sounds therefore as not the best idea. Moreover, on MS-Windows any value of that additional variable that is not exactly equal to the current system codepage will simply fail to work. So instead, I can suggest one of the following alternatives, to be done only when invoking Git to commit on MS-Windows: 1) ignore vc-git-commits-coding-system and always encode the command-line arguments using the system locale (in your case, codepage 1251); or 2) put the log message in a temporary file, encoded in vc-git-commits-coding-system, then use -F instead of -m; the rest of command-line arguments will be encoded in the system locale's codepage The 1st solution is essentially what you wanted, but without the need to introduce an additional variable or ask the users to configure it. The 2nd solution is somewhat slower, but it is better, because it will allow to write log messages using any characters, not just those representable in the current codepage. Note that it still doesn't solve all the problems with non-ASCII characters, because those could be in the "author" or any of the other arguments with which we call Git, such as the names of the files whose changes are to be committed (as Emacs does support arbitrary characters in file names). Comments?