From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.help Subject: Re: Grep Japanese characters Date: Thu, 12 Jul 2018 16:27:00 +0300 Message-ID: <83601kfuu3.fsf@gnu.org> References: <20180712.080255.586725992291613595.tkk@misasa.okayama-u.ac.jp> <83a7qxfa7r.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1531401958 12773 195.159.176.226 (12 Jul 2018 13:25:58 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 12 Jul 2018 13:25:58 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Jul 12 15:25:53 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fdbbV-000396-4A for geh-help-gnu-emacs@m.gmane.org; Thu, 12 Jul 2018 15:25:53 +0200 Original-Received: from localhost ([::1]:59886 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdbda-0002fx-9C for geh-help-gnu-emacs@m.gmane.org; Thu, 12 Jul 2018 09:28:02 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdbcd-0002dx-LZ for help-gnu-emacs@gnu.org; Thu, 12 Jul 2018 09:27:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fdbca-0002Uz-Ir for help-gnu-emacs@gnu.org; Thu, 12 Jul 2018 09:27:03 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42642) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdbca-0002Ui-E9 for help-gnu-emacs@gnu.org; Thu, 12 Jul 2018 09:27:00 -0400 Original-Received: from [176.228.60.248] (port=2588 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fdbcZ-0002UE-P0 for help-gnu-emacs@gnu.org; Thu, 12 Jul 2018 09:27:00 -0400 In-reply-to: (message from Yuri Khan on Thu, 12 Jul 2018 12:05:51 +0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:117408 Archived-At: > From: Yuri Khan > Date: Thu, 12 Jul 2018 12:05:51 +0700 > Cc: help-gnu-emacs > > On Thu, Jul 12, 2018 at 9:41 AM Eli Zaretskii wrote: > > > You cannot pass UTF-8 encoded parameters to sub-programs on > > MS-Windows. You can only use the encoding of your system codepage. > > Sorry, it's an MS-Windows limitation. > > That’s not entirely accurate: using the CreateProcessW API, you could > pass UTF-16. However, in order to make full use of arguments passed > that way, the sub-program needs to forgo the normal “int main(int, > char**)” signature and use “int _wmain(int, wchar_t**)”, or to call > GetCommandLineW and parse the returned UTF-16 string. A sub-program > that accepts arguments via the usual ‘main’ function will be limited > to characters that are representable in the current codepage. Not only does the sub-program need to use _wmain instead of main, it must also internally use wchar_t data type instead of char for text strings. Ports of GNU and Unix software generally won't do that, so passing UTF-16 encoded text to them is not really useful, with a few very rare exceptions.