From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.bugs Subject: bug#34862: 27.0.50; Trying to update pinyin.map Date: Thu, 14 Mar 2019 22:58:14 -0700 Message-ID: <87o96cbrwp.fsf@ericabrahamsen.net> References: <87zhpxyvls.fsf@ericabrahamsen.net> <83ftro20gt.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="220629"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: 34862@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Mar 15 07:15:40 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h4g80-000vDK-TG for geb-bug-gnu-emacs@m.gmane.org; Fri, 15 Mar 2019 07:15:37 +0100 Original-Received: from localhost ([127.0.0.1]:50454 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h4g7z-0004ZT-RJ for geb-bug-gnu-emacs@m.gmane.org; Fri, 15 Mar 2019 02:15:35 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:46514) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h4g5x-00039Z-BS for bug-gnu-emacs@gnu.org; Fri, 15 Mar 2019 02:13:31 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h4fs0-00077z-E2 for bug-gnu-emacs@gnu.org; Fri, 15 Mar 2019 01:59:06 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:59015) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h4fry-00076m-6W for bug-gnu-emacs@gnu.org; Fri, 15 Mar 2019 01:59:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1h4fry-0006cK-3i for bug-gnu-emacs@gnu.org; Fri, 15 Mar 2019 01:59:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 15 Mar 2019 05:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34862 X-GNU-PR-Package: emacs Original-Received: via spool by 34862-submit@debbugs.gnu.org id=B34862.155262950425389 (code B ref 34862); Fri, 15 Mar 2019 05:59:02 +0000 Original-Received: (at 34862) by debbugs.gnu.org; 15 Mar 2019 05:58:24 +0000 Original-Received: from localhost ([127.0.0.1]:44326 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h4frM-0006bR-DE for submit@debbugs.gnu.org; Fri, 15 Mar 2019 01:58:24 -0400 Original-Received: from ericabrahamsen.net ([52.70.2.18]:44440 helo=mail.ericabrahamsen.net) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h4frJ-0006bC-HA for 34862@debbugs.gnu.org; Fri, 15 Mar 2019 01:58:22 -0400 Original-Received: from localhost (97-126-92-188.tukw.qwest.net [97.126.92.188]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 88FCDFA02C; Fri, 15 Mar 2019 05:58:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ericabrahamsen.net; s=mail; t=1552629495; bh=1WMbuR8psBguiBCANIupZgiIuYR1OJBmZG5zgvK5KHw=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=Uapj+FAKBS+XfRf9a8AidQQr7wYobaii9tebvzHmo/8Rm0Ese4b23hpVxLOq5JmH7 CBikJ0Ri0S1e8EC6APFv20IgZCORXU11LXFTZJbwEx1dun25+Ntk/1kszey4mBfnMB t950anuQJ7G//tyFCduf76Zs9I0+N6p+43nM9ILg= In-Reply-To: <83ftro20gt.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 15 Mar 2019 07:03:30 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:156372 Archived-At: On 03/15/19 07:03 AM, Eli Zaretskii wrote: >> From: Eric Abrahamsen >> Date: Thu, 14 Mar 2019 14:49:51 -0700 >> >> >> As discussed in bug#34215, I'm trying to update the >> romanization-to-Chinese-character mapping in the >> file ./leim/MISC-DIC/pinyin.map to use the more complete mapping >> provided by the Google pinyin input method, licensed under Apache 2.0. >> This expands the number of characters recognized by Emacs from around >> 7,000 to around 17,000. (And increases the size of the mapping file from >> 18K to 53K.) >> >> I'm running into encoding problems when adding the new characters -- >> Emacs says some of the characters can't be written using the existing >> coding system. The original file has an encoding cookie reading coding: >> cn-gb-2312, and describing the coding system gives me: >> >> chinese-iso-8bit-dos (alias: cn-gb-2312-dos euc-china-dos euc-cn-dos >> cn-gb-dos gb2312-dos) >> >> The characters *can* be encoded using gb18030, and of course utf8. The >> wikipedia page for gb18030 describes gb2312 as "legacy"[1], and says >> gb18030 is a superset of 2312. >> >> Is there any reason not to go straight to utf8 for this file? If that's >> not okay, would gb18030 be acceptable? > > I'm not sure I understand the encoding of which file would you like to > change? Could you please clarify? Sorry, I'm trying to add more characters to ./leim/MISC-DIC/pinyin.map, which is encoded as chinese-iso-8bit-dos, and it can't accept the new characters with that current encoding. That's the file I'd like to change. Thanks, Eric