From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Naohiro Aota Newsgroups: gmane.emacs.bugs Subject: bug#5990: 23.1; Cannot type the word =?UTF-8?Q?=E8=B2=B7=E3=81=84=E3=81=BE=E3=81=99?= Date: Tue, 31 Aug 2021 15:13:09 +0900 Message-ID: <20210831061309.ppeluqllz4jim5hp@naota-xeon> References: <834m6wk7d1.fsf@gnu.org> <87mvknkxeg.fsf@gmail.com> <878s0oui53.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29071"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Andrew Hyatt , akirashinigami@gmail.com, Alex , 5990@debbugs.gnu.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Aug 31 08:14:19 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mKx2H-0007Ma-Rg for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 31 Aug 2021 08:14:17 +0200 Original-Received: from localhost ([::1]:41102 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mKx2F-0004Qg-8O for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 31 Aug 2021 02:14:16 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34178) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mKx22-0004QV-SX for bug-gnu-emacs@gnu.org; Tue, 31 Aug 2021 02:14:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49397) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mKx22-0002bW-KL for bug-gnu-emacs@gnu.org; Tue, 31 Aug 2021 02:14:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mKx22-0003Kj-EJ for bug-gnu-emacs@gnu.org; Tue, 31 Aug 2021 02:14:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Naohiro Aota Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 31 Aug 2021 06:14:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 5990 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 5990-submit@debbugs.gnu.org id=B5990.163039040712766 (code B ref 5990); Tue, 31 Aug 2021 06:14:02 +0000 Original-Received: (at 5990) by debbugs.gnu.org; 31 Aug 2021 06:13:27 +0000 Original-Received: from localhost ([127.0.0.1]:60943 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mKx1P-0003Jm-KV for submit@debbugs.gnu.org; Tue, 31 Aug 2021 02:13:27 -0400 Original-Received: from mail-pf1-f177.google.com ([209.85.210.177]:45625) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mKx1K-0003JV-DZ for 5990@debbugs.gnu.org; Tue, 31 Aug 2021 02:13:22 -0400 Original-Received: by mail-pf1-f177.google.com with SMTP id t42so14075608pfg.12 for <5990@debbugs.gnu.org>; Mon, 30 Aug 2021 23:13:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=elisp-net.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=ciZMTyfmiSWStOnAxsKPkoNopHA5+S7WJAPu3h3IAWg=; b=bAiog3ebkPMrOAXv2TKUnEJMegsEcOp0DCZYXl77SxFZ/dQDNFeCg23GBFXqqgM6b5 1VNAFssZ/P/BWuf7IU1hRnbQKD8W2wcwBws4mb9t/A29dj/1j9mkFAXzRpNdozSRLITP /mc+OlzWFH83VY478l8CfzTsGIhQ6pwXWTge0dLrpiVDFIMq5T0s+dlrXOuszUv+116t 0e7igCk+N18ULDdPkBjSZGxHddz7+U9sfmviTDH3OkAvfIn2ubi8PoaZndh0wg8v6pU3 iLQNFUEJRE2BapbHTMtQftbapOdjj9JQldajpt6U8J0crQhcdZ0Pjfb82TqJ4EG4VoE6 vFgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=ciZMTyfmiSWStOnAxsKPkoNopHA5+S7WJAPu3h3IAWg=; b=Uqt5hysOELMiQjAeDjAk+ciffeiyZVqPbG8knNYnozxTNyt6SLLIUifbFKipTARV77 vynRz92qiB/+lZEnws4tw9HK4yK+srCa0uepLP6wBBoaZvjRqcxtHzo9Oizhy7RezakC m/WMnrhoaIo4rHHre2sdQ9NLq+mKmzDhGcJTotgimsatLHpiwAw6+surg0PnyifehvGD n/Igz8lDVPBVtZhbgi8HACFpmNWEkG7zH7qXNuLCGQMNS/LFGusvp1cUHWYt6SDZC2a+ DD9beoorqHPpPTQX0Sn45RuepL417wr0uN2bxB5ZDTOIXgRCelg03f2FMELS0WbPSZoL pDPw== X-Gm-Message-State: AOAM533zq46qgsVr3sZx8FS91Ny1LiO4kffOL5OWxgEeHHFZZJ0VuLAU HLRIDPd0UP61RUokQgJDKPTDXw== X-Google-Smtp-Source: ABdhPJzuwj4enyRO05VmNiu3A5fAKIpeA3gSTKVZW35daiocpZSj3ub57sLJ9OuxOS9egTb83Wo/9A== X-Received: by 2002:a63:154d:: with SMTP id 13mr24888325pgv.404.1630390392268; Mon, 30 Aug 2021 23:13:12 -0700 (PDT) Original-Received: from localhost (fpa446bf90.knge321.ap.nuro.jp. [164.70.191.144]) by smtp.gmail.com with ESMTPSA id x15sm5906887pfq.31.2021.08.30.23.13.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 23:13:11 -0700 (PDT) Content-Disposition: inline In-Reply-To: <878s0oui53.fsf@gnus.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:213077 Archived-At: On Thu, Aug 26, 2021 at 09:21:12PM +0200, Lars Ingebrigtsen wrote: > Alex writes: > > >> If there are significant updates, we can import a new version for the > >> next release, I think. > > > > I can't say anything on that front, but I tried updating the Japanese > > dictionary to the latest version using skkdic-convert in > > lisp/international/ja-dic-cnv.el and it still gave the "wrong" results for > > 買います. Background: I'm Japanese. > In Emacs 28, I'm getting: > These are still all valid (possible) kanji conversion for "kaimasu". > 魔居間す (This looks strange. I guess there are some mistakes for this one.) > 加居間す > 過居間す > 可居間す All these can be read as "ka" (加, 過, 可) + "ima" (居間) + "su" (す). > かいます The implementation of leim's kanji conversion is so simple as follow. In short, it just find the longest match from the dictionary even if it is strange (for Japanese) or not. (setq kkc-current-key (string-to-vector kkc-original-kana)) (setq kkc-length-head (length kkc-current-key)) (unwind-protect ... (while (not (kkc-lookup-key kkc-length-head nil first)) (setq kkc-length-head (1- kkc-length-head) first nil)) So, we get the longest conversion of "kaimasu" as "kaima" as above. To get "買います", we need explicitly set the conversion length to 2 with C-o/C-i. It can be reproduced in a code like this: (let ((kkc-current-key "かいます")) (kkc-lookup-key 2) kkc-current-conversions) (1 "買い" "書い" "描い" "飼い" "画い" "欠い" "掻い" "嗅い" "交い" "畫い" "缺い" ...) ;; `-- Here, we have the "買います" result (let ((kkc-current-key "かいます")) (kkc-lookup-key 3) kkc-current-conversions) (1 "垣間" "加居間" "過居間" "可居間") > So I'm not getting 買います here, either. > > But I guess we're just using whatever is in: > > http://openlab.ring.gr.jp/skk/skk/dic/SKK-JISYO.L > > (Our version was updated earlier this year.) So is this something that > we can fix on our side, or is it just what this dictionary says? Well, other IM like mozc is much more intelligent to prefer "買います" than the above three conversions because "買います" is plausible. IMHO, implementing such complex algorithm is out of scope for leim. It still can be a "rescue" tool even with the simple algorithm if you can set a proper conversion length. And, we anyway use other IM like mozc, SKK, tc.el. Thanks, > -- > (domestic pets only, the antidote for overdose, milk.) > bloggy blog: http://lars.ingebrigtsen.no > > >