From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add an option to not reduce vocabulary of the Japanese Date: Fri, 03 Jun 2022 09:12:34 +0300 Message-ID: <8335gme0ml.fsf@gnu.org> References: <87r146o2rt.fsf@taiju.info> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31394"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Taiju HIGASHI , Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jun 03 08:18:12 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nx0dP-0007vL-Da for ged-emacs-devel@m.gmane-mx.org; Fri, 03 Jun 2022 08:18:12 +0200 Original-Received: from localhost ([::1]:40304 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nx0dN-0006OS-UF for ged-emacs-devel@m.gmane-mx.org; Fri, 03 Jun 2022 02:18:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52652) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nx0Xq-0004kn-0h for emacs-devel@gnu.org; Fri, 03 Jun 2022 02:12:27 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:48152) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nx0Xp-0000Pa-KB; Fri, 03 Jun 2022 02:12:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=zQRmHDMRL02+4Jpka93M98yC1qoUvf5xoLSw76M+Rkg=; b=huA3zY+OfHWSANRet8BP oyCfSdr0fjpJA+t41UOBRIA2GrigCNPJZIO+yVNF34tYxxbrOqcs0qvHSdQj5Y7DShxkLbYcnqhYs inggNrTisJQT26AsjK+sHWAOAyZrhbryPiSjsHrkMY1pyU4x5DT91v2SXvbgWETjzT+oWstHMXnn1 mSx8RT/m/YAsPTmiWDLy3MpdVdWjDZNMw0vsSUUBxhFql0CRxHMlo+pWSOZ9s29iuTY75/FCAiWcc DH4yOF7Mww/hw6hpgQsJuIVNTWtDg9CAdFNBLUIhj0KnBdGplnky6hLZYFXwzJbRy+0s9x/vpJA5S EYmoewfjiwx/xA==; Original-Received: from [87.69.77.57] (port=3085 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nx0Xm-0000cp-T6; Fri, 03 Jun 2022 02:12:24 -0400 In-Reply-To: <87r146o2rt.fsf@taiju.info> (message from Taiju HIGASHI on Fri, 03 Jun 2022 12:16:06 +0900) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:290597 Archived-At: > From: Taiju HIGASHI > CC: higashi@taiju.info > Date: Fri, 03 Jun 2022 12:16:06 +0900 > > The Japanese dictionary bundled with Emacs has a small vocabulary. > > For example, to convert "なごや" to "名古屋" (Nagoya) in Kanji, I would > enter "なご" and convert it to "名古", then enter "や" and convert it to > "屋". > Because the Japanese dictionary bundled with Emacs does not have "名古屋 > ". > > The skkdic-convert function in the ja-dic-cnv package generates the > Japanese dictionary, but the logic includes the dictionary vocabulary > reduction process. > > So I have created a patch to add an option to skip this reduction > process. I would be happy to receive your review and feedback. Thank you for working on this, and for your interest in Emacs. We don't have a lot of people on board who speak Japanese, so I CC Kenichi Handa in the hope that he could have some comments on your patch. Meanwhile, would you like to start the legal paperwork of assigning to the FSF the copyright for your changes? Your changes are small, but they are still borderline larger than we can accept without the copyright assignment. If you agree, I will send you the form to fill and instructions to go with the form. > * configure.ac: Add "with-ja-dic-reduction" configure argument. In addition to a configure-time option, I think it would be a good idea to have a special Makefile rule to regenerate the Japanese dictionary while skipping or not skipping the vocabulary reduction. Is such an option available with your changes? I think it is, but I'm not certain. So if needed, could you please add such an option to leim/Makefile.in? > + Does Emacs reduce the Japanese dictionary? ${with_ja_dic_reduction} I guess this wording is better: Should Emacs reduce Japanese dictionary vocabulary? > By the way, if I may be honest, I would like to remove this reduction > process. > > "名古屋" (Nagoya) [0] is the name of one of Japan's major cities and is a > proper noun. > > I don't think most people, myself included, recognize that the word is a > composite of "名古" and "屋". > > I am Japanese, so my sense may be different, but I recognize "New York" > as one word and "Spider-man" as one word. > In other words, instead of converting "名古" and "屋" respectively, we > want to convert "名古屋" as it is. It is stressful to have to separate > the words I imagine in my head from the words I use in Kanji > conversion. I would like to reduce that frequency at least a little. > > Although the skkdic-reduced-candidates function mechanically eliminates > words that can be entered by combining them with other words, it does > not judge the importance of words, so even frequently used words like " > 名古屋" are eliminated. That is very inconvenient. > > My concern is that Emacs' standard Kanji conversion engine will be > regarded as useless. > Despite being based on a dictionary with a sufficient vocabulary > (SKK-JISYO.L), it generates an inconvenient dictionary by the reduction > process. > Most of the people who rated Emacs' standard kanji conversion engine as > useless are probably unaware of this fact. > I also rated the standard Emacs kanji conversion engine as > useless. Because I did not know that fact. > However, when I learned the facts, I realized that this was a > misunderstanding and that I had disrespectful feelings toward Emacs. > This is simply a disrepute due to misunderstanding. This is something which would need an expert to respond to. I admit that I don't even understand the issues you are describing, as I don't read Kanji and don't speak Japanese. I hope Handa-san will comment on that. > The reduction of dictionaries would reduce the file size by less than > half. While significant, how important is this in today's computing > environment? It isn't too important, IMO. The reduction in Emacs's memory footprint, if that is significant, is probably more important. > My English is not very good, so I apologize if I did not convey my > intentions. There's absolutely nothing wrong with your English, so no need to apologize. Thanks!