unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Taiju HIGASHI <higashi@taiju.info>, Kenichi Handa <handa@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: [PATCH] Add an option to not reduce vocabulary of the Japanese
Date: Fri, 03 Jun 2022 09:12:34 +0300	[thread overview]
Message-ID: <8335gme0ml.fsf@gnu.org> (raw)
In-Reply-To: <87r146o2rt.fsf@taiju.info> (message from Taiju HIGASHI on Fri, 03 Jun 2022 12:16:06 +0900)

> From: Taiju HIGASHI <higashi@taiju.info>
> CC: higashi@taiju.info
> Date: Fri, 03 Jun 2022 12:16:06 +0900
> 
> The Japanese dictionary bundled with Emacs has a small vocabulary.
> 
> For example, to convert "なごや" to "名古屋" (Nagoya) in Kanji, I would
> enter "なご" and convert it to "名古", then enter "や" and convert it to
> "屋".
> Because the Japanese dictionary bundled with Emacs does not have "名古屋
> ".
> 
> The skkdic-convert function in the ja-dic-cnv package generates the
> Japanese dictionary, but the logic includes the dictionary vocabulary
> reduction process.
> 
> So I have created a patch to add an option to skip this reduction
> process. I would be happy to receive your review and feedback.

Thank you for working on this, and for your interest in Emacs.

We don't have a lot of people on board who speak Japanese, so I CC
Kenichi Handa in the hope that he could have some comments on your
patch.

Meanwhile, would you like to start the legal paperwork of assigning to
the FSF the copyright for your changes?  Your changes are small, but
they are still borderline larger than we can accept without the
copyright assignment.  If you agree, I will send you the form to fill
and instructions to go with the form.

> * configure.ac: Add "with-ja-dic-reduction" configure argument.

In addition to a configure-time option, I think it would be a good
idea to have a special Makefile rule to regenerate the Japanese
dictionary while skipping or not skipping the vocabulary reduction.
Is such an option available with your changes?  I think it is, but I'm
not certain.  So if needed, could you please add such an option to
leim/Makefile.in?

> +  Does Emacs reduce the Japanese dictionary?              ${with_ja_dic_reduction}

I guess this wording is better:

 Should Emacs reduce Japanese dictionary vocabulary?

> By the way, if I may be honest, I would like to remove this reduction
> process.
> 
> "名古屋" (Nagoya) [0] is the name of one of Japan's major cities and is a
> proper noun.
> 
> I don't think most people, myself included, recognize that the word is a
> composite of "名古" and "屋".
> 
> I am Japanese, so my sense may be different, but I recognize "New York"
> as one word and "Spider-man" as one word.
> In other words, instead of converting "名古" and "屋" respectively, we
> want to convert "名古屋" as it is. It is stressful to have to separate
> the words I imagine in my head from the words I use in Kanji
> conversion. I would like to reduce that frequency at least a little.
> 
> Although the skkdic-reduced-candidates function mechanically eliminates
> words that can be entered by combining them with other words, it does
> not judge the importance of words, so even frequently used words like "
> 名古屋" are eliminated. That is very inconvenient.
> 
> My concern is that Emacs' standard Kanji conversion engine will be
> regarded as useless.
> Despite being based on a dictionary with a sufficient vocabulary
> (SKK-JISYO.L), it generates an inconvenient dictionary by the reduction
> process.
> Most of the people who rated Emacs' standard kanji conversion engine as
> useless are probably unaware of this fact.
> I also rated the standard Emacs kanji conversion engine as
> useless. Because I did not know that fact.
> However, when I learned the facts, I realized that this was a
> misunderstanding and that I had disrespectful feelings toward Emacs.
> This is simply a disrepute due to misunderstanding.

This is something which would need an expert to respond to.  I admit
that I don't even understand the issues you are describing, as I don't
read Kanji and don't speak Japanese.  I hope Handa-san will comment on
that.

> The reduction of dictionaries would reduce the file size by less than
> half. While significant, how important is this in today's computing
> environment?

It isn't too important, IMO.  The reduction in Emacs's memory
footprint, if that is significant, is probably more important.

> My English is not very good, so I apologize if I did not convey my
> intentions.

There's absolutely nothing wrong with your English, so no need to
apologize.

Thanks!



  reply	other threads:[~2022-06-03  6:12 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-03  3:16 [PATCH] Add an option to not reduce vocabulary of the Japanese Taiju HIGASHI
2022-06-03  6:12 ` Eli Zaretskii [this message]
2022-06-03  6:43   ` Taiju HIGASHI
2022-06-03 11:10     ` Eli Zaretskii
     [not found]       ` <87sfolwyzj.fsf@taiju.info>
2022-06-04  8:38         ` Eli Zaretskii
2022-06-04 11:46           ` Taiju HIGASHI
2022-06-04 13:43             ` Eli Zaretskii
2022-06-04 16:39               ` Taiju HIGASHI
2022-06-04 16:47                 ` Eli Zaretskii
2022-06-04 17:01                   ` Taiju HIGASHI
2022-06-04 17:03                     ` Eli Zaretskii
2022-06-05  3:05                     ` handa
2022-06-05 14:07                       ` Taiju HIGASHI
2022-06-06 11:52                         ` handa
2022-06-06 12:53                           ` Taiju HIGASHI
2022-06-06 14:14                             ` Lars Ingebrigtsen
2022-06-06 14:17                               ` Eli Zaretskii
2022-06-06 15:08                                 ` Taiju HIGASHI
2022-06-06 16:05                                   ` Eli Zaretskii
2022-06-07  0:47                                     ` Taiju HIGASHI
2022-06-07  1:06                                       ` Taiju HIGASHI
2022-06-07  3:50                                         ` Taiju HIGASHI
2022-06-07 10:58                                           ` Eli Zaretskii
2022-06-07  9:36                                         ` Lars Ingebrigtsen
2022-06-07 10:10                                           ` Taiju HIGASHI
2022-06-07 10:22                                             ` Lars Ingebrigtsen
2022-06-07 10:48                                       ` Eli Zaretskii
2022-06-07 12:12                                         ` Taiju HIGASHI
2022-06-07 12:41                                         ` Taiju HIGASHI
2022-06-07 13:08                                           ` Taiju HIGASHI
2022-06-09 13:10                                             ` Taiju HIGASHI
2022-06-09 13:14                                               ` Eli Zaretskii
2022-06-10 13:15                                             ` Eli Zaretskii
2022-06-10 13:50                                               ` Taiju HIGASHI
2022-06-03 23:51     ` Richard Stallman
2022-06-04 10:57       ` Taiju HIGASHI
2022-06-04 11:19         ` Taiju HIGASHI
2022-06-05 22:53         ` Richard Stallman
2022-06-06  0:05           ` Taiju HIGASHI
2022-06-03 23:52 ` Richard Stallman
2022-06-04  6:25   ` Eli Zaretskii
2022-06-04 12:36     ` Taiju HIGASHI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8335gme0ml.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@gnu.org \
    --cc=higashi@taiju.info \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).