all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Taiju HIGASHI <higashi@taiju.info>
To: Eli Zaretskii <eliz@gnu.org>
Cc: larsi@gnus.org,  handa@gnu.org,  emacs-devel@gnu.org
Subject: Re: [PATCH] Add an option to not reduce vocabulary of the Japanese
Date: Tue, 07 Jun 2022 09:47:31 +0900	[thread overview]
Message-ID: <87v8td710c.fsf@taiju.info> (raw)
In-Reply-To: <834k0x93ra.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 06 Jun 2022 19:05:13 +0300")

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]


>> Based on the totality of the discussion so far, would the following
>> policy be the best?
>>
>> 1. make the build-time option selectable to install a dictionary with a
>>    reduced vocabulary
>> 2. install dictionaries without reduced vocabulary by default
>> 3. make it possible to regenerate dictionaries from make or Emacs
>>    command.
>
> I think 2+3 is the best.

I attached the v3 patch.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: v3 patch --]
[-- Type: text/x-patch, Size: 6906 bytes --]

From 615f07c0a6c565b70838834d319ff97407d1e59c Mon Sep 17 00:00:00 2001
From: Taiju HIGASHI <higashi@taiju.info>
Date: Tue, 7 Jun 2022 09:21:10 +0900
Subject: [PATCH v3] The vocabulary in ja-dic.el should not be reduced by
 default.

* configure.ac: Add "with-small-ja-dic" configure option.
* leim/Makefile.in (${leimdir}/ja-dic/ja-dic.el): Change the build
method depending on whether or not the with-small-ja-dic option is
specified.
* lisp/international/ja-dic-cnv.el (skkdic-convert-okuri-nasi): Add
"with-reduction" argument and change "skkdic-reduced-candidates" not
to be called if the "with-reduction" argument is
unspecified. (breaking changes)
(skkdic-convert): Add "with-reduction" optional argument.
(batch-skkdic-convert): Add "--with-reduction" command line argument.
---
 configure.ac                     |  5 +++++
 leim/Makefile.in                 |  8 +++++++-
 lisp/international/ja-dic-cnv.el | 26 ++++++++++++++++++--------
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/configure.ac b/configure.ac
index 313a1436b5..3e6eab94f8 100644
--- a/configure.ac
+++ b/configure.ac
@@ -491,6 +491,7 @@ OPTION_DEFAULT_ON([threads],[don't compile with elisp threading support])
 OPTION_DEFAULT_OFF([native-compilation],[compile with Emacs Lisp native compiler support])
 OPTION_DEFAULT_OFF([cygwin32-native-compilation],[use native compilation on 32-bit Cygwin])
 OPTION_DEFAULT_ON([xinput2],[don't use version 2 of the X Input Extension for input])
+OPTION_DEFAULT_OFF([small-ja-dic],[generate a small-sized Japanese dictionary])
 
 AC_ARG_WITH([file-notification],[AS_HELP_STRING([--with-file-notification=LIB],
  [use a file notification library (LIB one of: yes, inotify, kqueue, gfile, w32, no)])],
@@ -6492,6 +6493,7 @@ AS_ECHO(["  Does Emacs use -lXaw3d?                                 ${HAVE_XAW3D
   Which dumping strategy does Emacs use?                  ${with_dumping}
   Does Emacs have native lisp compiler?                   ${HAVE_NATIVE_COMP}
   Does Emacs use version 2 of the the X Input Extension?  ${HAVE_XINPUT2}
+  Should Emacs use a small-sized Japanese dictionary?     ${with_small_ja_dic}
 "])
 
 if test -n "${EMACSDATA}"; then
@@ -6590,6 +6592,9 @@ SUBDIR_MAKEFILES_IN=`echo " ${SUBDIR_MAKEFILES}" | sed -e 's| | $(srcdir)/|g' -e
 
 AC_SUBST(SUBDIR_MAKEFILES_IN)
 
+SMALL_JA_DIC=$with_small_ja_dic
+AC_SUBST(SMALL_JA_DIC)
+
 dnl You might wonder (I did) why epaths.h is generated by running make,
 dnl rather than just letting configure generate it from epaths.in.
 dnl One reason is that the various paths are not fully expanded (see above);
diff --git a/leim/Makefile.in b/leim/Makefile.in
index 3b4216c0b8..a256ca539b 100644
--- a/leim/Makefile.in
+++ b/leim/Makefile.in
@@ -32,6 +32,12 @@ leimdir = ${srcdir}/../lisp/leim
 
 EXEEXT = @EXEEXT@
 
+SMALL_JA_DIC = @SMALL_JA_DIC@
+JA_DIC_REDUCTION_OPTION =
+ifeq ($(SMALL_JA_DIC), yes)
+	JA_DIC_REDUCTION_OPTION = --with-reduction
+endif
+
 -include ${top_builddir}/src/verbose.mk
 
 # Prevent any settings in the user environment causing problems.
@@ -134,7 +140,7 @@ generate-ja-dic: ${leimdir}/ja-dic/ja-dic.el
 ${leimdir}/ja-dic/ja-dic.el: $(srcdir)/SKK-DIC/SKK-JISYO.L
 	$(AM_V_GEN)$(RUN_EMACS) -batch -l ja-dic-cnv \
 	  --eval "(setq max-specpdl-size 5000)" \
-	  -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" "$<"
+	  -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" $(JA_DIC_REDUCTION_OPTION) "$<"
 
 ${srcdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map
 	$(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $@
diff --git a/lisp/international/ja-dic-cnv.el b/lisp/international/ja-dic-cnv.el
index 7f7c0261dc..7451773912 100644
--- a/lisp/international/ja-dic-cnv.el
+++ b/lisp/international/ja-dic-cnv.el
@@ -295,7 +295,7 @@
       (setq skkdic-okuri-nasi-entries-count (length skkdic-okuri-nasi-entries))
       (progress-reporter-done progress))))
 
-(defun skkdic-convert-okuri-nasi (skkbuf buf)
+(defun skkdic-convert-okuri-nasi (skkbuf buf with-reduction)
   (with-current-buffer buf
     (insert ";; Setting okuri-nasi entries.\n"
 	    "(skkdic-set-okuri-nasi\n")
@@ -311,7 +311,9 @@
           (setq count (1+ count))
           (progress-reporter-update progress count)
 	  (if (setq candidates
-		    (skkdic-reduced-candidates skkbuf kana candidates))
+		    (if with-reduction
+                        (skkdic-reduced-candidates skkbuf kana candidates)
+                      candidates))
 	      (progn
 		(insert "\"" kana)
 		(while candidates
@@ -322,10 +324,12 @@
       (progress-reporter-done progress))
     (insert ")\n\n")))
 
-(defun skkdic-convert (filename &optional dirname)
+(defun skkdic-convert (filename &optional dirname with-reduction)
   "Generate Emacs Lisp file from Japanese dictionary file FILENAME.
 The format of the dictionary file should be the same as SKK dictionaries.
-Saves the output as `ja-dic-filename', in directory DIRNAME (if specified)."
+Saves the output as `ja-dic-filename', in directory DIRNAME (if specified).
+When WITH-REDUCTION is t, then reduce dictionary vocabulary.
+"
   (interactive "FSKK dictionary file: ")
   (let* ((skkbuf (get-buffer-create " *skkdic-unannotated*"))
 	 (buf (get-buffer-create "*skkdic-work*")))
@@ -389,7 +393,7 @@ Saves the output as `ja-dic-filename', in directory DIRNAME (if specified)."
 	(skkdic-collect-okuri-nasi)
 
 	;; Convert okuri-nasi general entries.
-	(skkdic-convert-okuri-nasi skkbuf buf)
+	(skkdic-convert-okuri-nasi skkbuf buf with-reduction)
 
 	;; Postfix
 	(with-current-buffer buf
@@ -427,15 +431,21 @@ To get complete usage, invoke:
 	(message "To convert SKK-JISYO.L into skkdic.el:")
 	(message "  %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert SKK-JISYO.L")
 	(message "To convert SKK-JISYO.L into DIR/ja-dic.el:")
-	(message "  %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert -dir DIR SKK-JISYO.L"))
-    (let (targetdir filename)
+	(message "  %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert -dir DIR SKK-JISYO.L")
+        (message "To convert SKK-JISYO.L into skkdic.el with reduce dictionary vocabulary:")
+        (message "  %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert SKK-JISYO.L --with-reduction"))
+    (let (targetdir filename with-reduction)
       (if (string= (car command-line-args-left) "-dir")
 	  (progn
 	    (setq command-line-args-left (cdr command-line-args-left))
 	    (setq targetdir (expand-file-name (car command-line-args-left)))
 	    (setq command-line-args-left (cdr command-line-args-left))))
+      (if (string= (car command-line-args-left) "--with-reduction")
+          (progn
+	    (setq with-reduction t)
+	    (setq command-line-args-left (cdr command-line-args-left))))
       (setq filename (expand-file-name (car command-line-args-left)))
-      (skkdic-convert filename targetdir)))
+      (skkdic-convert filename targetdir with-reduction)))
   (kill-emacs 0))
 
 
-- 
2.36.1


[-- Attachment #3: Type: text/plain, Size: 306 bytes --]


I thought that if reducing vocabulary is an option, some people might
question what reducing vocabulary means. So, to make it easier to convey
the intent, I changed the configure option to with-small-ja-dic and also
changed the description.

Please point out if the original is better.

Thanks,
-- 
Taiju

  reply	other threads:[~2022-06-07  0:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-03  3:16 [PATCH] Add an option to not reduce vocabulary of the Japanese Taiju HIGASHI
2022-06-03  6:12 ` Eli Zaretskii
2022-06-03  6:43   ` Taiju HIGASHI
2022-06-03 11:10     ` Eli Zaretskii
     [not found]       ` <87sfolwyzj.fsf@taiju.info>
2022-06-04  8:38         ` Eli Zaretskii
2022-06-04 11:46           ` Taiju HIGASHI
2022-06-04 13:43             ` Eli Zaretskii
2022-06-04 16:39               ` Taiju HIGASHI
2022-06-04 16:47                 ` Eli Zaretskii
2022-06-04 17:01                   ` Taiju HIGASHI
2022-06-04 17:03                     ` Eli Zaretskii
2022-06-05  3:05                     ` handa
2022-06-05 14:07                       ` Taiju HIGASHI
2022-06-06 11:52                         ` handa
2022-06-06 12:53                           ` Taiju HIGASHI
2022-06-06 14:14                             ` Lars Ingebrigtsen
2022-06-06 14:17                               ` Eli Zaretskii
2022-06-06 15:08                                 ` Taiju HIGASHI
2022-06-06 16:05                                   ` Eli Zaretskii
2022-06-07  0:47                                     ` Taiju HIGASHI [this message]
2022-06-07  1:06                                       ` Taiju HIGASHI
2022-06-07  3:50                                         ` Taiju HIGASHI
2022-06-07 10:58                                           ` Eli Zaretskii
2022-06-07  9:36                                         ` Lars Ingebrigtsen
2022-06-07 10:10                                           ` Taiju HIGASHI
2022-06-07 10:22                                             ` Lars Ingebrigtsen
2022-06-07 10:48                                       ` Eli Zaretskii
2022-06-07 12:12                                         ` Taiju HIGASHI
2022-06-07 12:41                                         ` Taiju HIGASHI
2022-06-07 13:08                                           ` Taiju HIGASHI
2022-06-09 13:10                                             ` Taiju HIGASHI
2022-06-09 13:14                                               ` Eli Zaretskii
2022-06-10 13:15                                             ` Eli Zaretskii
2022-06-10 13:50                                               ` Taiju HIGASHI
2022-06-03 23:51     ` Richard Stallman
2022-06-04 10:57       ` Taiju HIGASHI
2022-06-04 11:19         ` Taiju HIGASHI
2022-06-05 22:53         ` Richard Stallman
2022-06-06  0:05           ` Taiju HIGASHI
2022-06-03 23:52 ` Richard Stallman
2022-06-04  6:25   ` Eli Zaretskii
2022-06-04 12:36     ` Taiju HIGASHI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8td710c.fsf@taiju.info \
    --to=higashi@taiju.info \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@gnu.org \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.