From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Taiju HIGASHI Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add an option to not reduce vocabulary of the Japanese Date: Tue, 07 Jun 2022 22:08:12 +0900 Message-ID: <87tu8w4o5f.fsf@taiju.info> References: <87r142ypnq.fsf@gnu.org> <878rqa9cm8.fsf@taiju.info> <87k09tubfa.fsf@gnus.org> <837d5t98qq.fsf@gnu.org> <87h74x96em.fsf@taiju.info> <834k0x93ra.fsf@gnu.org> <87v8td710c.fsf@taiju.info> <83v8tc7nrm.fsf@gnu.org> <877d5s63yz.fsf@taiju.info> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8965"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) Cc: larsi@gnus.org, handa@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jun 07 15:23:54 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nyZBZ-0001yb-Je for ged-emacs-devel@m.gmane-mx.org; Tue, 07 Jun 2022 15:23:53 +0200 Original-Received: from localhost ([::1]:41982 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nyZBY-00088o-K0 for ged-emacs-devel@m.gmane-mx.org; Tue, 07 Jun 2022 09:23:52 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49932) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nyYwb-0000d8-O0 for emacs-devel@gnu.org; Tue, 07 Jun 2022 09:08:26 -0400 Original-Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]:44830) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nyYwW-0002tV-Hc for emacs-devel@gnu.org; Tue, 07 Jun 2022 09:08:25 -0400 Original-Received: by mail-pg1-x533.google.com with SMTP id c18so7483451pgh.11 for ; Tue, 07 Jun 2022 06:08:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=taiju-info.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=mq/CDeWkWmwT49z4gmIa8xDuGQe1lSvOtCHKQPADMiY=; b=YJ0p0lh+l2NKZ+fTqASq25nYPluafgkwOznHCA+kZOzLAwy0YUPd0771XcCzxqZfAk wYhXXOupBpplq9CBhKO8JmfalmkolpWDfXvdxzuR02pVev2vs5Q72ES9DPNe8eDSG2Bd IfkbczWpB4k+jJtKZrZcJiiJawXh/mSSqAVpRdw8iIH2jZDLUPjOkjvoXuf4YARimsTI H4EDnwRbMcY15BvVIrxBW/MpzX1QeUYwmn2oeFoVk0gp8n5mkSf2PdggrHQhHZWKlhqS I0vBW+8bB9ZYrk2rBGetHniATZGJacipkrVBdRj58iJKybOOvxO6VndDG89bVZkyiiXb Cevw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=mq/CDeWkWmwT49z4gmIa8xDuGQe1lSvOtCHKQPADMiY=; b=iVLxHU6ozhnhzgAlZbjI6d/Brw8iAQkS8puQwj+DwNcFWOLED0g4+mdu6rGdUAJDdE emIiQ9uG899de9tsbIS+zT6UtsOpCJXKgKFSU+X6mxTxb1EBSQDFzIathNryuBVpWvXq jb/RsBJfESijDhG8PAl72YdIJr/zsXs26CZa79NAXuCTPPAqtN/fyRjFtNSvyicewNe4 MwMIF1mPK4EfPrPv2UjIFKialnkGABSBG6ocQQ0JO3H1TjJCflDsdupQgxFdjmdMpbvZ CLsU8+UtSIxnhiP6fvRh8a/y10MYtfNsStu0Y3vxd+47jOrxo+/cJcarp+LK9FQcRUXz YvEQ== X-Gm-Message-State: AOAM533X+IRk4dRa/jlc3X58gbRh2FXIN8FVaB1hJCsSLP3/t2mzgoaw 21LJoyd9V7jE2cSCXVIQts9T+P54tQeOf18W X-Google-Smtp-Source: ABdhPJy5LVWuRjBEG0MyoqqdEMb0WkqiyJRcmMZBpCrXDV9zF8a3g+BWL2/99JPjTvFKe9Ua2lXbTQ== X-Received: by 2002:a05:6a00:2187:b0:50c:ef4d:ef3b with SMTP id h7-20020a056a00218700b0050cef4def3bmr29064337pfi.83.1654607295994; Tue, 07 Jun 2022 06:08:15 -0700 (PDT) Original-Received: from Taix ([240b:253:ec40:2400:b7d1:436e:2d61:e925]) by smtp.gmail.com with ESMTPSA id f5-20020a170902684500b001624cd63bbbsm12531863pln.133.2022.06.07.06.08.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 06:08:15 -0700 (PDT) In-Reply-To: <877d5s63yz.fsf@taiju.info> (Taiju HIGASHI's message of "Tue, 07 Jun 2022 21:41:08 +0900") Received-SPF: none client-ip=2607:f8b0:4864:20::533; envelope-from=higashi@taiju.info; helo=mail-pg1-x533.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:290861 Archived-At: --=-=-= Content-Type: text/plain Taiju HIGASHI writes: >> Thanks, but I thought we all agreed that by default the build should >> NOT reduce the vocabulary, so that use of the option would be needed >> only if someone would want the current behavior back? And this v3 >> does it the other way around? Or did I miss something? > > The default behavior of the build process of the v3 patch does not > reduce the vocabulary, but It reduces the vocabulary when the > --with-small-ja-dic option is specified. > > However, as you pointed out in another email, I have made the > inappropriate correction in the ja-dic-cnv.el, So I fix it. I have attached the v4 patch. Sorry for the many times I have taken your time with this simple code modification. Please check it out. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=v4-0001-The-vocabulary-in-ja-dic.el-should-not-be-reduced.patch Content-Description: v4 patch >From b91102b60575904bd2726e146be4b85f33161f5d Mon Sep 17 00:00:00 2001 From: Taiju HIGASHI Date: Tue, 7 Jun 2022 21:46:14 +0900 Subject: [PATCH v4] The vocabulary in ja-dic.el should not be reduced by default. * configure.ac: Add the "with-small-ja-dic" configure option. * leim/Makefile.in (${leimdir}/ja-dic/ja-dic.el): Change the build method depending on whether or not the with-small-ja-dic option is specified. * lisp/international/ja-dic-cnv.el (skkdic-convert-okuri-nasi): Add the "no-reduction" optional argument. When it is specified, then generate a Japanese dictionary with no reduced vocabulary. (skkdic-convert): Add the "no-reduction" optional argument. (batch-skkdic-convert): Add the "--no-reduction" command line argument. --- configure.ac | 5 +++++ leim/Makefile.in | 8 +++++++- lisp/international/ja-dic-cnv.el | 26 ++++++++++++++++++-------- 3 files changed, 30 insertions(+), 9 deletions(-) diff --git a/configure.ac b/configure.ac index 313a1436b5..3e6eab94f8 100644 --- a/configure.ac +++ b/configure.ac @@ -491,6 +491,7 @@ OPTION_DEFAULT_ON([threads],[don't compile with elisp threading support]) OPTION_DEFAULT_OFF([native-compilation],[compile with Emacs Lisp native compiler support]) OPTION_DEFAULT_OFF([cygwin32-native-compilation],[use native compilation on 32-bit Cygwin]) OPTION_DEFAULT_ON([xinput2],[don't use version 2 of the X Input Extension for input]) +OPTION_DEFAULT_OFF([small-ja-dic],[generate a small-sized Japanese dictionary]) AC_ARG_WITH([file-notification],[AS_HELP_STRING([--with-file-notification=LIB], [use a file notification library (LIB one of: yes, inotify, kqueue, gfile, w32, no)])], @@ -6492,6 +6493,7 @@ AS_ECHO([" Does Emacs use -lXaw3d? ${HAVE_XAW3D Which dumping strategy does Emacs use? ${with_dumping} Does Emacs have native lisp compiler? ${HAVE_NATIVE_COMP} Does Emacs use version 2 of the the X Input Extension? ${HAVE_XINPUT2} + Should Emacs use a small-sized Japanese dictionary? ${with_small_ja_dic} "]) if test -n "${EMACSDATA}"; then @@ -6590,6 +6592,9 @@ SUBDIR_MAKEFILES_IN=`echo " ${SUBDIR_MAKEFILES}" | sed -e 's| | $(srcdir)/|g' -e AC_SUBST(SUBDIR_MAKEFILES_IN) +SMALL_JA_DIC=$with_small_ja_dic +AC_SUBST(SMALL_JA_DIC) + dnl You might wonder (I did) why epaths.h is generated by running make, dnl rather than just letting configure generate it from epaths.in. dnl One reason is that the various paths are not fully expanded (see above); diff --git a/leim/Makefile.in b/leim/Makefile.in index 3b4216c0b8..29b9f3b2f8 100644 --- a/leim/Makefile.in +++ b/leim/Makefile.in @@ -32,6 +32,12 @@ leimdir = ${srcdir}/../lisp/leim EXEEXT = @EXEEXT@ +SMALL_JA_DIC = @SMALL_JA_DIC@ +JA_DIC_NO_REDUCTION_OPTION = --no-reduction +ifeq ($(SMALL_JA_DIC), yes) + JA_DIC_NO_REDUCTION_OPTION = +endif + -include ${top_builddir}/src/verbose.mk # Prevent any settings in the user environment causing problems. @@ -134,7 +140,7 @@ generate-ja-dic: ${leimdir}/ja-dic/ja-dic.el ${leimdir}/ja-dic/ja-dic.el: $(srcdir)/SKK-DIC/SKK-JISYO.L $(AM_V_GEN)$(RUN_EMACS) -batch -l ja-dic-cnv \ --eval "(setq max-specpdl-size 5000)" \ - -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" "$<" + -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" $(JA_DIC_NO_REDUCTION_OPTION) "$<" ${srcdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $@ diff --git a/lisp/international/ja-dic-cnv.el b/lisp/international/ja-dic-cnv.el index 7f7c0261dc..f0929410ea 100644 --- a/lisp/international/ja-dic-cnv.el +++ b/lisp/international/ja-dic-cnv.el @@ -295,7 +295,7 @@ (setq skkdic-okuri-nasi-entries-count (length skkdic-okuri-nasi-entries)) (progress-reporter-done progress)))) -(defun skkdic-convert-okuri-nasi (skkbuf buf) +(defun skkdic-convert-okuri-nasi (skkbuf buf &optional no-reduction) (with-current-buffer buf (insert ";; Setting okuri-nasi entries.\n" "(skkdic-set-okuri-nasi\n") @@ -311,7 +311,9 @@ (setq count (1+ count)) (progress-reporter-update progress count) (if (setq candidates - (skkdic-reduced-candidates skkbuf kana candidates)) + (if no-reduction + candidates + (skkdic-reduced-candidates skkbuf kana candidates))) (progn (insert "\"" kana) (while candidates @@ -322,10 +324,12 @@ (progress-reporter-done progress)) (insert ")\n\n"))) -(defun skkdic-convert (filename &optional dirname) +(defun skkdic-convert (filename &optional dirname no-reduction) "Generate Emacs Lisp file from Japanese dictionary file FILENAME. The format of the dictionary file should be the same as SKK dictionaries. -Saves the output as `ja-dic-filename', in directory DIRNAME (if specified)." +Saves the output as `ja-dic-filename', in directory DIRNAME (if specified). +When NO-REDUCTION is t, then not reduce dictionary vocabulary. +" (interactive "FSKK dictionary file: ") (let* ((skkbuf (get-buffer-create " *skkdic-unannotated*")) (buf (get-buffer-create "*skkdic-work*"))) @@ -389,7 +393,7 @@ Saves the output as `ja-dic-filename', in directory DIRNAME (if specified)." (skkdic-collect-okuri-nasi) ;; Convert okuri-nasi general entries. - (skkdic-convert-okuri-nasi skkbuf buf) + (skkdic-convert-okuri-nasi skkbuf buf no-reduction) ;; Postfix (with-current-buffer buf @@ -427,15 +431,21 @@ To get complete usage, invoke: (message "To convert SKK-JISYO.L into skkdic.el:") (message " %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert SKK-JISYO.L") (message "To convert SKK-JISYO.L into DIR/ja-dic.el:") - (message " %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert -dir DIR SKK-JISYO.L")) - (let (targetdir filename) + (message " %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert -dir DIR SKK-JISYO.L") + (message "To convert SKK-JISYO.L into skkdic.el with not reduce dictionary vocabulary:") + (message " %% emacs -batch -l ja-dic-cnv -f batch-skkdic-convert --no-reduction SKK-JISYO.L")) + (let (targetdir filename no-reduction) (if (string= (car command-line-args-left) "-dir") (progn (setq command-line-args-left (cdr command-line-args-left)) (setq targetdir (expand-file-name (car command-line-args-left))) (setq command-line-args-left (cdr command-line-args-left)))) + (if (string= (car command-line-args-left) "--no-reduction") + (progn + (setq no-reduction t) + (setq command-line-args-left (cdr command-line-args-left)))) (setq filename (expand-file-name (car command-line-args-left))) - (skkdic-convert filename targetdir))) + (skkdic-convert filename targetdir no-reduction))) (kill-emacs 0)) -- 2.36.1 --=-=-= Content-Type: text/plain -- Taiju --=-=-=--