From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.devel Subject: Better encoding handling for aspell in ispell.el and flyspell.el Date: Tue, 13 Nov 2007 14:04:01 +0100 Message-ID: <20071113130400.GA11508@agmartin.aq.upm.es> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="5mCyUwZo2JvN/JJP" X-Trace: ger.gmane.org 1194959156 20023 80.91.229.12 (13 Nov 2007 13:05:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 13 Nov 2007 13:05:56 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Nov 13 14:05:57 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IrvS3-00042a-I3 for ged-emacs-devel@m.gmane.org; Tue, 13 Nov 2007 14:05:32 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IrvRr-0006uu-8g for ged-emacs-devel@m.gmane.org; Tue, 13 Nov 2007 08:04:59 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1IrvR9-0006ab-Vi for emacs-devel@gnu.org; Tue, 13 Nov 2007 08:04:16 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1IrvR6-0006Ys-6L for emacs-devel@gnu.org; Tue, 13 Nov 2007 08:04:15 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IrvR6-0006Ya-3U for emacs-devel@gnu.org; Tue, 13 Nov 2007 08:04:12 -0500 Original-Received: from euler.ccupm.upm.es ([138.100.4.67] helo=smtp.upm.es) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1IrvR3-0006Yo-My for emacs-devel@gnu.org; Tue, 13 Nov 2007 08:04:10 -0500 Original-Received: from mala.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.13.8/8.13.8/euler-005) with ESMTP id lADD408W017240; Tue, 13 Nov 2007 14:04:00 +0100 Original-Received: by mala.aq.upm.es (Postfix, from userid 1000) id 1B2347249; Tue, 13 Nov 2007 14:04:00 +0100 (CET) Mail-Followup-To: emacs-devel@gnu.org Content-Disposition: inline User-Agent: Mutt/1.5.17 (2007-11-01) X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:83118 Archived-At: --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, There are some problems with the way emacs {ispell,flyspell}.el currently handle encodings for aspell. For ispell.el, when creating the dicts list from the present aspell dicts an --encoding string is added, but this is not done for the default entries not having an equivalent in the parsed list. For that reason no encoding is forced for them and it is selected after current locale, which might not match the dict encoding. For instance, that is not done for castellano8, the old way for calling 8 bit spanish, which is not usually shipped as an aspell alias. Attached 'ispell.el_aspell-encoding.diff' patch tries to address this problem by adding the --encoding string when the ispell process is started, so is done for all dicts. ispell.el patch [ispell.el_aspell-encoding.diff] proposed changelog entries ------------------- 8< ---------------------------------------------------- (ispell-aspell-find-dictionary): Do not set aspell encoding here. (ispell-start-process): Explicitly set encoding here if we are using aspell. ------------------- 8< ---------------------------------------------------- The other call that should involve --encoding (currently not set) is in flyspell.el (flyspell-large-region) function. Here, the proposed patch also makes sure that communication with the process is done in the dict encoding (since this only wraps the ispell process I just put a generic regexp here) and addresses some other issues, like using ispell-current-{personal-,}dictionary and making sure no "-d" string is added if one is already present. Things are also rearranged in a way closer to that of (ispell-start-process). flyspell.el patch [flyspell.el_aspell-encoding.diff] proposed changelog entries ------------------- 8< ---------------------------------------------------- (flyspell-large-region): - Explicitly set encoding if we are using aspell, for process and for communication with the process. - Do not add "-d" string if one is already present. - Use ispell-current-dictionary and ispell-current-personal-dictionary. - Reorganize code. ------------------- 8< ---------------------------------------------------- -- Agustin --5mCyUwZo2JvN/JJP Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="flyspell.el_aspell-encoding.diff" --- flyspell.el.orig 2007-11-12 14:05:54.000000000 +0100 +++ flyspell.el 2007-11-13 13:51:50.000000000 +0100 @@ -1531,29 +1531,42 @@ (if flyspell-issue-message-flag (message "Checking region...")) (set-buffer curbuf) (ispell-check-version) - (let ((c (apply 'ispell-call-process-region beg - end - ispell-program-name - nil - buffer - nil - (if ispell-really-aspell "list" "-l") - (let (args) - ;; Local dictionary becomes the global dictionary in use. - (if ispell-local-dictionary - (setq ispell-dictionary ispell-local-dictionary)) - (setq args (ispell-get-ispell-args)) - (if ispell-dictionary ; use specified dictionary - (setq args - (append (list "-d" ispell-dictionary) args))) - (if ispell-personal-dictionary ; use specified pers dict - (setq args - (append args - (list "-p" - (expand-file-name - ispell-personal-dictionary))))) - (setq args (append args ispell-extra-args)) - args)))) + ;; Local dictionary becomes the global dictionary in use. + (setq ispell-current-dictionary + (or ispell-local-dictionary ispell-dictionary)) + (setq ispell-current-personal-dictionary + (or ispell-local-pdict ispell-personal-dictionary)) + (let ((args (ispell-get-ispell-args)) + (encoding (ispell-get-coding-system)) + c) + (if (and ispell-current-dictionary ; use specified dictionary + (not (member "-d" args))) ; only define if not overridden + (setq args + (append (list "-d" ispell-current-dictionary) args))) + (if ispell-current-personal-dictionary ; use specified pers dict + (setq args + (append args + (list "-p" + (expand-file-name + ispell-current-personal-dictionary))))) + (setq args (append args ispell-extra-args)) + (if (and ispell-really-aspell + ispell-aspell-supports-utf8) + (setq args + (append args + (list + (concat "--encoding=" + (symbol-name + encoding)))))) + (let ((process-coding-system-alist (list (cons "\\.*" encoding)))) + (setq c (apply 'ispell-call-process-region beg + end + ispell-program-name + nil + buffer + nil + (if ispell-really-aspell "list" "-l") + args))) (if (eq c 0) (progn (flyspell-process-localwords buffer) --5mCyUwZo2JvN/JJP Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="ispell.el_aspell-encoding.diff" --- ispell.el.orig 2007-11-12 14:06:00.000000000 +0100 +++ ispell.el 2007-11-13 13:50:34.000000000 +0100 @@ -981,7 +981,7 @@ "[^[:alpha:]]" (regexp-opt otherchars) t ; We can't tell, so set this to t - (list "-d" dict-name "--encoding=utf-8") + (list "-d" dict-name) nil ; aspell doesn't support this ;; Here we specify the encoding to use while communicating with ;; aspell. This doesn't apply to command line arguments, so @@ -2508,6 +2508,13 @@ (append args (list "-p" (expand-file-name ispell-current-personal-dictionary))))) + (if (and ispell-really-aspell + ispell-aspell-supports-utf8) + (setq args + (append args + (list + (concat "--encoding=" + (symbol-name (ispell-get-coding-system))))))) (setq args (append args ispell-extra-args)) ;; Initially we don't know any buffer's local words. --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel --5mCyUwZo2JvN/JJP--