unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Agustin Martin <agustin.martin@hispalinux.es>
To: emacs-devel@gnu.org
Subject: Re: Ispell and unibyte characters
Date: Tue, 10 Apr 2012 21:08:03 +0200	[thread overview]
Message-ID: <20120410190803.GA13517@agmartin.aq.upm.es> (raw)
In-Reply-To: <20120328191821.GA6266@agmartin.aq.upm.es>

[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]

On Wed, Mar 28, 2012 at 09:18:21PM +0200, Agustin Martin wrote:
> On Mon, Mar 26, 2012 at 04:08:06PM -0400, Eli Zaretskii wrote:
> > > Date: Mon, 26 Mar 2012 19:39:12 +0200
> > > From: Agustin Martin <agustin.martin@hispalinux.es>
> > > 
> > > Hi Eli,
> > 
> > Thanks for responding, I was beginning to think that no one is
> > interested.  In general, I find that ispell.el is in sore need of
> > modernization; at least that's my conclusion so far from playing with
> > hunspell (with which I want to replace my aging collection of Ispell
> > and its dictionaries that I use for many years).
> > 
> > > At least for aspell ispell.el already uses utf8 as default communication
> > > encoding and [:alpha:] as CASECHARS (and ^[:alpha:] as NOT-CASECHARS). 
> > > OTHERCHARS is guessed from aspell .dat file for given dictionary.
> > 
> > The question is, why isn't this done for any modern speller.  The only
> > one I know of that cannot handle UTF-8 is Ispell.
> 
> I think the only real remaining reason is for XEmacs compatibility. AFAIK 
> XEmacs does not support [:alpha:].
> 
> I thought about filtering ispell-dictionary-base-alist when used from FSF
> Emacs, so it uses [:alpha:] and still keeps compatibility. I am currently a
> bit busy, but at some time I may try this for Debian and see what happens.

For the records, I am attaching what I am currently trying, post-processing
global dictionary list while leaving local definitions at ~/.emacs
unmodified. This should also deal with [#11200: ispell.el sets incorrect
encoding for the default dictionary]. I would like to test this a bit more
and commit if there are no problems.

-- 
Agustin

[-- Attachment #2: ispell.el_alpha-regexp.2.diff --]
[-- Type: text/x-diff, Size: 2073 bytes --]

--- ispell.el.orig	2012-04-10 20:02:51.422092761 +0200
+++ ispell.el	2012-04-10 20:18:27.464680054 +0200
@@ -783,6 +783,12 @@
 (make-obsolete-variable 'ispell-aspell-supports-utf8
                         'ispell-encoding8-command "23.1")
 
+(defvar ispell-emacs-alpha-regexp
+  (if (string-match "^[[:alpha:]]+$" "abcde")
+      "[[:alpha:]]"
+    nil)
+  "[[:alpha:]] if Emacs supports [:alpha:] regexp, nil
+otherwise (current XEmacs does not support it).")
 
 ;;; **********************************************************************
 ;;; The following are used by ispell, and should not be changed.
@@ -1179,8 +1185,7 @@
 	       (error nil))
 	     ispell-really-aspell
 	     ispell-encoding8-command
-	     ;; XEmacs does not like [:alpha:] regexps.
-	     (string-match "^[[:alpha:]]+$" "abcde"))
+	     ispell-emacs-alpha-regexp)
 	(unless ispell-aspell-dictionary-alist
 	  (ispell-find-aspell-dictionaries)))
 
@@ -1204,8 +1209,27 @@
 			    ispell-dictionary-base-alist))
 	(unless (assoc (car dict) all-dicts-alist)
 	  (add-to-list 'all-dicts-alist dict)))
-      (setq ispell-dictionary-alist all-dicts-alist))))
+      (setq ispell-dictionary-alist all-dicts-alist))
 
+    ;; If Emacs flavor supports [:alpha:] use it for global dicts.  If
+    ;; spellchecker also supports UTF-8 via command-line option use it
+    ;; in communication.  This does not affect definitions in ~/.emacs.
+    (if ispell-emacs-alpha-regexp
+     	(let (tmp-dicts-alist)
+    	  (dolist (adict ispell-dictionary-alist)
+  	    (add-to-list 'tmp-dicts-alist
+   			 (list
+   			  (nth 0 adict)  ; dict name
+    			  "[[:alpha:]]"  ; casechars
+    			  "[^[:alpha:]]" ; not-casechars
+   			  (nth 3 adict)  ; otherchars
+    			  (nth 4 adict)  ; many-otherchars-p
+   			  (nth 5 adict)  ; ispell-args
+   			  (nth 6 adict)  ; extended-character-mode
+			  (if ispell-encoding8-command
+			      'utf-8
+			    (nth 7 adict)))))
+    	  (setq ispell-dictionary-alist tmp-dicts-alist)))))
 
 (defun ispell-valid-dictionary-list ()
   "Return a list of valid dictionaries.

  parent reply	other threads:[~2012-04-10 19:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-17 18:46 Ispell and unibyte characters Eli Zaretskii
2012-03-26 17:39 ` Agustin Martin
2012-03-26 20:08   ` Eli Zaretskii
2012-03-26 22:07     ` Lennart Borgman
2012-03-28 19:18     ` Agustin Martin
2012-03-29 18:06       ` Eli Zaretskii
2012-03-29 21:13         ` Andreas Schwab
2012-03-30  6:28           ` Eli Zaretskii
2012-04-26  9:54         ` Eli Zaretskii
2012-04-10 19:08       ` Agustin Martin [this message]
2012-04-10 19:11         ` Eli Zaretskii
2012-04-12 14:36           ` Agustin Martin
2012-04-12 19:01             ` Eli Zaretskii
2012-04-13 15:25               ` Agustin Martin
2012-04-13 15:53                 ` Eli Zaretskii
2012-04-13 16:38                   ` Agustin Martin
2012-04-13 17:51                 ` Stefan Monnier
2012-04-13 18:44                   ` Agustin Martin
2012-04-14  1:57                     ` Stefan Monnier
2012-04-15  0:02                       ` Agustin Martin
2012-04-16  2:40                         ` Stefan Monnier
2012-04-20 15:25                           ` Agustin Martin
2012-04-20 15:36                             ` Eli Zaretskii
2012-04-20 16:17                               ` Agustin Martin
2012-04-21  2:17                                 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120410190803.GA13517@agmartin.aq.upm.es \
    --to=agustin.martin@hispalinux.es \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).