all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Agustin Martin <agustin.martin@hispalinux.es>
To: 13639@debbugs.gnu.org
Subject: bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
Date: Wed, 20 Feb 2013 18:50:45 +0100	[thread overview]
Message-ID: <20130220175045.GA20958@agmartin.aq.upm.es> (raw)
In-Reply-To: <20130116122509.GA2209@omega.in.herr-schmitt.de>

[-- Attachment #1: Type: text/plain, Size: 1410 bytes --]

On Thu, Jan 17, 2013 at 09:36:09PM +0200, Eli Zaretskii wrote:
> > On Thu, Jan 17, 2013 at 08:42:58PM +0200, Eli Zaretskii wrote:
> > > > Date: Thu, 17 Jan 2013 19:12:34 +0100
> > > > From: Agustin Martin <agustin.martin@hispalinux.es>
> > > > 
> > > > Sorry, I should have written WORDCHARS.
> > > 
> > > Why do we need that?
> > 
> > This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that
> > both
> > hunspell and ispell.el think about the same characters in that category.
> 
> I think you are mistaken, that's not my reading of hunspell(4).

Sorry for the late reply,

(Opening a new thread specifically about hunspell dicts autodetection and
using new cloned bugreport #13639 specific about this)

Although WORDCHARS description in hunspell(4)

WORDCHARS characters
   WORDCHARS extends tokenizer of Hunspell command line interface
   with additional word character. For example, dot, dash, n-dash, numbers,
   percent sign are word character in Hungarian.

is too hungarian biassed and does not mention usual apostrophe AFAIK it
mostly refers to the same as 'otherchars', although hunspell may accept
that in locations not in the middle of a word.

The good news are that I started working on hunspell dicts autodetection.
For those curious I am attaching my initial test suite. I am currently
integrating this into ispell.el (unfortunately slowly due to time
constraints)

-- 
Agustin

[-- Attachment #2: hunspell-autodetect.el --]
[-- Type: text/plain, Size: 4540 bytes --]

(require 'ispell)

(setq ispell-debug t)
(setq ispell-program-name "hunspell")

(setq ispell-hunspell-dict-paths-alist nil)
(setq ispell-hunspell-dictionary-alist nil)

(defun ispell-print-if-debug (string)
  ""
  (if ispell-debug
      (message "%s" string)))

(defun ispell-replace-dictionary-entry (dicts-alist new-entry)
  "Replace old entry in `DICTS-ALIST' with `NEW-ENTRY'.
Mostly intended to play with `ispell-dictionary-alist' and friends."
  (let (newlist)
    (dolist (entry dicts-alist)
      (if (string= (car new-entry) (car entry))
	  (add-to-list 'newlist new-entry)
	(add-to-list 'newlist entry)))
    newlist))

(defun ispell-parse-hunspell-affix-file (dict-name)
  "Parse hunspell affix file for `dict-name'.
Return a list in `ispell-dictionary-alist' format."
  (let* ((path (cadr (assoc dict-name ispell-hunspell-dict-paths-alist)))
	 (affix-file (concat path dict-name ".aff")))
    (unless path
      (error "No matching entry for %s" dict-name))
    (if (file-exists-p affix-file)
	(with-temp-buffer
	  (insert-file-contents affix-file)
	  (let (otherchars-string otherchars-list)
	    (setq otherchars-string
		  (save-excursion
		    (beginning-of-buffer)
		    (if (search-forward-regexp "^WORDCHARS +" nil t )
			(buffer-substring (point)
					  (progn (end-of-line) (point))))))
	    ;; Remove trailing whitespace and extra stuff. Make list if non-nil.
	    (setq otherchars-list
		  (if otherchars-string
		      (split-string
		       (if (string-match " +.*$" otherchars-string)
			   (replace-match "" nil nil otherchars-string)
			 otherchars-string)
		       "" t)))

	    ;; Fill dict entry
	    (list dict-name
		  "[[:alpha:]]"
		  "[^[:alpha:]]"
		  (if otherchars-list
		      (regexp-opt otherchars-list)
		    "")
		  t                      ;; many-otherchars-p: We can't tell, set to t
		  (list "-d" dict-name)
		  nil                    ;; extended-char-mode: not supported by hunspell
		  'utf-8)))
      (error "File \"%s\" not found" affix-file))))

(defun ispell-find-hunspell-dictionaries ()
  "Parse installed hunspell dictionaries."
  (let ((hunspell-found-dicts
	 (split-string
	  (with-temp-buffer
	    (ispell-call-process ispell-program-name
				 null-device
				 t
				 nil
				 "-D")
	    (buffer-string))
	  "[\n\r]+"
	  t))
	hunspell-default-dict
	hunspell-default-dict-entry)
    (dolist (dict hunspell-found-dicts)
      (let* ((full-name (file-name-nondirectory dict))
	     (path      (file-name-directory dict))
	     (basename  (file-name-sans-extension full-name)))
	(if (string-match "\\.aff$" dict)
	    ;; Found default dictionary
	    (if hunspell-default-dict
		(error "Default dict already defined as %s. Not using %s."
		       hunspell-default-dict dict)
	      (setq hunspell-default-dict (list basename path)))
	  (if (and (not (assoc basename ispell-hunspell-dict-paths-alist))
		   (file-exists-p (concat dict ".aff")))
	      ;; Entry has an associated .aff file and no previous value.
	      (progn
		(ispell-print-if-debug
		 (format "++ dict-entry:%s name:%s basename:%s path:%s aff:%s"
			 dict full-name basename path (concat dict ".aff")))
		(add-to-list 'ispell-hunspell-dict-paths-alist
			     (list basename path)))
	    (ispell-print-if-debug
	     (format "-- Skipping %s" dict))))))
    ;; Parse values for default dictionary.
    (setq hunspell-default-dict (car hunspell-default-dict))
    (setq hunspell-default-dict-entry
	  (ispell-parse-hunspell-affix-file hunspell-default-dict))
    ;; Create an alist of found dicts with only names, except for default dict.
    (setq ispell-hunspell-dictionary-alist
	  (list (append (list nil) (cdr hunspell-default-dict-entry))))
    (dolist (dict (mapcar 'car ispell-hunspell-dict-paths-alist))
      (if (string= dict hunspell-default-dict)
	  (add-to-list 'ispell-hunspell-dictionary-alist
		       hunspell-default-dict-entry)
	(add-to-list 'ispell-hunspell-dictionary-alist
		     (list dict))))))

(ispell-find-hunspell-dictionaries)

(setq mylang "en_US")

(message "-- For selected language \"%s\" before: %s"
	 mylang
	 (assoc mylang ispell-hunspell-dictionary-alist))

(or (cadr (assoc mylang ispell-hunspell-dictionary-alist))
    (let ((dict-entry (ispell-parse-hunspell-affix-file mylang)))
      (setq ispell-hunspell-dictionary-alist
            (ispell-replace-dictionary-entry ispell-hunspell-dictionary-alist
                                             dict-entry))))

(message "-- For selected language \"%s\" after: %s"
	 mylang
	 (assoc mylang ispell-hunspell-dictionary-alist))


  parent reply	other threads:[~2013-02-20 17:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-16 12:25 bug#13460: Issue to change dictionary when using hunspell on emacs Jochen Schmitt
2013-01-16 18:01 ` Eli Zaretskii
2013-01-16 23:23   ` Glenn Morris
2013-01-17  3:51     ` Eli Zaretskii
2013-01-17  6:37       ` Glenn Morris
2013-01-17 12:26         ` Agustin Martin
2013-01-17 15:24           ` Agustin Martin
2013-01-17 16:31             ` Stefan Monnier
2013-01-17 18:15               ` Agustin Martin
2013-01-17 16:41             ` Eli Zaretskii
2013-01-17 18:12               ` Agustin Martin
2013-01-17 18:42                 ` Eli Zaretskii
     [not found]                 ` <11624660.12538.1358448223517.JavaMail.root@mx1-new.spamfiltro.es>
2013-01-17 19:06                   ` Agustin Martin
2013-01-17 19:36                     ` Eli Zaretskii
2013-01-17 18:08             ` Glenn Morris
     [not found]             ` <7076415.12428.1358446115519.JavaMail.root@mx1-new.spamfiltro.es>
2013-01-17 18:44               ` Agustin Martin
2013-01-17 16:10           ` Eli Zaretskii
     [not found]     ` <20130117131733.GA20519@omega.in.herr-schmitt.de>
2013-01-17 18:19       ` Glenn Morris
2013-01-17 19:30         ` Agustin Martin
2013-01-18 17:05           ` Agustin Martin
2013-01-18 18:03             ` Jochen Schmitt
2013-01-18 19:03               ` Eli Zaretskii
2013-01-18 19:23                 ` Agustin Martin
2013-01-18 19:05               ` Agustin Martin
2013-01-21 16:52                 ` Agustin Martin
2013-01-21  9:43             ` Jochen Schmitt
2013-02-20 17:50 ` Agustin Martin [this message]
2013-02-20 19:00   ` bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs Eli Zaretskii
2013-02-28 19:23     ` Agustin Martin
2013-02-28 20:26       ` Eli Zaretskii
2013-04-15 10:18       ` Agustin Martin
2013-04-04 14:41 ` bug#13639: " Jacek Chrząszcz
2013-04-05 15:57   ` Agustin Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130220175045.GA20958@agmartin.aq.upm.es \
    --to=agustin.martin@hispalinux.es \
    --cc=13639@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.