unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Geoff Kuenning <geoff@cs.hmc.edu>
Cc: agustin.martin@hispalinux.es, emacs-devel@gnu.org,
	k.stevens@ieee.org, 130397@bugs.debian.org
Subject: Re: Bug 130397
Date: 29 Apr 2005 02:29:41 +0200	[thread overview]
Message-ID: <pnifyxajuh6.fsf@bow.cs.hmc.edu> (raw)
In-Reply-To: <87sm4yt0o6.fsf@jurta.org>

For those of you who don't know, I've released ispell 3.3.00.  Having
gotten that off my plate, I'm busily working on some improvements that
will go into 3.3.01.  Number one on that list is to redo the
fixispell-a script that I whipped up a few months ago.

Juri points out:

> This approach is quite promising, but it doesn't work sufficiently well
> for non-English languages.  It loses all characters that don't belong
> to the alphabet specified in .aff file.

and:

> But there is another problem.  fixispell-a returns a list of near misses
> only for the last language in the pipe.  It would be better if it
> accumulated a list of near misses from all ispell commands in the pipe.

The former problem is best addressed using Juri's suggestion of
passing the "-w" switch to specify a superset.  In addition, in the
new release, the english.aff file includes all of Latin-1 (since
English sometimes adopts accented words and names from other
languages).  The -w switch is still needed, though, to handle things
like the apostrophe, which isn't in all non-English affix files.  I
welcome further suggestions.

The latter problem motivated me to write an entirely new program,
multispell, which does a better job of what fixispell-a attempted.
It's invoked as:

        multispell [ispell-switches] dict1 dict2 dict3

For example:

        multispell -m english deutsch francais

Multispell behaves like ispell -a, but accepts any word that any of
the mentioned dictionaries accept.  If a word is rejected, it combines
suggestions from all dictionaries.  So, for example, sending "wuld" to
the above line produces:

        & wuld 0 7 weld, wild, wold, would, Wald, wild, wund

This brings me to a question and a discussion point.  The question is
highlighted in the above line: the word "wild" appears as a
suggestion twice, because the English and German dictionaries both
produce it.  Do people think that's a Bad Thing?  I can certainly
write code to suppress the duplicates; I'm just feeling lazy at the
moment. *grin*

The discussion point is a bit more complex.  If you invoke multispell
with:

        multispell -T latin1 -m english deutsch francais

it will fail because the English dictionary doesn't recognize "latin1"
as a valid encoding.  How do people think I should handle these
variations among affix files?  One obvious option would be to make the
-T switch be dictionary-specific in multispell, so you'd write:

        multispell -m -T list english -T latin1 deutsch -T latin1 francais

Another option would be to insist that all affix files follow a common
naming scheme, so that everybody would be willing to accept "latin1"
as an encoding name, and so forth.

>From my point of view, both options are bad.  The first requires too
much intelligence on the part of ispell.el.  The second is going to be
hard to enforce.

Opinions are welcomed.
-- 
    Geoff Kuenning   geoff@cs.hmc.edu   http://www.cs.hmc.edu/~geoff/

Windows XP is the "most reliable Windows ever," which is like saying
that asparagus is "the most articulate vegetable ever."
	-- Dave Barry

  parent reply	other threads:[~2005-04-29  0:29 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.43.0305140821370.30166-100000@wr-linux02.rki.ivbb.bund.de>
     [not found] ` <m3addpd2ur.fsf@dionysos.nib>
     [not found]   ` <E19HNCh-0000tv-00@fencepost.gnu.org>
     [not found]     ` <20040517120658.GA6919@agmartin.aq.upm.es>
     [not found]       ` <E1BQ5z5-0000f4-5u@fencepost.gnu.org>
2004-05-19 11:44         ` Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary) Agustin Martin
2004-05-21  8:01           ` Agustin Martin
2004-12-17 12:15       ` Agustin Martin
2004-12-22 12:37         ` Kenichi Handa
2004-12-22 17:13           ` Agustin Martin
2005-01-04 12:50             ` Kenichi Handa
2005-01-04 14:55               ` Bug 130397 Stefan
2005-01-05  2:00                 ` Kenichi Handa
2005-01-05  4:42                   ` Stefan Monnier
2005-01-05  5:50                     ` Kenichi Handa
2005-01-05 14:02                       ` Stefan Monnier
2005-01-06  0:44                         ` Kenichi Handa
2005-01-06 16:30                           ` Ken Stevens
2005-01-06 17:33                             ` Stefan Monnier
2005-01-07  0:39                               ` Kenichi Handa
2005-01-07 15:48                             ` Agustin Martin
2005-01-08 12:31                             ` Geoff Kuenning
2005-01-08 12:47                               ` David Kastrup
2005-01-08 13:29                                 ` Miles Bader
2005-01-08 17:15                                   ` Geoff Kuenning
2005-01-10  4:45                                   ` Eli Zaretskii
2005-01-10  9:09                                     ` David Kastrup
2005-01-10 20:16                                       ` Eli Zaretskii
2005-01-13  7:50                                       ` Kenichi Handa
2005-01-08 22:39                               ` Peter Heslin
2005-01-07 15:36                       ` Agustin Martin
2005-01-07 20:29                         ` Ken Stevens
2005-01-07 21:27                         ` Juri Linkov
2005-01-13  5:59                           ` Kenichi Handa
2005-01-18 10:44                             ` Juri Linkov
2005-01-18 13:57                               ` Geoff Kuenning
2005-01-19  7:34                                 ` Juri Linkov
2005-01-19 12:22                                   ` Geoff Kuenning
2005-04-29  0:29                                   ` Geoff Kuenning [this message]
2005-04-29  8:45                                     ` Thien-Thi Nguyen
2005-01-18 23:24                               ` Kenichi Handa
2005-01-19  7:43                                 ` Juri Linkov
2005-01-19 12:52                                   ` Kenichi Handa
2005-01-19 13:08                                     ` David Kastrup
2005-01-07 15:34               ` Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary) Agustin Martin
2005-01-10 13:06             ` Lionel Elie Mamane
2005-01-10 17:16               ` Agustin Martin
2005-01-11  5:16                 ` Kenichi Handa
2005-01-11 19:56                   ` Agustin Martin
2005-01-11 21:39                     ` Lionel Elie Mamane
2005-01-12  7:37                     ` Kenichi Handa
2005-01-12 19:17                       ` Agustin Martin
2005-01-13  5:53                         ` Kenichi Handa
2005-01-11 14:29                 ` Richard Stallman
2005-01-12  7:45                   ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pnifyxajuh6.fsf@bow.cs.hmc.edu \
    --to=geoff@cs.hmc.edu \
    --cc=130397@bugs.debian.org \
    --cc=agustin.martin@hispalinux.es \
    --cc=emacs-devel@gnu.org \
    --cc=k.stevens@ieee.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).