From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Sergei Newsgroups: gmane.emacs.help Subject: Re: Spellcheck against multiple dictionaries? Date: Thu, 19 Mar 2009 02:30:35 -0700 (PDT) Organization: http://groups.google.com Message-ID: <957727de-bc65-4e5a-867f-32215d0896f8@b38g2000prf.googlegroups.com> References: <49C09110.9010105@gmx.at> <5f0660120903181236g3714f647ia568e3d02ae4fe56@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1237455810 27187 80.91.229.12 (19 Mar 2009 09:43:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 19 Mar 2009 09:43:30 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Mar 19 10:44:47 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LkEno-0002nd-RM for geh-help-gnu-emacs@m.gmane.org; Thu, 19 Mar 2009 10:44:41 +0100 Original-Received: from localhost ([127.0.0.1]:35509 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LkEmS-0000DJ-D8 for geh-help-gnu-emacs@m.gmane.org; Thu, 19 Mar 2009 05:43:16 -0400 Original-Path: news.stanford.edu!newsfeed.stanford.edu!postnews.google.com!b38g2000prf.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 99 Original-NNTP-Posting-Host: 195.161.50.69 Original-X-Trace: posting.google.com 1237455036 3504 127.0.0.1 (19 Mar 2009 09:30:36 GMT) Original-X-Complaints-To: groups-abuse@google.com Original-NNTP-Posting-Date: Thu, 19 Mar 2009 09:30:36 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: b38g2000prf.googlegroups.com; posting-host=195.161.50.69; posting-account=exrZLAoAAABFy4TCoZNdKd2oG1nld6Pb User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7,gzip(gfe),gzip(gfe) X-HTTP-Via: 1.1 msfwpr02.ims.intel.com:911 (squid/2.6.STABLE18) Original-Xref: news.stanford.edu gnu.emacs.help:167788 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:63080 Archived-At: ---- martin: >> I've downloaded speck.el file, but I'm not sure how do I use it. >> I've created a test file containing mixed correct and incorrect >> words, in Russian and English: >> Test =D1=82=D0=B5=D1=81=D1=82 correct =D0=BE=D1=87=D0=B5=D0=BF=D1=8F=D1= =82=D0=BA=D0=B0 incorect =D0=B2=D0=B5=D1=80=D0=BD=D0=BE >> Then I've done M-x speck-mode. Emacs said that Speck-mode has been >> activated and is using ru_RU dictionary, but nothing has changed in >> the test buffer. From your description I was expecting that the >> incorrect words would be highlighted somehow. Am I missing >> something? I do not know about speck-mode, but at least ispell.el would pick up only what looks like a word in the currently enabled language; only such words are recoded according to the current ispell dictionary requirements and passed to the ispell process. This means that "Test" is skipped in the Russian mode (just like =3D%=3D=3D!!.... etc); and conversely, =D0=BE=D1=87=D0=B5=D0=BF=D1=8F=D1=82= =D0=BA=D0=B0 and =D0=B2=D0=B5=D1=80=D0=BD=D0=BE are skipped in a Latin-alphabet context. And this is really convenient. (While the users of Latin-alphabet languages should stumble at any foreign word.) > I don't have a Russian spell-checking engine installed so I can't > comment your example directly. Suppose I have a file with the line > Test Test correct Duckfehler incorect richtig > Doing M-x speck-mode here starts an Aspell process checking with my > default language which is English, flagging the last three words as > incorrect. I can now set the region around the word "Duckfehler" > and type C-2 C-? to set the speck language text property of that > word to German, which will still flag the word as incorrect but now > with the appropriate German suggestions how to correct it. There are some formal text (like html or xml) which allow for a language markup. Something like ,---- | correct Duckfehler incorect richtig `---- >> I think that the ispell-ish behavior would indeed be nice. I've >> looked through the ispell code, and it looks like Emacs raises some >> kind of exception if the ispell process returns "invalid" >> status. Do you think it is possible to fallback to another >> dictionary on such an event? > With my Aspell engine I can write (and bind) a trivial command like > (defun ispell-check-word (arg) > (interactive "p") > (if (=3D arg 2) > (ispell-change-dictionary "de_DE") > (ispell-change-dictionary "en_US")) > (ispell-word)) > here and probably get what you want. Note, however, that each time you > change the language with this command, Emacs kills an old and spawns a > new process of the Aspell engine. Yes, because everything has to be changed: the filtering rules, the affix grammar, the word provision. > Changing `ispell-word' as you say seems hardly possible because in > general there's no way to distinguish a word written incorrectly in > language A from a word written correctly in language B. For the > special English/Russian case you could probably investigate the > character properties at `point' and spark the appropriate > word-checking process. In principle one could create a combined grammar for Russian and English; actually it would be a "direct sum" of the two grammars, as the word spaces are completely disjoint because the alphabets are disjoint. Such a combined processor exists in TeX for a combined English-Russian hyphenation. It would be more efficient too, because there would be no need to spawn a new process at every change from Russian to English. But presently it would be easier to use a two-pass approach: 1. check the Russian spelling (ignoring all Latin characters); 2. check the English spelling (ignoring all Cyrillic characters) Both passes are faster then in a switching mode -- and no extra work is required. Besides, you could spellcheck the Russian+French or Russian+German combinations (but not Russian+French+English, of course; while Russian+German+Armenian is still possible). -- Sergei