From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Date: Fri, 07 Feb 2020 20:23:33 +0200 Message-ID: <83r1z6dzoa.fsf@gnu.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="6831"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 39483@debbugs.gnu.org To: "Paul W. Rankin" Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Feb 07 19:25:12 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1j08JU-0001hJ-5W for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 07 Feb 2020 19:25:12 +0100 Original-Received: from localhost ([::1]:33926 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j08JT-0001Dy-5C for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 07 Feb 2020 13:25:11 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54189) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j08JL-0001DY-5A for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 13:25:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j08JK-000302-54 for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 13:25:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:44260) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j08JK-0002zP-19 for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 13:25:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j08JJ-0002v9-T2 for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 13:25:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 07 Feb 2020 18:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39483 X-GNU-PR-Package: emacs Original-Received: via spool by 39483-submit@debbugs.gnu.org id=B39483.158109984111146 (code B ref 39483); Fri, 07 Feb 2020 18:25:01 +0000 Original-Received: (at 39483) by debbugs.gnu.org; 7 Feb 2020 18:24:01 +0000 Original-Received: from localhost ([127.0.0.1]:50233 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j08IL-0002te-J9 for submit@debbugs.gnu.org; Fri, 07 Feb 2020 13:24:01 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:38713) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j08IJ-0002tN-Rn for 39483@debbugs.gnu.org; Fri, 07 Feb 2020 13:24:00 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:57790) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j08IE-0008S3-79; Fri, 07 Feb 2020 13:23:54 -0500 Original-Received: from [176.228.60.248] (port=3123 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j08ID-00013x-JM; Fri, 07 Feb 2020 13:23:54 -0500 In-reply-to: (hello@paulwrankin.com) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:175755 Archived-At: > From: "Paul W. Rankin" > Date: Sat, 08 Feb 2020 01:44:52 +1000 > > It appears that the function `ispell-get-word' makes its own judgements > on word boundaries, ignoring the buffer's syntax tables and character > categories. That is true. And I don't really see how it can be any different, since ispell.el must have the same notion of a word as the underlying dictionary, otherwise you will have false positives and/or false negatives, right? ispell.el looks up the word characters and non-word characters in its database, and the doc string of ispell-dictionary-base-alist explains how. > This becomes a problem with using `electric-quote-mode' and > ispell, because contractions are parsed as separate words. e.g. Calling > ispell word for "doesn’t" returns: > > T is correct > > To reproduce: > > 1. emacs -Q > 2. (in *scratch*) M-x text-mode RET > 3. enter text "doesn’t" (i.e. "doesn" C-x 8 ] "t") > 4. M-: (modify-syntax-entry ?’ "w") > 5. M-: (modify-category-entry ?’ ?^) > 6. M-$ | ispell-word The buffer syntax table has no effect on ispell.el, and shouldn't have any effect on it. > Attempts at workarounds: > > I've tried altering slot 3 of the corresponding `ispell-dictionary-base-alist' > entries from "[']" to "['’]" to no avail. That's the right direction, but you didn't follow it far enough. First, ispell-dictionary-base-alist is the default value, and is used to produce ispell-dictionary-alist, which is one you should change (alternatively, customize ispell-local-dictionary-alist). More importantly, the definitions of each dictionary include more than just one character set: there are 3 character sets there and one parameter for encoding the string passed to the spell-checker, and you should be sure to set them all as appropriate for the dictionary you use. My suggestion is to step with Edebug through ispell-get-word and see why it doesn't consider "doesn’t" as a single word in your case. > Setup: > > GNU Emacs 27.0.60 (build 2, x86_64-apple-darwin19.3.0, NS appkit-1894.30 > Version 10.15.3 (Build 19D76)) of 2020-02-05 This omits crucial information, like the dictionary in use and the locale-dependent settings that affect encoding. (In any case, I don't think this list is the right place of discussing this issue.)