From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#25230: Patch to ispell.el to simplify use of [:alpha:] for CASECHARS in built-in dictionaries Date: Mon, 19 Dec 2016 18:23:26 +0200 Message-ID: <831sx36g75.fsf@gnu.org> References: Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1482164719 25666 195.159.176.226 (19 Dec 2016 16:25:19 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 19 Dec 2016 16:25:19 +0000 (UTC) Cc: 25230@debbugs.gnu.org To: Reuben Thomas Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Dec 19 17:25:15 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ0kU-00060I-LL for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2016 17:25:14 +0100 Original-Received: from localhost ([::1]:46446 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJ0kZ-0007wb-08 for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2016 11:25:19 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54601) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJ0kN-0007rn-A7 for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 11:25:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cJ0kI-0000VK-CN for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 11:25:07 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:60615) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cJ0kI-0000V5-AL for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 11:25:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cJ0kI-00052K-5b for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 11:25:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 19 Dec 2016 16:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25230 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 25230-submit@debbugs.gnu.org id=B25230.148216464919293 (code B ref 25230); Mon, 19 Dec 2016 16:25:02 +0000 Original-Received: (at 25230) by debbugs.gnu.org; 19 Dec 2016 16:24:09 +0000 Original-Received: from localhost ([127.0.0.1]:47781 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ0jR-000517-3v for submit@debbugs.gnu.org; Mon, 19 Dec 2016 11:24:09 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:34998) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ0jQ-00050u-CT for 25230@debbugs.gnu.org; Mon, 19 Dec 2016 11:24:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cJ0jG-0008HW-6s for 25230@debbugs.gnu.org; Mon, 19 Dec 2016 11:24:03 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:46456) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJ0jG-0008HS-3Z; Mon, 19 Dec 2016 11:23:58 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1715 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cJ0jF-0007Oq-D7; Mon, 19 Dec 2016 11:23:57 -0500 In-reply-to: (message from Reuben Thomas on Mon, 19 Dec 2016 12:28:57 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:127185 Archived-At: > From: Reuben Thomas > Date: Mon, 19 Dec 2016 12:28:57 +0000 > > In ispell-set-spellchecker-params, there is code that used to be run conditionally on support for POSIX > character classes, which sets all the CASECHARS and NOT-CASECHARS entries for built-in dictionaries to > [[:alpha:]] and [^[:alpha:]] respectively. > > There is no point doing this unconditionally, so instead, put these character classes directly into the initial > values used in ispell-dictionary-base-alist. This change also makes the variable's initialization easier to read. > > The attached patch makes these changes. > > - "[A-Za-z]" "[^A-Za-z]" "[']" nil ("-B") nil iso-8859-1) > + ;; just use a minimal regexp. > + "[[:alpha:]]" "[^[:alpha:]]" "[']" nil ("-B") nil iso-8859-1) You are assuming that [[:alpha:]] and [A-Za-z] are identical. But they are far from being identical, not since Emacs 25.1. I mentioned this in another thread today. > ("brasileiro" ; Brazilian mode > - "[A-Z\301\311\315\323\332\300\310\314\322\331\303\325\307\334\302\312\324a-z\341\351\355\363\372\340\350\354\362\371\343\365\347\374\342\352\364]" > - "[^A-Z\301\311\315\323\332\300\310\314\322\331\303\325\307\334\302\312\324a-z\341\351\355\363\372\340\350\354\362\371\343\365\347\374\342\352\364]" > - "[']" nil nil nil iso-8859-1) > + "[[:alpha:]]" "[^[:alpha:]]" "[']" nil nil nil iso-8859-1) Same here: [[:alpha:]] is much broader now than any set of characters supported by a single language. In any case, these settings are for Ispell, which only supports single-byte encodings. We cannot use arbitrary characters with it. IOW, I don't think this patch is in the right direction. Thanks.