From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Sebastian Urban Newsgroups: gmane.emacs.bugs Subject: bug#42602: Wrong (not-)casechars value for "polish" in ispell-dictionary-base-alist Date: Thu, 30 Jul 2020 13:39:55 +0200 Message-ID: References: <2f58556a-8f0f-f923-2716-5366d66fa44d@gmail.com> <83h7tqf9h1.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28803"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 Cc: 42602@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jul 30 13:41:12 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1k16vv-0007Ok-Rj for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 30 Jul 2020 13:41:11 +0200 Original-Received: from localhost ([::1]:43744 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k16vu-00009N-Pu for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 30 Jul 2020 07:41:10 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39900) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k16vm-000075-Bi for bug-gnu-emacs@gnu.org; Thu, 30 Jul 2020 07:41:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51880) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1k16vm-0007kO-1F for bug-gnu-emacs@gnu.org; Thu, 30 Jul 2020 07:41:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1k16vl-0000Iu-VG for bug-gnu-emacs@gnu.org; Thu, 30 Jul 2020 07:41:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Sebastian Urban Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 30 Jul 2020 11:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 42602 X-GNU-PR-Package: emacs Original-Received: via spool by 42602-submit@debbugs.gnu.org id=B42602.15961092011073 (code B ref 42602); Thu, 30 Jul 2020 11:41:01 +0000 Original-Received: (at 42602) by debbugs.gnu.org; 30 Jul 2020 11:40:01 +0000 Original-Received: from localhost ([127.0.0.1]:35193 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k16un-0000H5-7h for submit@debbugs.gnu.org; Thu, 30 Jul 2020 07:40:01 -0400 Original-Received: from mail-lf1-f46.google.com ([209.85.167.46]:37957) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k16ul-0000Gp-3c for 42602@debbugs.gnu.org; Thu, 30 Jul 2020 07:40:00 -0400 Original-Received: by mail-lf1-f46.google.com with SMTP id 140so14766409lfi.5 for <42602@debbugs.gnu.org>; Thu, 30 Jul 2020 04:39:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=iaFWtsTJk6zifwAJQqp4rJ/U/IHT2Pfzz6lyPVK3lBk=; b=aMYvJXDrOwUWTISfCZ0FPXCtC/otpi2Gud4jABlAOppT7Elwf36ji+jRgH1O9yCLMT oRV6ZS7fDurZXTala4c2wZdXwCsXv8nz8q2rYvbQ2NQPI5zaM5C92CVfhee+lHWBFhwt 2pff1fjpAsmnuAjUc9tAZRduFouGb7Oye4+UaSTMbAXyZeAy8ezI4lulkWkI4JMIrBMl ht4OAyT/3j8hWJkOceLf7H6avlp+7UCH3pW2RsJbCEF44/JJfqzj+1FME8YfHxHjWb7T Em0Ib6JTGES0D5fOganXC5hLN4AX3zcyUofqySowUtle2vGcpgUyZ5Ni9pIukTAm1OKr 1huA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=iaFWtsTJk6zifwAJQqp4rJ/U/IHT2Pfzz6lyPVK3lBk=; b=Btm8psBE+K64hSlr2Krk2l6glYP7VIn4hV+FK+BrPjIzEJvGNRjPT98lftcdMcvO7r JEtrgZEG9OkbWvn7bcI95w32CGoZv8Nwuy1FlZ9knoFuIbnc6eYKYS5SbwU9c5+/dMVL Jt7789s6V02upt6qfQFy5p6eDoCkMeUHt93UGUjyCRG/fIFMPbxkk/F0LrNjV8ielMFL T+btJdqvdTkXKOJt1PwoUECAlkwklejEB0VIBICj6EPMw4sEGQXcabjMsLgU+gjHCI7P M8NtjMeic1D45GUGyS5n4zjXKC2b0cnmicwTck84BjryHCgqqRPzxN+MNjH61BJ4V5Qm RWtg== X-Gm-Message-State: AOAM53122xulvgguDQm/UmSJt7C/v4HhlTSZz16mXFdxTB0YjYGXIRSl vmFY0ng7tLAG/EFBDklgMNlPi8CY X-Google-Smtp-Source: ABdhPJw1qFV7juztOa729vHhRvVo0nkDOoCATXosP9KaEPk3+LjQRcf5De3QGGVys/7xgiuQQhpdjA== X-Received: by 2002:a19:c752:: with SMTP id x79mr1329033lff.197.1596109191800; Thu, 30 Jul 2020 04:39:51 -0700 (PDT) Original-Received: from [192.168.1.100] (ip-89-161-2-137.tel.tkb.net.pl. [89.161.2.137]) by smtp.gmail.com with ESMTPSA id k12sm1128313lfe.68.2020.07.30.04.39.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 30 Jul 2020 04:39:51 -0700 (PDT) In-Reply-To: <83h7tqf9h1.fsf@gnu.org> Content-Language: en-GB X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:183698 Archived-At: > I don't understand this change. Values above octal 377 cannot be > right in the above regexps, because they are supposed to be in > Latin-2 encoding, which is a single-byte encoding, and so can only > handle values below octal 400. How did you come up with those > values? Basically, C-x = on a char, which gave me octal values. I though it was recognising only A-z + ó/Ó and some other chars that I'm not interested in, so I swapped those values for the ones corresponding to the Polish chars. That's the whole story. > Anyway, I'm quite sure some other factor is at work here. Well, I did some tests, e.g. switched back to the original value of "polish" in my "pl" dictionary, and... it works. And if I change from iso-8859-2 to utf-8 in my "pl" (with original value from "polish") it doesn't work. So, as you later wrote - wrong character encoding, I guess. Looking for a cause (in default settings), I think I found it in ispell-dictionary-base-alist and ispell-dictionary-alist. During "transfer" from *-base-* to ispell-dictionary-alist, the value of CHARACTER-SET is changed in all cases from iso-* or cp1255 to utf-8, then ispell uses these (from ispell-dictionary-alist) when it "talks" with Aspell. On the other hand, if I use Emacs 26.3 from Cygwin, everything works out of the box, I don't even have to set "polish" as default dictionary. But there, in Cygwin command line, "env | grep LANG" gives "LANG=pl_PL.UTF-8". > Your Emacs is a native MinGW build, whereas Aspell seems to be > a Cygwin build? Both Emacses are official Win builds, and Aspell is installed through Cygwin. > If so, you could have incompatibility in character encoding. What > is your Windows locale? "Polish" everywhere in "Control Panel" -> "Regional and Language". > And what does M-: (getenv "LANG") RET yield inside Emacs? "PLK" S. U. P.S. > Moreover, if I type in regexp-builder "[\363\323]" it won't > recognize ó/Ó, but it doesn't have a problem with other Polish > chars, like "ł" ("[\502]") or "ż" ("[\574]"). In the "Character List" buffer for unicode-bmp, regexp-builder (numbers are octal values): - 0-177 and 400-777 - highlights chars - 240-377 - doesn't highlight chars (it highlights them if I use hex value, or insert them directly) I didn't check "80h-9Fh" chars. Chars like C-a were checked by inserting them with quoted-insert in another buffer.