From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Nathan Trapuzzano Newsgroups: gmane.emacs.bugs Subject: bug#17130: 24.4.50; Deficient Unicode case folding Date: Sat, 29 Mar 2014 10:03:32 -0400 Message-ID: <87eh1lcdaj.fsf@nbtrap.com> References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1396101866 2228 80.91.229.3 (29 Mar 2014 14:04:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 29 Mar 2014 14:04:26 +0000 (UTC) Cc: 17130@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Mar 29 15:04:20 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WTtrv-0007Rz-3u for geb-bug-gnu-emacs@m.gmane.org; Sat, 29 Mar 2014 15:04:19 +0100 Original-Received: from localhost ([::1]:39440 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WTtru-000307-L7 for geb-bug-gnu-emacs@m.gmane.org; Sat, 29 Mar 2014 10:04:18 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WTtrl-0002sx-AX for bug-gnu-emacs@gnu.org; Sat, 29 Mar 2014 10:04:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WTtrf-0008NK-9b for bug-gnu-emacs@gnu.org; Sat, 29 Mar 2014 10:04:09 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:55102) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WTtrf-0008NF-1G for bug-gnu-emacs@gnu.org; Sat, 29 Mar 2014 10:04:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WTtre-0003gh-Fp for bug-gnu-emacs@gnu.org; Sat, 29 Mar 2014 10:04:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 14:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139610183814164 (code B ref 17130); Sat, 29 Mar 2014 14:04:02 +0000 Original-Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 14:03:58 +0000 Original-Received: from localhost ([127.0.0.1]:56284 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTtrY-0003gO-VE for submit@debbugs.gnu.org; Sat, 29 Mar 2014 10:03:57 -0400 Original-Received: from gproxy4-pub.mail.unifiedlayer.com ([69.89.23.142]:53601) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTtrS-0003g7-Pb for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 10:03:52 -0400 Original-Received: (qmail 1852 invoked by uid 0); 29 Mar 2014 14:03:46 -0000 Original-Received: from unknown (HELO cmgw3) (10.0.90.84) by gproxy4.mail.unifiedlayer.com with SMTP; 29 Mar 2014 14:03:46 -0000 Original-Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw3 with id jZ3c1n0094B3kjm01Z3fzY; Sat, 29 Mar 2014 15:03:44 -0600 X-Authority-Analysis: v=2.1 cv=O5+q4nNW c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=ux31zNp4dXR3TzTYOKMA:9 a=QEXdDO2ut3YA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=+q5HTBQlBtn+K2Coc+nVCAf6tgK/OgNhDuMVA28mc2c=; b=ggRZWps5g3mDL7MCWD4f9dnVI9Il96nCR88xUWWiKB+0yVWBgmdm3QQYTo5aSb4mEoG53x2MRAhJSdnAGk4Kwb2Yk+wovFBvUoy0XvcLOBMUPCgDgtBYFx8hf6vVz0Ve; Original-Received: from [50.90.253.209] (port=43073 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTtrF-0002fe-Bb; Sat, 29 Mar 2014 08:03:37 -0600 In-Reply-To: <83d2h5du2e.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 16:15:53 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:87523 Archived-At: Eli Zaretskii writes: >> Reading through the manual section on case tables, it seems that this >> could be supported via the extra "canonicalize" slot: >>=20 >> CANONICALIZE >> The canonicalize table maps all of a set of case-related >> characters into a particular member of that set. > > Not efficiently, no. E.g., how will you find =CF=82 from =CF=83, using t= his > method? =CF=83, =CF=82, and =CE=A3 would all have =CF=83 in the CANONICALIZE slot, = since they all fold to =CF=83. (By the way, =CF=82 should upcase to =CE=A3--that much I k= now the case tables can handle.) > Besides, don't we also need to know that =CF=82 can only be present at the > end of a word? Don't think so. AFAIK, Unicode says nothing about ordering except when it comes to combining characters. But even it did prescribe such a rule, I don't think it would have anything to do with case folding. >> If this isn't already used for Unicode case folding, what _is_ it used >> for? > > It is used for case-insensitive regexp matching, see search.c. Right, but what I'm asking is: if Emacs doesn't do Unicode case folding, what is the purpose of the CANONICALIZE slot except as a kind of placeholder that gets autofilled? Are there other kinds of case folding--other than traditional upper/lower and Unicode--that I'm not aware of? I understand that Emacs autofills the CANONICALIZE slot from the other slots, but only when the CANONICALIZE slot is not already set to non-nil. What if the CANONICALIZE slot on =CF=82 were set to =CF=83? I= think that's all that would have to happen for the Unicode folding to work. It seems the machinery is already in place.