From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#24603: [RFC 08/18] Support casing characters which map into multiple code points Date: Tue, 04 Oct 2016 10:38:20 +0300 Message-ID: <838tu4o977.fsf@gnu.org> References: <1475543441-10493-1-git-send-email-mina86@mina86.com> <1475543441-10493-8-git-send-email-mina86@mina86.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1475566786 21785 195.159.176.226 (4 Oct 2016 07:39:46 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 4 Oct 2016 07:39:46 +0000 (UTC) Cc: 24603@debbugs.gnu.org To: Michal Nazarewicz Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Oct 04 09:39:42 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brKJo-0001qw-J6 for geb-bug-gnu-emacs@m.gmane.org; Tue, 04 Oct 2016 09:39:16 +0200 Original-Received: from localhost ([::1]:40647 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brKJm-0002yz-W5 for geb-bug-gnu-emacs@m.gmane.org; Tue, 04 Oct 2016 03:39:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44790) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brKJg-0002yp-Se for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2016 03:39:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1brKJa-000429-IY for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2016 03:39:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:37539) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brKJa-00041u-F0 for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2016 03:39:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1brKJa-00026l-8A for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2016 03:39:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 04 Oct 2016 07:39:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24603 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24603-submit@debbugs.gnu.org id=B24603.14755667148059 (code B ref 24603); Tue, 04 Oct 2016 07:39:02 +0000 Original-Received: (at 24603) by debbugs.gnu.org; 4 Oct 2016 07:38:34 +0000 Original-Received: from localhost ([127.0.0.1]:43729 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brKJ8-00025v-4I for submit@debbugs.gnu.org; Tue, 04 Oct 2016 03:38:34 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:53632) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brKJ5-00025f-Tn for 24603@debbugs.gnu.org; Tue, 04 Oct 2016 03:38:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1brKIx-0003Y3-Ne for 24603@debbugs.gnu.org; Tue, 04 Oct 2016 03:38:26 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43898) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brKIx-0003X1-KH; Tue, 04 Oct 2016 03:38:23 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2512 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1brKIv-0000L8-PE; Tue, 04 Oct 2016 03:38:22 -0400 In-reply-to: <1475543441-10493-8-git-send-email-mina86@mina86.com> (message from Michal Nazarewicz on Tue, 4 Oct 2016 03:10:31 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:124024 Archived-At: > From: Michal Nazarewicz > Date: Tue, 4 Oct 2016 03:10:31 +0200 > > * src/make-special-casing.py: New script to generate special-casing.h > file from the SpecialCasing.txt data file. Please do this without Python, either in Emacs Lisp and/or the tools already used in admin/unidata, including awk. Python is still not available as widely as the other tools. > +special-casing.h: make-special-casing.py ../admin/unidata/SpecialCasing.txt > + $(AM_V_GEN) > + python $^ $@ Don't use a literal name of a program, so users could specify their name and/or absolute file name at build time. See what we do with awk, for example. > +#include "special-casing.h" Why not a shorter 'casing.h'? Once again, this stores the casing rules in C, whereas I'd prefer to have them in tables accessible from Lisp. > @@ -194,7 +276,9 @@ casify_object (enum case_action flag, Lisp_Object obj) > DEFUN ("upcase", Fupcase, Supcase, 1, 1, 0, > doc: /* Convert argument to upper case and return that. > The argument may be a character or string. The result has the same type. > -The argument object is not altered--the value is a copy. > +The argument object is not altered--the value is a copy. If argument > +is a character, characters which map to multiple code points when > +cased, e.g. fi, are returned unchanged. > See also `capitalize', `downcase' and `upcase-initials'. */) I think this doc string should say what to do if the application wants to convert fi into "FI". Thanks.