From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Date: Thu, 13 Feb 2014 18:33:05 +0200 Message-ID: <83y51fq8fy.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT X-Trace: ger.gmane.org 1392309251 29661 80.91.229.3 (13 Feb 2014 16:34:11 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 13 Feb 2014 16:34:11 +0000 (UTC) Cc: 16731@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Feb 13 17:34:16 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WDzEu-0004J4-9d for geb-bug-gnu-emacs@m.gmane.org; Thu, 13 Feb 2014 17:34:16 +0100 Original-Received: from localhost ([::1]:47589 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDzEt-0000Uu-KK for geb-bug-gnu-emacs@m.gmane.org; Thu, 13 Feb 2014 11:34:15 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40204) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDzEm-0000Sg-1o for bug-gnu-emacs@gnu.org; Thu, 13 Feb 2014 11:34:12 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDzEg-000528-Vl for bug-gnu-emacs@gnu.org; Thu, 13 Feb 2014 11:34:07 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:49803) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDzEg-00051k-JQ for bug-gnu-emacs@gnu.org; Thu, 13 Feb 2014 11:34:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WDzEf-0000g4-TY for bug-gnu-emacs@gnu.org; Thu, 13 Feb 2014 11:34:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 13 Feb 2014 16:34:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16731 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 16731-submit@debbugs.gnu.org id=B16731.13923092032534 (code B ref 16731); Thu, 13 Feb 2014 16:34:01 +0000 Original-Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 16:33:23 +0000 Original-Received: from localhost ([127.0.0.1]:50985 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzE1-0000ej-7b for submit@debbugs.gnu.org; Thu, 13 Feb 2014 11:33:22 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:50106) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzDy-0000eO-As for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 11:33:19 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N0Y00B000FNIQ00@a-mtaout20.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 18:33:11 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00BKT0NB9X60@a-mtaout20.012.net.il>; Thu, 13 Feb 2014 18:33:11 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:85488 Archived-At: > From: Stefan Monnier > Cc: Andreas Röhler , > 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 08:37:45 -0500 > > > How will we then be able to distinguish between lower-case characters > > that have no upcase variant and characters that are not lower-case > > characters at all? > > Right: to handle this, we need to distinguish characters that are > lower-case without an uppercase variant from characters which are > neither lowercase nor uppercase. > > We could do that by saying that the upcase table should return nil or -1 > for ß, to indicate that the upcase version is "missing". But such > a change will probably require carefully revising "all" the code that > uses those tables. Right. I can instead suggest a much less intrusive change below. Its only disadvantage is that if some user or Lisp program overrides the standard case tables, and actually _wants_ some lower-case characters behave as if they weren't, looking at the Unicode tables will undo such customizations. If this is a concern, perhaps we could compare the case table with the standard value, and only use the Unicode attributes when they are equal? If the approach below is accepted, a related question is how to treat letters whose category is Lt, i.e. "titlecase" -- do we consider such letters upper case or don't we? --- src/buffer.h~0 2014-01-01 09:46:07.000000000 +0200 +++ src/buffer.h 2014-02-13 18:27:32.225839000 +0200 @@ -1349,7 +1349,19 @@ downcase (int c) } /* True if C is upper case. */ -INLINE bool uppercasep (int c) { return downcase (c) != c; } +INLINE bool uppercasep (int c) +{ + Lisp_Object val; + + if (downcase (c) != c) + return true; + + if (NILP (Vunicode_category_table)) + return false; + + val = CHAR_TABLE_REF (Vunicode_category_table, c); + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu; +} /* Upcase a character C known to be not upper case. */ INLINE int @@ -1364,7 +1376,16 @@ upcase1 (int c) INLINE bool lowercasep (int c) { - return !uppercasep (c) && upcase1 (c) != c; + Lisp_Object val; + + if (!uppercasep (c) && upcase1 (c) != c) + return true; + + if (NILP (Vunicode_category_table)) + return false; + + val = CHAR_TABLE_REF (Vunicode_category_table, c); + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Ll; } /* Upcase a character C, or make no change if that cannot be done. */