From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Yair F Newsgroups: gmane.emacs.devel Subject: Re: Placement of HEBREW MAQAF (diacritical) Date: Fri, 30 Jul 2010 00:47:59 +0300 Message-ID: References: <20100722191932.GF17213@isis.luna> <83d3u7cx5y.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Trace: dough.gmane.org 1280440117 12636 80.91.229.12 (29 Jul 2010 21:48:37 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 29 Jul 2010 21:48:37 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 29 23:48:34 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Oeaxu-0002MG-7M for ged-emacs-devel@m.gmane.org; Thu, 29 Jul 2010 23:48:34 +0200 Original-Received: from localhost ([127.0.0.1]:33232 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oeaxt-0001r5-Bo for ged-emacs-devel@m.gmane.org; Thu, 29 Jul 2010 17:48:33 -0400 Original-Received: from [140.186.70.92] (port=46514 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OeaxR-0001cz-N3 for emacs-devel@gnu.org; Thu, 29 Jul 2010 17:48:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OeaxO-0004B3-9X for emacs-devel@gnu.org; Thu, 29 Jul 2010 17:48:05 -0400 Original-Received: from mail-ww0-f49.google.com ([74.125.82.49]:39482) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OeaxN-0004Al-UN; Thu, 29 Jul 2010 17:48:02 -0400 Original-Received: by wwi14 with SMTP id 14so290701wwi.30 for ; Thu, 29 Jul 2010 14:48:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=AJuXWM5gvFjCDwYEHykRs7z+hOqRuQDPAOZ63JZ6pTc=; b=xs+q0lJebPkA70oTi+L8C8TNQ7mmfF+vWbnOv8qqqQzNy+QL9v5tB8PikmTT972z/i v+qcLSDYuxpjBmQ99JFFRHyCOzZa2BAMg87wyUxSCPLp8YEYGHPORBjZuNocw/mFz7DP pfZyE8LvsFw7SF2ED3R/KW8K9hFQmSwopiBG8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=tbtbNhTXvZDOxmongT+2ZySteHwMfjq9fZwYuU15NINyTN4wE/M6WTdt4octjsoz26 Na4ArU0YW7uI5HTwCanH8z6iTOj1f4PuC/JVJohLwaZT6kkm7ylt4UOshSdMoI7BJiGO he4lQ1/yQNeY9cLBD+c/zZv7XZROPixUoYzmI= Original-Received: by 10.216.164.141 with SMTP id c13mr710910wel.83.1280440079970; Thu, 29 Jul 2010 14:47:59 -0700 (PDT) Original-Received: by 10.216.164.133 with HTTP; Thu, 29 Jul 2010 14:47:59 -0700 (PDT) In-Reply-To: <83d3u7cx5y.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:127991 Archived-At: On Wed, Jul 28, 2010 at 8:51 PM, Eli Zaretskii wrote: >> From: Yair F. > How about if you ask questions about things you don't understand? We > could then ask Handa-san to answer them, and use the answers as a > starting point for documenting this in the ELisp manual. Here are my questions: 1. Given the regular expressions below, how should composition-function-table be adjust so only those patterns are composed? 2. How to optimize the patterns in order not to slow down display of letters without points and give hebrew-composable-pattern-basic priority over hebrew-composable-pattern-full performance-wise? 3. Is the composition of script-specific characters from Hebrew bloc (0590) with non-script-specific marks from the combining marks bloc (0300) currently possible? 4. How to set the language environment so composition displays the correct presentation form for the given language? I believe if we get the answers for these questions we are nearly done with Hebrew composition. The only missing thing would be influence of ZWJ, ZWNJ and CGJ on certain combinations. Yair (defconst hebrew-composable-no-rafe-pattern (concat "[\u05D6-\u05D9\u05DC-\u05E2\u05E5-\u05E8]" ;; base "\u05BC?" ;; 0-1 marks of dagesh "[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of niqqud ) "Regexp matching letters that cannot accept rafe and basic composition order.") (defconst hebrew-composable-rafe-pattern (concat "[\u05D0-\u05D4\u05DA\u05DB\u05E4\u05E5\u05EA]" ;; base "[\u05BC\u05BF]?" ;; 0-1 marks of dagesh/rafe "[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of niqqud ) "Regexp matching letters that cannot accept rafe and basic composition order.") (defconst hebrew-composable-vav-pattern (concat "\u05D5" ;; base (vav) "\u05BC?" ;; 0-1 marks of dagesh (actually shuruq) "[\u05B0-\u05BB\u05C7]?" ;; 0-1 marks of extended niqqud ) "Regexp matching composition order of letter vav.") (defconst hebrew-composable-shin-pattern (concat "\u05E9" ;; base (shin) "\u05BC?" ;; 0-1 marks of dagesh "[\u05C1\u05C2]?" ;; 0-1 marks of shin dot "[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of niqqud ) "Regexp matching composition order of letter Shin.") (defconst hebrew-composable-cantillation (concat "[\u0591\u0596\u059B\u05A3-\u05A7\u05AA\u05BD\u05C5]*" ;; marks of low-center cantillation marks "[\u0323-\u0325\u0332\u0333]?" ;; 0-1 marks of other low marks "[\u059A\u05AD]?" ;; 0-1 marks of low-right cantillation marks "[\u059D\u05A0]?" ;; 0-1 marks of high-right cantillation marks "[\u0593-\u0595\u0597\u0598\u059C\u059E\u05A1\u05A8\u05A9\u05AB\u05AC\u05AF\u05C4]*" ;; marks of high-center cantillation marks "[\u0305\u0307\u0308\u030A]?" ;; 0-1 marks of other high marks "[\u0592\u0599\u05A9\u5AE]?" ;; 0-1 marks of high-left cantillation marks ) "Regexp matching composition order of cantillation and other marks.") (defconst hebrew-maqaf-compisition "\u05BE\u05AF?" "Regexp matching composition order of Maqaf.") (defconst hebrew-nun-hafukha-compisition "\u05C6\u0307?" "Regexp matching composition order of Nun Hafukha.") (defconst hebrew-composable-pattern-basic (concat "\\(" hebrew-composable-no-rafe-pattern "\\|" hebrew-composable-rafe-pattern "\\|" hebrew-composable-vav-pattern "\\|" hebrew-composable-shin-pattern "\\)") "Regexp matching a composable sequence of Hebrew characters basic level.") (defconst hebrew-composable-pattern-full (concat "\\(" hebrew-composable-no-rafe-pattern hebrew-composable-cantillation "\\|" hebrew-composable-rafe-pattern hebrew-composable-cantillation "\\|" hebrew-composable-vav-pattern hebrew-composable-cantillation "\\|" hebrew-composable-shin-pattern hebrew-composable-cantillation "\\|" hebrew-maqaf-compisition "\\|" hebrew-nun-hafukha-compisition "\\)") "Regexp matching a composable sequence of Hebrew characters basic level.") (defconst hebrew-yiddish-composable-pattern (concat "\u05D0" ;; alef "[\u05B7\u05B8]?" ;; qamats or Patah "\\|" "[\u05D1\u05DA\u05DB\u05E4\u05EA]" ;; base "[\u05BC\u05BF]?" ;; dagesh-rafe "\\|" "\u05D5" ;; base (vav) "\u05BC?" ;; 0-1 marks of dagesh (actually shuruq) "\\|" "\u05D9" ;; base (yod) "\u05B4" ;; This should be composed as \uFB1D "\\|" "\u05E9" ;; base (shin) "[\u05C1\u05C2]?" ;; 0-1 marks of shin dot "\\|" "\u05F2" ;; base (zwei yoden) "\u05B7" ;; This should be composed as \uFB1F ) "Regexp matching a composable sequence of Hebrew characters for Yiddish language.") (defconst hebrew-ladino-composable-pattern "[\u05D1\u05D2\u05D6\u05E4]\u05BF" ;; all sould use \uFB1E insread of \05BF "Regexp matching a composable sequence of Hebrew characters for Ladino (Judezmo) language.")