From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Yair F Newsgroups: gmane.emacs.devel Subject: Re: Composing Hebrew diacriticals Date: Tue, 18 May 2010 00:08:21 +0300 Message-ID: References: <83mxwlw2c0.fsf@gnu.org> <83pr12pfw6.fsf@gnu.org> <83fx1xowfj.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1274130525 10380 80.91.229.12 (17 May 2010 21:08:45 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 17 May 2010 21:08:45 +0000 (UTC) Cc: emacs-devel@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 17 23:08:44 2010 connect(): No such file or directory Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OE7YH-0001AY-3z for ged-emacs-devel@m.gmane.org; Mon, 17 May 2010 23:08:41 +0200 Original-Received: from localhost ([127.0.0.1]:50461 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OE7YG-00064O-D6 for ged-emacs-devel@m.gmane.org; Mon, 17 May 2010 17:08:40 -0400 Original-Received: from [140.186.70.92] (port=34349 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OE7Y6-000610-R6 for emacs-devel@gnu.org; Mon, 17 May 2010 17:08:35 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OE7Xz-0000EA-8u for emacs-devel@gnu.org; Mon, 17 May 2010 17:08:30 -0400 Original-Received: from mail-ww0-f41.google.com ([74.125.82.41]:49993) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OE7Xy-0000Du-P5 for emacs-devel@gnu.org; Mon, 17 May 2010 17:08:23 -0400 Original-Received: by wwa36 with SMTP id 36so2677400wwa.0 for ; Mon, 17 May 2010 14:08:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=xS1nDYZTkhc7Zsz9w/p9HoUAwZY+RBExQ+JxKJHfw38=; b=DrFvHj0/UVy9OQ+pfzcpqB6YbzrCGtAgtKzfhURcIqb/NP6AogwBPq62S+n2FWlIHz o+3ApODnpahAqAjy7sr0pxCpYjRcmXARdCQ0rRKtRX7cU4LvrMES3s0KvJq5S3cZqUho 792UwZsD+N1J0Md5r/ljrMfQspfCGuGSoT10s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=QBc1aM24ZzxBRq85sTitm1oXfo5WA6fIXFZc0LHFkZttRr206P14WlLFdvWtjxaojp hcH0OrUUasMPqAJchX+Iind2MYwO60l7oZIve2fjUT+Hm9PL8Ic3E/wd08AdpqWf7MB9 gRmHi5sKk7gDkBa7ikC0SOEHIAmsOnnHSdsms= Original-Received: by 10.216.93.11 with SMTP id k11mr3533427wef.153.1274130501228; Mon, 17 May 2010 14:08:21 -0700 (PDT) Original-Received: by 10.216.188.67 with HTTP; Mon, 17 May 2010 14:08:21 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:124874 Archived-At: On Mon, May 17, 2010 at 7:35 AM, Kenichi Handa wrote: > In article = , Yair F writes: > Are you using the same setting as mine which I wrote as > below in the previous mail? > No. I was using this: ;; For automatic composition. (defconst hebrew-composable-pattern =C2=A0(concat =C2=A0 "\\(" =C2=A0 "[\u05D6-\u05D9\u05DC-\u05E2\u05E5-\u05E8]" ;; base =C2=A0 "\u05BC?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0;; 0-1 marks of 1st class (dagesh) =C2=A0 "[\u05B0-\u05B9\u05BB\u05C7]?" =C2=A0 ;; 0-1 marks of 3rd class (niq= qud) =C2=A0 "[\u0591-\u05AF\u05BD]*" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-2 (possibl= y 3) marks of 4th class =C2=A0 "\\|" =C2=A0 "[\u05D0-\u05D4\u05DA\u05DB\u05E4\u05E5-\u05EA]" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; base (allows rafe) =C2=A0 "[\u05BC\u05BF]?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0;; 0-1 marks of 1st class (dagesh/rafe) =C2=A0 "[\u05B0-\u05B9\u05BB\u05C7]?" =C2=A0 ;; 0-1 marks of 3rd class (niq= qud) =C2=A0 "[\u0591-\u05AF\u05BD]*" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-2 (possibl= y 3) marks of 4th class =C2=A0 "\\|" =C2=A0 "\u05D5" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 ;; base (vav) =C2=A0 "\u05BC?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0;; 0-1 marks of 1st class (dagesh) =C2=A0 "[\u05B0-\u05BB\u05C7]?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-1 marks of= extended 3rd class (niqqud) =C2=A0 "[\u0591-\u05AF\u05BD]*" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-2 (possibl= y 3) marks of 4th class =C2=A0 "\\|" =C2=A0 "\u05E9" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 ;; base (shin) =C2=A0 "\u05BC?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0;; 0-1 marks of 1st class (dagesh) =C2=A0 "[\u05C1\u05C2]?" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0;; 0-1 marks of 2nd class (shin dot) =C2=A0 "[\u05B0-\u05B9\u05BB\u05C7]?" =C2=A0 ;; 0-1 marks of 3rd class (niq= qud) =C2=A0 "[\u0591-\u05AF\u05BD]*" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-2 (possibl= y 3) marks of 4th class =C2=A0 "\\|" =C2=A0 "[\u05F1-\u05F3]" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0;; base (yidish ligatures) =C2=A0 "[\u05B0-\u05B9\u05BB\u05C7]?" =C2=A0 ;; 0-1 marks of 3rd class (niq= qud) =C2=A0 "[\u0591-\u05AF\u05BD]*" =C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; 0-2 (possibl= y 3) marks of 4th class =C2=A0 "\\)") =C2=A0"Regexp matching a composable sequence of Hebrew characters.") (set-char-table-range composition-function-table '(#x591 . #x5F4) (list (vector hebrew-composable-pattern 0 'font-shape-gstring))) With your changes there is some composition. But this word doesn't compose properly: =D7=A2=D6=B7=D7=A9=D6=BC=D7=81=D6=B6=D7=A9=D7=81=D6=B6=D7=AA First Shin (u+05E9) composes with Dagesh (u+05BC), This shin-dot (u+05C1) isn't visible, the Segol (u+05B6) Goes under the previous base letter. what-cursor-position gives this: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0display: composed to form "=D7=A9=D6=BC= =D7=81=D6=B6" (see below) Composed with the following character(s) "=D6=BC=D7=81=D6=B6" using this fo= nt: =C2=A0xft:-unknown-DejaVu Sans-normal-normal-normal-*-23-*-*-*-*-0-iso10646= -1 by these glyphs: =C2=A0[0 3 0 4786 18 2 16 13 0 nil] =C2=A0[0 3 1473 1311 0 15 17 16 -14 nil] =C2=A0[0 3 1462 1300 0 5 11 -1 6 nil] MDEBUG_FLT=3D3 emacs - --eval ' (message "\u05E9\u05BC\u05C1\u05B6")' [FLT] (hebr-ff (dejavu sans) [FLT] =C2=A0 (SOURCE 05E9 05BC 05C1 05B6) [FLT] =C2=A0 (STAGE 0 "Hhhh" (05E9 05BC 05C1 05B6) [FLT] =C2=A0 =C2=A0 (SUBPART 0 [FLT] =C2=A0 =C2=A0 =C2=A0(COND [FLT] =C2=A0 =C2=A0 =C2=A0 (REGEX "^Hhh*" "Hhhh" 4 [FLT] =C2=A0 =C2=A0 =C2=A0 =C2=A0<0 [FLT] =C2=A0 =C2=A0 =C2=A0 =C2=A0:otf=3Dhebr=3Dccmp+mark 4>)))) [FLT] =C2=A0 (RESULT (12B2 1152 0 0) (051F 0 0 0) (0514 0 0 0))) > By the way, do you have a better font than 'dejavu sans' for > Hebrew? There are 2 major options: Fonts from culmus package (http://culmus.sourceforge.net/): Miriam Mono (blends with couier), David (Serif), Nachlieli (OpenOffice default) or most of the others. The other option is using msttcorefonts. I'll try to approach Dejavu designers as well. > >>>> 3. Letter Yod (U+5D9) composed with Hiriq (U+5B4) is composed into >>>> presentation form (U+FB1D). This should only happen with specific >>>> control (Either CGJ or ZWJ I'll check). > Then what is the correct rendering of the sequence "\u05D9\u05B4"? Hiriq should be rendered below the baseline like under all other letters. > See the attached image. I'm sorry but the attachment was lost. Can you please resent it? I