From: Yair F <yair.f.lists@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org, handa@m17n.org
Subject: Re: Composing Hebrew diacriticals
Date: Thu, 13 May 2010 22:46:03 +0300 [thread overview]
Message-ID: <AANLkTikN7UToHmF1EN5elYD1lMm1Rj25w6zym6tFTdaS@mail.gmail.com> (raw)
In-Reply-To: <83aas3pvve.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 867 bytes --]
On Thu, May 13, 2010 at 8:14 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Thu, 13 May 2010 01:01:38 +0300
>> From: Yair F <yair.f.lists@gmail.com>
>> Cc: handa@m17n.org, emacs-devel@gnu.org
>>
>> For Hebrew, the diplay is a bit different (no composition info):
>
> IIUC, this means no composition took place. Why did you expect a
> composition? If this is in stock Emacs 24.0.50, then there are no
> compositions defined for any of the Hebrew characters out of the box.
> This is why we need your work.
>
Something strange happens here as these characters *are* composed
(Shin+shin dot+qamats).
One more thing: In the test case attached the Latin composition
sometimes occurs and sometimes not. I haven't been able to identify
why.
All of this applies to current trunk built with the attached
lisp/languages/hebrew.el (Kubuntu/gtk/xtf)
[-- Attachment #2: hebrew-sample2.txt --]
[-- Type: text/plain, Size: 143 bytes --]
שָׁלוֹם לְמִשְׁתַּמְּשֵׁי אִמַאקְס
A "אֲעוֹלֵל 123 כַּגֶּפֶן" B.
עַשֶּׁשֶׁת
Ȧ
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: hebrew.el --]
[-- Type: text/x-emacs-lisp; name="hebrew.el", Size: 5304 bytes --]
;;; hebrew.el --- support for Hebrew -*- coding: iso-2022-7bit; no-byte-compile: t -*-
;; Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
;; Free Software Foundation, Inc.
;; Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
;; 2005, 2006, 2007, 2008, 2009, 2010
;; National Institute of Advanced Industrial Science and Technology (AIST)
;; Registration Number H14PRO021
;; Copyright (C) 2003
;; National Institute of Advanced Industrial Science and Technology (AIST)
;; Registration Number H13PRO009
;; Keywords: multilingual, Hebrew
;; This file is part of GNU Emacs.
;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.
;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>.
;;; Commentary:
;; For Hebrew, the character set ISO8859-8 is supported.
;; See http://www.ecma.ch/ecma1/STAND/ECMA-121.HTM.
;; Windows-1255 is also supported.
;;; Code:
(define-coding-system 'hebrew-iso-8bit
"ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)."
:coding-type 'charset
:mnemonic ?8
:charset-list '(iso-8859-8)
:mime-charset 'iso-8859-8)
(define-coding-system-alias 'iso-8859-8 'hebrew-iso-8bit)
;; These are for Explicit and Implicit directionality information, as
;; defined in RFC 1556. We don't yet support directional information
;; in bidi languages, so these aliases are a lie, especially as far as
;; iso-8859-8-e is concerned. FIXME.
(define-coding-system-alias 'iso-8859-8-e 'hebrew-iso-8bit)
(define-coding-system-alias 'iso-8859-8-i 'hebrew-iso-8bit)
(set-language-info-alist
"Hebrew" '((charset iso-8859-8)
(coding-priority hebrew-iso-8bit)
(coding-system hebrew-iso-8bit windows-1255 cp862)
(nonascii-translation . iso-8859-8)
(input-method . "hebrew")
(unibyte-display . hebrew-iso-8bit)
(sample-text . "Hebrew ^[,Hylem^[(B")
(documentation . "Right-to-left writing is not yet supported.")))
(set-language-info-alist
"Windows-1255" '((coding-priority windows-1255)
(coding-system windows-1255)
(documentation . "\
Support for Windows-1255 encoding, e.g. for Yiddish.
Right-to-left writing is not yet supported.")))
(define-coding-system 'windows-1255
"windows-1255 (Hebrew) encoding (MIME: WINDOWS-1255)"
:coding-type 'charset
:mnemonic ?h
:charset-list '(windows-1255)
:mime-charset 'windows-1255)
(define-coding-system-alias 'cp1255 'windows-1255)
(define-coding-system 'cp862
"DOS codepage 862 (Hebrew)"
:coding-type 'charset
:mnemonic ?D
:charset-list '(cp862)
:mime-charset 'cp862)
(define-coding-system-alias 'ibm862 'cp862)
;; For automatic composition.
(defconst hebrew-composable-pattern
(concat
"\\("
"[\u05D6-\u05D9\u05DC-\u05E2\u05E5-\u05E8]" ;; base
"\u05BC?" ;; 0-1 marks of 1st class (dagesh)
"[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of 3rd class (niqqud)
"[\u0591-\u05AF\u05BD]*" ;; 0-2 (possibly 3) marks of 4th class
"\\|"
"[\u05D0-\u05D4\u05DA\u05DB\u05E4\u05E5-\u05EA]"
;; base (allows rafe)
"[\u05BC\u05BF]?" ;; 0-1 marks of 1st class (dagesh/rafe)
"[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of 3rd class (niqqud)
"[\u0591-\u05AF\u05BD]*" ;; 0-2 (possibly 3) marks of 4th class
"\\|"
"\u05D5" ;; base (vav)
"\u05BC?" ;; 0-1 marks of 1st class (dagesh)
"[\u05B0-\u05BB\u05C7]?" ;; 0-1 marks of extended 3rd class (niqqud)
"[\u0591-\u05AF\u05BD]*" ;; 0-2 (possibly 3) marks of 4th class
"\\|"
"\u05E9" ;; base (shin)
"\u05BC?" ;; 0-1 marks of 1st class (dagesh)
"[\u05C1\u05C2]?" ;; 0-1 marks of 2nd class (shin dot)
"[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of 3rd class (niqqud)
"[\u0591-\u05AF\u05BD]*" ;; 0-2 (possibly 3) marks of 4th class
"\\|"
"[\u05F1-\u05F3]" ;; base (yidish ligatures)
"[\u05B0-\u05B9\u05BB\u05C7]?" ;; 0-1 marks of 3rd class (niqqud)
"[\u0591-\u05AF\u05BD]*" ;; 0-2 (possibly 3) marks of 4th class
"\\)")
"Regexp matching a composable sequence of Hebrew characters.")
;;; Handa san suggest this. still needs to be understood
;; (let ((hebrew-diacritals-list '((FROM1 . TO1) (FROM2 . TO2) ...))
;; (regexp "[..HEBREW_BASE_CHARS..][..HEBREW_DIACRITICALS..]))
;; (dolist (elt hebrew-diacritals-list)
;; (set-char-table-range elt
;; (list (vector regexp 1 'font-shape-gstring)))))
(set-char-table-range
composition-function-table '(#x591 . #x5F4)
(list (vector hebrew-composable-pattern 0 'font-shape-gstring)))
(provide 'hebrew)
;; arch-tag: 3ca04f32-3f1e-498e-af46-8267498ba5d9
;;; hebrew.el ends here
next prev parent reply other threads:[~2010-05-13 19:46 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-30 12:29 Composing Hebrew diacriticals Eli Zaretskii
2010-05-05 2:39 ` Kenichi Handa
2010-05-05 15:49 ` David Kastrup
2010-05-05 20:51 ` Eli Zaretskii
2010-05-06 7:20 ` David Kastrup
2010-05-06 0:45 ` Kenichi Handa
2010-05-06 12:14 ` David Kastrup
2010-05-06 13:01 ` Kenichi Handa
2010-05-05 18:01 ` Eli Zaretskii
2010-05-07 11:15 ` Kenichi Handa
2010-05-08 12:51 ` Eli Zaretskii
2010-05-06 14:59 ` Yair F.
2010-05-06 17:41 ` Eli Zaretskii
2010-05-07 0:48 ` Kenichi Handa
2010-05-07 4:41 ` Yair F
2010-05-07 6:23 ` Kenichi Handa
2010-05-07 10:00 ` Yair F
2010-05-07 11:11 ` Kenichi Handa
2010-05-07 9:28 ` Eli Zaretskii
2010-05-10 14:09 ` Yair F
2010-05-11 0:25 ` Kenichi Handa
2010-05-11 12:20 ` Kenichi Handa
2010-05-11 16:22 ` Eli Zaretskii
2010-05-12 8:04 ` Kenichi Handa
2010-05-12 17:35 ` Eli Zaretskii
2010-05-12 19:05 ` Juanma Barranquero
2010-05-13 3:06 ` Eli Zaretskii
2010-05-13 0:42 ` Kenichi Handa
2010-05-14 8:10 ` Kenichi Handa
2010-05-14 10:02 ` Eli Zaretskii
2010-05-14 11:58 ` Kenichi Handa
2010-05-14 13:29 ` Eli Zaretskii
2010-05-14 14:06 ` Eli Zaretskii
[not found] ` <AANLkTilcNB_ntRY_EVS9EyMrqS3GRAp3rHGiXL_3YZuR@mail.gmail.com>
2010-05-15 2:14 ` Kenichi Handa
2010-05-15 21:35 ` Yair F
2010-05-17 4:35 ` Kenichi Handa
2010-05-17 17:32 ` Eli Zaretskii
2010-05-18 0:36 ` Kenichi Handa
2010-05-17 21:08 ` Yair F
2010-05-20 2:09 ` Kenichi Handa
2010-05-25 1:45 ` Kenichi Handa
2010-05-25 20:56 ` Yair F
2010-05-26 0:36 ` Kenichi Handa
2010-05-26 4:37 ` Yair F
2010-05-26 6:00 ` Kenichi Handa
2010-05-26 16:12 ` Yair F
2010-05-27 7:27 ` Kenichi Handa
2010-05-27 21:59 ` Yair F
2010-05-28 0:42 ` Kenichi Handa
2010-06-01 8:58 ` Yair F
2010-05-26 13:28 ` Enabling bidi (was: Composing Hebrew diacriticals) Stefan Monnier
2010-05-26 17:14 ` Eli Zaretskii
2010-05-27 4:13 ` Enabling bidi Stefan Monnier
2010-05-27 17:43 ` Eli Zaretskii
2010-05-18 7:29 ` Composing Hebrew diacriticals Eli Zaretskii
2010-05-17 13:53 ` Stefan Monnier
2010-05-19 17:23 ` Eli Zaretskii
2010-05-11 21:40 ` Yair F
2010-05-12 3:15 ` Eli Zaretskii
2010-05-12 15:11 ` Yair F
2010-05-12 17:43 ` Eli Zaretskii
2010-05-12 22:01 ` Yair F
2010-05-13 17:14 ` Eli Zaretskii
2010-05-13 19:46 ` Yair F [this message]
2010-05-13 19:56 ` Eli Zaretskii
2010-05-13 20:08 ` Yair F
2010-05-14 2:35 ` Miles Bader
2010-05-14 10:45 ` Yair F
2010-05-14 13:05 ` Eli Zaretskii
2010-05-14 13:15 ` Kenichi Handa
2010-05-15 0:46 ` Miles Bader
2010-05-13 0:29 ` Kenichi Handa
[not found] <tl7fx0v9nra.fsf@m17n.org>
2010-06-15 11:02 ` Kenichi Handa
2010-06-24 6:33 ` Kenichi Handa
2010-06-25 10:16 ` Eli Zaretskii
2010-06-28 16:40 ` Yair F
2010-06-29 8:07 ` Kenichi Handa
2010-06-29 18:57 ` Yair F
2010-06-30 5:27 ` Kenichi Handa
[not found] ` <AANLkTim3sQzyJ4YQkOzfRHCFhztgLG-CA2vlM84lbwoq@mail.gmail.com>
2010-06-30 21:48 ` Fwd: " Yair F
2010-07-01 5:59 ` Miles Bader
2010-07-01 5:52 ` Kenichi Handa
2010-07-01 20:30 ` Yair F
2010-07-02 7:51 ` Kenichi Handa
2010-07-12 8:17 ` Kenichi Handa
2010-07-12 21:10 ` Yair F
2010-07-13 4:11 ` Kenichi Handa
2010-07-13 4:47 ` Yair F
2010-07-13 12:01 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTikN7UToHmF1EN5elYD1lMm1Rj25w6zym6tFTdaS@mail.gmail.com \
--to=yair.f.lists@gmail.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=handa@m17n.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.