all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Yair F <yair.f.lists@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org, handa@m17n.org
Subject: Re: Composing Hebrew diacriticals
Date: Thu, 13 May 2010 22:46:03 +0300	[thread overview]
Message-ID: <AANLkTikN7UToHmF1EN5elYD1lMm1Rj25w6zym6tFTdaS@mail.gmail.com> (raw)
In-Reply-To: <83aas3pvve.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 867 bytes --]

On Thu, May 13, 2010 at 8:14 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Thu, 13 May 2010 01:01:38 +0300
>> From: Yair F <yair.f.lists@gmail.com>
>> Cc: handa@m17n.org, emacs-devel@gnu.org
>>
>> For Hebrew, the diplay is a bit different (no composition info):
>
> IIUC, this means no composition took place.  Why did you expect a
> composition?  If this is in stock Emacs 24.0.50, then there are no
> compositions defined for any of the Hebrew characters out of the box.
> This is why we need your work.
>

Something strange happens here as these characters *are* composed
(Shin+shin dot+qamats).

One more thing: In the test case attached the Latin composition
sometimes occurs and sometimes not. I haven't been able to identify
why.

All of this applies to current trunk built with the attached
lisp/languages/hebrew.el (Kubuntu/gtk/xtf)

[-- Attachment #2: hebrew-sample2.txt --]
[-- Type: text/plain, Size: 143 bytes --]

שָׁלוֹם לְמִשְׁתַּמְּשֵׁי אִמַאקְס

A "אֲעוֹלֵל 123 כַּגֶּפֶן" B.

עַשֶּׁשֶׁת 

Ȧ

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: hebrew.el --]
[-- Type: text/x-emacs-lisp; name="hebrew.el", Size: 5304 bytes --]

;;; hebrew.el --- support for Hebrew -*- coding: iso-2022-7bit; no-byte-compile: t -*-

;; Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
;;   Free Software Foundation, Inc.
;; Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
;;   2005, 2006, 2007, 2008, 2009, 2010
;;   National Institute of Advanced Industrial Science and Technology (AIST)
;;   Registration Number H14PRO021

;; Copyright (C) 2003
;;   National Institute of Advanced Industrial Science and Technology (AIST)
;;   Registration Number H13PRO009

;; Keywords: multilingual, Hebrew

;; This file is part of GNU Emacs.

;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.

;;; Commentary:

;; For Hebrew, the character set ISO8859-8 is supported.
;; See http://www.ecma.ch/ecma1/STAND/ECMA-121.HTM.
;; Windows-1255 is also supported.

;;; Code:

(define-coding-system 'hebrew-iso-8bit
  "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)."
  :coding-type 'charset
  :mnemonic ?8
  :charset-list '(iso-8859-8)
  :mime-charset 'iso-8859-8)

(define-coding-system-alias 'iso-8859-8 'hebrew-iso-8bit)

;; These are for Explicit and Implicit directionality information, as
;; defined in RFC 1556.  We don't yet support directional information
;; in bidi languages, so these aliases are a lie, especially as far as
;; iso-8859-8-e is concerned.  FIXME.
(define-coding-system-alias 'iso-8859-8-e 'hebrew-iso-8bit)
(define-coding-system-alias 'iso-8859-8-i 'hebrew-iso-8bit)

(set-language-info-alist
 "Hebrew" '((charset iso-8859-8)
	    (coding-priority hebrew-iso-8bit)
	    (coding-system hebrew-iso-8bit windows-1255 cp862)
	    (nonascii-translation . iso-8859-8)
	    (input-method . "hebrew")
	    (unibyte-display . hebrew-iso-8bit)
	    (sample-text . "Hebrew	^[,Hylem^[(B")
	    (documentation . "Right-to-left writing is not yet supported.")))

(set-language-info-alist
 "Windows-1255" '((coding-priority windows-1255)
		  (coding-system windows-1255)
		  (documentation . "\
Support for Windows-1255 encoding, e.g. for Yiddish.
Right-to-left writing is not yet supported.")))

(define-coding-system 'windows-1255
  "windows-1255 (Hebrew) encoding (MIME: WINDOWS-1255)"
  :coding-type 'charset
  :mnemonic ?h
  :charset-list '(windows-1255)
  :mime-charset 'windows-1255)
(define-coding-system-alias 'cp1255 'windows-1255)

(define-coding-system 'cp862
  "DOS codepage 862 (Hebrew)"
  :coding-type 'charset
  :mnemonic ?D
  :charset-list '(cp862)
  :mime-charset 'cp862)
(define-coding-system-alias 'ibm862 'cp862)

;; For automatic composition.
(defconst hebrew-composable-pattern
  (concat
   "\\("
   "[\u05D6-\u05D9\u05DC-\u05E2\u05E5-\u05E8]" ;; base
   "\u05BC?"                        ;; 0-1 marks of 1st class (dagesh)
   "[\u05B0-\u05B9\u05BB\u05C7]?"   ;; 0-1 marks of 3rd class (niqqud)
   "[\u0591-\u05AF\u05BD]*"         ;; 0-2 (possibly 3) marks of 4th class
   "\\|"
   "[\u05D0-\u05D4\u05DA\u05DB\u05E4\u05E5-\u05EA]" 
                                    ;; base (allows rafe)
   "[\u05BC\u05BF]?"                ;; 0-1 marks of 1st class (dagesh/rafe)
   "[\u05B0-\u05B9\u05BB\u05C7]?"   ;; 0-1 marks of 3rd class (niqqud)
   "[\u0591-\u05AF\u05BD]*"         ;; 0-2 (possibly 3) marks of 4th class
   "\\|"
   "\u05D5"                         ;; base (vav)
   "\u05BC?"                        ;; 0-1 marks of 1st class (dagesh)
   "[\u05B0-\u05BB\u05C7]?"         ;; 0-1 marks of extended 3rd class (niqqud)
   "[\u0591-\u05AF\u05BD]*"         ;; 0-2 (possibly 3) marks of 4th class
   "\\|"
   "\u05E9"                         ;; base (shin)
   "\u05BC?"                        ;; 0-1 marks of 1st class (dagesh)
   "[\u05C1\u05C2]?"                ;; 0-1 marks of 2nd class (shin dot)
   "[\u05B0-\u05B9\u05BB\u05C7]?"   ;; 0-1 marks of 3rd class (niqqud)
   "[\u0591-\u05AF\u05BD]*"         ;; 0-2 (possibly 3) marks of 4th class
   "\\|"
   "[\u05F1-\u05F3]"                ;; base (yidish ligatures)
   "[\u05B0-\u05B9\u05BB\u05C7]?"   ;; 0-1 marks of 3rd class (niqqud)
   "[\u0591-\u05AF\u05BD]*"         ;; 0-2 (possibly 3) marks of 4th class
   "\\)")
  "Regexp matching a composable sequence of Hebrew characters.")

;;; Handa san suggest this. still needs to be understood
;; (let ((hebrew-diacritals-list '((FROM1 . TO1) (FROM2 . TO2) ...))
;;      (regexp "[..HEBREW_BASE_CHARS..][..HEBREW_DIACRITICALS..]))
;;  (dolist (elt hebrew-diacritals-list)
;;    (set-char-table-range elt
;;      (list (vector regexp 1 'font-shape-gstring)))))

(set-char-table-range 
 composition-function-table '(#x591 . #x5F4)
 (list (vector hebrew-composable-pattern 0 'font-shape-gstring)))

(provide 'hebrew)

;; arch-tag: 3ca04f32-3f1e-498e-af46-8267498ba5d9
;;; hebrew.el ends here

  reply	other threads:[~2010-05-13 19:46 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-30 12:29 Composing Hebrew diacriticals Eli Zaretskii
2010-05-05  2:39 ` Kenichi Handa
2010-05-05 15:49   ` David Kastrup
2010-05-05 20:51     ` Eli Zaretskii
2010-05-06  7:20       ` David Kastrup
2010-05-06  0:45     ` Kenichi Handa
2010-05-06 12:14       ` David Kastrup
2010-05-06 13:01         ` Kenichi Handa
2010-05-05 18:01   ` Eli Zaretskii
2010-05-07 11:15     ` Kenichi Handa
2010-05-08 12:51       ` Eli Zaretskii
2010-05-06 14:59   ` Yair F.
2010-05-06 17:41     ` Eli Zaretskii
2010-05-07  0:48     ` Kenichi Handa
2010-05-07  4:41       ` Yair F
2010-05-07  6:23         ` Kenichi Handa
2010-05-07 10:00           ` Yair F
2010-05-07 11:11             ` Kenichi Handa
2010-05-07  9:28         ` Eli Zaretskii
2010-05-10 14:09           ` Yair F
2010-05-11  0:25             ` Kenichi Handa
2010-05-11 12:20               ` Kenichi Handa
2010-05-11 16:22                 ` Eli Zaretskii
2010-05-12  8:04                   ` Kenichi Handa
2010-05-12 17:35                     ` Eli Zaretskii
2010-05-12 19:05                       ` Juanma Barranquero
2010-05-13  3:06                         ` Eli Zaretskii
2010-05-13  0:42                       ` Kenichi Handa
2010-05-14  8:10                         ` Kenichi Handa
2010-05-14 10:02                           ` Eli Zaretskii
2010-05-14 11:58                             ` Kenichi Handa
2010-05-14 13:29                               ` Eli Zaretskii
2010-05-14 14:06                                 ` Eli Zaretskii
     [not found]                           ` <AANLkTilcNB_ntRY_EVS9EyMrqS3GRAp3rHGiXL_3YZuR@mail.gmail.com>
2010-05-15  2:14                             ` Kenichi Handa
2010-05-15 21:35                               ` Yair F
2010-05-17  4:35                                 ` Kenichi Handa
2010-05-17 17:32                                   ` Eli Zaretskii
2010-05-18  0:36                                     ` Kenichi Handa
2010-05-17 21:08                                   ` Yair F
2010-05-20  2:09                                     ` Kenichi Handa
2010-05-25  1:45                                       ` Kenichi Handa
2010-05-25 20:56                                         ` Yair F
2010-05-26  0:36                                           ` Kenichi Handa
2010-05-26  4:37                                             ` Yair F
2010-05-26  6:00                                               ` Kenichi Handa
2010-05-26 16:12                                                 ` Yair F
2010-05-27  7:27                                                   ` Kenichi Handa
2010-05-27 21:59                                                     ` Yair F
2010-05-28  0:42                                                       ` Kenichi Handa
2010-06-01  8:58                                                         ` Yair F
2010-05-26 13:28                                             ` Enabling bidi (was: Composing Hebrew diacriticals) Stefan Monnier
2010-05-26 17:14                                               ` Eli Zaretskii
2010-05-27  4:13                                                 ` Enabling bidi Stefan Monnier
2010-05-27 17:43                                                   ` Eli Zaretskii
2010-05-18  7:29                                   ` Composing Hebrew diacriticals Eli Zaretskii
2010-05-17 13:53                                 ` Stefan Monnier
2010-05-19 17:23                     ` Eli Zaretskii
2010-05-11 21:40                 ` Yair F
2010-05-12  3:15                   ` Eli Zaretskii
2010-05-12 15:11                     ` Yair F
2010-05-12 17:43                       ` Eli Zaretskii
2010-05-12 22:01                         ` Yair F
2010-05-13 17:14                           ` Eli Zaretskii
2010-05-13 19:46                             ` Yair F [this message]
2010-05-13 19:56                               ` Eli Zaretskii
2010-05-13 20:08                                 ` Yair F
2010-05-14  2:35                                   ` Miles Bader
2010-05-14 10:45                                     ` Yair F
2010-05-14 13:05                                       ` Eli Zaretskii
2010-05-14 13:15                                       ` Kenichi Handa
2010-05-15  0:46                                       ` Miles Bader
2010-05-13  0:29                       ` Kenichi Handa
     [not found] <tl7fx0v9nra.fsf@m17n.org>
2010-06-15 11:02 ` Kenichi Handa
2010-06-24  6:33   ` Kenichi Handa
2010-06-25 10:16     ` Eli Zaretskii
2010-06-28 16:40     ` Yair F
2010-06-29  8:07       ` Kenichi Handa
2010-06-29 18:57         ` Yair F
2010-06-30  5:27           ` Kenichi Handa
     [not found]             ` <AANLkTim3sQzyJ4YQkOzfRHCFhztgLG-CA2vlM84lbwoq@mail.gmail.com>
2010-06-30 21:48               ` Fwd: " Yair F
2010-07-01  5:59                 ` Miles Bader
2010-07-01  5:52               ` Kenichi Handa
2010-07-01 20:30                 ` Yair F
2010-07-02  7:51                   ` Kenichi Handa
2010-07-12  8:17                     ` Kenichi Handa
2010-07-12 21:10                       ` Yair F
2010-07-13  4:11                         ` Kenichi Handa
2010-07-13  4:47                           ` Yair F
2010-07-13 12:01                         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTikN7UToHmF1EN5elYD1lMm1Rj25w6zym6tFTdaS@mail.gmail.com \
    --to=yair.f.lists@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@m17n.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.