all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
To: ehud@unix.mvs.co.il
Cc: eliz@gnu.org, emacs-devel@gnu.org
Subject: Re: Usage of standard-display-table in MSDOS
Date: Mon, 06 Sep 2010 14:14:01 +0900	[thread overview]
Message-ID: <tl739tnzc6e.fsf@m17n.org> (raw)
In-Reply-To: <201009042332.o84NWhSA017839@beta.mvs.co.il> (ehud@unix.mvs.co.il)

In article <201009042332.o84NWhSA017839@beta.mvs.co.il>, "Ehud Karni" <ehud@unix.mvs.co.il> writes:

> I attach a tar.bz2 file with 3 files:
> 1. lit1 - the sample file.
> 2. lit1-tty.png - how it should show on text terminal.
> 3. lit1-x.png   - how it should show on X.

> I can do it if I read the file with the iso-latin-1 coding-system
> and change the display table to show the Hebrew glyphs for the Hebrew
> [#xE0-#xFA] bytes. But in this way it is not Hebrew characters (e.g.
> for the new bidi display). I want it the other way around, to read it
> with hebrew-iso-8bit and to to tweak the display table to show all
> the bytes not belonging to the Hebrew set.

Does it mean that you want bidi-reordering for the bytes
#xE0..#xFA (code-points of iso-8859-8) but bidi-reordering
is not necessary for the bytes #x80..#x8A (code-points of
cp862)?

But, your file "lit1" contains #xE0..#xFA (code-points of
iso-8859-8) at the second to 4th lines in visual order.  If
bidi-reordering is applied on them, you'll get the different
view than lit1-tty.png and lit1-x.png.  Is that ok?

> I had similar problem a long time ago. In 2001 you suggested to use
> the following code:

>   (make-coding-system
>       'hebrew-iso-8bit 2 ?8
>       "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)"
>       '(ascii hebrew-iso8859-8 nil nil
>               nil ascii-eol ascii-cntl nil nil nil nil nil t)
>       '((safe-charsets ascii hebrew-iso8859-8 eight-bit-control)
>         (mime-charset . iso-8859-8)))

> May be I can define a new coding system that will have bytes #x80-#xFF
> as legal characters and be recognized as Hebrew variant.

This code will that.  I think it's not difficult to
understand what the code is doing.

------------------------------------------------------------
(define-charset 'cp862-sub
  "Subset of CP862"
  :code-space [#x80 #xDF]
  :subset '(cp862 #x80 #xDF #x00))

(define-charset 'iso-8859-8-sub
  "Subset of ISO-8859-8"
  :code-space [#xE0 #xFA]
  :subset '(iso-8859-8 #xE0 #xFA #x00))

(define-coding-system 'mix-hebrew
  "Mixture of ISO-8859-8 and CP862"
  :mnemonic ?H
  :coding-type 'charset
  :charset-list '(ascii iso-8859-8-sub cp862-sub)
  :ascii-compatible-p t)
------------------------------------------------------------

Please try C-x C-m c mix-hebrew RET lit1 RET.

But, if you do that, you must consider the problem Eli wrote:

In article <E1Os7oU-0006m6-7X@fencepost.gnu.org>, Eli
Zaretskii <eliz@gnu.org> writes:

> But if you want all the Hebrew characters to be treated by Emacs as
> such (e.g., for bidi display), no matter what's their encoding in the
> file, you will have to define a coding-system that will decode them
> all into Unicode codepoints of Hebrew characters.  There's a problem
> you will need to solve for defining such a coding system: it has 2
> different encodings for the same character, one from hebrew-iso-8bit,
> the other from cp862.  So you will need to decide how will Hebrew
> characters be encoded when the file is saved.

In the above definition of mix-hebrew, as iso-8859-8-sub is
listed before cp862-sub, all Hebrew characters are encoded
into bytes #xE0..#xFA even if they were originally decoded
from bytes #x80..#x9A.

If you don't like it, you must give up decoding bytes
#x80..#x9A into Hebrew chars.  You decode them as raw-bytes,
and setup a display table to display them as Hebrew chars.
It can be done by this code:

------------------------------------------------------------
(define-charset 'cp862-sub
  "Subset of CP862"
  :code-space [#x9B #xDF]
  :subset '(cp862 #x9B #xDF #x00))

(define-charset 'iso-8859-8-sub
  "Subset of ISO-8859-8"
  :code-space [#xE0 #xFA]
  :subset '(iso-8859-8 #xE0 #xFA #x00))

(define-coding-system 'mix-hebrew
  "Mixture of ISO-8859-8, CP862, and raw 8-bit bytes"
  :mnemonic ?H
  :coding-type 'charset
  :charset-list '(ascii iso-8859-8-sub cp862-sub eight-bit)
  :ascii-compatible-p t)

(require 'disp-table)
;; Display bytes #x80..#x9A as Hebrew chars (code-points #xE0..#xFA of
;; ISO-8859-8).
(dotimes (i #x1B)
  (aset standard-display-table
	(unibyte-char-to-multibyte (+ #x80 i))
	(vector (decode-char 'iso-8859-8 (+ #xE0 i)))))
------------------------------------------------------------

This display-table setting works also on terminal as far as
you set terminal coding system to mix-hebrew.

---
Kenichi Handa
handa@m17n.org



  parent reply	other threads:[~2010-09-06  5:14 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-23 12:44 Usage of standard-display-table in MSDOS Kenichi Handa
2010-08-24  5:34 ` Stephen J. Turnbull
2010-08-24 11:13   ` Ehud Karni
2010-08-24 16:51     ` Eli Zaretskii
2010-08-25 13:04       ` Ehud Karni
2010-08-25 18:09         ` Eli Zaretskii
2010-08-26 15:26           ` Ehud Karni
2010-08-26 16:43             ` Eli Zaretskii
2010-08-27 13:35               ` Ehud Karni
2010-08-27 16:30                 ` Eli Zaretskii
2010-08-27 10:24 ` Eli Zaretskii
2010-08-27 11:44   ` Kenichi Handa
2010-08-27 14:13     ` Eli Zaretskii
2010-08-28  4:18       ` Kenichi Handa
2010-08-28  7:22         ` Eli Zaretskii
2010-08-30  2:24           ` Kenichi Handa
2010-08-30  3:02             ` Eli Zaretskii
2010-09-01  3:21             ` Kenichi Handa
2010-09-01  9:20               ` Ehud Karni
2010-09-01 23:33               ` Ehud Karni
2010-09-02  5:19                 ` Eli Zaretskii
2010-09-02  5:20                 ` Kenichi Handa
2010-09-04 22:54                   ` Ehud Karni
2010-09-06  1:30                     ` Kenichi Handa
2010-09-02 12:32                 ` Kenichi Handa
2010-09-04 23:32                   ` Ehud Karni
2010-09-05  5:30                     ` Eli Zaretskii
2010-09-06  5:14                     ` Kenichi Handa [this message]
2010-08-29 10:16         ` Ehud Karni
2010-08-29 11:21           ` Eli Zaretskii
2010-08-29 11:49             ` Ehud Karni
2010-08-29 13:06               ` Ehud Karni
2010-08-29 13:50                 ` Eli Zaretskii
2010-08-29 14:04               ` Eli Zaretskii
2010-09-07 21:11                 ` Ehud Karni
2010-09-09 11:57                   ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tl739tnzc6e.fsf@m17n.org \
    --to=handa@m17n.org \
    --cc=ehud@unix.mvs.co.il \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.