From: Kenichi Handa <handa@m17n.org>
To: ehud@unix.mvs.co.il
Cc: eliz@gnu.org, emacs-devel@gnu.org
Subject: Re: Usage of standard-display-table in MSDOS
Date: Mon, 06 Sep 2010 14:14:01 +0900 [thread overview]
Message-ID: <tl739tnzc6e.fsf@m17n.org> (raw)
In-Reply-To: <201009042332.o84NWhSA017839@beta.mvs.co.il> (ehud@unix.mvs.co.il)
In article <201009042332.o84NWhSA017839@beta.mvs.co.il>, "Ehud Karni" <ehud@unix.mvs.co.il> writes:
> I attach a tar.bz2 file with 3 files:
> 1. lit1 - the sample file.
> 2. lit1-tty.png - how it should show on text terminal.
> 3. lit1-x.png - how it should show on X.
> I can do it if I read the file with the iso-latin-1 coding-system
> and change the display table to show the Hebrew glyphs for the Hebrew
> [#xE0-#xFA] bytes. But in this way it is not Hebrew characters (e.g.
> for the new bidi display). I want it the other way around, to read it
> with hebrew-iso-8bit and to to tweak the display table to show all
> the bytes not belonging to the Hebrew set.
Does it mean that you want bidi-reordering for the bytes
#xE0..#xFA (code-points of iso-8859-8) but bidi-reordering
is not necessary for the bytes #x80..#x8A (code-points of
cp862)?
But, your file "lit1" contains #xE0..#xFA (code-points of
iso-8859-8) at the second to 4th lines in visual order. If
bidi-reordering is applied on them, you'll get the different
view than lit1-tty.png and lit1-x.png. Is that ok?
> I had similar problem a long time ago. In 2001 you suggested to use
> the following code:
> (make-coding-system
> 'hebrew-iso-8bit 2 ?8
> "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)"
> '(ascii hebrew-iso8859-8 nil nil
> nil ascii-eol ascii-cntl nil nil nil nil nil t)
> '((safe-charsets ascii hebrew-iso8859-8 eight-bit-control)
> (mime-charset . iso-8859-8)))
> May be I can define a new coding system that will have bytes #x80-#xFF
> as legal characters and be recognized as Hebrew variant.
This code will that. I think it's not difficult to
understand what the code is doing.
------------------------------------------------------------
(define-charset 'cp862-sub
"Subset of CP862"
:code-space [#x80 #xDF]
:subset '(cp862 #x80 #xDF #x00))
(define-charset 'iso-8859-8-sub
"Subset of ISO-8859-8"
:code-space [#xE0 #xFA]
:subset '(iso-8859-8 #xE0 #xFA #x00))
(define-coding-system 'mix-hebrew
"Mixture of ISO-8859-8 and CP862"
:mnemonic ?H
:coding-type 'charset
:charset-list '(ascii iso-8859-8-sub cp862-sub)
:ascii-compatible-p t)
------------------------------------------------------------
Please try C-x C-m c mix-hebrew RET lit1 RET.
But, if you do that, you must consider the problem Eli wrote:
In article <E1Os7oU-0006m6-7X@fencepost.gnu.org>, Eli
Zaretskii <eliz@gnu.org> writes:
> But if you want all the Hebrew characters to be treated by Emacs as
> such (e.g., for bidi display), no matter what's their encoding in the
> file, you will have to define a coding-system that will decode them
> all into Unicode codepoints of Hebrew characters. There's a problem
> you will need to solve for defining such a coding system: it has 2
> different encodings for the same character, one from hebrew-iso-8bit,
> the other from cp862. So you will need to decide how will Hebrew
> characters be encoded when the file is saved.
In the above definition of mix-hebrew, as iso-8859-8-sub is
listed before cp862-sub, all Hebrew characters are encoded
into bytes #xE0..#xFA even if they were originally decoded
from bytes #x80..#x9A.
If you don't like it, you must give up decoding bytes
#x80..#x9A into Hebrew chars. You decode them as raw-bytes,
and setup a display table to display them as Hebrew chars.
It can be done by this code:
------------------------------------------------------------
(define-charset 'cp862-sub
"Subset of CP862"
:code-space [#x9B #xDF]
:subset '(cp862 #x9B #xDF #x00))
(define-charset 'iso-8859-8-sub
"Subset of ISO-8859-8"
:code-space [#xE0 #xFA]
:subset '(iso-8859-8 #xE0 #xFA #x00))
(define-coding-system 'mix-hebrew
"Mixture of ISO-8859-8, CP862, and raw 8-bit bytes"
:mnemonic ?H
:coding-type 'charset
:charset-list '(ascii iso-8859-8-sub cp862-sub eight-bit)
:ascii-compatible-p t)
(require 'disp-table)
;; Display bytes #x80..#x9A as Hebrew chars (code-points #xE0..#xFA of
;; ISO-8859-8).
(dotimes (i #x1B)
(aset standard-display-table
(unibyte-char-to-multibyte (+ #x80 i))
(vector (decode-char 'iso-8859-8 (+ #xE0 i)))))
------------------------------------------------------------
This display-table setting works also on terminal as far as
you set terminal coding system to mix-hebrew.
---
Kenichi Handa
handa@m17n.org
next prev parent reply other threads:[~2010-09-06 5:14 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-23 12:44 Usage of standard-display-table in MSDOS Kenichi Handa
2010-08-24 5:34 ` Stephen J. Turnbull
2010-08-24 11:13 ` Ehud Karni
2010-08-24 16:51 ` Eli Zaretskii
2010-08-25 13:04 ` Ehud Karni
2010-08-25 18:09 ` Eli Zaretskii
2010-08-26 15:26 ` Ehud Karni
2010-08-26 16:43 ` Eli Zaretskii
2010-08-27 13:35 ` Ehud Karni
2010-08-27 16:30 ` Eli Zaretskii
2010-08-27 10:24 ` Eli Zaretskii
2010-08-27 11:44 ` Kenichi Handa
2010-08-27 14:13 ` Eli Zaretskii
2010-08-28 4:18 ` Kenichi Handa
2010-08-28 7:22 ` Eli Zaretskii
2010-08-30 2:24 ` Kenichi Handa
2010-08-30 3:02 ` Eli Zaretskii
2010-09-01 3:21 ` Kenichi Handa
2010-09-01 9:20 ` Ehud Karni
2010-09-01 23:33 ` Ehud Karni
2010-09-02 5:19 ` Eli Zaretskii
2010-09-02 5:20 ` Kenichi Handa
2010-09-04 22:54 ` Ehud Karni
2010-09-06 1:30 ` Kenichi Handa
2010-09-02 12:32 ` Kenichi Handa
2010-09-04 23:32 ` Ehud Karni
2010-09-05 5:30 ` Eli Zaretskii
2010-09-06 5:14 ` Kenichi Handa [this message]
2010-08-29 10:16 ` Ehud Karni
2010-08-29 11:21 ` Eli Zaretskii
2010-08-29 11:49 ` Ehud Karni
2010-08-29 13:06 ` Ehud Karni
2010-08-29 13:50 ` Eli Zaretskii
2010-08-29 14:04 ` Eli Zaretskii
2010-09-07 21:11 ` Ehud Karni
2010-09-09 11:57 ` Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tl739tnzc6e.fsf@m17n.org \
--to=handa@m17n.org \
--cc=ehud@unix.mvs.co.il \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.