From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Usage of standard-display-table in MSDOS Date: Mon, 06 Sep 2010 14:14:01 +0900 Message-ID: References: <83aao8mjzx.fsf@gnu.org> <837hjcm9cw.fsf@gnu.org> <83y6brkxqe.fsf@gnu.org> <201009012333.o81NXrRq016732@beta.mvs.co.il> <201009042332.o84NWhSA017839@beta.mvs.co.il> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1283750069 6095 80.91.229.12 (6 Sep 2010 05:14:29 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Sep 2010 05:14:29 +0000 (UTC) Cc: eliz@gnu.org, emacs-devel@gnu.org To: ehud@unix.mvs.co.il Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 06 07:14:27 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OsU2F-0001zL-2f for ged-emacs-devel@m.gmane.org; Mon, 06 Sep 2010 07:14:27 +0200 Original-Received: from localhost ([127.0.0.1]:40797 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OsU2E-0004rz-6c for ged-emacs-devel@m.gmane.org; Mon, 06 Sep 2010 01:14:26 -0400 Original-Received: from [140.186.70.92] (port=51537 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OsU1y-0004rp-3H for emacs-devel@gnu.org; Mon, 06 Sep 2010 01:14:11 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OsU1w-0005L7-RZ for emacs-devel@gnu.org; Mon, 06 Sep 2010 01:14:10 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:34516) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OsU1u-0005JE-R6; Mon, 06 Sep 2010 01:14:07 -0400 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id o865E28f009545; Mon, 6 Sep 2010 14:14:02 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp3.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id o865E2xG029517; Mon, 6 Sep 2010 14:14:02 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp3.aist.go.jp with ESMTP id o865E1FT024586; Mon, 6 Sep 2010 14:14:01 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.71) (envelope-from ) id 1OsU1p-00084F-Fc; Mon, 06 Sep 2010 14:14:01 +0900 In-Reply-To: <201009042332.o84NWhSA017839@beta.mvs.co.il> (ehud@unix.mvs.co.il) X-detected-operating-system: by eggs.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129700 Archived-At: In article <201009042332.o84NWhSA017839@beta.mvs.co.il>, "Ehud Karni" writes: > I attach a tar.bz2 file with 3 files: > 1. lit1 - the sample file. > 2. lit1-tty.png - how it should show on text terminal. > 3. lit1-x.png - how it should show on X. > I can do it if I read the file with the iso-latin-1 coding-system > and change the display table to show the Hebrew glyphs for the Hebrew > [#xE0-#xFA] bytes. But in this way it is not Hebrew characters (e.g. > for the new bidi display). I want it the other way around, to read it > with hebrew-iso-8bit and to to tweak the display table to show all > the bytes not belonging to the Hebrew set. Does it mean that you want bidi-reordering for the bytes #xE0..#xFA (code-points of iso-8859-8) but bidi-reordering is not necessary for the bytes #x80..#x8A (code-points of cp862)? But, your file "lit1" contains #xE0..#xFA (code-points of iso-8859-8) at the second to 4th lines in visual order. If bidi-reordering is applied on them, you'll get the different view than lit1-tty.png and lit1-x.png. Is that ok? > I had similar problem a long time ago. In 2001 you suggested to use > the following code: > (make-coding-system > 'hebrew-iso-8bit 2 ?8 > "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)" > '(ascii hebrew-iso8859-8 nil nil > nil ascii-eol ascii-cntl nil nil nil nil nil t) > '((safe-charsets ascii hebrew-iso8859-8 eight-bit-control) > (mime-charset . iso-8859-8))) > May be I can define a new coding system that will have bytes #x80-#xFF > as legal characters and be recognized as Hebrew variant. This code will that. I think it's not difficult to understand what the code is doing. ------------------------------------------------------------ (define-charset 'cp862-sub "Subset of CP862" :code-space [#x80 #xDF] :subset '(cp862 #x80 #xDF #x00)) (define-charset 'iso-8859-8-sub "Subset of ISO-8859-8" :code-space [#xE0 #xFA] :subset '(iso-8859-8 #xE0 #xFA #x00)) (define-coding-system 'mix-hebrew "Mixture of ISO-8859-8 and CP862" :mnemonic ?H :coding-type 'charset :charset-list '(ascii iso-8859-8-sub cp862-sub) :ascii-compatible-p t) ------------------------------------------------------------ Please try C-x C-m c mix-hebrew RET lit1 RET. But, if you do that, you must consider the problem Eli wrote: In article , Eli Zaretskii writes: > But if you want all the Hebrew characters to be treated by Emacs as > such (e.g., for bidi display), no matter what's their encoding in the > file, you will have to define a coding-system that will decode them > all into Unicode codepoints of Hebrew characters. There's a problem > you will need to solve for defining such a coding system: it has 2 > different encodings for the same character, one from hebrew-iso-8bit, > the other from cp862. So you will need to decide how will Hebrew > characters be encoded when the file is saved. In the above definition of mix-hebrew, as iso-8859-8-sub is listed before cp862-sub, all Hebrew characters are encoded into bytes #xE0..#xFA even if they were originally decoded from bytes #x80..#x9A. If you don't like it, you must give up decoding bytes #x80..#x9A into Hebrew chars. You decode them as raw-bytes, and setup a display table to display them as Hebrew chars. It can be done by this code: ------------------------------------------------------------ (define-charset 'cp862-sub "Subset of CP862" :code-space [#x9B #xDF] :subset '(cp862 #x9B #xDF #x00)) (define-charset 'iso-8859-8-sub "Subset of ISO-8859-8" :code-space [#xE0 #xFA] :subset '(iso-8859-8 #xE0 #xFA #x00)) (define-coding-system 'mix-hebrew "Mixture of ISO-8859-8, CP862, and raw 8-bit bytes" :mnemonic ?H :coding-type 'charset :charset-list '(ascii iso-8859-8-sub cp862-sub eight-bit) :ascii-compatible-p t) (require 'disp-table) ;; Display bytes #x80..#x9A as Hebrew chars (code-points #xE0..#xFA of ;; ISO-8859-8). (dotimes (i #x1B) (aset standard-display-table (unibyte-char-to-multibyte (+ #x80 i)) (vector (decode-char 'iso-8859-8 (+ #xE0 i))))) ------------------------------------------------------------ This display-table setting works also on terminal as far as you set terminal coding system to mix-hebrew. --- Kenichi Handa handa@m17n.org