unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* TUTORIAL.bg and windows-1251
@ 2003-11-14 18:56 Ognyan Kulev
  2003-11-15 12:19 ` Ognyan Kulev
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-14 18:56 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]

Hi,

Updated etc/TUTORIAL.bg is attached.  Mostly punctuation errors are 
fixed and windows-1251 is used.  (The only reason that current 
TUTORIAL.bg uses koi8-r is because at that time Emacs couldn't use 
windows-1251 without evalling (codepage-setup 1251).)

Unfortunately, one problem with windows-1251 still remains.  When using 
windows-1251 coding system, Emacs tries to use iso10646-1 fonts even 
when it has no cyrillic characters[1]!  (Old cp1251 coding system 
doesn't have such problem.)  After conversation with Dave Love, he 
showed me customize option utf-fragment-on-decoding.  Setting this 
option fixes the problem (iso8859-5 font is used instead of iso10646-1 
one).  He said that deciding whether this option should be set by 
default is up to other people (e.g. Handa), so my question is: what 
about setting this customize option by default?  And if not, how the 
issue can be resolved in Emacs 21.4?

[1] http://mail.gnu.org/archive/html/emacs-pretest-bug/2003-10/msg00010.html

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

[-- Attachment #2: TUTORIAL.bg.gz --]
[-- Type: application/x-tar, Size: 18176 bytes --]

[-- Attachment #3: Type: text/plain, Size: 141 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-14 18:56 TUTORIAL.bg and windows-1251 Ognyan Kulev
@ 2003-11-15 12:19 ` Ognyan Kulev
  2003-11-26  7:33   ` Ognyan Kulev
  2003-11-15 14:24 ` Jason Rumney
  2003-11-17  7:21 ` Kenichi Handa
  2 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-15 12:19 UTC (permalink / raw)
  Cc: emacs-devel

Ognyan Kulev wrote:
> Updated etc/TUTORIAL.bg is attached.

Can you mention that Emacs 21.4 contains bulgarian translation of 
TUTORIAL in etc/NEWS?  Thank you.

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-14 18:56 TUTORIAL.bg and windows-1251 Ognyan Kulev
  2003-11-15 12:19 ` Ognyan Kulev
@ 2003-11-15 14:24 ` Jason Rumney
  2003-11-17  7:21 ` Kenichi Handa
  2 siblings, 0 replies; 32+ messages in thread
From: Jason Rumney @ 2003-11-15 14:24 UTC (permalink / raw)
  Cc: emacs-devel

Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:

> doesn't have such problem.)  After conversation with Dave Love, he
> showed me customize option utf-fragment-on-decoding.  Setting this
> option fixes the problem (iso8859-5 font is used instead of
> iso10646-1 
> one).  He said that deciding whether this option should be set by
> default is up to other people (e.g. Handa), so my question is: what
> about setting this customize option by default?

Given the problems with coverage of iso10646 fonts, having that set
by default seems like a good idea.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-14 18:56 TUTORIAL.bg and windows-1251 Ognyan Kulev
  2003-11-15 12:19 ` Ognyan Kulev
  2003-11-15 14:24 ` Jason Rumney
@ 2003-11-17  7:21 ` Kenichi Handa
  2003-11-18 15:49   ` Ognyan Kulev
  2 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2003-11-17  7:21 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FB52552.6090302@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> Updated etc/TUTORIAL.bg is attached.  Mostly punctuation errors are 
> fixed and windows-1251 is used.  (The only reason that current 
> TUTORIAL.bg uses koi8-r is because at that time Emacs couldn't use 
> windows-1251 without evalling (codepage-setup 1251).)

> Unfortunately, one problem with windows-1251 still remains.  When using 
> windows-1251 coding system, Emacs tries to use iso10646-1 fonts even 
> when it has no cyrillic characters[1]!  (Old cp1251 coding system 
> doesn't have such problem.)  After conversation with Dave Love, he 
> showed me customize option utf-fragment-on-decoding.  Setting this 
> option fixes the problem (iso8859-5 font is used instead of iso10646-1 
> one).  He said that deciding whether this option should be set by 
> default is up to other people (e.g. Handa), so my question is: what 
> about setting this customize option by default?  And if not, how the 
> issue can be resolved in Emacs 21.4?

I think the default handling of cyrillic characters must be
most convenient for native users.  But, there are many
languages that use cyrillic and their requests may conflict.
So I think we must start from adjusting each language
environment.  Once we found most language environments
require the same setting, we can make it the default.

For instance, what font Bulgarian people mostly use on X
Window for Cyrillic characters?  Is it iso8859-5 font?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-17  7:21 ` Kenichi Handa
@ 2003-11-18 15:49   ` Ognyan Kulev
  2003-11-24 23:55     ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-18 15:49 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> I think the default handling of cyrillic characters must be
> most convenient for native users.  But, there are many
> languages that use cyrillic and their requests may conflict.
> So I think we must start from adjusting each language
> environment.  Once we found most language environments
> require the same setting, we can make it the default.

Can X encoding be adjusted?  Isn't there only two choices for cyrillic: 
iso10646-1 and iso8859-5?

> For instance, what font Bulgarian people mostly use on X
> Window for Cyrillic characters?  Is it iso8859-5 font?

Well, most use Microsoft Core fonts ;-)  (For example, Vera fonts don't 
have cyrillic.)  The next one, suitable for Emacs, is cronyx-courier[1]. 
  I don't know of other fonts that are often used.  Both of these fonts 
has various encodings: iso10646-1, microsoft-cp1251, koi8-r and iso8859-5.

[1] Sorry for not providing link, but http://packages.debian.org don't 
work right now.  From there, you can type "xfonts-cronyx" and search for 
all packages whose name contains this string.  Home page is 
http://oldrus-ispell.sourceforge.net/

So it's best to focus on cronyx-courier.  It's part of a package, 
bglinux, that many bulgarians use.  The other half of the bulgarians use 
Debian, because maintainer of bglinux (Anton Zinoviev) is maintainer for 
xfonts-cronyx-* packages in Debian.  All of bglinux is already part of 
Debian.

The negative site of Debian packages is that each encoding of the four 
above mentioned has its own package.  So people sometimes install only 
microsoft-cp1251 and iso10646-1 fonts, without koi8-r and iso8859-5 ones.

Another problem with cronyx-courier is that it doesn't work when it's 
set in Default in Basic Faces customize group.  I've just posted 
question to comp.emacs.

What about the following: when mule-unicode-0100-24ff is used and the 
used iso10646-1 font doesn't contain wanted character (e.g. cyrillic 
one), then another font is searched that contains such character.  I 
think this will often end up in cronyx-courier.  Is this hard to be 
implemented?

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-18 15:49   ` Ognyan Kulev
@ 2003-11-24 23:55     ` Kenichi Handa
  2003-11-26  7:16       ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2003-11-24 23:55 UTC (permalink / raw)
  Cc: emacs-devel

Sorry for the late responses on this thread.  I'm now
involved in threads more than what my capacity allows.

In article <3FBA3F81.4010602@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:

> Kenichi Handa wrote:
>>  I think the default handling of cyrillic characters must be
>>  most convenient for native users.  But, there are many
>>  languages that use cyrillic and their requests may conflict.
>>  So I think we must start from adjusting each language
>>  environment.  Once we found most language environments
>>  require the same setting, we can make it the default.

> Can X encoding be adjusted?  Isn't there only two choices for cyrillic: 
> iso10646-1 and iso8859-5?

It seems that bg_BG locale of glibc, gtk, or XFree86 (I
don't know which is responsible for) encodes cyrillic
characters using extended segment with charset name
"microsoft-cp1251" in selection.

Please try the attached file.  It overrides the ctext
encoder/decoder so that microsoft-cp1251 is used on decoding
in Bulgarian lang. env.

[...]
> The negative site of Debian packages is that each encoding of the four 
> above mentioned has its own package.  So people sometimes install only 
> microsoft-cp1251 and iso10646-1 fonts, without koi8-r and iso8859-5 ones.

> Another problem with cronyx-courier is that it doesn't work when it's 
> set in Default in Basic Faces customize group.  I've just posted 
> question to comp.emacs.

> What about the following: when mule-unicode-0100-24ff is used and the 
> used iso10646-1 font doesn't contain wanted character (e.g. cyrillic 
> one), then another font is searched that contains such character.  I 
> think this will often end up in cronyx-courier.  Is this hard to be 
> implemented?

I've implemented it in emacs-unicode verion.  But, that
change requires various infrastructure of emacs-unicode, so
it's very difficult to back port it in HEAD.

Anyway, the attached ctext.el also contains a short code to
enable Emacs to display characters in windows-1251 by
microsoft-cp1251 font.  Please try to call
(use-microsoft-cp1251-font).

---
Ken'ichi HANDA
handa@m17n.org

--- ctext.el ---
(defvar ctext-non-standard-encodings-database
  '(("big5-0" big5 2 (chinese-big5-1 chinese-big5-2)))
  "Alist of non-standard character set encodings for CTEXT's extended segments.
Each element has the form (ENCODING-NAME CODING-SYSTEM N-OCTET CHARSET)
and provides information about how to use \"extended segments\"
with the encoding name ENCODING-NAME.

CODING-SYSTEM is the coding-system to encode the characters into
an extended segment.

N-OCTET is the number of octets (bytes) that encodes a character
in the segment.  It can be 0 (meaning the number of octets per
character is variable), 1, 2, 3, or 4.

CHARSET is a charater set containing characters that are encoded
as ENCODING-NAME.  It may be a list of character sets.  It may
also be a char-table, in which case characters that have non-nil
value in the char-table are the target.

On decoding CTEXT, all encoding names listed here are recognized.

On encoding CTEXT, encoding names in the variable
`ctext-non-standard-encodings-list' and in
`ctext-non-standard-encodings' property of the current language
environment are used.")

(defun ctext-post-read-conversion (len)
  "Decode LEN characters encoded as Compound Text with Extended Segments."
  (save-match-data
    (save-restriction
      (let ((case-fold-search nil)
	    (in-workbuf (string= (buffer-name) " *code-converting-work*"))
	    last-coding-system-used
	    pos bytes)
	(or in-workbuf
	    (narrow-to-region (point) (+ (point) len)))
	(decode-coding-region (point-min) (point-max) 'ctext)
	(if in-workbuf
	    (set-buffer-multibyte t))
	(while (re-search-forward ctext-non-standard-encodings-regexp
				  nil 'move)
	  (setq pos (match-beginning 0))
	  (if (match-beginning 1)
	      ;; ESC % / [0-4] M L --ENCODING-NAME-- \002 --BYTES--
	      (let* ((M (char-after (+ pos 4)))
		     (L (char-after (+ pos 5)))
		     (encoding (match-string 2))
		     (encoding-info (assoc-ignore-case 
				     encoding
				     ctext-non-standard-encodings-database))
		     (coding (if encoding-info
				 (nth 1 encoding-info)
			       (setq encoding (intern (downcase encoding)))
			       (and (coding-system-p encoding)
				    encoding))))
		(setq bytes (- (+ (* (- M 128) 128) (- L 128))
			       (- (point) (+ pos 6))))
		(when coding
		  (delete-region pos (point))
		  (forward-char bytes)
		  (decode-coding-region (- (point) bytes) (point) coding)))
	    ;; ESC % G --UTF-8-BYTES-- ESC % @
	    (setq bytes (- (point) pos))
	    (decode-coding-region (- (point) bytes) (point) 'utf-8))))
      (goto-char (point-min))
      (- (point-max) (point)))))

(defvar ctext-non-standard-encodings-list
  '("big5-0")
  "List of non-standard character set encoding names used in CTEXT.")

(defun ctext-non-standard-encodings-table ()
  (let ((table (make-char-table 'translation-table)))
    (dolist (encoding (reverse
		       (append
			(get-language-info current-language-environment
					   'ctext-non-standard-encodings)
			ctext-non-standard-encodings-list)))
      (let* ((slot (assoc encoding ctext-non-standard-encodings-database))
	     (charset (nth 3 slot)))
	(if charset
	    (cond ((charsetp charset)
		   (aset table (make-char charset) slot))
		  ((listp charset)
		   (dolist (elt charset)
		     (aset table (make-char elt) slot)))
		  ((char-table-p charset)
		   (map-char-table #'(lambda (k v) 
				   (if (and v (> k 128)) (aset table k slot)))
				   charset))))))
    table))

(defun ctext-pre-write-conversion (from to)
  "Encode characters between FROM and TO as Compound Text w/Extended Segments.

If FROM is a string, or if the current buffer is not the one set up for us
by encode-coding-string, generate a new temp buffer, insert the
text, and convert it in the temporary buffer.  Otherwise, convert in-place."
  (save-match-data
    ;; Setup a working buffer if necessary.
    (cond ((stringp from)
	   (let ((buf (current-buffer)))
	     (set-buffer (generate-new-buffer " *temp"))
	     (set-buffer-multibyte (multibyte-string-p from))
	     (insert from)))
	  ((not (string= (buffer-name) " *code-converting-work*"))
	   (let ((buf (current-buffer))
		 (multibyte enable-multibyte-characters))
	     (set-buffer (generate-new-buffer " *temp"))
	     (set-buffer-multibyte multibyte)
	     (insert-buffer-substring buf from to))))

    ;; Now we can encode the whole buffer.
    (let ((encoding-table (ctext-non-standard-encodings-table))
	  last-coding-system-used
	  last-pos last-encoding-info
	  pos encoding-info end-pos)
      (goto-char (setq last-pos (point-min)))
      (setq end-pos (point-marker))
      (while (re-search-forward "[^\000-\177]+" nil t)
	(setq last-pos (match-beginning 0)
	      last-encoding-info (aref encoding-table (char-after last-pos)))
	(set-marker end-pos (match-end 0))
	(goto-char (1+ last-pos))
	(catch 'tag
	  (while t
	    (setq encoding-info
		  (if (< (point) end-pos)
		      (aref encoding-table (following-char))))
	    (unless (eq last-encoding-info encoding-info)
	      (if last-encoding-info
		  (let ((encoding-name (car last-encoding-info))
			(coding-system (nth 1 last-encoding-info))
			(noctets (nth 2 last-encoding-info))
			len)
		    (encode-coding-region last-pos (point) coding-system)
		    (setq len (+ (length encoding-name) 1
				 (- (point) last-pos)))
		    (save-excursion
		      (goto-char last-pos)
		      (insert (string-to-multibyte 
			       (format "\e%%/%d%c%c%s\x02"
				       noctets
				       (+ (/ len 128) 128)
				       (+ (% len 128) 128)
				       encoding-name)))))
		(encode-coding-region last-pos (point) 'ctext-no-compositions))
	      (setq last-pos (point)
		    last-encoding-info encoding-info))
	    (if (< (point) end-pos)
		(forward-char 1)
	      (throw 'tag nil))))
	(if (< last-pos (point))
	    (encode-coding-region last-pos (point) 'ctext-no-compositions)))
      (set-marker end-pos nil)
      (goto-char (point-min))))
  ;; Must return nil, as build_annotations_2 expects that.
  nil)

;; The followings are to override the current settings.

(set-language-info "Bulgarian" 'ctext-non-standard-encodings
		   '("microsoft-cp1251"))

(let ((elt `("microsoft-cp1251" windows-1251 1
	     ,(get 'encode-windows-1251 'translation-table)))
      (slot (assoc "microsoft-cp1251" ctext-non-standard-encodings-database)))
  (if slot
      (setcdr slot (cdr elt))
    (push elt ctext-non-standard-encodings-database)))

(define-ccl-program ccl-encode-windows-1251-font
  '(0
    ((r1 <<= 7)
     (r1 += r2)
     (translate-character encode-windows-1251 r0 r1)
     )))

(let ((slot (assoc "microsoft-cp1251" font-ccl-encoder-alist)))
  (if slot
      (setcdr slot ccl-encode-windows-1251-font)
    (push '("microsoft-cp1251" . ccl-encode-windows-1251-font)
	  font-ccl-encoder-alist)))

(defun use-microsoft-cp1251-font ()
  (let ((fontspec '(nil . "microsoft-cp1251")))
    (map-char-table
     #'(lambda (k v) 
	 (if (and v (> k 128))
	     (set-fontset-font "fontset-default" k fontspec)))
     (get 'encode-windows-1251 'translation-table))))

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-24 23:55     ` Kenichi Handa
@ 2003-11-26  7:16       ` Ognyan Kulev
  2003-11-26  7:47         ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-26  7:16 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> In article <3FBA3F81.4010602@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>Can X encoding be adjusted?  Isn't there only two choices for cyrillic: 
>>iso10646-1 and iso8859-5?
> 
> It seems that bg_BG locale of glibc, gtk, or XFree86 (I
> don't know which is responsible for) encodes cyrillic
> characters using extended segment with charset name
> "microsoft-cp1251" in selection.

Yes, CP1251 in Bulgaria is what is KOI8-R in Russia.

> Please try the attached file.  It overrides the ctext
> encoder/decoder so that microsoft-cp1251 is used on decoding
> in Bulgarian lang. env.

After (use-microsoft-cp1251-font) in *scratch*, windows-1251 files are 
displayed correctly :-)

Is it problem if this function is always evaled when Emacs is started? 
One can want to view windows-1251 encoded file without being in 
Bulgarian language environment.  Will this case be resolved?

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-15 12:19 ` Ognyan Kulev
@ 2003-11-26  7:33   ` Ognyan Kulev
  0 siblings, 0 replies; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-26  7:33 UTC (permalink / raw)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This mail is just a reminder.  TUTORIAL.bg can be downloaded from
http://mail.gnu.org/archive/html/emacs-devel/2003-11/msg00180.html

Ognyan Kulev wrote:
| Ognyan Kulev wrote:
|> Updated etc/TUTORIAL.bg is attached.
| Can you mention that Emacs 21.4 contains bulgarian translation of
| TUTORIAL in etc/NEWS?  Thank you.

Regards
- --
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/xFcqgL86jKJSl4IRAnMMAKCJkWI/HgZxPg4VqbcwTKdqKxIvVQCeM2+y
+1UvgbCIA6axEfsOKoYnkH4=
=gM44
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-26  7:16       ` Ognyan Kulev
@ 2003-11-26  7:47         ` Kenichi Handa
  2003-11-26  8:30           ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2003-11-26  7:47 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FC45367.6070504@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> Kenichi Handa wrote:
>>  In article <3FBA3F81.4010602@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>> Can X encoding be adjusted?  Isn't there only two choices for cyrillic: 
>>> iso10646-1 and iso8859-5?
>>  
>>  It seems that bg_BG locale of glibc, gtk, or XFree86 (I
>>  don't know which is responsible for) encodes cyrillic
>>  characters using extended segment with charset name
>>  "microsoft-cp1251" in selection.

> Yes, CP1251 in Bulgaria is what is KOI8-R in Russia.

Have you tested copy&paste of Cyrillic text between Emacs
and the other applications (e.g. Mozilla)?  Did it work well?

>>  Please try the attached file.  It overrides the ctext
>>  encoder/decoder so that microsoft-cp1251 is used on decoding
>>  in Bulgarian lang. env.

> After (use-microsoft-cp1251-font) in *scratch*, windows-1251 files are 
> displayed correctly :-)

> Is it problem if this function is always evaled when Emacs is started? 
> One can want to view windows-1251 encoded file without being in 
> Bulgarian language environment.  Will this case be resolved?

But, if he is not in a lang. env. that mainly uses
windows-1251, I think we can't assume that he want to see
them in microsoft-cp1251 font.  I think the possibility that
he has iso10646 font is higher.

By the way, I'm now designing a facility to use
microsoft-cp1251 for Cyrillic characters automatically in
Bulgarian lang. env. by implementing something like this.

------------------------------------------------------------
set-overriding-fontspec-internal is a built-in function.

Internal use only.

FONTLIST is an alist of TARGET vs FONTNAME, where TARGET is a charset
or a char-table, FONTNAME have the same meanings as in
`set-fontset-font'.

It overrides the font specifications for each TARGET in the default
fontset by the corresponding FONTNAME.

If TARGET is a charset, targets are all characters in the charset.  If
TARGET is a char-table, targets are characters whose value is non-nil
in the table.

It is intended that this function is called only from
`set-language-environment'.
------------------------------------------------------------

Such lang. env. as Bulgarian will have `overrinding-fontset'
property whose value is used as an argument to the above
function.

It's a lightweight method that overrides the default fontset
but can be cancelled easily when we switch to the other
lang. env.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-26  7:47         ` Kenichi Handa
@ 2003-11-26  8:30           ` Ognyan Kulev
  2003-11-26 13:17             ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-26  8:30 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> In article <3FC45367.6070504@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> Have you tested copy&paste of Cyrillic text between Emacs
> and the other applications (e.g. Mozilla)?  Did it work well?

I've tested it now and it worked well in both directions.

>>Is it problem if this function is always evaled when Emacs is started? 
>>One can want to view windows-1251 encoded file without being in 
>>Bulgarian language environment.  Will this case be resolved?
> 
> But, if he is not in a lang. env. that mainly uses
> windows-1251, I think we can't assume that he want to see
> them in microsoft-cp1251 font.  I think the possibility that
> he has iso10646 font is higher.

You're right.  But we return back to the initial problem: I have 
-monotype-courier new-*-iso10646-1 and -cronyx-courier-*-iso10646-1, 
both with cyrillic characters.  But instead of these, 
-adobe-courier-*-iso10646-1 is used, which hasn't cyrillic characters. 
And I think that this -adobe-courier-*-iso10646-1 is part of XFree86, so 
it's almost everywhere and f^Hruins everything.  IMHO cyrillic 
characters in windows-1251 has to be mapped to koi8-r or iso8859-5 font 
  (when not in Bulgarian lang.env.), instead of iso10646-1 font.  But 
which of these two fonts is more popular world-wide -- I don't know.

> By the way, I'm now designing a facility to use
> microsoft-cp1251 for Cyrillic characters automatically in
> Bulgarian lang. env. by implementing something like this.

Given that Bulgarian language environment is automatically selected when 
LANG=bg_BG, I'm completely satisfied with this.

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-26  8:30           ` Ognyan Kulev
@ 2003-11-26 13:17             ` Kenichi Handa
  2003-11-26 14:08               ` Ognyan Kulev
  2003-12-03  8:34               ` Kenichi Handa
  0 siblings, 2 replies; 32+ messages in thread
From: Kenichi Handa @ 2003-11-26 13:17 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FC464C2.7010504@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>  Have you tested copy&paste of Cyrillic text between Emacs
>>  and the other applications (e.g. Mozilla)?  Did it work well?

> I've tested it now and it worked well in both directions.

Thank you for confirming it.   I'll install that change next
week (I must go to Kyoto soon).

>>  But, if he is not in a lang. env. that mainly uses
>>  windows-1251, I think we can't assume that he want to see
>>  them in microsoft-cp1251 font.  I think the possibility that
>>  he has iso10646 font is higher.

> You're right.  But we return back to the initial problem: I have 
> -monotype-courier new-*-iso10646-1 and -cronyx-courier-*-iso10646-1, 
> both with cyrillic characters.  But instead of these, 
> -adobe-courier-*-iso10646-1 is used, which hasn't cyrillic characters. 
> And I think that this -adobe-courier-*-iso10646-1 is part of XFree86, so 
> it's almost everywhere and f^Hruins everything.  

Don't you have -misc-fixed-*-iso10646-1?  They contains
cyrillic characters.  I thought they are included in the
latest XFree86.  At least my debian and redhat have them.

By the way, you can inhibit using
-adobe-courier-*-iso10646-1 by specifying it in
face-ignored-fonts.  Please try this.

(setq face-ignored-fonts '(("-adobe-courier-.*-iso10646-1")))

> IMHO cyrillic 
> characters in windows-1251 has to be mapped to koi8-r or iso8859-5 font 
>   (when not in Bulgarian lang.env.), instead of iso10646-1 font.  But 
> which of these two fonts is more popular world-wide -- I don't know.

It seems that both koi8-r and iso8859-5 doesn't covers all
characters in windows-1251.  Then, I think iso10646-1 is
better (as far as it contains cyrillic chars).

>>  By the way, I'm now designing a facility to use
>>  microsoft-cp1251 for Cyrillic characters automatically in
>>  Bulgarian lang. env. by implementing something like this.

> Given that Bulgarian language environment is automatically selected when 
> LANG=bg_BG,

Yes.

>  I'm completely satisfied with this.

Ok, I'll keep on working on it.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-26 13:17             ` Kenichi Handa
@ 2003-11-26 14:08               ` Ognyan Kulev
  2003-12-03  8:34               ` Kenichi Handa
  1 sibling, 0 replies; 32+ messages in thread
From: Ognyan Kulev @ 2003-11-26 14:08 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> Don't you have -misc-fixed-*-iso10646-1?  They contains
> cyrillic characters.  I thought they are included in the
> latest XFree86.  At least my debian and redhat have them.

I see this font now.  It's encoded only in iso10646-1 and iso8859-1 and 
that's why I didn't notice it before.

> By the way, you can inhibit using
> -adobe-courier-*-iso10646-1 by specifying it in
> face-ignored-fonts.  Please try this.
> 
> (setq face-ignored-fonts '(("-adobe-courier-.*-iso10646-1")))

I tried with

(setq face-ignored-fonts '(("-Adobe-Courier-.*-ISO10646-1")
			   ("-adobe-courier-.*-iso10646-1")))

but this didn't help.  After evaling this and opening my 
windows-1251-encoded file, it is tried to be shown with 
-Adobe-Courier-Medium-R-Normal--17-120-100-100-M-100-ISO10646-1 (again). 
  Should I report this to emacs-pretest-bug?

 > It seems that both koi8-r and iso8859-5 doesn't covers all
 > characters in windows-1251.  Then, I think iso10646-1 is
 > better (as far as it contains cyrillic chars).

OK.

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-11-26 13:17             ` Kenichi Handa
  2003-11-26 14:08               ` Ognyan Kulev
@ 2003-12-03  8:34               ` Kenichi Handa
  2003-12-04 16:28                 ` Ognyan Kulev
  1 sibling, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2003-12-03  8:34 UTC (permalink / raw)
  Cc: ogi, emacs-devel

In article <200311261317.WAA27673@etlken.m17n.org>, Kenichi Handa <handa@m17n.org> writes:
> In article <3FC464C2.7010504@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>>   Have you tested copy&paste of Cyrillic text between Emacs
>>>   and the other applications (e.g. Mozilla)?  Did it work well?

>>  I've tested it now and it worked well in both directions.

> Thank you for confirming it.   I'll install that change next
> week (I must go to Kyoto soon).

I've just commited that change in CVS HEAD.

As the change contains full re-write of ctext handling
functions, I'd like to ask the other people to test
cut&paste of non-ASCII texts from/to various X clients.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-03  8:34               ` Kenichi Handa
@ 2003-12-04 16:28                 ` Ognyan Kulev
  2003-12-04 23:28                   ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2003-12-04 16:28 UTC (permalink / raw)
  Cc: emacs-devel

(About ctext stuff)

Kenichi Handa wrote:
> I've just commited that change in CVS HEAD.

Unfortunately, it doesn't work in Emacs[1].  (The ctext.el that you sent 
me in previous mail works.)  emacs is started with LANG=bg_BG.  I'm 
setting lang.env. to English and back to Bulgarian, but nothing helped.

[1] Checked out on 1st December, shortly after your commit -- I don't 
know if you committed other stuff since then.

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.net}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-04 16:28                 ` Ognyan Kulev
@ 2003-12-04 23:28                   ` Kenichi Handa
  2003-12-31 15:06                     ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2003-12-04 23:28 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FCF60C9.8060009@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> (About ctext stuff)
> Kenichi Handa wrote:
>>  I've just commited that change in CVS HEAD.

> Unfortunately, it doesn't work in Emacs[1].  (The ctext.el that you sent 
> me in previous mail works.)  emacs is started with LANG=bg_BG.  I'm 
> setting lang.env. to English and back to Bulgarian, but nothing helped.

Please check the value of
`ctext-non-standard-encodings-alist'.   Does it contain an
element something like:
  ("microsoft-cp1251" windows-1251 1 CHAR-TABLE)
If not, please byte compile lisp/language/cyrillic.el and
make Emacs again.

> [1] Checked out on 1st December, shortly after your commit -- I don't 
> know if you committed other stuff since then.

These are the all changes.

2003-12-03  Kenichi Handa  <handa@m17n.org>

	* language/cyrillic.el: Register "microsoft-cp1251" in
	ctext-non-standard-encodings-alist.
	("Bulgarian"): Add ctext-non-standard-encodings.
	("Belarusian"): Likewise.

	* international/mule-conf.el (compound-text-with-extensions):
	Change the type to 2 (iso-2022 base).

	* international/mule.el (ctext-non-standard-encodings-alist):
	Change the format.
	(ctext-non-standard-encodings): New variable.
	(ctext-post-read-conversion): Fully re-written.
	(ctext-non-standard-designations-alist): Delete it.
	(ctext-non-standard-encodings-table): New function.
	(ctext-pre-write-conversion): Fully re-written.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-04 23:28                   ` Kenichi Handa
@ 2003-12-31 15:06                     ` Ognyan Kulev
  2003-12-31 15:54                       ` Eli Zaretskii
  2004-01-05  4:14                       ` Kenichi Handa
  0 siblings, 2 replies; 32+ messages in thread
From: Ognyan Kulev @ 2003-12-31 15:06 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> In article <3FCF60C9.8060009@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>Unfortunately, it doesn't work in Emacs[1].  (The ctext.el that you sent 
>>me in previous mail works.)  emacs is started with LANG=bg_BG.  I'm 
>>setting lang.env. to English and back to Bulgarian, but nothing helped.
> 
> Please check the value of
> `ctext-non-standard-encodings-alist'.   Does it contain an
> element something like:
>   ("microsoft-cp1251" windows-1251 1 CHAR-TABLE)
> If not, please byte compile lisp/language/cyrillic.el and
> make Emacs again.

Today (2003-12-31) I've checked out emacs and tested it again. 
Unfortunately, it doesn't work as expected -- microsoft-cp1251 font 
encoding is not used.

ctext-non-standard-encodings-alist contains microsoft-cp1251.

I tried to understand the code, but without great success.  So 
ctext-pre-write-conversion seems the only place that uses 
ctext-non-standard-encodings property of current language environment 
(via ctext-non-standard-encodings-table).  Do I understand it right that 
somehow all rendering of text to X is done via the "special" compound 
text (ctext) coding system, while buffer can be in other coding system? 
     If not, how this ctext coding system is used, and, consequently, 
ctext-non-standard-encodings property?   (I just try to help getting 
this thing working.)

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.org}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-31 15:06                     ` Ognyan Kulev
@ 2003-12-31 15:54                       ` Eli Zaretskii
  2004-01-05  4:20                         ` Kenichi Handa
  2004-01-05  4:14                       ` Kenichi Handa
  1 sibling, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2003-12-31 15:54 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Wed, 31 Dec 2003 17:06:07 +0200
> From: Ognyan Kulev <ogi@fmi.uni-sofia.bg>
> 
> Do I understand it right that somehow all rendering of text to X is
> done via the "special" compound text (ctext) coding system [...]?

I don't think so.  See xfns.c:x_encode_text for the gory details.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-31 15:06                     ` Ognyan Kulev
  2003-12-31 15:54                       ` Eli Zaretskii
@ 2004-01-05  4:14                       ` Kenichi Handa
  2004-01-06 12:03                         ` YAMAMOTO Mitsuharu
  2004-01-07 16:22                         ` Ognyan Kulev
  1 sibling, 2 replies; 32+ messages in thread
From: Kenichi Handa @ 2004-01-05  4:14 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FF2E5DF.4090906@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> Today (2003-12-31) I've checked out emacs and tested it again. 
> Unfortunately, it doesn't work as expected -- microsoft-cp1251 font 
> encoding is not used.

> ctext-non-standard-encodings-alist contains microsoft-cp1251.

Could you try this code?

(let ((lang-env current-language-environment)
      (mirror-R (string (decode-char 'ucs #x42f)))
      (hex-print #'(lambda (head str) 
		     (insert head)
		     (dotimes (i (length str))
		       (let ((ch (aref str i)))
			 (if (< ch 128)
			     (insert ch)
			   (insert (format "\\x%X" (aref str i))))))
		     (insert "\n")))
      encoded decoded)
  (funcall hex-print "original:" mirror-R)
  (set-language-environment "Bulgarian")
  (setq encoded (encode-coding-string mirror-R 'ctext-with-extensions))
  (funcall hex-print "encoded: " encoded)
  (setq decoded (decode-coding-string encoded 'ctext-with-extensions))
  (funcall hex-print "decoded: " decoded)
  (set-language-environment "English")
  (setq encoded (encode-coding-string mirror-R 'ctext-with-extensions))
  (funcall hex-print "encoded: " encoded)
  (setq decoded (decode-coding-string encoded 'ctext-with-extensions))
  (funcall hex-print "decoded: " decoded)
  (set-language-environment lang-env))

The result I got is this.

original:\x5144F
encoded: ^[%/1\x80\x92microsoft-cp1251\x02\xDF
decoded: \x5144F
encoded: ^[%G\xD0\xAF^[%@
decoded: \x5144F

It seems that the coding system ctext-with-extensions is
working as expected here.

> I tried to understand the code, but without great success.  So 
> ctext-pre-write-conversion seems the only place that uses 
> ctext-non-standard-encodings property of current language environment 
> (via ctext-non-standard-encodings-table).

Yes.

> Do I understand it right that 
> somehow all rendering of text to X is done via the "special" compound 
> text (ctext) coding system, while buffer can be in other coding system? 
>      If not, how this ctext coding system is used, and, consequently, 
> ctext-non-standard-encodings property?   (I just try to help getting 
> this thing working.)

Rendering is not relevant to the current problem.  When
Emacs accepts a selection request, it encodes the currently
selected text by a coding-system bound to
selection-coding-system.  By default, ctext-with-extensions
(alias of compound-text-with-extensions) is bound.  And,
`pre-write-conversion' property of ctext-with-extensions is
ctext-pre-write-conversion.  So, this function is called
before the actual encoding is done.

By the way, for rendering, I installed the code I proposed a
while ago which forces *-microsoft-cp1251 fonts to be used
for Cyrillic letters of the charset mule-unicode-0100-24ff
in Bulgarian environment on 2003-12-29.  Have you noticed
it?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2003-12-31 15:54                       ` Eli Zaretskii
@ 2004-01-05  4:20                         ` Kenichi Handa
  0 siblings, 0 replies; 32+ messages in thread
From: Kenichi Handa @ 2004-01-05  4:20 UTC (permalink / raw)
  Cc: ogi, emacs-devel

In article <216-Wed31Dec2003175412+0200-eliz@elta.co.il>, "Eli Zaretskii" <eliz@elta.co.il> writes:
>>  Do I understand it right that somehow all rendering of text to X is
>>  done via the "special" compound text (ctext) coding system [...]?

> I don't think so.  See xfns.c:x_encode_text for the gory details.

Now x_encode_text is not used for selection.  I installed a
code to do conversion on selection in Lisp a while ago.
x-select-convert-to-string in select.el and
x-cut-buffer-or-selection-value in term/x-win.el are the
relevant functions.

x_encode_text is currently used only for setting frame title
and icon title.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-05  4:14                       ` Kenichi Handa
@ 2004-01-06 12:03                         ` YAMAMOTO Mitsuharu
  2004-01-07  0:25                           ` Kenichi Handa
  2004-01-07 16:22                         ` Ognyan Kulev
  1 sibling, 1 reply; 32+ messages in thread
From: YAMAMOTO Mitsuharu @ 2004-01-06 12:03 UTC (permalink / raw)


>>>>> On Mon, 5 Jan 2004 13:14:39 +0900 (JST), Kenichi Handa <handa@m17n.org> said:

> By the way, for rendering, I installed the code I proposed a while
> ago which forces *-microsoft-cp1251 fonts to be used for Cyrillic
> letters of the charset mule-unicode-0100-24ff in Bulgarian
> environment on 2003-12-29.  Have you noticed it?

After this change, I'm experiencing slow redisplay of multibyte
characters in Mac OS X (with both Carbon and X).  Strangely, it does
not cause any slowdown for me in Solaris 8.

I tried to partially revert to the previous version, and the original
redisplay speed came back with the following change.

--- fontset.c~	Mon Dec 29 18:10:36 2003
+++ fontset.c	Tue Jan  6 20:48:22 2004
@@ -305,7 +305,7 @@
     elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
   if (NILP (elt))
     elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
-  if (NILP (elt) && ! EQ (FONTSET_BASE (fontset), Vdefault_fontset))
+  if (NILP (elt) && ! EQ (fontset, Vdefault_fontset))
     elt = FONTSET_REF (Vdefault_fontset, *c);
   if (NILP (elt))
     return Qnil;

But I'm not sure I'm doing the right thing.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-06 12:03                         ` YAMAMOTO Mitsuharu
@ 2004-01-07  0:25                           ` Kenichi Handa
  2004-01-07  1:32                             ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2004-01-07  0:25 UTC (permalink / raw)
  Cc: emacs-devel

In article <wl65fp5lnj.wl@church.math.s.chiba-u.ac.jp>, YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:
>>  By the way, for rendering, I installed the code I proposed a while
>>  ago which forces *-microsoft-cp1251 fonts to be used for Cyrillic
>>  letters of the charset mule-unicode-0100-24ff in Bulgarian
>>  environment on 2003-12-29.  Have you noticed it?

> After this change, I'm experiencing slow redisplay of multibyte
> characters in Mac OS X (with both Carbon and X).  Strangely, it does
> not cause any slowdown for me in Solaris 8.

> I tried to partially revert to the previous version, and the original
> redisplay speed came back with the following change.

> --- fontset.c~	Mon Dec 29 18:10:36 2003
> +++ fontset.c	Tue Jan  6 20:48:22 2004
> @@ -305,7 +305,7 @@
>      elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
>    if (NILP (elt))
>      elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
> -  if (NILP (elt) && ! EQ (FONTSET_BASE (fontset), Vdefault_fontset))
> +  if (NILP (elt) && ! EQ (fontset, Vdefault_fontset))
>      elt = FONTSET_REF (Vdefault_fontset, *c);
>    if (NILP (elt))
>      return Qnil;

Thank you for the report.  I found what is wrong.  The above
change is not good.  I've just committed the patch below.
Please try it.

*** fontset.c.~1.82.~	Mon Dec 29 14:12:04 2003
--- fontset.c	Wed Jan  7 09:11:52 2004
***************
*** 305,311 ****
      elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
    if (NILP (elt))
      elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
!   if (NILP (elt) && ! EQ (FONTSET_BASE (fontset), Vdefault_fontset))
      elt = FONTSET_REF (Vdefault_fontset, *c);
    if (NILP (elt))
      return Qnil;
--- 305,311 ----
      elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
    if (NILP (elt))
      elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
!   if (NILP (elt))
      elt = FONTSET_REF (Vdefault_fontset, *c);
    if (NILP (elt))
      return Qnil;

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-07  0:25                           ` Kenichi Handa
@ 2004-01-07  1:32                             ` YAMAMOTO Mitsuharu
  0 siblings, 0 replies; 32+ messages in thread
From: YAMAMOTO Mitsuharu @ 2004-01-07  1:32 UTC (permalink / raw)


>>>>> On Wed, 7 Jan 2004 09:25:43 +0900 (JST), Kenichi Handa <handa@m17n.org> said:

>> After this change, I'm experiencing slow redisplay of multibyte
>> characters in Mac OS X (with both Carbon and X).  Strangely, it
>> does not cause any slowdown for me in Solaris 8.

>> I tried to partially revert to the previous version, and the
>> original redisplay speed came back with the following change.

> Thank you for the report.  I found what is wrong.  The above change
> is not good.  I've just committed the patch below.  Please try it.

It works fine.  Thanks.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-05  4:14                       ` Kenichi Handa
  2004-01-06 12:03                         ` YAMAMOTO Mitsuharu
@ 2004-01-07 16:22                         ` Ognyan Kulev
  2004-01-07 23:58                           ` Kenichi Handa
  1 sibling, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2004-01-07 16:22 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> (let ((lang-env current-language-environment)
>       (mirror-R (string (decode-char 'ucs #x42f)))
>       (hex-print #'(lambda (head str) 
> 		     (insert head)
> 		     (dotimes (i (length str))
> 		       (let ((ch (aref str i)))
> 			 (if (< ch 128)
> 			     (insert ch)
> 			   (insert (format "\\x%X" (aref str i))))))
> 		     (insert "\n")))
>       encoded decoded)
>   (funcall hex-print "original:" mirror-R)
>   (set-language-environment "Bulgarian")
>   (setq encoded (encode-coding-string mirror-R 'ctext-with-extensions))
>   (funcall hex-print "encoded: " encoded)
>   (setq decoded (decode-coding-string encoded 'ctext-with-extensions))
>   (funcall hex-print "decoded: " decoded)
>   (set-language-environment "English")
>   (setq encoded (encode-coding-string mirror-R 'ctext-with-extensions))
>   (funcall hex-print "encoded: " encoded)
>   (setq decoded (decode-coding-string encoded 'ctext-with-extensions))
>   (funcall hex-print "decoded: " decoded)
>   (set-language-environment lang-env))
> 
> The result I got is this.
> 
> original:\x5144F
> encoded: ^[%/1\x80\x92microsoft-cp1251\x02\xDF
> decoded: \x5144F
> encoded: ^[%G\xD0\xAF^[%@
> decoded: \x5144F
> 
> It seems that the coding system ctext-with-extensions is
> working as expected here.

I get the same here.

> By the way, for rendering, I installed the code I proposed a
> while ago which forces *-microsoft-cp1251 fonts to be used
> for Cyrillic letters of the charset mule-unicode-0100-24ff
> in Bulgarian environment on 2003-12-29.  Have you noticed
> it?

Wait!  My report is exactly about that change not working.  What you 
sent[1] in pure elisp works as expected though.

[1] http://mail.gnu.org/archive/html/emacs-devel/2003-11/msg00452.html

When in current (2004-01-07) emacs I eval the following elisp (written 
by you), microsoft-cp1251 font is used for cyrillic characters.  But 
just setting language environment to bulgarian doesn't work and 
iso10646-1 is used.

(defun use-microsoft-cp1251-font ()
   (let ((fontspec '(nil . "microsoft-cp1251")))
     (map-char-table
      #'(lambda (k v)
	 (if (and v (> k 128))
	     (set-fontset-font "fontset-default" k fontspec)))
      (get 'encode-windows-1251 'translation-table))))

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.org}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-07 16:22                         ` Ognyan Kulev
@ 2004-01-07 23:58                           ` Kenichi Handa
  2004-01-09 16:10                             ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2004-01-07 23:58 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FFC3249.3010501@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>  By the way, for rendering, I installed the code I proposed a
>>  while ago which forces *-microsoft-cp1251 fonts to be used
>>  for Cyrillic letters of the charset mule-unicode-0100-24ff
>>  in Bulgarian environment on 2003-12-29.  Have you noticed
>>  it?

> Wait!  My report is exactly about that change not working.  What you 
> sent[1] in pure elisp works as expected though.

> [1] http://mail.gnu.org/archive/html/emacs-devel/2003-11/msg00452.html

???  I've thought that your bug report was about CTEXT
encoding in copy&paste, not about font because you wrote
something doesn't work before I install the change for font
selection.

So, do you mean that copy&paste has no problem now?

> When in current (2004-01-07) emacs I eval the following elisp (written 
> by you), microsoft-cp1251 font is used for cyrillic characters.  But 
> just setting language environment to bulgarian doesn't work and 
> iso10646-1 is used.

> (defun use-microsoft-cp1251-font ()
>    (let ((fontspec '(nil . "microsoft-cp1251")))
>      (map-char-table
>       #'(lambda (k v)
> 	 (if (and v (> k 128))
> 	     (set-fontset-font "fontset-default" k fontspec)))
>       (get 'encode-windows-1251 'translation-table))))

Please tell me what this returns:

(get-language-info "Bulgarian" 'overriding-fontspec)

and the effect of this.

(set-overriding-fontspec-internal
 (get-language-info "Bulgarian" 'overriding-fontspec))

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-07 23:58                           ` Kenichi Handa
@ 2004-01-09 16:10                             ` Ognyan Kulev
  2004-01-13  4:07                               ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2004-01-09 16:10 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> In article <3FFC3249.3010501@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
>>[1] http://mail.gnu.org/archive/html/emacs-devel/2003-11/msg00452.html
> 
> ???  I've thought that your bug report was about CTEXT
> encoding in copy&paste, not about font because you wrote
> something doesn't work before I install the change for font
> selection.

I'm sorry if I've made confusion.  I am reporting about rendering.

 > So, do you mean that copy&paste has no problem now?

Just copy&paste in both directions works fine.  But how can I be sure 
that the text is encoded with microsoft-cp1251, not with iso10646-1? 
What program would show the difference?

> Please tell me what this returns:
> 
> (get-language-info "Bulgarian" 'overriding-fontspec)

((#^[t nil nil nil nil nil nil nil nil nil nil nil ...] nil . 
"microsoft-cp1251") (#^[t nil nil nil nil nil nil nil nil nil nil nil 
...] nil . "koi8-r"))

Is it normal that there are (almost) only nil:s?

> and the effect of this.
> 
> (set-overriding-fontspec-internal
>  (get-language-info "Bulgarian" 'overriding-fontspec))

I see no effect after evaling it :-(  That is, cyrillic characters are 
still shown in iso10646-1 instead of microsoft-cp1251.  (CVS 2004-01-09 
and "emacs -q".)

Regards
-- 
Ognyan Kulev <ogi@{fmi.uni-sofia.bg,fsa-bg.org,jabber.org}>
7D9F 66E6 68B7 A62B 0FCF  EB04 80BF 3A8C A252 9782

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-09 16:10                             ` Ognyan Kulev
@ 2004-01-13  4:07                               ` Kenichi Handa
  2004-01-14 11:42                                 ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2004-01-13  4:07 UTC (permalink / raw)
  Cc: emacs-devel

In article <3FFED268.3020908@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> Just copy&paste in both directions works fine.  But how can I be sure 
> that the text is encoded with microsoft-cp1251, not with iso10646-1? 
> What program would show the difference?

If (encode-coding-string CYRILLIC_STRING
'ctext-with-extensions) produces a string that contains
"microsoft-cp1251", it means that Emacs is using
microsoft-cp1251 extended seqement in X selection.  And if
copy&paste works fine, that means that the encoding is in a
correct format.

>>  Please tell me what this returns:
>>  
>>  (get-language-info "Bulgarian" 'overriding-fontspec)

> ((#^[t nil nil nil nil nil nil nil nil nil nil nil ...] nil . 
> "microsoft-cp1251") (#^[t nil nil nil nil nil nil nil nil nil nil nil 
> ...] nil . "koi8-r"))

> Is it normal that there are (almost) only nil:s?

Yes.

>>  and the effect of this.
>>  
>>  (set-overriding-fontspec-internal
>>   (get-language-info "Bulgarian" 'overriding-fontspec))

> I see no effect after evaling it :-(  That is, cyrillic characters are 
> still shown in iso10646-1 instead of microsoft-cp1251.  (CVS 2004-01-09 
> and "emacs -q".)

Hmmm strange.  What is the result of this?

(x-resolve-font-name "*-microsoft-cp1251")

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-13  4:07                               ` Kenichi Handa
@ 2004-01-14 11:42                                 ` Ognyan Kulev
  2004-01-14 12:10                                   ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2004-01-14 11:42 UTC (permalink / raw)
  Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]

Kenichi Handa wrote:
> In article <3FFED268.3020908@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> 
>>Just copy&paste in both directions works fine.  But how can I be sure 
>>that the text is encoded with microsoft-cp1251, not with iso10646-1? 
>>What program would show the difference?
> 
> 

> If (encode-coding-string CYRILLIC_STRING
> 'ctext-with-extensions) produces a string that contains
> "microsoft-cp1251", it means that Emacs is using
> microsoft-cp1251 extended seqement in X selection.  And if
> copy&paste works fine, that means that the encoding is in a
> correct format.

I tested with encode-coding-string and all is OK.

> Hmmm strange.  What is the result of this?
> 
> (x-resolve-font-name "*-microsoft-cp1251")

We can exchange mails this way for another month, so I decided to dive 
into the Source ;-)  It seems override-fontspec hasn't enough 
"priority".  To see what I mean, take a look at the attached patch. 
After applying it,  microsoft-cp1251 is used in rendering.

Regards,
ogi

[-- Attachment #2: fontset.diff --]
[-- Type: text/x-patch, Size: 1920 bytes --]

Index: fontset.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/fontset.c,v
retrieving revision 1.83
diff -u -p -r1.83 fontset.c
--- fontset.c	7 Jan 2004 00:21:53 -0000	1.83
+++ fontset.c	14 Jan 2004 11:36:24 -0000
@@ -254,6 +254,9 @@ lookup_overriding_fontspec (frame, c)
 {
   Lisp_Object tail;
 
+  printf ("lookup_overriding_fontspec %#x\n", c);
+  fflush (stdout);
+
   for (tail = Voverriding_fontspec_alist; CONSP (tail); tail = XCDR (tail))
     {
       Lisp_Object val, target, elt;
@@ -300,11 +303,17 @@ fontset_ref_via_base (fontset, c)
   if (SINGLE_BYTE_CHAR_P (*c))
     return FONTSET_ASCII (fontset);
 
+#if 0
   elt = Qnil;
   if (! EQ (FONTSET_BASE (fontset), Vdefault_fontset))
     elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
   if (NILP (elt))
     elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
+#else
+  elt = lookup_overriding_fontspec (FONTSET_FRAME (fontset), *c);
+  if (NILP (elt) && ! EQ (FONTSET_BASE (fontset), Vdefault_fontset))
+    elt = FONTSET_REF (FONTSET_BASE (fontset), *c);
+#endif
   if (NILP (elt))
     elt = FONTSET_REF (Vdefault_fontset, *c);
   if (NILP (elt))
@@ -592,6 +601,7 @@ fontset_font_pattern (f, id, c)
   Lisp_Object fontset, elt;
   struct font_info *fontp;
 
+#if 0
   elt = Qnil;
   if (fontset_id_valid_p (id))
     {
@@ -607,6 +617,21 @@ fontset_font_pattern (f, id, c)
       XSETFRAME (frame, f);
       elt = lookup_overriding_fontspec (frame, c);
     }
+#else
+  {
+    Lisp_Object frame;
+    
+    XSETFRAME (frame, f);
+    elt = lookup_overriding_fontspec (frame, c);
+  }
+  if (NILP (elt) && fontset_id_valid_p (id))
+    {
+      fontset = FONTSET_FROM_ID (id);
+      xassert (!BASE_FONTSET_P (fontset));
+      fontset = FONTSET_BASE (fontset);
+      elt = FONTSET_REF (fontset, c);
+    }
+#endif
   if (NILP (elt))
     elt = FONTSET_REF (Vdefault_fontset, c);
 

[-- Attachment #3: Type: text/plain, Size: 141 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-14 11:42                                 ` Ognyan Kulev
@ 2004-01-14 12:10                                   ` Kenichi Handa
  2004-01-17 19:31                                     ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2004-01-14 12:10 UTC (permalink / raw)
  Cc: emacs-devel

In article <40052B24.2030803@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> We can exchange mails this way for another month, so I decided to dive 
> into the Source ;-)

Thank you for that!  But...

> It seems override-fontspec hasn't enough 
> "priority".  To see what I mean, take a look at the attached patch. 
> After applying it,  microsoft-cp1251 is used in rendering.

That's still strange.  You wrote that microsoft-cp1251 font
is used when you modify the default fontset by this code:

(defun use-microsoft-cp1251-font ()
   (let ((fontspec '(nil . "microsoft-cp1251")))
     (map-char-table
      #'(lambda (k v)
	 (if (and v (> k 128))
	     (set-fontset-font "fontset-default" k fontspec)))
      (get 'encode-windows-1251 'translation-table))))

That means that you are using the default fontset, thus
making override-fontspec have the higher priority than the
defualt fontset should be enough (as the current Emacs
code).

And, if a user is using a fontset (other than the default
fontset) that specifies some font for Cyrillic characters, I
think override-fontspec should not have the higher priority
than that fontset.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-14 12:10                                   ` Kenichi Handa
@ 2004-01-17 19:31                                     ` Ognyan Kulev
  2004-01-19  0:34                                       ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2004-01-17 19:31 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> In article <40052B24.2030803@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> That's still strange.  You wrote that microsoft-cp1251 font
> is used when you modify the default fontset by this code:
> 
> (defun use-microsoft-cp1251-font ()
>    (let ((fontspec '(nil . "microsoft-cp1251")))
>      (map-char-table
>       #'(lambda (k v)
> 	 (if (and v (> k 128))
> 	     (set-fontset-font "fontset-default" k fontspec)))
>       (get 'encode-windows-1251 'translation-table))))
> 
> That means that you are using the default fontset, thus
> making override-fontspec have the higher priority than the
> defualt fontset should be enough (as the current Emacs
> code).
> 
> And, if a user is using a fontset (other than the default
> fontset) that specifies some font for Cyrillic characters, I
> think override-fontspec should not have the higher priority
> than that fontset.

I agree with that.

The patch I've send you raises priority in two functions.  Actually, 
only raising it in fontset_font_pattern gives effect.  So when run with 
"emacs -q" and cyrillic character is being displayed, is it expected 
that "FONTSET (fontset, c)" fails in function fontset_font_pattern, or 
"fontset_id_valid_p (id)" is false?  One of these must fail somewhat in 
order lookup_overriding_fontspec to be used.

Regards,
ogi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-17 19:31                                     ` Ognyan Kulev
@ 2004-01-19  0:34                                       ` Kenichi Handa
  2004-01-21  6:45                                         ` Ognyan Kulev
  0 siblings, 1 reply; 32+ messages in thread
From: Kenichi Handa @ 2004-01-19  0:34 UTC (permalink / raw)
  Cc: emacs-devel

In article <40098D96.90205@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:
> The patch I've send you raises priority in two functions.  Actually, 
> only raising it in fontset_font_pattern gives effect.  So when run with 
> "emacs -q" and cyrillic character is being displayed, is it expected 
> that "FONTSET (fontset, c)" fails in function fontset_font_pattern, or 
> "fontset_id_valid_p (id)" is false?  One of these must fail somewhat in 
> order lookup_overriding_fontspec to be used.

Thank you!!  That question makes me realize what was wrong
in my previous code.  As I installed a fix, please try again
with the latest CVS code.

In my environment, X resource "Font" is set.  Thus, Emacs
creates a fontset "fontset-startup" from that font which
doesn't specify any font for Cyrillic.  Thus, FONTSET_REF
(fontset, c) returns nil in my case.  But, if X resource
"Font" is not set (I think it's your case), Emacs uses the
default fontset.  So, FONTSET_REF (fontset, c) returns
ISO10646-1 font for Cyrillic.  The fix is apperant as below.
It makes the logic for font finding the same as
fontset_ref_via_base.

---
Ken'ichi HANDA
handa@m17n.org

Index: fontset.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/fontset.c,v
retrieving revision 1.83
retrieving revision 1.84
diff -u -c -r1.83 -r1.84
cvs server: conflicting specifications of output style
*** fontset.c	7 Jan 2004 00:21:53 -0000	1.83
--- fontset.c	19 Jan 2004 00:22:03 -0000	1.84
***************
*** 598,604 ****
        fontset = FONTSET_FROM_ID (id);
        xassert (!BASE_FONTSET_P (fontset));
        fontset = FONTSET_BASE (fontset);
!       elt = FONTSET_REF (fontset, c);
      }
    if (NILP (elt))
      {
--- 598,605 ----
        fontset = FONTSET_FROM_ID (id);
        xassert (!BASE_FONTSET_P (fontset));
        fontset = FONTSET_BASE (fontset);
!       if (! EQ (fontset, Vdefault_fontset))
! 	elt = FONTSET_REF (fontset, c);
      }
    if (NILP (elt))
      {

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-19  0:34                                       ` Kenichi Handa
@ 2004-01-21  6:45                                         ` Ognyan Kulev
  2004-01-21 10:52                                           ` Kenichi Handa
  0 siblings, 1 reply; 32+ messages in thread
From: Ognyan Kulev @ 2004-01-21  6:45 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa wrote:
> Thank you!!  That question makes me realize what was wrong
> in my previous code.  As I installed a fix, please try again
> with the latest CVS code.

Now it's OK :-)

Regards,
ogi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: TUTORIAL.bg and windows-1251
  2004-01-21  6:45                                         ` Ognyan Kulev
@ 2004-01-21 10:52                                           ` Kenichi Handa
  0 siblings, 0 replies; 32+ messages in thread
From: Kenichi Handa @ 2004-01-21 10:52 UTC (permalink / raw)
  Cc: emacs-devel

In article <400E2021.8070800@fmi.uni-sofia.bg>, Ognyan Kulev <ogi@fmi.uni-sofia.bg> writes:

> Kenichi Handa wrote:
>>  Thank you!!  That question makes me realize what was wrong
>>  in my previous code.  As I installed a fix, please try again
>>  with the latest CVS code.

> Now it's OK :-)

That's good!  Thank you for confirming that.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2004-01-21 10:52 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-14 18:56 TUTORIAL.bg and windows-1251 Ognyan Kulev
2003-11-15 12:19 ` Ognyan Kulev
2003-11-26  7:33   ` Ognyan Kulev
2003-11-15 14:24 ` Jason Rumney
2003-11-17  7:21 ` Kenichi Handa
2003-11-18 15:49   ` Ognyan Kulev
2003-11-24 23:55     ` Kenichi Handa
2003-11-26  7:16       ` Ognyan Kulev
2003-11-26  7:47         ` Kenichi Handa
2003-11-26  8:30           ` Ognyan Kulev
2003-11-26 13:17             ` Kenichi Handa
2003-11-26 14:08               ` Ognyan Kulev
2003-12-03  8:34               ` Kenichi Handa
2003-12-04 16:28                 ` Ognyan Kulev
2003-12-04 23:28                   ` Kenichi Handa
2003-12-31 15:06                     ` Ognyan Kulev
2003-12-31 15:54                       ` Eli Zaretskii
2004-01-05  4:20                         ` Kenichi Handa
2004-01-05  4:14                       ` Kenichi Handa
2004-01-06 12:03                         ` YAMAMOTO Mitsuharu
2004-01-07  0:25                           ` Kenichi Handa
2004-01-07  1:32                             ` YAMAMOTO Mitsuharu
2004-01-07 16:22                         ` Ognyan Kulev
2004-01-07 23:58                           ` Kenichi Handa
2004-01-09 16:10                             ` Ognyan Kulev
2004-01-13  4:07                               ` Kenichi Handa
2004-01-14 11:42                                 ` Ognyan Kulev
2004-01-14 12:10                                   ` Kenichi Handa
2004-01-17 19:31                                     ` Ognyan Kulev
2004-01-19  0:34                                       ` Kenichi Handa
2004-01-21  6:45                                         ` Ognyan Kulev
2004-01-21 10:52                                           ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).