From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Probably dumb question: glyph rendering on unicode-2 branch
Date: Tue, 25 Oct 2005 10:33:01 +0900
Message-ID: <E1EUDgT-0003A2-00@etlken>
References: <09B15CC4-37F2-4B0F-8487-2037B482D1CC@cogsci.ucsd.edu>
	<BD5A24D1-F3BD-4AC9-8762-8E4917C83D2E@cogsci.ucsd.edu>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=US-ASCII
X-Trace: sea.gmane.org 1130204091 913 80.91.229.2 (25 Oct 2005 01:34:51 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Tue, 25 Oct 2005 01:34:51 +0000 (UTC)
Cc: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 25 03:34:50 2005
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1EUDi1-0003FU-FM
	for ged-emacs-devel@m.gmane.org; Tue, 25 Oct 2005 03:34:38 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1EUDi0-0006Vt-T0
	for ged-emacs-devel@m.gmane.org; Mon, 24 Oct 2005 21:34:36 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1EUDgp-00067Z-Pl
	for emacs-devel@gnu.org; Mon, 24 Oct 2005 21:33:24 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1EUDgm-00064l-Tl
	for emacs-devel@gnu.org; Mon, 24 Oct 2005 21:33:22 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1EUDgm-00064g-M7
	for emacs-devel@gnu.org; Mon, 24 Oct 2005 21:33:20 -0400
Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org)
	by monty-python.gnu.org with esmtp
	(TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.34) id 1EUDgm-0006gz-Dv
	for emacs-devel@gnu.org; Mon, 24 Oct 2005 21:33:20 -0400
Original-Received: from nfs.m17n.org (nfs.m17n.org [192.47.44.7])
	by tsukuba.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id
	j9P1X2cU014712; Tue, 25 Oct 2005 10:33:02 +0900
Original-Received: from etlken (etlken.m17n.org [192.47.44.125])
	by nfs.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id j9P1X2Eh011070; 
	Tue, 25 Oct 2005 10:33:02 +0900
Original-Received: from handa by etlken with local (Exim 3.36 #1 (Debian))
	id 1EUDgT-0003A2-00; Tue, 25 Oct 2005 10:33:01 +0900
Original-To: Adrian Robert <arobert@cogsci.ucsd.edu>
In-reply-to: <BD5A24D1-F3BD-4AC9-8762-8E4917C83D2E@cogsci.ucsd.edu> (message
	from Adrian Robert on Mon, 24 Oct 2005 10:43:04 -0400)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2
	Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:44785
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/44785>

In article <BD5A24D1-F3BD-4AC9-8762-8E4917C83D2E@cogsci.ucsd.edu>, Adrian Robert <arobert@cogsci.ucsd.edu> writes:
> I didn't get any response to the below, let me try asking it in a  
> different way:

Sorry for not responding on this matter.  It seems that I
missed your original mail.

> unicode-2 branch:
>    dispextern.h:

>      struct glyph {
>      ...
>          /* Character code for character glyphs (type ==  
> CHAR_GLYPH).  */
>          unsigned ch;
>      ...
>      }
>      ...
>      struct glyph_string {
>      ...
>      /* Characters to be drawn, and number of characters.  */
>      XChar2b *char2b;
>      int nchars;
>      ...
>      }

>    {x,mac,w32}term.c:

>      x_encode_char(int c, XChar2b *char2b, ...)
>      {
>      ...
>      }

>      x_draw_glyph_string(struct glyph_string *s)
>      {
>      ...
>      }

> Questions:

> 1) Is 'int c' passed to x_encode_char() the same as 'unsigned ch' in  
> struct glpyh?

Mostly yes.  The exception is in the case that x_encode_char
is called on an element of composition glyph.  In that case,
x_encode_char is called from get_char_face_and_encoding
which is called from BUILD_COMPOSITE_GLYPH_STRING macro on
each element of a composition glyph.

> 2) In either case, what are they -- UCS-2?  UTF-16?  MULE?  UCS-4?   
> UTF-32?  What is the byte ordering?

It is a character code used in Emacs.  The value range is
0x0..0x3FFFFF.  Among them, 0x0..0x10FFFF are exactly the
same as Unicode characters.  I think it's nonsense to ask
"byte ordering" of (int).  That's depends on your hardware
architecture.

> I'll be happy to RTFM if this is documented anywhere..

The file src/character.h contains some documentation about
character code.

>>  I apologize if this is a dumb question, but I've been looking  
>>  through the code and can't figure this one out: on the unicode-2  
>>  branch, if a font specifies "iso-10646-1" for XLFD registry/ 
>>  encoding (and then fontset.c sets 'charset' accordingly), what  
>>  exactly is getting passed in struct glyph_string.char2b to  
>>  x_draw_glyph_string()?

If a font has CHARSET_REGISTRY "iso10646" and
CHARSET_ENCODING "1", the font contains only BMP characters.
Emacs-unicode uses such a font only for BMP characters.


>>  Not UTF-8, since it's just 2 bytes.   
>>  UCS-2?  UTF-16?  Don't these exclude a lot of unicode characters?   

Yes.  But, as far as I know, there's no consensus about what
to specify in a font supporting SMP or SIP in
CHARSET_REGISTRY and CHARSET_ENCODING fields.

>>  Does emacs provide any internal facility to get UTF-8?

Do you mean a way to convert a character code to UTF-8 byte
sequence in C level?  Then you can use the macro CHAR_STRING
(defined in character.h) because Emacs-unicode's internal
string/buffer representation is UTF-8 byte sequence.

---
Kenichi Handa
handa@m17n.org