From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Peter Dyballa Newsgroups: gmane.emacs.devel,gmane.emacs.pretest.bugs Subject: 23.0.60; describe-char gives wrong information Date: Mon, 31 Dec 2007 14:16:04 +0100 Message-ID: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1199106991 5242 80.91.229.12 (31 Dec 2007 13:16:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 31 Dec 2007 13:16:31 +0000 (UTC) To: emacs-pretest-bug@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Dec 31 14:16:44 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1J9KVX-0004dg-P9 for ged-emacs-devel@m.gmane.org; Mon, 31 Dec 2007 14:16:44 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1J9KVB-0004TH-W3 for ged-emacs-devel@m.gmane.org; Mon, 31 Dec 2007 08:16:22 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1J9KV7-0004QK-Rk for emacs-devel@gnu.org; Mon, 31 Dec 2007 08:16:17 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1J9KV5-0004Mz-Hc for emacs-devel@gnu.org; Mon, 31 Dec 2007 08:16:17 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1J9KV5-0004Mj-9L for emacs-devel@gnu.org; Mon, 31 Dec 2007 08:16:15 -0500 Original-Received: from fencepost.gnu.org ([140.186.70.10]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1J9KV4-0007WZ-Vl for emacs-devel@gnu.org; Mon, 31 Dec 2007 08:16:15 -0500 Original-Received: from mx10.gnu.org ([199.232.76.166]) by fencepost.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1J9KV4-0002zn-PK for emacs-pretest-bug@gnu.org; Mon, 31 Dec 2007 08:16:14 -0500 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1J9KV1-0007VW-DE for emacs-pretest-bug@gnu.org; Mon, 31 Dec 2007 08:16:14 -0500 Original-Received: from mout1.freenet.de ([195.4.92.91]) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1J9KV0-0007V7-Qm for emacs-pretest-bug@gnu.org; Mon, 31 Dec 2007 08:16:11 -0500 Original-Received: from [195.4.92.23] (helo=13.mx.freenet.de) by mout1.freenet.de with esmtpa (Exim 4.68) (envelope-from ) id 1J9KUy-0007gD-GS for emacs-pretest-bug@gnu.org; Mon, 31 Dec 2007 14:16:08 +0100 Original-Received: from fcd8a.f.ppp-pool.de ([195.4.205.138]:61681 helo=[192.168.1.2]) by 13.mx.freenet.de with esmtpsa (ID peter_dyballa@freenet.de) (TLSv1:AES128-SHA:128) (port 25) (Exim 4.68 #1) id 1J9KUy-0001Ij-3U for emacs-pretest-bug@gnu.org; Mon, 31 Dec 2007 14:16:08 +0100 X-Mailer: Apple Mail (2.753) X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 1) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:85732 gmane.emacs.pretest.bugs:20489 Archived-At: Hello! When inquiring information for =CE=9F=CC=93 (i.e. a capital Omicron and = a =20 psili), maybe not correctly "composed" coming from a XeTeX document, =20 GNU Emacs 23.0.60 tells me: character: =CE=9F (927, #o1637, #x39f) preferred charset: gb18030 (GB18030) code point: 0xA6AF syntax: w which means: word category: G:Greek characters of 2-byte character sets =20= c:Chinese g:Greek h:Korean j:Japanese buffer code: #xCE #x9F file code: #xCE #x9F (encoded by coding system = utf-8-unix) display: composed to form "=CE=9F=CC=93" (see below) =09 Composed with the following character(s) "=CC=93" by the rule: (?=CE=9F (tc . bc) ?=CC=93) The component character(s) are displayed by these fonts (glyph = codes): =CE=9F: = -Misc-Fixed-Medium-R-Normal--13-120-75-75-C-80-ISO8859-7 (#xCF) =CC=93: -monotype-arial unicode = ms-medium-r-normal--13-127-74-74-p-129-=20 gb18030.2000-0 (#xBE35) See the variable `reference-point-alist' for the meaning of the = rule. =09 Character code properties are not shown: customize what to show =09 There are text properties here: auto-composed t composition [Show] fontified t Character U+039F can't hardly belong to a Chinese encoding. It's a =20 Greek character, taken off an ISO 8859-7 font. Its psili modifier or =20 COMBINING COMMA ABOVE is at U+0313, outside any Chinese encoding, too =20= (although GB18030-2000 defines both as 0xA6AF and as 0x8130BE35). =20 Isn't Unicode, as in the name "Unicode Emacs," more appropriate? The =20 "code point" data shown above is obviously the GB18030 representation =20= of GREEK CAPITAL LETTER OMICRON. The buffer and file code of #xCE =20 #x9F stands for GREEK CAPITAL LETTER OMICRON at U+039F in UTF-8. And then there is no sense in using a non-existing character from an =20 inappropriate font when the default font, Lucida Sans Typewriter, has =20= this character COMBINING COMMA ABOVE. And this font also has GREEK =20 CAPITAL LETTER OMICRON at U+039F. Similarly GNU Emacs 23.0.60 handles =E1=BD=88 (i.e. one letter Omicron = with =20 psili): character: =E1=BD=88 (8008, #o17510, #x1f48) preferred charset: gb18030 (GB18030) code point: 0x81369132 syntax: w which means: word category: g:Greek buffer code: #xE1 #xBD #x88 file code: #xE1 #xBD #x88 (encoded by coding system = utf-8-unix) display: by this font (glyph code) -monotype-arial unicode = ms-medium-r-normal--10-98-74-74-p-99-=20 gb18030.2000-0 (#x9132) =09 Character code properties: customize what to show name: GREEK CAPITAL LETTER OMICRON WITH PSILI general-category: Lu (Letter, Uppercase) decomposition: (927 787) ('=CE=9F' '=CC=93') =09 There are text properties here: auto-composed t fontified t And although it claims taking GREEK CAPITAL LETTER OMICRON WITH PSILI =20= at U+1F48 off Arial Unicode MS, which has this glyph, it uses an open =20= box to display it. Because U+1F48 is not defined in GB18030? The byte =20= sequence (code point) 0x81369132 is not defined in GB18030-2000. In GNU Emacs 23.0.60.1 (powerpc-apple-darwin8.11.0, X toolkit, Xaw3d =20 scroll bars) of 2007-12-30 on Latsche.local Windowing system distributor `The XFree86 Project, Inc', version =20 11.0.40400000 configured using `configure '--with-x-toolkit=3Dlucid' '--without-gtk' =20= '--with-dbus' '--without-sound' '--without-pop' '--with-xpm' '--with-=20 jpeg' '--with-tiff' '--with-gif' '--with-png' '--enable-=20 locallisppath=3D/Library/Application Support/Emacs/calendar22:/Library/=20= Application Support/Emacs/caml:/Library/Application Support/Emacs:/sw/=20= share/emacs21/site-lisp/elib' 'PKG_CONFIG_PATH=3D/sw/lib/freetype219/=20 lib/pkgconfig:/sw/lib/fontconfig2/lib/pkgconfig:/sw/lib/pkgconfig:/sw/=20= lib/system-openssl/lib/pkgconfig:/sw/share/pkgconfig:/usr/lib/=20 pkgconfig:/usr/local/lib/pkgconfig:/usr/local/clamXav/lib/pkgconfig:/=20 usr/local/lib/pkgconfig' 'CPPFLAGS=3D-no-cpp-precomp -D__BIND_NOSTATIC -=20= I/usr/include/openssl -I/sw/include/pango-1.0 -I/sw/lib/fontconfig2/=20 include -I/sw/lib/freetype219/include -I/sw/lib/freetype219/include/=20 freetype2 -I/sw/include -I/usr/local/include -idirafter /usr/X11R6/=20 include' 'CXXFLAGS=3D-no-cpp-precomp -I/usr/include/openssl -I/sw/=20 include/pango-1.0 -I/sw/lib/fontconfig2/include -I/sw/lib/freetype219/=20= include -I/sw/lib/freetype219/include/freetype2 -I/sw/include -I/usr/=20 local/include' 'CFLAGS=3D-bind_at_load -pipe -fPIC -mcpu=3D7450 -=20 mtune=3D7450 -fast -mpim-altivec -ftree-vectorize -foptimize-register-=20= move -freorder-blocks -freorder-blocks-and-partition -fthread-jumps -=20 fpeephole -fno-crossjumping' 'LDFLAGS=3D-dead_strip -multiply_defined =20= suppress -L/sw/lib/ncurses -L/sw/lib/fontconfig2/lib -L/sw/lib/=20 freetype219/lib -L/sw/lib -L/usr/local/lib -L/usr/X11R6/lib'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: de_DE.UTF-8 value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: de_DE.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: TeX-PDF-mode: t shell-dirtrack-mode: t show-paren-mode: t display-time-mode: t desktop-save-mode: t tooltip-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-compression-mode: t column-number-mode: t line-number-mode: t transient-mark-mode: t -- Greetings Pete A common mistake that people make when trying to design something =20 completely foolproof is to underestimate the ingenuity of complete =20 fools.