From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Geoffrey Alan Washburn Newsgroups: gmane.emacs.devel Subject: Re: modify-syntax-entry and UTF8? Date: Wed, 23 May 2007 11:09:22 -0400 Message-ID: <46545922.1050002@cis.upenn.edu> References: <4652AE2C.5030305@cis.upenn.edu> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1179933029 17030 80.91.229.12 (23 May 2007 15:10:29 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 23 May 2007 15:10:29 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 23 17:10:19 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HqsTj-0005dE-Aw for ged-emacs-devel@m.gmane.org; Wed, 23 May 2007 17:10:19 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HqsTk-0002It-U9 for ged-emacs-devel@m.gmane.org; Wed, 23 May 2007 11:10:20 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HqsTf-0002IP-Uh for emacs-devel@gnu.org; Wed, 23 May 2007 11:10:15 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HqsTe-0002Hz-Jn for emacs-devel@gnu.org; Wed, 23 May 2007 11:10:15 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HqsTe-0002Hw-Cc for emacs-devel@gnu.org; Wed, 23 May 2007 11:10:14 -0400 Original-Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1HqsTd-0003ml-UU for emacs-devel@gnu.org; Wed, 23 May 2007 11:10:14 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HqsTJ-0001S1-Vj for emacs-devel@gnu.org; Wed, 23 May 2007 17:09:53 +0200 Original-Received: from seasnet-50-07.cis.upenn.edu ([158.130.50.8]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 23 May 2007 17:09:53 +0200 Original-Received: from geoffw by seasnet-50-07.cis.upenn.edu with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 23 May 2007 17:09:53 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 73 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: seasnet-50-07.cis.upenn.edu User-Agent: Thunderbird 2.0.0.4pre (X11/20070522) In-Reply-To: X-detected-kernel: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:71656 Archived-At: James Cloos wrote: >>>>>> "Geoffrey" =3D=3D Geoffrey Alan Washburn wr= ites: >=20 > Geoffrey> No, what I wrote is exactly what I meant, unless the author o= f > Geoffrey> the TeX-input method incorrectly defined \langle and \rangle.= >=20 > Ah. That does put a different spin on things. >=20 > And in fact, the UCS has expanded since that was written, and character= s > were added for exactly TeX's \langle and \rlangle (and a few others in > latin-ltx.el which currently point to CJK characters instead of math ch= ars). >=20 > latin-ltx.el should be updated to use =E2=9F=A8 U+27E8 MATHEMATICAL LEF= T ANGLE > BRACKET for \langle and =E2=9F=A9 U+27E9 MATHEMATICAL RIGHT ANGLE BRACK= ET for \rangle. Ah, that is good to know. Is there any straightforward way to override=20 this in my .emacs file? > What does C-uC-x=3D output when point is on the characters in your > (modify-syntax-entry) calls and when point is on one of the characters > you are trying to match in the buffer you are editing? What are the > mode and coding-system of the buffer you are editing? What is the > coding-system of the .el file? So when using the correct glyphs I get character: =E2=9F=A8 (10216, #o23750, #x27e8) preferred charset: unicode (Unicode (ISO10646)) code point: 0x27E8 syntax: (=E2=9F=A9 which means: open, matches =E2=9F=A9 buffer code: #xE2 #x9F #xA8 file code: #xE2 #x9F #xA8 (encoded by coding system utf-8-unix) display: no font available =2E.. and character: =E2=9F=A9 (10217, #o23751, #x27e9) preferred charset: unicode (Unicode (ISO10646)) code point: 0x27E9 syntax: )=E2=9F=A8 which means: close, matches =E2=9F=A8 buffer code: #xE2 #x9F #xA9 file code: #xE2 #x9F #xA9 (encoded by coding system utf-8-unix) display: no font available =2E.. which as I understand it means that they should already be treated as=20 matching delimiters. However, if create an empty scratch buffer and I move the cursor on top=20 of either of the glyphs they become highlighted, but with the face that=20 is used for matched delimiters rather than the face mismatch/unmatched=20 delimiters. Adding both glyphs to an empty buffer in correctly and=20 incorrectly matching permutations gives the same behavior. So I am inclined to believe Stefan's hypothesis that modify-syntax-entry = is working correctly here and instead whatever code actually interprets=20 the syntax table or performs the actual adjustment to the faces for=20 highlighting has a bug of some sort. I'm also somewhat curious that emacs tells me that no font is available=20 for these glyphs, but Thunderbird seems to be able to locate a font that = can be used to display them.