modify-syntax-entry and UTF8?

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* modify-syntax-entry and UTF8?
@ 2007-05-21 17:24 Geoffrey Alan Washburn
  2007-05-21 23:06 ` James Cloos
  2007-05-22  9:48 ` modify-syntax-entry and UTF8? Stefan Monnier
  0 siblings, 2 replies; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-21 17:24 UTC (permalink / raw)
  To: emacs-devel

As far as I can tell, modify-syntax-entry does not interact properly 
with UTF8 in the CVS snapshot of emacs that I am currently using.  Is 
this the intended behavior and there is some other mechanism I should be 
using?  Specifically I would like to do something like the following

(modify-syntax-entry ?〈 "(〉")
(modify-syntax-entry ?〉 ")〈")

Which is happily accepted but does not correctly interpret matching 
pairs of angle brackets in a buffer.  I searched the documentation but 
could not seem to find information on how "char" parameters to 
modify-syntax-entry relate to Unicode characters.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-21 17:24 modify-syntax-entry and UTF8? Geoffrey Alan Washburn
@ 2007-05-21 23:06 ` James Cloos
  2007-05-22  8:47   ` Geoffrey Alan Washburn
  2007-05-22  9:48 ` modify-syntax-entry and UTF8? Stefan Monnier
  1 sibling, 1 reply; 19+ messages in thread
From: James Cloos @ 2007-05-21 23:06 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:

Geoffrey> Specifically I would like to do something like the following

Geoffrey> (modify-syntax-entry ?〈 "(〉")
Geoffrey> (modify-syntax-entry ?〉 ")〈")

Geoffrey> Which is happily accepted but does not correctly interpret
Geoffrey> matching pairs of angle brackets in a buffer.

Those angle brackets you have there are U+2329 LEFT-POINTING ANGLE
BRACKET and U+232A RIGHT-POINTING ANGLE which are CJK or wide
characters.  I suspect you may have wanted the similar, narrow
characters ‹ and › which are U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION
MARK and U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK

Try these two lines instead:

 (modify-syntax-entry ?‹ "(›")
 (modify-syntax-entry ?› ")‹")

They may do what you want.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-21 23:06 ` James Cloos
@ 2007-05-22  8:47   ` Geoffrey Alan Washburn
  2007-05-22 13:25     ` James Cloos
  0 siblings, 1 reply; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-22  8:47 UTC (permalink / raw)
  To: emacs-devel; +Cc: emacs-devel

James Cloos wrote:
>>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
> 
> Geoffrey> Specifically I would like to do something like the following
> 
> Geoffrey> (modify-syntax-entry ?〈 "(〉")
> Geoffrey> (modify-syntax-entry ?〉 ")〈")
> 
> Geoffrey> Which is happily accepted but does not correctly interpret
> Geoffrey> matching pairs of angle brackets in a buffer.
> 
> Those angle brackets you have there are U+2329 LEFT-POINTING ANGLE
> BRACKET and U+232A RIGHT-POINTING ANGLE which are CJK or wide
> characters.  I suspect you may have wanted the similar, narrow
> characters ‹ and › which are U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION
> MARK and U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
> 
> Try these two lines instead:
> 
>  (modify-syntax-entry ?‹ "(›")
>  (modify-syntax-entry ?› ")‹")
> 
> They may do what you want.

	No, what I wrote is exactly what I meant, unless the author of the 
TeX-input method incorrectly defined \langle and \rangle.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-21 17:24 modify-syntax-entry and UTF8? Geoffrey Alan Washburn
  2007-05-21 23:06 ` James Cloos
@ 2007-05-22  9:48 ` Stefan Monnier
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2007-05-22  9:48 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

Hi Geoffrey,

> As far as I can tell, modify-syntax-entry does not interact properly with
> UTF8 in the CVS snapshot of emacs that I am currently using.  Is this the
> intended behavior and there is some other mechanism I should be using?
> Specifically I would like to do something like the following

> (modify-syntax-entry ?〈 "(〉")
> (modify-syntax-entry ?〉 ")〈")

> Which is happily accepted but does not correctly interpret matching pairs of
> angle brackets in a buffer.  I searched the documentation but could not seem
> to find information on how "char" parameters to modify-syntax-entry relate
> to Unicode characters.

It should work, but there are several potential problems you may be
bumping into.  One is that there are several syntax-tables, so you want to
be careful to modify the one you actually use.  The two calls above modify
the "current syntax table", i.e. the one currently active in the current
buffer, so if you then try it in some other buffer it probably won't do what
you wanted.

The other one is that Emacs's internal representation of characters is not
unified so there may be several different chars equivalent to 〈, in which
case setting the syntax of one will not have any influence on the other.

Yet another is that maybe the syntax-table is set right, but the way you use
to test that Emacs "correctly interpret matching pairs of angle brackets in
a buffer" hits a bug (or misfeature).

Note also that those two chars should already by default have the syntax
you're trying to set, so most likely the problem is not the first one.

So, in the buffer of interest, go to the 〈 and 〉 chars and check them with
C-u C-x =.  If the *Help* buffer says these are part of the
mule-unicode-... charset and that their syntax is (〉 (resp )〈), then tell us
what makes you think that they are not properly matched (which operation
fails and in precisely which circumstance).

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-22  8:47   ` Geoffrey Alan Washburn
@ 2007-05-22 13:25     ` James Cloos
  2007-05-23 15:09       ` Geoffrey Alan Washburn
  0 siblings, 1 reply; 19+ messages in thread
From: James Cloos @ 2007-05-22 13:25 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:

Geoffrey> No, what I wrote is exactly what I meant, unless the author of
Geoffrey> the TeX-input method incorrectly defined \langle and \rangle.

Ah.  That does put a different spin on things.

And in fact, the UCS has expanded since that was written, and characters
were added for exactly TeX's \langle and \rlangle (and a few others in
latin-ltx.el which currently point to CJK characters instead of math chars).

latin-ltx.el should be updated to use ⟨ U+27E8 MATHEMATICAL LEFT ANGLE
BRACKET for \langle and ⟩ U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET for \rangle.

Other examples are \llbracket and \rrbracket which should be U+27E6 and
U+27E7 instead of U+301A and U+301B, \ldata and \rdata (U+27EA and
U+27EB instead of U+300A and U+300B), \sbs (U+29F5 instead of U+FE68).

The reason is that the CJK characters in Emacs get different codepoints
depending on which language, and that can prevent matching.

I'm sure you are having problems matching those characters because the
versions in your .el file have different buffer and/or file codes than
what you are trying to match them to.

What does C-uC-x= output when point is on the characters in your
(modify-syntax-entry) calls and when point is on one of the characters
you are trying to match in the buffer you are editing?  What are the
mode and coding-system of the buffer you are editing?  What is the
coding-system of the .el file?

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-22 13:25     ` James Cloos
@ 2007-05-23 15:09       ` Geoffrey Alan Washburn
  2007-05-23 16:56         ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-23 15:09 UTC (permalink / raw)
  To: emacs-devel

James Cloos wrote:
>>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
> 
> Geoffrey> No, what I wrote is exactly what I meant, unless the author of
> Geoffrey> the TeX-input method incorrectly defined \langle and \rangle.
> 
> Ah.  That does put a different spin on things.
> 
> And in fact, the UCS has expanded since that was written, and characters
> were added for exactly TeX's \langle and \rlangle (and a few others in
> latin-ltx.el which currently point to CJK characters instead of math chars).
> 
> latin-ltx.el should be updated to use ⟨ U+27E8 MATHEMATICAL LEFT ANGLE
> BRACKET for \langle and ⟩ U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET for \rangle.

Ah, that is good to know.  Is there any straightforward way to override 
this in my .emacs file?

> What does C-uC-x= output when point is on the characters in your
> (modify-syntax-entry) calls and when point is on one of the characters
> you are trying to match in the buffer you are editing?  What are the
> mode and coding-system of the buffer you are editing?  What is the
> coding-system of the .el file?

So when using the correct glyphs I get

         character: ⟨ (10216, #o23750, #x27e8)
preferred charset: unicode (Unicode (ISO10646))
        code point: 0x27E8
            syntax: (⟩	which means: open, matches ⟩
       buffer code: #xE2 #x9F #xA8
         file code: #xE2 #x9F #xA8 (encoded by coding system utf-8-unix)
           display: no font available
...

and

         character: ⟩ (10217, #o23751, #x27e9)
preferred charset: unicode (Unicode (ISO10646))
        code point: 0x27E9
            syntax: )⟨	which means: close, matches ⟨
       buffer code: #xE2 #x9F #xA9
         file code: #xE2 #x9F #xA9 (encoded by coding system utf-8-unix)
           display: no font available

...

which as I understand it means that they should already be treated as 
matching delimiters.

However, if create an empty scratch buffer and I move the cursor on top 
of either of the glyphs they become highlighted, but with the face that 
is used for matched delimiters rather than the face mismatch/unmatched 
delimiters.  Adding both glyphs to an empty buffer in correctly and 
incorrectly matching permutations gives the same behavior.

So I am inclined to believe Stefan's hypothesis that modify-syntax-entry 
is working correctly here and instead whatever code actually interprets 
the syntax table or performs the actual adjustment to the faces for 
highlighting has a bug of some sort.

I'm also somewhat curious that emacs tells me that no font is available 
for these glyphs, but Thunderbird seems to be able to locate a font that 
can be used to display them.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-23 15:09       ` Geoffrey Alan Washburn
@ 2007-05-23 16:56         ` Stefan Monnier
  2007-05-25 13:48           ` Geoffrey Alan Washburn
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2007-05-23 16:56 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

> However, if create an empty scratch buffer and I move the cursor on top of
> either of the glyphs they become highlighted, but with the face that is used
> for matched delimiters rather than the face mismatch/unmatched delimiters.
> Adding both glyphs to an empty buffer in correctly and incorrectly matching
> permutations gives the same behavior.

> So I am inclined to believe Stefan's hypothesis that modify-syntax-entry is
> working correctly here and instead whatever code actually interprets the
> syntax table or performs the actual adjustment to the faces for highlighting
> has a bug of some sort.

Try to use C-M-f and C-M-b to see if the code correctly counts
opening/closing elements (this code doesn't pay attention to matching or
non-matching elements).  If this works, then 99% of things are right.

Then, please tell us what you use to cause face-highlighting of those
opening&closing elements.  Ideally, give us a recipe starting from
"emacs -Q" which can reproduce your problem.

> I'm also somewhat curious that Emacs tells me that no font is available for
> these glyphs, but Thunderbird seems to be able to locate a font that can be
> used to display them.

Emacs's font handling is quite different.  You may need to help it a bit by
specifying which font to use.  I'm not sure how to do that since you seem to
be using the emacs-unicode branch and I'm not yet familiar enough with it.

Maybe start a different thread about it, and don't forget to mention
emacs-unicode in there so the right people will look at it.

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-23 16:56         ` Stefan Monnier
@ 2007-05-25 13:48           ` Geoffrey Alan Washburn
  2007-05-25 14:23             ` Miles Bader
  0 siblings, 1 reply; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-25 13:48 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier wrote:

> Try to use C-M-f and C-M-b to see if the code correctly counts
> opening/closing elements (this code doesn't pay attention to matching or
> non-matching elements).  If this works, then 99% of things are right.

Based upon my understanding of backward-sexp and forward-sexp, this 
doesn't work correctly either.  Or at least not entirely analogously to 
parens, square-brackets, etc.  If I use
forward-sexp with the point on just

	⟨

it jumps to just after the \langle.  Similarly for

	⟨   ⟩

If I do backward-sexp just after the \rangle the point will move on top 
of the \rangle.  If I do a backward-sexp while on the \rangle instead of 
giving me an error (like it appears to do for parens, etc.) it moves the 
point on top of the \langle.

> Then, please tell us what you use to cause face-highlighting of those
> opening&closing elements.  Ideally, give us a recipe starting from
> "emacs -Q" which can reproduce your problem.

Okay, starting from emacs -Q, enter "⟨   ⟩" and "(   )".  Then using C-U 
C-X = verify that these two pairs of glyphs are indeed considered to be 
a matching pair in the current syntax table.  Then to enable 
show-paren-mode.  Moving the point on top of "(" or just after ")" will 
cause both glyphs to become "highlighted".  Moving the point on top of 
"⟨" or just after "⟩" will instead just highlight the first or the 
second respectively.  forward-sexp and backward-sexp behave as described 
above.

> Emacs's font handling is quite different.  You may need to help it a bit by
> specifying which font to use.  I'm not sure how to do that since you seem to
> be using the emacs-unicode branch and I'm not yet familiar enough with it.
> 
> Maybe start a different thread about it, and don't forget to mention
> emacs-unicode in there so the right people will look at it.

	Sure.  Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 13:48           ` Geoffrey Alan Washburn
@ 2007-05-25 14:23             ` Miles Bader
  2007-05-25 14:26               ` Geoffrey Alan Washburn
  0 siblings, 1 reply; 19+ messages in thread
From: Miles Bader @ 2007-05-25 14:23 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
> Then to enable show-paren-mode.  Moving the point on top of "(" or
> just after ")" will cause both glyphs to become "highlighted".  Moving
> the point on top of "⟨" or just after "⟩" will instead just highlight
> the first or the second respectively.  forward-sexp and backward-sexp
> behave as described above.

FWIW, all these examples work correctly in Emacs 23 (the unicode
branch).

-Miles

-- 
I'd rather be consing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 14:23             ` Miles Bader
@ 2007-05-25 14:26               ` Geoffrey Alan Washburn
  2007-05-25 14:54                 ` Geoffrey Alan Washburn
  2007-05-25 22:06                 ` James Cloos
  0 siblings, 2 replies; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-25 14:26 UTC (permalink / raw)
  To: emacs-devel; +Cc: emacs-devel

Miles Bader wrote:
> Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
>> Then to enable show-paren-mode.  Moving the point on top of "(" or
>> just after ")" will cause both glyphs to become "highlighted".  Moving
>> the point on top of "⟨" or just after "⟩" will instead just highlight
>> the first or the second respectively.  forward-sexp and backward-sexp
>> behave as described above.
> 
> FWIW, all these examples work correctly in Emacs 23 (the unicode
> branch).

I am pretty sure I have been building from the emacs-unicode-2 branch. 
M-x version tells me that I am running:

GNU Emacs 23.0.0.3 (i686-pc-linux-gnu, GTK+ Version 2.10.6) of 2007-03-02

I will do a CVS update and try again to make sure this isn't something 
that has been fixed since March.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 14:26               ` Geoffrey Alan Washburn
@ 2007-05-25 14:54                 ` Geoffrey Alan Washburn
  2007-05-25 17:53                   ` Miles Bader
  2007-05-25 22:06                 ` James Cloos
  1 sibling, 1 reply; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-25 14:54 UTC (permalink / raw)
  To: emacs-devel; +Cc: emacs-devel

Geoffrey Alan Washburn wrote:
> Miles Bader wrote:
>> Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
>>> Then to enable show-paren-mode.  Moving the point on top of "(" or
>>> just after ")" will cause both glyphs to become "highlighted".  Moving
>>> the point on top of "⟨" or just after "⟩" will instead just highlight
>>> the first or the second respectively.  forward-sexp and backward-sexp
>>> behave as described above.
>>
>> FWIW, all these examples work correctly in Emacs 23 (the unicode
>> branch).
> 
> I am pretty sure I have been building from the emacs-unicode-2 branch. 
> M-x version tells me that I am running:
> 
> GNU Emacs 23.0.0.3 (i686-pc-linux-gnu, GTK+ Version 2.10.6) of 2007-03-02
> 
> I will do a CVS update and try again to make sure this isn't something 
> that has been fixed since March.

Still broken, at least in the latest CVS version of the emacs-unicode-2 
branch.  Is there another branch that I should be using (or alternately 
have its changes merged into emacs-unicode-2)?

Also, if I want to submit a patch to correct latin-ltx.el, where should 
I send it?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 14:54                 ` Geoffrey Alan Washburn
@ 2007-05-25 17:53                   ` Miles Bader
  2007-05-25 18:20                     ` Geoffrey Alan Washburn
  0 siblings, 1 reply; 19+ messages in thread
From: Miles Bader @ 2007-05-25 17:53 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel

Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
> Still broken, at least in the latest CVS version of the emacs-unicode-2 
> branch.  Is there another branch that I should be using (or alternately
> have its changes merged into emacs-unicode-2)?

I'm using my own branch, derived from the CVS emacs-unicode-2 branch
(I don't think my personal changes are relevant to this area).

So how come it works for me but not for you?

I use the following command to invoke Emacs (where /tmp/m is your
message):

   env - TERM=xterm emacs -nw -Q /tmp/m

[The non-ascii chars are displayed as question-marks because Emacs
doesn't know what character set the terminal supports, but they still
match correctly.]

-Miles

-- 
The car has become... an article of dress without which we feel uncertain,
unclad, and incomplete.  [Marshall McLuhan, Understanding Media, 1964]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 17:53                   ` Miles Bader
@ 2007-05-25 18:20                     ` Geoffrey Alan Washburn
  0 siblings, 0 replies; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-25 18:20 UTC (permalink / raw)
  To: emacs-devel

Miles Bader wrote:
> Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
>> Still broken, at least in the latest CVS version of the emacs-unicode-2 
>> branch.  Is there another branch that I should be using (or alternately
>> have its changes merged into emacs-unicode-2)?
> 
> I'm using my own branch, derived from the CVS emacs-unicode-2 branch
> (I don't think my personal changes are relevant to this area).
> 
> So how come it works for me but not for you?
> 
> I use the following command to invoke Emacs (where /tmp/m is your
> message):
> 
>    env - TERM=xterm emacs -nw -Q /tmp/m
> 
> [The non-ascii chars are displayed as question-marks because Emacs
> doesn't know what character set the terminal supports, but they still
> match correctly.]

Okay, one difference was that I wasn't running emacs in terminal mode. 
However, I tried your example which worked and then reverted to my 
normal configuration and now that seems to be working too.  So this 
really got me worried that I was just doing something really silly. 
However, if I start a fresh buffer and then use X11 "cut and paste" to 
copy ⟨   ⟩ into the buffer (instead of reading it from a file) it 
doesn't work.  If I am editing a file that I've opened that already 
contains ⟨   ⟩ "cut and paste" a second pair, those match correctly as 
well.  Alternately, if I first "cut and paste" and then use insert-file, 
matching behaves incorrectly.  So I'm sure whether this is a 
configuration error on my part, or some bugginess regarding interaction 
with the X server.  I was pretty sure that I emacs configured to use the 
correct encoding to start with

set-language-environment "utf-8")
(set-keyboard-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(prefer-coding-system 'utf-8)

I just checked and when opening a file with the pair of delimiters, 
describe-coding-system says

Coding system for saving this buffer:
   U -- utf-8-unix (alias: mule-utf-8-unix)

Default coding system (for new files):
   U -- utf-8 (alias: mule-utf-8)

Coding system for keyboard input:
   U -- utf-8 (alias: mule-utf-8)

Coding system for terminal output:
   U -- utf-8 (alias: mule-utf-8)

Coding system for inter-client cut and paste:
   x -- compound-text-with-extensions (alias: x-ctext-with-extensions 
ctext-with-extensions)

Defaults for subprocess I/O:
   decoding: U -- utf-8-unix (alias: mule-utf-8-unix)

   encoding: U -- utf-8-unix (alias: mule-utf-8-unix)

whereas when I open a fresh buffer I get

Coding system for saving this buffer:
   Not set locally, use the default.
Default coding system (for new files):
   U -- utf-8 (alias: mule-utf-8)

Coding system for keyboard input:
   U -- utf-8 (alias: mule-utf-8)

Coding system for terminal output:
   U -- utf-8 (alias: mule-utf-8)

Coding system for inter-client cut and paste:
   x -- compound-text-with-extensions (alias: x-ctext-with-extensions 
ctext-with-extensions)

Defaults for subprocess I/O:
   decoding: U -- utf-8-unix (alias: mule-utf-8-unix)

   encoding: U -- utf-8-unix (alias: mule-utf-8-unix)

If I change my .emacs so that it has

(prefer-coding-system 'utf-8-unix)

describe-coding-system reports both cases as identical to the first I 
quoted above.  However, the behavior for working with a fresh buffer 
versus and opened file remains the same.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 14:26               ` Geoffrey Alan Washburn
  2007-05-25 14:54                 ` Geoffrey Alan Washburn
@ 2007-05-25 22:06                 ` James Cloos
  2007-05-28 16:04                   ` Geoffrey Alan Washburn
  1 sibling, 1 reply; 19+ messages in thread
From: James Cloos @ 2007-05-25 22:06 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: Miles Bader, emacs-devel

>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:

Geoffrey> I am pretty sure I have been building from the emacs-unicode-2
Geoffrey> branch.

That pretty much invalidates much/most of what I earlier wrote in this
thread.  Sorry for any mis-direction.

I was just trying this out, and was seeing cases where only one of the
angle brackets (for both the U+27E8/U+27E9 and U+2329/U+232A pairs).

After some trial and error, I narrowed it down to instances where there
were only whitespace between the open and close angle brackets when the
buffer was in Lisp Interaction mode (ie the *scratch* buffer).

Either Lisp Interaction has some syntax rules which intentionally limit
highlighting when only whitespace separates the delimiters or something
is triggering a display bug in that situation.  

The partial highlighting in Lisp Interaction mode when only whitespace
separates the delimiters does not happen with parentheses or brackets;
and no highlighting occurs with braces.

I have:

  (set-terminal-coding-system 'utf-8)
  (set-default-coding-systems 'utf-8)

in my ~/.emacs and run with LANG=en_US.UTF-8 LC_COLLATE=C LC_TIME=C.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: modify-syntax-entry and UTF8?
  2007-05-25 22:06                 ` James Cloos
@ 2007-05-28 16:04                   ` Geoffrey Alan Washburn
  2007-05-29 12:43                     ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) James Cloos
  0 siblings, 1 reply; 19+ messages in thread
From: Geoffrey Alan Washburn @ 2007-05-28 16:04 UTC (permalink / raw)
  To: emacs-devel; +Cc: Miles Bader

James Cloos wrote:
>>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:
> I was just trying this out, and was seeing cases where only one of the
> angle brackets (for both the U+27E8/U+27E9 and U+2329/U+232A pairs).
> 
> After some trial and error, I narrowed it down to instances where there
> were only whitespace between the open and close angle brackets when the
> buffer was in Lisp Interaction mode (ie the *scratch* buffer).
> 
> Either Lisp Interaction has some syntax rules which intentionally limit
> highlighting when only whitespace separates the delimiters or something
> is triggering a display bug in that situation.  

	So should I file a bug report on this then?

> The partial highlighting in Lisp Interaction mode when only whitespace
> separates the delimiters does not happen with parentheses or brackets;
> and no highlighting occurs with braces.
> 
> I have:
> 
>   (set-terminal-coding-system 'utf-8)
>   (set-default-coding-systems 'utf-8)
> 
> in my ~/.emacs and run with LANG=en_US.UTF-8 LC_COLLATE=C LC_TIME=C.
> 

	I have LANG and LC_CTYPE set to en_US.UTF-8, but I do not have 
LC_COLLATE or LC_TIME set.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?)
  2007-05-28 16:04                   ` Geoffrey Alan Washburn
@ 2007-05-29 12:43                     ` James Cloos
  2007-05-29 13:27                       ` highlights of parentheses in lisp-interaction-mode martin rudalics
  2007-06-04  0:17                       ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) Richard Stallman
  0 siblings, 2 replies; 19+ messages in thread
From: James Cloos @ 2007-05-29 12:43 UTC (permalink / raw)
  To: Geoffrey Alan Washburn; +Cc: emacs-devel, Miles Bader

>>>>> "James" == James Cloos <cloos@jhcloos.com> writes:
>>>>> "Geoffrey" == Geoffrey Alan Washburn <geoffw@cis.upenn.edu> writes:

James> Either Lisp Interaction has some syntax rules which intentionally
James> limit highlighting when only whitespace separates the delimiters
James> or something is triggering a display bug in that situation.

Geoffrey> So should I file a bug report on this then?

Probably, but I've cc'ed the list for further opinions.

To the list:  paren match highlighting in lisp interaction mode fails to
highlight the distant paren when any whitespace separates the matching
parens.  This only happens for non-ASCII parens.

I've verified it in X11 frames on the unicode-2 branch and on terminal
frames with both the unicode-2 and the 22 release branches.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: highlights of parentheses in lisp-interaction-mode
  2007-05-29 12:43                     ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) James Cloos
@ 2007-05-29 13:27                       ` martin rudalics
  2007-05-29 16:34                         ` James Cloos
  2007-06-04  0:17                       ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) Richard Stallman
  1 sibling, 1 reply; 19+ messages in thread
From: martin rudalics @ 2007-05-29 13:27 UTC (permalink / raw)
  To: James Cloos; +Cc: Miles Bader, Geoffrey Alan Washburn, emacs-devel

> To the list:  paren match highlighting in lisp interaction mode fails to
> highlight the distant paren when any whitespace separates the matching
> parens.  This only happens for non-ASCII parens.
> 
> I've verified it in X11 frames on the unicode-2 branch and on terminal
> frames with both the unicode-2 and the 22 release branches.

Could you try with `multibyte-syntax-as-symbol' nil?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: highlights of parentheses in lisp-interaction-mode
  2007-05-29 13:27                       ` highlights of parentheses in lisp-interaction-mode martin rudalics
@ 2007-05-29 16:34                         ` James Cloos
  0 siblings, 0 replies; 19+ messages in thread
From: James Cloos @ 2007-05-29 16:34 UTC (permalink / raw)
  To: martin rudalics; +Cc: emacs-devel, Geoffrey Alan Washburn, Miles Bader

>>>>> "martin" == martin rudalics <rudalics@gmx.at> writes:

>> To the list:  paren match highlighting in lisp interaction mode fails to
>> highlight the distant paren when any whitespace separates the matching
>> parens.  This only happens for non-ASCII parens.
>> 
>> I've verified it in X11 frames on the unicode-2 branch and on terminal
>> frames with both the unicode-2 and the 22 release branches.

martin> Could you try with `multibyte-syntax-as-symbol' nil?

That makes the difference.

With that nil the highlighting works as normal.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?)
  2007-05-29 12:43                     ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) James Cloos
  2007-05-29 13:27                       ` highlights of parentheses in lisp-interaction-mode martin rudalics
@ 2007-06-04  0:17                       ` Richard Stallman
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2007-06-04  0:17 UTC (permalink / raw)
  To: James Cloos; +Cc: miles, geoffw, emacs-devel

    To the list:  paren match highlighting in lisp interaction mode fails to
    highlight the distant paren when any whitespace separates the matching
    parens.  This only happens for non-ASCII parens.

Could you please send me a *precise* test case for this bug?  The test
case should start with `emacs -q', so that your .emacs file does not
affect it, and it should show exactly what text to put in the buffer,
what commands to execute, and how and where to click.  Also please say
exactly what incorrect results you get.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-06-04  0:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-21 17:24 modify-syntax-entry and UTF8? Geoffrey Alan Washburn
2007-05-21 23:06 ` James Cloos
2007-05-22  8:47   ` Geoffrey Alan Washburn
2007-05-22 13:25     ` James Cloos
2007-05-23 15:09       ` Geoffrey Alan Washburn
2007-05-23 16:56         ` Stefan Monnier
2007-05-25 13:48           ` Geoffrey Alan Washburn
2007-05-25 14:23             ` Miles Bader
2007-05-25 14:26               ` Geoffrey Alan Washburn
2007-05-25 14:54                 ` Geoffrey Alan Washburn
2007-05-25 17:53                   ` Miles Bader
2007-05-25 18:20                     ` Geoffrey Alan Washburn
2007-05-25 22:06                 ` James Cloos
2007-05-28 16:04                   ` Geoffrey Alan Washburn
2007-05-29 12:43                     ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) James Cloos
2007-05-29 13:27                       ` highlights of parentheses in lisp-interaction-mode martin rudalics
2007-05-29 16:34                         ` James Cloos
2007-06-04  0:17                       ` highlights of parentheses in lisp-interaction-mode (was: Re: modify-syntax-entry and UTF8?) Richard Stallman
2007-05-22  9:48 ` modify-syntax-entry and UTF8? Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).