unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#1913: Identifier after reserved word "raise" is not always
@ 2010-01-13  8:03 Stephen Leake
  2011-07-09 23:24 ` Juanma Barranquero
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Leake @ 2010-01-13  8:03 UTC (permalink / raw)
  To: 1913

It is clear that [a-zA-Z] does not match the characters permitted by
the Ada standard.

However, neither does [[:alpha:]] - consider this fragment:

procedure doµ 

the 'µ' (entered by C-x 8 u) is not matched by [[:alpha:]]*
(Emacs 23.1, Windows XP, LANG=C.UTF-8).

This could be fixed by the user; they can define µ to have word
syntax.

Ideally, we would have regular expression character ranges that match
those defined by ISO/IEC 10646:2003 (see LRM 2.1); 

Letter, Uppercase
Letter, Lowercase
Letter, Titlecase
Letter, Modifier
Letter, Other
Mark, Non-Spacing
Mark, Spacing Combining
Number, Decimal
Number, Letter
Punctuation, Connector
Other, Format
Separator, Space
Separator, Line
Separator, Paragraph

These categories are used to define Ada lexical elements (LRM 2.2).

But I don't think that's going to happen.

It seems the best compromise is to replace a-z etc with [:alpha:] or
[:alnum:] as appropriate, and hope the user knows how to define
characters to have word syntax. That's a lot of work, since each
modified regexp needs to be tested.

As for matching leading underscores, I agree it would be nice to get
it right. Using shy groups (the elisp name for non-capturing groups)
would help, since it won't disturb the group numbering, as well as
being faster. If it doesn't complicate the testing, I'll try to do
that.

Do you have suggestions about which regular expressions are more
important to be fixed? If you can provide typical code, and point out
the most annoying font-lock failures, that would be a good start.

-- 
-- Stephe






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-07-12 12:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-13  8:03 bug#1913: Identifier after reserved word "raise" is not always Stephen Leake
2011-07-09 23:24 ` Juanma Barranquero
2011-07-10 17:28   ` Stephen Leake
2011-07-10 23:12     ` Juanma Barranquero
2011-07-11 13:07       ` Stephen Leake
2011-07-12 12:19         ` Juanma Barranquero

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).