all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* syntax identification (Request for Help)
@ 2015-08-04 16:49 Phillip Lord
  2015-08-07 14:57 ` Stefan Monnier
  0 siblings, 1 reply; 3+ messages in thread
From: Phillip Lord @ 2015-08-04 16:49 UTC (permalink / raw
  To: emacs-devel



I am trying to improve the syntax identification of omn-mode.el (in
elpa). The omn syntax uses URLs everywhere which are identified like so:
<http://www.gnu.org>

Within this syntax, although they are URLs they have little other
meaning actually, they are IRIs -- identifiers, rather than locations.
IRIs are difficult to identify by regular expression. So I treat them
syntactically as strings with this (st is the syntax table).

    (modify-syntax-entry ?\< "|" st)
    (modify-syntax-entry ?\> "|" st)


Strings are nice because I also do this....

    (modify-syntax-entry ?\# "<" st)
    (modify-syntax-entry ?\n ">" st)

that is # is the start of comment character but NOT inside a IRI where
it's actually quite common. Identifying IRIs as strings also solves this
problem since comment characters inside strings are not comment
characters -- Emacs gives me this for free.

This fails, however, in two ways. Firstly while <url> is correctly
identified so is <url<, >url< and >url>. And, secondly "<" and ">" can
also be used along to mean (guess what!) greater than or less than in an
expression like so:

     xsd:integer[>= 0 , <= 18]


Unfortunately, everthing between ">" and "<" gets identified as a
string.

Stefan added comments to omn-mode saying "We could use a
syntax-propertize-function to do more carefully.". Would anyone be
willing to help explain to me how this works and help me? I found the
manual a bit confusing.

I am willing to use space characters to differentiate. IRIs are complex
(they have very few rules) but cannot contain spaces. The "facet" (i.e.
[>= 0]) bit above can contain spaces, and while they do not need to
contain spaces, I am willing to use this to differentiate between them
and an IRI as an acceptable compromise.

Any help gratefully recieved.

Phil



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: syntax identification (Request for Help)
  2015-08-04 16:49 syntax identification (Request for Help) Phillip Lord
@ 2015-08-07 14:57 ` Stefan Monnier
  2015-08-11 21:33   ` Phillip Lord
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Monnier @ 2015-08-07 14:57 UTC (permalink / raw
  To: Phillip Lord; +Cc: emacs-devel

> Stefan added comments to omn-mode saying "We could use a
> syntax-propertize-function to do more carefully.". Would anyone be
> willing to help explain to me how this works and help me? I found the
> manual a bit confusing.

You could start by removing the

    (modify-syntax-entry ?\< "|" st)
    (modify-syntax-entry ?\> "|" st)

and using

    (setq-local syntax-propertize-function
                (syntax-propertize-rules
                 ("\\(<\\)[^ ]*\\(>\\)" (1 "|") (2 "|"))))

Guaranteed 100% untested.


        Stefan



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: syntax identification (Request for Help)
  2015-08-07 14:57 ` Stefan Monnier
@ 2015-08-11 21:33   ` Phillip Lord
  0 siblings, 0 replies; 3+ messages in thread
From: Phillip Lord @ 2015-08-11 21:33 UTC (permalink / raw
  To: Stefan Monnier; +Cc: Phillip Lord, emacs-devel


Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Stefan added comments to omn-mode saying "We could use a
>> syntax-propertize-function to do more carefully.". Would anyone be
>> willing to help explain to me how this works and help me? I found the
>> manual a bit confusing.
>
> You could start by removing the
>
>     (modify-syntax-entry ?\< "|" st)
>     (modify-syntax-entry ?\> "|" st)
>
> and using
>
>     (setq-local syntax-propertize-function
>                 (syntax-propertize-rules
>                  ("\\(<\\)[^ ]*\\(>\\)" (1 "|") (2 "|"))))
>
> Guaranteed 100% untested.

I've just tested it and it seems to work pretty well, actually!

Phil



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-08-11 21:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-04 16:49 syntax identification (Request for Help) Phillip Lord
2015-08-07 14:57 ` Stefan Monnier
2015-08-11 21:33   ` Phillip Lord

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.