unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#20859: 25.0.50; css-mode: Comments within URIs
@ 2015-06-20 15:04 Simen Heggestøyl
  2015-06-21 17:38 ` Simen Heggestøyl
  2015-06-22  1:13 ` Stefan Monnier
  0 siblings, 2 replies; 9+ messages in thread
From: Simen Heggestøyl @ 2015-06-20 15:04 UTC (permalink / raw)
  To: 20859

[-- Attachment #1: Type: text/plain, Size: 1372 bytes --]

In CSS, the syntax of URIs is described as follows:

  The format of a URI value is 'url(' followed by optional white space
  followed by an optional single quote (') or double quote (")
  character followed by the URI itself, followed by an optional single
  quote (') or double quote (") character followed by optional white
  space followed by ')'. The two quote characters must be the
  same. [1]

This means that all of the following are legal URI entries:

  url("http://www.example.com/")
  url('http://www.example.com/')
  url(http://www.example.com/)

However, css-mode doesn't currently interpret the stuff between the
parenthesis in the quote-less form as strings. This means that URIs
containing /* (or // in scss-mode, which is more common for URIs) will
be interpreted as comment starters by Emacs.

I attempted to fix this by setting syntax-propertize-function to the
following:

  (syntax-propertize-rules
     ("url\\(\(\\)[[:space:]]*[[:graph:]]*[[:space:]]*\\(\)\\)"
      (1 "|") (2 "|")))

This almost solves the problem, but with the quirk that "(" and ")"
are interpreted as part of the string.

Would it be possible to assign string syntax to the stuff between the
two parenthesis without having any visible string delimiters? I'm
thankful for any hints for getting closer to a solution!

-- Simen


[1] http://www.w3.org/TR/CSS21/syndata.html#uri

[-- Attachment #2: Type: text/html, Size: 2048 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-06-20 15:04 bug#20859: 25.0.50; css-mode: Comments within URIs Simen Heggestøyl
@ 2015-06-21 17:38 ` Simen Heggestøyl
  2015-06-22 15:54   ` Stefan Monnier
  2015-06-22  1:13 ` Stefan Monnier
  1 sibling, 1 reply; 9+ messages in thread
From: Simen Heggestøyl @ 2015-06-21 17:38 UTC (permalink / raw)
  To: 20859

[-- Attachment #1: Type: text/plain, Size: 1913 bytes --]

I think I found a solution, by applying the string syntax class to the
whole URI.

Please consider the following patch:


 From 69e4a7635f17dd2b5cb773b9088d154687a75b44 Mon Sep 17 00:00:00 2001
 From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg@gmail.com>
Date: Sat, 20 Jun 2015 17:17:31 +0200
Subject: [PATCH] Handle comments inside unquoted URIs in css-mode

* css-mode.el (css-syntax-propertize-function): New function.
(css-mode): Set `syntax-propertize-function'.  (Bug#20859)
---
 lisp/textmodes/css-mode.el | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el
index 424cdb7..9ffaa0a 100644
--- a/lisp/textmodes/css-mode.el
+++ b/lisp/textmodes/css-mode.el
@@ -198,6 +198,18 @@
     (modify-syntax-entry ?- "_" st)
     st))

+(defun css-syntax-propertize-function (start end)
+  (save-excursion
+    (goto-char start)
+    ;; Allow comment starters to appear inside unquoted URIs, such as
+    ;; url(http://www.example.com/*/).
+    (while (re-search-forward
+            "url\([[:space:]]*\\([^\"'\n[:space:]]+\\)[[:space:]]*\)"
+            end t)
+      (add-text-properties
+       (match-beginning 1) (match-end 1)
+       '(syntax-table (2) font-lock-face font-lock-string-face)))))
+
 (defconst css-escapes-re
   "\\\\\\(?:[^\000-\037\177]\\|[0-9a-fA-F]+[ \n\t\r\f]?\\)")
 (defconst css-nmchar-re (concat "\\(?:[-[:alnum:]]\\|" css-escapes-re 
"\\)"))
@@ -381,6 +393,8 @@ pseudo-classes, and at-rules."
   (setq-local comment-start-skip "/\\*+[ \t]*")
   (setq-local comment-end "*/")
   (setq-local comment-end-skip "[ \t]*\\*+/")
+  (setq-local syntax-propertize-function
+              #'css-syntax-propertize-function)
   (setq-local fill-paragraph-function #'css-fill-paragraph)
   (setq-local adaptive-fill-function #'css-adaptive-fill)
   (setq-local add-log-current-defun-function #'css-current-defun-name)
-- 
2.1.4

[-- Attachment #2: Type: text/html, Size: 2823 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-06-20 15:04 bug#20859: 25.0.50; css-mode: Comments within URIs Simen Heggestøyl
  2015-06-21 17:38 ` Simen Heggestøyl
@ 2015-06-22  1:13 ` Stefan Monnier
  2015-08-21 21:08   ` Simen Heggestøyl
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2015-06-22  1:13 UTC (permalink / raw)
  To: Simen Heggestøyl; +Cc: 20859

> However, css-mode doesn't currently interpret the stuff between the
> parenthesis in the quote-less form as strings. This means that URIs
> containing /* (or // in scss-mode, which is more common for URIs) will
> be interpreted as comment starters by Emacs.

Indeed.

> I attempted to fix this by setting syntax-propertize-function to the
> following:
>  (syntax-propertize-rules
>     ("url\\(\(\\)[[:space:]]*[[:graph:]]*[[:space:]]*\\(\)\\)"
              ^^                                          ^^
[ Side note: these are the same as the unquoted parens (IOW the
  backslash there are ineffective, which is actually fine because we don't
  need them).  ]

There's also a real problem: how should we treat url(foo)bar) ?

>      (1 "|") (2 "|")))
> This almost solves the problem, but with the quirk that "(" and ")"
> are interpreted as part of the string.

Just like single quotes and double quotes are usually highlighted as
part of the string.  IOW, I think it's OK to highlight the parens as
part of the string, tho only when the argument inside the parens is not
itself wrapped in '...' or "...".

So we should use a regexp that only matches when the contents of the
parens is not quoted.

> Would it be possible to assign string syntax to the stuff between the
> two parenthesis without having any visible string delimiters? I'm
> thankful for any hints for getting closer to a solution!

If you really care about it you could add a font-lock-keyword which
matches "url(...)" and overrides the face on the open and close parens.


        Stefan





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-06-21 17:38 ` Simen Heggestøyl
@ 2015-06-22 15:54   ` Stefan Monnier
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2015-06-22 15:54 UTC (permalink / raw)
  To: Simen Heggestøyl; +Cc: 20859

> I think I found a solution, by applying the string syntax class to the
> whole URI.

You're not applying the "string syntax class", but the "word syntax-class".

> +(defun css-syntax-propertize-function (start end)
> +  (save-excursion
> +    (goto-char start)
> +    ;; Allow comment starters to appear inside unquoted URIs, such as
> +    ;; url(http://www.example.com/*/).
> +    (while (re-search-forward
> +            "url\([[:space:]]*\\([^\"'\n[:space:]]+\\)[[:space:]]*\)"
                   ^^                                                ^^
Again, these backslashes will be ignored.

The issue of what to do with "foo(bar)baz)" is still open.

> +      (add-text-properties
> +       (match-beginning 1) (match-end 1)
> +       '(syntax-table (2) font-lock-face font-lock-string-face)))))

The downside of this approach is that M-f, M-b will skip over the whole
URL rather than go one word at a time.


        Stefan





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-06-22  1:13 ` Stefan Monnier
@ 2015-08-21 21:08   ` Simen Heggestøyl
  2015-08-22  4:04     ` Stefan Monnier
  0 siblings, 1 reply; 9+ messages in thread
From: Simen Heggestøyl @ 2015-08-21 21:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20859

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

On Mon, Jun 22, 2015 at 3:13 AM, Stefan Monnier 
<monnier@iro.umontreal.ca> wrote:
> There's also a real problem: how should we treat url(foo)bar) ?

I find the regexp 
"url\\((\\)[[:space:]]*[^\"'\n[:space:]]+[[:space:]]*\\()\\)"
hungry enough to work for all of the following cases:

url(foobar);
url(foo)bar);
url(http://www.example.com/foo.html);
url('http://www.example.com/foo.html');
url("http://www.example.com/foo.html");

> Just like single quotes and double quotes are usually highlighted as
> part of the string.  IOW, I think it's OK to highlight the parens as
> part of the string, tho only when the argument inside the parens is 
> not
> itself wrapped in '...' or "...".
> 
> So we should use a regexp that only matches when the contents of the
> parens is not quoted.
> 
> If you really care about it you could add a font-lock-keyword which
> matches "url(...)" and overrides the face on the open and close 
> parens.

Thanks for the tip, I'll do that.

In sum, the patch now looks like:


 From 67e48674bf42faa20ee8571ecf84c4bdc48148a3 Mon Sep 17 00:00:00 2001
 From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg@gmail.com>
Date: Sat, 20 Jun 2015 17:17:31 +0200
Subject: [PATCH] Handle comments inside unquoted URIs in css-mode

* lisp/textmodes/css-mode.el (css--uri-re): New defconst.
(css-syntax-propertize-function): New function.
(css-mode): Set `syntax-propertize-function'.
---
 lisp/textmodes/css-mode.el | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el
index 424cdb7..1383420 100644
--- a/lisp/textmodes/css-mode.el
+++ b/lisp/textmodes/css-mode.el
@@ -198,6 +198,14 @@
     (modify-syntax-entry ?- "_" st)
     st))

+(eval-and-compile
+ (defconst css--uri-re
+ "url\\((\\)[[:space:]]*[^\"'\n[:space:]]+[[:space:]]*\\()\\)"))
+
+(defconst css-syntax-propertize-function
+ (syntax-propertize-rules
+ (css--uri-re (1 "|") (2 "|"))))
+
 (defconst css-escapes-re
   "\\\\\\(?:[^\000-\037\177]\\|[0-9a-fA-F]+[ \n\t\r\f]?\\)")
 (defconst css-nmchar-re (concat "\\(?:[-[:alnum:]]\\|" css-escapes-re 
"\\)"))
@@ -278,7 +286,13 @@
               "\\(?:\\(" css-proprietary-nmstart-re "\\)\\|"
               css-nmstart-re "\\)" css-nmchar-re "*"
               "\\)\\s-*:")
- (1 (if (match-end 2) 'css-proprietary-property 'css-property)))))
+ (1 (if (match-end 2) 'css-proprietary-property 'css-property)))
+ ;; Make sure the parens in a url(...) expression receive the
+ ;; default face. This is done because the parens may sometimes
+ ;; receive generic string delimiter syntax (see
+ ;; `css-syntax-propertize-function').
+ (,css--uri-re
+ (1 'default t) (2 'default t))))

 (defvar css-font-lock-keywords (css--font-lock-keywords))

@@ -381,6 +395,8 @@ pseudo-classes, and at-rules."
   (setq-local comment-start-skip "/\\*+[ \t]*")
   (setq-local comment-end "*/")
   (setq-local comment-end-skip "[ \t]*\\*+/")
+ (setq-local syntax-propertize-function
+ css-syntax-propertize-function)
   (setq-local fill-paragraph-function #'css-fill-paragraph)
   (setq-local adaptive-fill-function #'css-adaptive-fill)
   (setq-local add-log-current-defun-function #'css-current-defun-name)
-- 
2.5.0



[-- Attachment #2: Type: text/html, Size: 4617 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-08-21 21:08   ` Simen Heggestøyl
@ 2015-08-22  4:04     ` Stefan Monnier
  2015-08-22  8:19       ` Simen Heggestøyl
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2015-08-22  4:04 UTC (permalink / raw)
  To: Simen Heggestøyl; +Cc: 20859

>> There's also a real problem: how should we treat url(foo)bar) ?

> I find the regexp
> "url\\((\\)[[:space:]]*[^\"'\n[:space:]]+[[:space:]]*\\()\\)"
> hungry enough to work for all of the following cases:

> url(foobar);
> url(foo)bar);

The question was not whether the regexp works, but what does the CSS
language want us to do.  I.e. is "url(foo)bar)" equivalent to
"url('foo)bar')" or to "url('foo')bar)"?


        Stefan





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-08-22  4:04     ` Stefan Monnier
@ 2015-08-22  8:19       ` Simen Heggestøyl
  2015-08-22 14:47         ` Stefan Monnier
  0 siblings, 1 reply; 9+ messages in thread
From: Simen Heggestøyl @ 2015-08-22  8:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20859

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On Sat, Aug 22, 2015 at 6:04 AM, Stefan Monnier 
<monnier@iro.umontreal.ca> wrote:
> The question was not whether the regexp works, but what does the CSS
> language want us to do.  I.e. is "url(foo)bar)" equivalent to
> "url('foo)bar')" or to "url('foo')bar)"?

Ah, good that you asked, because the CSS spec states the following:

  Some characters appearing in an unquoted URI, such as parentheses,
  white space characters, single quotes (') and double quotes ("),
  must be escaped with a backslash so that the resulting URI value is
  a URI token: '\(', '\)'.

So in "url(foo)bar)", the URI ends at the first ")", while in
"url(foo\)bar)", it ends at the last ")".

The following regexp seems to handle this:

"url\\((\\)[[:space:]]*\\(?:\\\\.\\|[^()[:space:]\n'\"]\\)+[[:space:]]*\\()\\)"

-- Simen

[-- Attachment #2: Type: text/html, Size: 1112 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-08-22  8:19       ` Simen Heggestøyl
@ 2015-08-22 14:47         ` Stefan Monnier
  2015-08-22 17:21           ` Simen Heggestøyl
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2015-08-22 14:47 UTC (permalink / raw)
  To: Simen Heggestøyl; +Cc: 20859

> The following regexp seems to handle this:

> "url\\((\\)[[:space:]]*\\(?:\\\\.\\|[^()[:space:]\n'\"]\\)+[[:space:]]*\\()\\)"

Good.  With this cleared, I think the patch is ready to go.  Feel free
to install it, thanks,


        Stefan





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#20859: 25.0.50; css-mode: Comments within URIs
  2015-08-22 14:47         ` Stefan Monnier
@ 2015-08-22 17:21           ` Simen Heggestøyl
  0 siblings, 0 replies; 9+ messages in thread
From: Simen Heggestøyl @ 2015-08-22 17:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20859-done

[-- Attachment #1: Type: text/plain, Size: 218 bytes --]

On Sat, Aug 22, 2015 at 4:47 PM, Stefan Monnier 
<monnier@iro.umontreal.ca> wrote:
> Good.  With this cleared, I think the patch is ready to go.  Feel free
> to install it, thanks,

Thanks Stefan. Installed!

-- Simen

[-- Attachment #2: Type: text/html, Size: 355 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-08-22 17:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-20 15:04 bug#20859: 25.0.50; css-mode: Comments within URIs Simen Heggestøyl
2015-06-21 17:38 ` Simen Heggestøyl
2015-06-22 15:54   ` Stefan Monnier
2015-06-22  1:13 ` Stefan Monnier
2015-08-21 21:08   ` Simen Heggestøyl
2015-08-22  4:04     ` Stefan Monnier
2015-08-22  8:19       ` Simen Heggestøyl
2015-08-22 14:47         ` Stefan Monnier
2015-08-22 17:21           ` Simen Heggestøyl

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).