unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Scheme Mode and Regular Expression Literals
@ 2024-02-27 14:46 Jakub T. Jankiewicz
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub T. Jankiewicz @ 2024-02-27 14:46 UTC (permalink / raw)
  To: emacs-devel

Hi,

I'm working on Scheme interpreter written in JavaScript called LIPS
Scheme. It has the same syntax for [Regular expressions as Gauche][1]
but with more flags, because they are just JavaScript RegExp.

Is there a way to add support for regular expressions in the native scheme
mode without modification of the code? Something I can do from my .emacs file?

Regexes in LIPS and Gauche look like this `#/[a-b]+/`, it gives a lot of
problems inside Emacs. Sometimes you need to add comments with stuff after
to make the syntax highlighting and indentation work.

Example:


    (let ((x #/foo|bar/)) ;; |))
      (write x))

Emacs thinks that the vertical bar is a scheme extended symbol syntax and you
need to add a comment to close the symbol and close the lists otherwise the
code will not display and evaluate properly.

And if you want to send this to scheme REPL and have proper highlighting you
need something like this:

    (let ((x #/foo|bar/)) ;; |))
      (write x)
      (display " ;; |")
      (newline))

it would be cool if you could configure the scheme mode to allow such syntax.

[1]:
https://practical-scheme.net/gauche/man/gauche-refe/Regular-expressions.html

Jakub

--
Jakub T. Jankiewicz, Senior Front-End Developer
https://jcubic.pl/me
https://lips.js.org
https://koduj.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-09  2:59 Toshi Umehara
  2024-03-09 13:37 ` Jakub T. Jankiewicz
  2024-03-14  8:40 ` Eli Zaretskii
  0 siblings, 2 replies; 12+ messages in thread
From: Toshi Umehara @ 2024-03-09  2:59 UTC (permalink / raw)
  To: jcubic; +Cc: emacs-devel


Regular expression literals as exist in Gauche Scheme does not seem to
work in scheme mode. #/regexp/ is not dealt as a chunk. I looked for its
solution on the Web, and reached this web page,
https://ardggy.hatenablog.jp/entry/2015/11/24/143713 (in Japanese).  The
solution on the page overwrites syntax-propertize-function and adds a
new rule for regular expression literals. I got a hint from it, and have
tried to manage backslash slash ( \/ ) not to finish the literal and
backshashes of even numbers (like \\\\ ) not to work as escapes.

Currently, the following code in init.el works for me.

#+BEGIN_SRC
(add-hook 'scheme-mode-hook
          (lambda ()
            (setq-local
             syntax-propertize-function
             (lambda (start end)
               (goto-char start)
               (funcall
                (syntax-propertize-rules
                 ;; For #/regexp/ syntax
                 ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
                  (1 "|")
                  (3 "|"))
                 ;; For #; comment syntax
                 ("\\(#\\);"
                  (1 (prog1 "< cn"
                       (scheme-syntax-propertize-sexp-comment
                        (point) end)))))
                (point) end))
             ))
          )
#+END_SRC

Hope this helps.

--
Toshi (Toshihiro Umehara)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-09  2:59 Scheme Mode and Regular Expression Literals Toshi Umehara
@ 2024-03-09 13:37 ` Jakub T. Jankiewicz
  2024-03-14  8:40 ` Eli Zaretskii
  1 sibling, 0 replies; 12+ messages in thread
From: Jakub T. Jankiewicz @ 2024-03-09 13:37 UTC (permalink / raw)
  To: emacs-devel

This is perfect, thank you.

I also just noticed (while reading the docs) that GNU Kawa also have the same
syntax for regular expressions.

On Sat, 09 Mar 2024 11:59:10 +0900
Toshi Umehara <toshi@niceume.com> wrote:

> Regular expression literals as exist in Gauche Scheme does not seem to
> work in scheme mode. #/regexp/ is not dealt as a chunk. I looked for its
> solution on the Web, and reached this web page,
> https://ardggy.hatenablog.jp/entry/2015/11/24/143713 (in Japanese).  The
> solution on the page overwrites syntax-propertize-function and adds a
> new rule for regular expression literals. I got a hint from it, and have
> tried to manage backslash slash ( \/ ) not to finish the literal and
> backshashes of even numbers (like \\\\ ) not to work as escapes.
> 
> Currently, the following code in init.el works for me.
> 
> #+BEGIN_SRC
> (add-hook 'scheme-mode-hook
>           (lambda ()
>             (setq-local
>              syntax-propertize-function
>              (lambda (start end)
>                (goto-char start)
>                (funcall
>                 (syntax-propertize-rules
>                  ;; For #/regexp/ syntax
>                  ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
>                   (1 "|")
>                   (3 "|"))
>                  ;; For #; comment syntax
>                  ("\\(#\\);"
>                   (1 (prog1 "< cn"
>                        (scheme-syntax-propertize-sexp-comment
>                         (point) end)))))
>                 (point) end))
>              ))
>           )
> #+END_SRC
> 
> Hope this helps.
> 

--
Jakub T. Jankiewicz, Senior Front-End Developer
https://jcubic.pl/me
https://lips.js.org
https://koduj.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-09  2:59 Scheme Mode and Regular Expression Literals Toshi Umehara
  2024-03-09 13:37 ` Jakub T. Jankiewicz
@ 2024-03-14  8:40 ` Eli Zaretskii
  2024-03-14 11:38   ` Mattias Engdegård
  1 sibling, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2024-03-14  8:40 UTC (permalink / raw)
  To: Toshi Umehara, Stefan Monnier; +Cc: jcubic, emacs-devel

> From: Toshi Umehara <toshi@niceume.com>
> Cc: emacs-devel@gnu.org
> Date: Sat, 09 Mar 2024 11:59:10 +0900
> 
> 
> Regular expression literals as exist in Gauche Scheme does not seem to
> work in scheme mode. #/regexp/ is not dealt as a chunk. I looked for its
> solution on the Web, and reached this web page,
> https://ardggy.hatenablog.jp/entry/2015/11/24/143713 (in Japanese).  The
> solution on the page overwrites syntax-propertize-function and adds a
> new rule for regular expression literals. I got a hint from it, and have
> tried to manage backslash slash ( \/ ) not to finish the literal and
> backshashes of even numbers (like \\\\ ) not to work as escapes.
> 
> Currently, the following code in init.el works for me.
> 
> #+BEGIN_SRC
> (add-hook 'scheme-mode-hook
>           (lambda ()
>             (setq-local
>              syntax-propertize-function
>              (lambda (start end)
>                (goto-char start)
>                (funcall
>                 (syntax-propertize-rules
>                  ;; For #/regexp/ syntax
>                  ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
>                   (1 "|")
>                   (3 "|"))
>                  ;; For #; comment syntax
>                  ("\\(#\\);"
>                   (1 (prog1 "< cn"
>                        (scheme-syntax-propertize-sexp-comment
>                         (point) end)))))
>                 (point) end))
>              ))
>           )
> #+END_SRC

Thanks.

Stefan, is the below the right fix?  I'm confused by the call to
scheme-syntax-propertize-sexp-comment that scheme-syntax-propertize
does at its very beginning -- what is it for?

diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 67abab6..a83e5dc 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -412,6 +412,11 @@ scheme-syntax-propertize
   (scheme-syntax-propertize-sexp-comment (point) end)
   (funcall
    (syntax-propertize-rules
+    ;; For #/regexp/ syntax
+    ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
+     (1 "|")
+     (3 "|"))
+    ;; For #; comment syntax
     ("\\(#\\);" (1 (prog1 "< cn"
                      (scheme-syntax-propertize-sexp-comment (point) end)))))
    (point) end))



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-14  8:40 ` Eli Zaretskii
@ 2024-03-14 11:38   ` Mattias Engdegård
  2024-03-14 13:34     ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Mattias Engdegård @ 2024-03-14 11:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Toshi Umehara, Stefan Monnier, jcubic, emacs-devel

>    (syntax-propertize-rules
> +    ;; For #/regexp/ syntax
> +    ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
> +     (1 "|")
> +     (3 "|"))

That amount of leaning toothpicks confuses even someone used to reading Elisp regexps so I'm switching notation here (`syntax-propertize-rules` permits regexps in rx). The above means:

  ((rx (group "#") "/"
       (*? (group (| "\\/" "\\\\" nonl)))
       (group "/"))
   (1 "|")
   (3 "|"))

This is a tad too ambiguous; it will match "#/ab\/", for instance, which probably wasn't intended.
What about the more robust

  ((rx (group "#") "/"
       (* (| (: "\\" nonl)
             (not (in "\n/\\"))))
       (group "/"))
   (1 "|")
   (2 "|"))

instead?

The Gauche documentation isn't entirely clear on how literal newlines are lexed inside this construct.
If newlines can be part of regexp literals, remove the \n from the third line.
If backslashed newlines are allowed, change `nonl` to `anychar`.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-14 11:38   ` Mattias Engdegård
@ 2024-03-14 13:34     ` Stefan Monnier
  2024-03-14 15:09       ` Jakub T. Jankiewicz
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2024-03-14 13:34 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Eli Zaretskii, Toshi Umehara, jcubic, emacs-devel

>   ((rx (group "#") "/"
>        (* (| (: "\\" nonl)
>              (not (in "\n/\\"))))
>        (group "/"))
>    (1 "|")
>    (2 "|"))
>
> instead?
>
> The Gauche documentation isn't entirely clear on how literal newlines are lexed inside this construct.
> If newlines can be part of regexp literals, remove the \n from the third line.
> If backslashed newlines are allowed, change `nonl` to `anychar`.

If they can span several lines then matching the whole #/.../ with
a regexp is definitely the wrong approach.

If not, that rule can be made to work but it has to check that the
leading #/ is not inside a string or comment.

The more robust approach is to match only #/ here and then use
a function which checks the resulting `syntax-ppss` to handle the rest
of the regexp.

See `octave-syntax-propertize-function` for an example of the structure
where we use the regexp "\\(?:^\\|[[({,; ]\\)\\('\\)" to match the
beginning of a quoted string, and then use
`octave-syntax-propertize-sqs` to process its contents (and use that as
well at the beginning of the function in case `syntax-propertize` is
called with start in the middle of such a string (or in our case,
regexp)).


        Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-14 13:34     ` Stefan Monnier
@ 2024-03-14 15:09       ` Jakub T. Jankiewicz
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub T. Jankiewicz @ 2024-03-14 15:09 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Mattias Engdegård, Eli Zaretskii, Toshi Umehara, emacs-devel



On Thu, 14 Mar 2024 09:34:10 -0400
Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> If they can span several lines then matching the whole #/.../ with
> a regexp is definitely the wrong approach.

Just checked, Kawa and Gauche both accept newline as part of the expression,
the same as strings.


--
Jakub T. Jankiewicz, Senior Front-End Developer
https://jcubic.pl/me
https://lips.js.org
https://koduj.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-17  0:28 Toshi Umehara
  2024-03-17  2:02 ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Toshi Umehara @ 2024-03-17  0:28 UTC (permalink / raw)
  To: monnier; +Cc: Eli Zaretskii, jcubic, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3575 bytes --]


After reading your suggestions, I've created new functions to deal with
regular expression syntax. This approach consists of two procedures.

1. scheme-syntax-propertize-regexp-1 detects starts of regular
expressoin (#/). If it finds a start, it continues to try to find its
corresponding end.

2. scheme-syntax-propertize-regexp-2 detects ends of regular expression
(/) out of comments but within strings that start with #. The second
procedure is introduced, to deal with cases where regular expression is
written in multiline.

The following code can be put in init.el, and patch for
/lisp/progmodes/scheme.el is attached. I hope this is useful, thanks.


#+BEGIN_SRC
(add-hook
 'scheme-mode-hook
 (lambda ()
   (setq-local
    syntax-propertize-function
    (lambda (beg end)
      (goto-char beg)
      (scheme-syntax-propertize-sexp-comment (point) end)
      (funcall
       (syntax-propertize-rules
        ("\\(#\\);" (1 (prog1 "< cn"
                         (scheme-syntax-propertize-sexp-comment
                          (point) end))))
        )
       (point) end)
      ;; For regular expression literals
      (scheme-syntax-propertize-regexp-1 end)
      (scheme-syntax-propertize-regexp-2 end)
      ))))

(defun scheme-match-regexp-start (limit)
  (re-search-forward
   (rx
    (or
     bol
     space
     (in "[('")
     )
    (group "#")
    "/"
    )
   limit
   t
   )
  )

(defun scheme-match-regexp-end (limit)
  (re-search-forward
   (rx
     (group "/")
     )
   limit
   t
   )
  )

(defun scheme-syntax-propertize-regexp-1 (end)
  (while (scheme-match-regexp-start end)
    (let* ((state (save-excursion
                    (syntax-ppss (match-beginning 1))))
           (within-str (nth 3 state))
           (within-comm (nth 4 state)))
      (if (and (not within-comm) (not within-str))
          (progn
            (put-text-property
             (match-beginning 1)
             (1+ (match-beginning 1))
             'syntax-table (string-to-syntax "|"))
            (let ((end-found nil))
              (while
                  (and
                   (not end-found)
                   (scheme-match-regexp-end end))
                (if
                    (not (char-equal
                          (char-before (match-beginning 1))
                          ?\\ ))
                    (progn
                      (put-text-property
                       (match-beginning 1)
                       (1+ (match-beginning 1))
                       'syntax-table (string-to-syntax "|"))
                      (setq end-found t)
                      )))))))))

(defun scheme-syntax-propertize-regexp-2 (end)
  (let ((end-found nil))
    (while (scheme-match-regexp-end end)
      (let* ((state (save-excursion
                      (syntax-ppss (match-beginning 1))))
             (within-str (nth 3 state))
             (within-comm (nth 4 state))
             (start-delim-pos (nth 8 state)))
        (if (and (not within-comm)
                 within-str
                 (string=
                  (buffer-substring-no-properties
                   start-delim-pos
                   (1+ start-delim-pos))
                  "#")
                 (not (char-equal
                       (char-before (match-beginning 1))
                       ?\\ )))
            (progn
                    (put-text-property
                     (match-beginning 1)
                     (1+ (match-beginning 1))
                     'syntax-table (string-to-syntax "|"))
                    (setq end-found t)
                    ))))))
#+END_SRC


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Enable dealing with regular expression literal --]
[-- Type: text/x-patch, Size: 3271 bytes --]

diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 67abab6913d..d1980463859 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -414,7 +414,10 @@ scheme-syntax-propertize
    (syntax-propertize-rules
     ("\\(#\\);" (1 (prog1 "< cn"
                      (scheme-syntax-propertize-sexp-comment (point) end)))))
-   (point) end))
+   (point) end)
+  (scheme-syntax-propertize-regexp-1 end)
+  (scheme-syntax-propertize-regexp-2 end)
+  )
 
 (defun scheme-syntax-propertize-sexp-comment (_ end)
   (let ((state (syntax-ppss)))
@@ -430,6 +433,87 @@ scheme-syntax-propertize-sexp-comment
                                'syntax-table (string-to-syntax "> cn")))
         (scan-error (goto-char end))))))
 
+(defun scheme-match-regexp-start (limit)
+  (re-search-forward
+   (rx
+    (or
+     bol
+     space
+     (in "[('")
+     )
+    (group "#")
+    "/"
+    )
+   limit
+   t
+   )
+  )
+
+(defun scheme-match-regexp-end (limit)
+  (re-search-forward
+   (rx
+     (group "/")
+     )
+   limit
+   t
+   )
+  )
+
+(defun scheme-syntax-propertize-regexp-1 (end)
+  (while (scheme-match-regexp-start end)
+    (let* ((state (save-excursion
+                    (syntax-ppss (match-beginning 1))))
+           (within-str (nth 3 state))
+           (within-comm (nth 4 state)))
+      (if (and (not within-comm) (not within-str))
+          (progn
+            (put-text-property
+             (match-beginning 1)
+             (1+ (match-beginning 1))
+             'syntax-table (string-to-syntax "|"))
+            (let ((end-found nil))
+              (while
+                  (and
+                   (not end-found)
+                   (scheme-match-regexp-end end))
+                (if
+                    (not (char-equal
+                          (char-before (match-beginning 1))
+                          ?\\ ))
+                    (progn
+                      (put-text-property
+                       (match-beginning 1)
+                       (1+ (match-beginning 1))
+                       'syntax-table (string-to-syntax "|"))
+                      (setq end-found t)
+                      )))))))))
+
+(defun scheme-syntax-propertize-regexp-2 (end)
+  (let ((end-found nil))
+    (while (scheme-match-regexp-end end)
+      (let* ((state (save-excursion
+                      (syntax-ppss (match-beginning 1))))
+             (within-str (nth 3 state))
+             (within-comm (nth 4 state))
+             (start-delim-pos (nth 8 state)))
+        (if (and (not within-comm)
+                 within-str
+                 (string=
+                  (buffer-substring-no-properties
+                   start-delim-pos
+                   (1+ start-delim-pos))
+                  "#")
+                 (not (char-equal
+                       (char-before (match-beginning 1))
+                       ?\\ )))
+            (progn
+                    (put-text-property
+                     (match-beginning 1)
+                     (1+ (match-beginning 1))
+                     'syntax-table (string-to-syntax "|"))
+                    (setq end-found t)
+                    ))))))
+
 ;;;###autoload
 (define-derived-mode dsssl-mode scheme-mode "DSSSL"
   "Major mode for editing DSSSL code.

[-- Attachment #3: Type: text/plain, Size: 31 bytes --]


-- 
Toshi (Toshihiro Umehara)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-17  0:28 Toshi Umehara
@ 2024-03-17  2:02 ` Stefan Monnier
  0 siblings, 0 replies; 12+ messages in thread
From: Stefan Monnier @ 2024-03-17  2:02 UTC (permalink / raw)
  To: Toshi Umehara; +Cc: Eli Zaretskii, jcubic, emacs-devel

>    (setq-local
>     syntax-propertize-function
>     (lambda (beg end)
>       (goto-char beg)
>       (scheme-syntax-propertize-sexp-comment (point) end)
>       (funcall
>        (syntax-propertize-rules
>         ("\\(#\\);" (1 (prog1 "< cn"
>                          (scheme-syntax-propertize-sexp-comment
>                           (point) end))))
>         )
>        (point) end)
>       ;; For regular expression literals
>       (scheme-syntax-propertize-regexp-1 end)
>       (scheme-syntax-propertize-regexp-2 end)
>       ))))

Does this work for you?
The "funcall" in there is expected to scan through (point)...end and
move point accordingly, so once you call
`scheme-syntax-propertize-regexp-1` you're already "too far".

Instead you need to turn

      (syntax-propertize-rules
       ("\\(#\\);" (1 (prog1 "< cn"
                        (scheme-syntax-propertize-sexp-comment
                         (point) end)))))

into something like:

      (syntax-propertize-rules
       ("\\(#\\);" (1 (prog1 "< cn"
                        (scheme-syntax-propertize-sexp-comment
                         (point) end))))
       ("\\(#\\)/" (1 (prog1 "|"
                        (scheme-syntax-propertize-sexp-comment
                         (point) end))))

and then extend `scheme-syntax-propertize-sexp-comment` so it handles
the case when point is inside a regexp (based on `syntax-ppss`, just
like it currently does for the case where point is inside a sexp-comment).
[ And probably rename it while you're at it since it will not be only
  for "sexp comment" any more.  ]


        Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-19  3:06 Toshi Umehara
  2024-03-19 13:36 ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Toshi Umehara @ 2024-03-19  3:06 UTC (permalink / raw)
  To: monnier; +Cc: Eli Zaretskii, jcubic, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 4375 bytes --]


Thank you very much for your comments, Stefan. As you point out, the
previous code is found to fail when regular expressions and sexp
comments exist in the same line.

Now I put the logic to scan the beginning of sexp comments and regular
expressions in syntax-propertize-rules macro. I use different functions
to be invoked by them, because the beginning of regular expressions
need to be canceled when they are already in normal strings or
comments. (This kind of logic might be required also for sexp comment,
but the current implementation does not do that and does not seem to
cause any harms.)

Therefore, if I use the same function, it requires an extra conditional
code only for regular expressions, which is a repetition of what is done
in syntax-propertize-rules level. (At syntax-propertize-rules leve, we
already know it's a beginning of a sexp comment or a possible beginning
of regular expression).

The following explains the functions implemented.

- scheme-syntax-propertize-sexp-comment

  No change from the current built-in implementation

- scheme-syntax-propertize-regexp-end

  If the posistion is already in regular expressions and not in
  comments, it searches regular expression end (/) ignoring backslash
  slash (\/).


- scheme-syntax-propertize-regexp

  This is invoked in syntax-propertize-rules. It cancels syntax class
  assignment to # of #/ if the # part is already in strings or
  comments. (Precisely, it assigns @ syntax class to # .) Otherwise it
  continues to scan the end of regular expression by calling
  scheme-syntax-propertize-regexp-end.


Thanks.


#+BEGIN_SRC
(add-hook
 'scheme-mode-hook
 (lambda ()
   (setq-local
    syntax-propertize-function
    (lambda (beg end)
      (goto-char beg)
      (scheme-syntax-propertize-sexp-comment (point) end)
      (scheme-syntax-propertize-regexp-end (point) end)
      (funcall
       (syntax-propertize-rules
        ("\\(#\\);" (1 (prog1 "< cn"
                         (scheme-syntax-propertize-sexp-comment
                          (point) end))))
        ("\\(#\\)/" (1 (prog1 "|"
                         (scheme-syntax-propertize-regexp
                          (point) end))))
         )
       (point) end)
      ))))

(defun scheme-syntax-propertize-sexp-comment (_ end)
  (let ((state (syntax-ppss)))
    (when (eq 2 (nth 7 state))
      ;; It's a sexp-comment.  Tell parse-partial-sexp where it ends.
      (condition-case nil
          (progn
            (goto-char (+ 2 (nth 8 state)))
            ;; FIXME: this doesn't handle the case where the sexp
            ;; itself contains a #; comment.
            (forward-sexp 1)
            (put-text-property (1- (point)) (point)
                               'syntax-table (string-to-syntax "> cn")))
        (scan-error (goto-char end))))))

(defun scheme-syntax-propertize-regexp-end (_ end)
  (let* ((state (syntax-ppss))
         (within-str (nth 3 state))
         (within-comm (nth 4 state))
         (start-delim-pos (nth 8 state)))
    (if (and (not within-comm)
             (and within-str
                  (string=
                   (buffer-substring-no-properties
                    start-delim-pos
                    (1+ start-delim-pos))
                   "#")))
        (let ((end-found nil))
          (while (and
                  (not end-found)
                  (re-search-forward "\\(/\\)" end t))
            (progn
              (if
                  (not (char-equal
                        (char-before (match-beginning 1))
                        ?\\ ))
                  (progn
                    (put-text-property
                     (match-beginning 1)
                     (1+ (match-beginning 1))
                     'syntax-table (string-to-syntax "|"))
                    (setq end-found t)
                    )))))
      )))

(defun scheme-syntax-propertize-regexp (_ end)
  (let* ((match-start-state (save-excursion
                              (syntax-ppss (match-beginning 1))))
         (within-str (nth 3 match-start-state))
         (within-comm (nth 4 match-start-state)))
    (if (or within-str within-comm)
          (put-text-property ;; Cancel regular expression start
           (match-beginning 1)
           (1+ (match-beginning 1))
           'syntax-table (string-to-syntax "@"))
      (scheme-syntax-propertize-regexp-end _ end)
      )))

#+END_SRC


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: deals with scheme regular expression syntax --]
[-- Type: text/x-patch, Size: 2791 bytes --]

diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 67abab6913d..98405513099 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -410,11 +410,18 @@ scheme-sexp-comment-syntax-table
 (defun scheme-syntax-propertize (beg end)
   (goto-char beg)
   (scheme-syntax-propertize-sexp-comment (point) end)
+  (scheme-syntax-propertize-regexp-end (point) end)
   (funcall
    (syntax-propertize-rules
     ("\\(#\\);" (1 (prog1 "< cn"
-                     (scheme-syntax-propertize-sexp-comment (point) end)))))
-   (point) end))
+                     (scheme-syntax-propertize-sexp-comment
+                      (point) end))))
+    ("\\(#\\)/" (1 (prog1 "|"
+                     (scheme-syntax-propertize-regexp
+                      (point) end))))
+    )
+   (point) end)
+  )
 
 (defun scheme-syntax-propertize-sexp-comment (_ end)
   (let ((state (syntax-ppss)))
@@ -430,6 +437,49 @@ scheme-syntax-propertize-sexp-comment
                                'syntax-table (string-to-syntax "> cn")))
         (scan-error (goto-char end))))))
 
+(defun scheme-syntax-propertize-regexp-end (_ end)
+  (let* ((state (syntax-ppss))
+         (within-str (nth 3 state))
+         (within-comm (nth 4 state))
+         (start-delim-pos (nth 8 state)))
+    (if (and (not within-comm)
+             (and within-str
+                  (string=
+                   (buffer-substring-no-properties
+                    start-delim-pos
+                    (1+ start-delim-pos))
+                   "#")))
+        (let ((end-found nil))
+          (while (and
+                  (not end-found)
+                  (re-search-forward "\\(/\\)" end t))
+            (progn
+              (if
+                  (not (char-equal
+                        (char-before (match-beginning 1))
+                        ?\\ ))
+                  (progn
+                    (put-text-property
+                     (match-beginning 1)
+                     (1+ (match-beginning 1))
+                     'syntax-table (string-to-syntax "|"))
+                    (setq end-found t)
+                    )))))
+      )))
+
+(defun scheme-syntax-propertize-regexp (_ end)
+  (let* ((match-start-state (save-excursion
+                              (syntax-ppss (match-beginning 1))))
+         (within-str (nth 3 match-start-state))
+         (within-comm (nth 4 match-start-state)))
+    (if (or within-str within-comm)
+          (put-text-property ;; Cancel regular expression start
+           (match-beginning 1)
+           (1+ (match-beginning 1))
+           'syntax-table (string-to-syntax "@"))
+      (scheme-syntax-propertize-regexp-end _ end)
+      )))
+
 ;;;###autoload
 (define-derived-mode dsssl-mode scheme-mode "DSSSL"
   "Major mode for editing DSSSL code.

[-- Attachment #3: Type: text/plain, Size: 32 bytes --]



-- 
Toshi (Toshihiro Umehara)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-19  3:06 Toshi Umehara
@ 2024-03-19 13:36 ` Stefan Monnier
  2024-03-23  2:45   ` Toshi Umehara
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2024-03-19 13:36 UTC (permalink / raw)
  To: Toshi Umehara; +Cc: Eli Zaretskii, jcubic, emacs-devel

> Now I put the logic to scan the beginning of sexp comments and regular
> expressions in syntax-propertize-rules macro. I use different functions
> to be invoked by them, because the beginning of regular expressions
> need to be canceled when they are already in normal strings or
> comments.

We usually don't bother doing that because it affects only navigation
*within* those strings/comments :-)

The code looks pretty good now.  Can you turn it into a patch against
`scheme.el`?

In the mean time, see my comments below.

> (defun scheme-syntax-propertize-regexp-end (_ end)
>   (let* ((state (syntax-ppss))
>          (within-str (nth 3 state))
>          (within-comm (nth 4 state))
>          (start-delim-pos (nth 8 state)))
>     (if (and (not within-comm)
>              (and within-str
>                   (string=
>                    (buffer-substring-no-properties
>                     start-delim-pos
>                     (1+ start-delim-pos))
>                    "#")))

(eq ?# (char-after start-delim-pos)) would be simpler and more efficient.
Also, `within-str` and `within-comm` are mutually exclusive, so the
`(and (not within-comm)` is redundant.

>         (let ((end-found nil))
>           (while (and
>                   (not end-found)
>                   (re-search-forward "\\(/\\)" end t))

You don't need the \\( \\), you can just use (match-beginning 0)
instead, which will let Emacs use the simpler non-regexp
search algorithm.

>             (progn
>               (if
>                   (not (char-equal
>                         (char-before (match-beginning 1))
>                         ?\\ ))

This fails for #/foo\\/
In sh-script.el I used

    (eq -1 (% (save-excursion (skip-chars-backward "\\\\")) 2))

At other places we let the regexp matcher skip those by using a hideous
regexp such as

    "[^\\]\\(?:\\\\\\\\\\)*/"

(but this regexp fails to match the closing / if it's right at the
starting position, so you have to use a workaround such as doing
a (forward-char -1) before searching).

>                   (progn
>                     (put-text-property
>                      (match-beginning 1)
>                      (1+ (match-beginning 1))

Aka (match-end 1).

>                      'syntax-table (string-to-syntax "|"))
>                     (setq end-found t)
>                     )))))
>       )))

You can avoid the `end-found` thingy with

    (while (and (re-search-forward "/" end 'move)
                (eq -1 (% (save-excursion (skip-chars-backward "\\\\")) 2))))
    (when (< (point) end) ;; Double check that `re-search-forward` succeeded.
      (put-text-property ...))

> (defun scheme-syntax-propertize-regexp (_ end)
>   (let* ((match-start-state (save-excursion
>                               (syntax-ppss (match-beginning 1))))
>          (within-str (nth 3 match-start-state))
>          (within-comm (nth 4 match-start-state)))
>     (if (or within-str within-comm)

(nth 8 match-start-state) gives the same boolean answer as this `or`.
And instead of having `syntax-propertize-rules` add a | syntax and then
you replacing it with "@" you could also tell `syntax-propertize-rules`
when to do it with something like:

    ("\\(#\\)/" (1 (when (null (nth 8 (save-excursion
                                        (syntax-ppss (match-beginning 0)))))
                     (prog1 "|"
                       (scheme-syntax-propertize-regexp-end
                        (point) end)))))

tho `syntax-propertize-rules` sadly doesn't currently handle this
combination of `when` and `prog1` :-(

    ("\\(#\\)/" (1 (when (null (nth 8 (save-excursion
                                        (syntax-ppss (match-beginning 0)))))
                     (put-text-property ... "|" ..)
                     (scheme-syntax-propertize-regexp-end
                      (point) end)
                     nil)))


- Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Scheme Mode and Regular Expression Literals
  2024-03-19 13:36 ` Stefan Monnier
@ 2024-03-23  2:45   ` Toshi Umehara
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Umehara @ 2024-03-23  2:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Thank you very much, Stefan.
Reading carefully your comments, I understand them and have made a patch :)
The new patch is posted on this mailing list.

--
Toshi (Toshihiro Umehara)



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-03-23  2:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-09  2:59 Scheme Mode and Regular Expression Literals Toshi Umehara
2024-03-09 13:37 ` Jakub T. Jankiewicz
2024-03-14  8:40 ` Eli Zaretskii
2024-03-14 11:38   ` Mattias Engdegård
2024-03-14 13:34     ` Stefan Monnier
2024-03-14 15:09       ` Jakub T. Jankiewicz
  -- strict thread matches above, loose matches on Subject: below --
2024-03-19  3:06 Toshi Umehara
2024-03-19 13:36 ` Stefan Monnier
2024-03-23  2:45   ` Toshi Umehara
2024-03-17  0:28 Toshi Umehara
2024-03-17  2:02 ` Stefan Monnier
2024-02-27 14:46 Jakub T. Jankiewicz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).