unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Scheme Mode and Regular Expression Literals
@ 2024-02-27 14:46 Jakub T. Jankiewicz
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub T. Jankiewicz @ 2024-02-27 14:46 UTC (permalink / raw)
  To: emacs-devel

Hi,

I'm working on Scheme interpreter written in JavaScript called LIPS
Scheme. It has the same syntax for [Regular expressions as Gauche][1]
but with more flags, because they are just JavaScript RegExp.

Is there a way to add support for regular expressions in the native scheme
mode without modification of the code? Something I can do from my .emacs file?

Regexes in LIPS and Gauche look like this `#/[a-b]+/`, it gives a lot of
problems inside Emacs. Sometimes you need to add comments with stuff after
to make the syntax highlighting and indentation work.

Example:


    (let ((x #/foo|bar/)) ;; |))
      (write x))

Emacs thinks that the vertical bar is a scheme extended symbol syntax and you
need to add a comment to close the symbol and close the lists otherwise the
code will not display and evaluate properly.

And if you want to send this to scheme REPL and have proper highlighting you
need something like this:

    (let ((x #/foo|bar/)) ;; |))
      (write x)
      (display " ;; |")
      (newline))

it would be cool if you could configure the scheme mode to allow such syntax.

[1]:
https://practical-scheme.net/gauche/man/gauche-refe/Regular-expressions.html

Jakub

--
Jakub T. Jankiewicz, Senior Front-End Developer
https://jcubic.pl/me
https://lips.js.org
https://koduj.org



^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-09  2:59 Toshi Umehara
  2024-03-09 13:37 ` Jakub T. Jankiewicz
  2024-03-14  8:40 ` Eli Zaretskii
  0 siblings, 2 replies; 12+ messages in thread
From: Toshi Umehara @ 2024-03-09  2:59 UTC (permalink / raw)
  To: jcubic; +Cc: emacs-devel


Regular expression literals as exist in Gauche Scheme does not seem to
work in scheme mode. #/regexp/ is not dealt as a chunk. I looked for its
solution on the Web, and reached this web page,
https://ardggy.hatenablog.jp/entry/2015/11/24/143713 (in Japanese).  The
solution on the page overwrites syntax-propertize-function and adds a
new rule for regular expression literals. I got a hint from it, and have
tried to manage backslash slash ( \/ ) not to finish the literal and
backshashes of even numbers (like \\\\ ) not to work as escapes.

Currently, the following code in init.el works for me.

#+BEGIN_SRC
(add-hook 'scheme-mode-hook
          (lambda ()
            (setq-local
             syntax-propertize-function
             (lambda (start end)
               (goto-char start)
               (funcall
                (syntax-propertize-rules
                 ;; For #/regexp/ syntax
                 ("\\(#\\)/\\(\\\\/\\|\\\\\\\\\\|.\\)*?\\(/\\)"
                  (1 "|")
                  (3 "|"))
                 ;; For #; comment syntax
                 ("\\(#\\);"
                  (1 (prog1 "< cn"
                       (scheme-syntax-propertize-sexp-comment
                        (point) end)))))
                (point) end))
             ))
          )
#+END_SRC

Hope this helps.

--
Toshi (Toshihiro Umehara)



^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-17  0:28 Toshi Umehara
  2024-03-17  2:02 ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Toshi Umehara @ 2024-03-17  0:28 UTC (permalink / raw)
  To: monnier; +Cc: Eli Zaretskii, jcubic, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3575 bytes --]


After reading your suggestions, I've created new functions to deal with
regular expression syntax. This approach consists of two procedures.

1. scheme-syntax-propertize-regexp-1 detects starts of regular
expressoin (#/). If it finds a start, it continues to try to find its
corresponding end.

2. scheme-syntax-propertize-regexp-2 detects ends of regular expression
(/) out of comments but within strings that start with #. The second
procedure is introduced, to deal with cases where regular expression is
written in multiline.

The following code can be put in init.el, and patch for
/lisp/progmodes/scheme.el is attached. I hope this is useful, thanks.


#+BEGIN_SRC
(add-hook
 'scheme-mode-hook
 (lambda ()
   (setq-local
    syntax-propertize-function
    (lambda (beg end)
      (goto-char beg)
      (scheme-syntax-propertize-sexp-comment (point) end)
      (funcall
       (syntax-propertize-rules
        ("\\(#\\);" (1 (prog1 "< cn"
                         (scheme-syntax-propertize-sexp-comment
                          (point) end))))
        )
       (point) end)
      ;; For regular expression literals
      (scheme-syntax-propertize-regexp-1 end)
      (scheme-syntax-propertize-regexp-2 end)
      ))))

(defun scheme-match-regexp-start (limit)
  (re-search-forward
   (rx
    (or
     bol
     space
     (in "[('")
     )
    (group "#")
    "/"
    )
   limit
   t
   )
  )

(defun scheme-match-regexp-end (limit)
  (re-search-forward
   (rx
     (group "/")
     )
   limit
   t
   )
  )

(defun scheme-syntax-propertize-regexp-1 (end)
  (while (scheme-match-regexp-start end)
    (let* ((state (save-excursion
                    (syntax-ppss (match-beginning 1))))
           (within-str (nth 3 state))
           (within-comm (nth 4 state)))
      (if (and (not within-comm) (not within-str))
          (progn
            (put-text-property
             (match-beginning 1)
             (1+ (match-beginning 1))
             'syntax-table (string-to-syntax "|"))
            (let ((end-found nil))
              (while
                  (and
                   (not end-found)
                   (scheme-match-regexp-end end))
                (if
                    (not (char-equal
                          (char-before (match-beginning 1))
                          ?\\ ))
                    (progn
                      (put-text-property
                       (match-beginning 1)
                       (1+ (match-beginning 1))
                       'syntax-table (string-to-syntax "|"))
                      (setq end-found t)
                      )))))))))

(defun scheme-syntax-propertize-regexp-2 (end)
  (let ((end-found nil))
    (while (scheme-match-regexp-end end)
      (let* ((state (save-excursion
                      (syntax-ppss (match-beginning 1))))
             (within-str (nth 3 state))
             (within-comm (nth 4 state))
             (start-delim-pos (nth 8 state)))
        (if (and (not within-comm)
                 within-str
                 (string=
                  (buffer-substring-no-properties
                   start-delim-pos
                   (1+ start-delim-pos))
                  "#")
                 (not (char-equal
                       (char-before (match-beginning 1))
                       ?\\ )))
            (progn
                    (put-text-property
                     (match-beginning 1)
                     (1+ (match-beginning 1))
                     'syntax-table (string-to-syntax "|"))
                    (setq end-found t)
                    ))))))
#+END_SRC


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Enable dealing with regular expression literal --]
[-- Type: text/x-patch, Size: 3271 bytes --]

diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 67abab6913d..d1980463859 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -414,7 +414,10 @@ scheme-syntax-propertize
    (syntax-propertize-rules
     ("\\(#\\);" (1 (prog1 "< cn"
                      (scheme-syntax-propertize-sexp-comment (point) end)))))
-   (point) end))
+   (point) end)
+  (scheme-syntax-propertize-regexp-1 end)
+  (scheme-syntax-propertize-regexp-2 end)
+  )
 
 (defun scheme-syntax-propertize-sexp-comment (_ end)
   (let ((state (syntax-ppss)))
@@ -430,6 +433,87 @@ scheme-syntax-propertize-sexp-comment
                                'syntax-table (string-to-syntax "> cn")))
         (scan-error (goto-char end))))))
 
+(defun scheme-match-regexp-start (limit)
+  (re-search-forward
+   (rx
+    (or
+     bol
+     space
+     (in "[('")
+     )
+    (group "#")
+    "/"
+    )
+   limit
+   t
+   )
+  )
+
+(defun scheme-match-regexp-end (limit)
+  (re-search-forward
+   (rx
+     (group "/")
+     )
+   limit
+   t
+   )
+  )
+
+(defun scheme-syntax-propertize-regexp-1 (end)
+  (while (scheme-match-regexp-start end)
+    (let* ((state (save-excursion
+                    (syntax-ppss (match-beginning 1))))
+           (within-str (nth 3 state))
+           (within-comm (nth 4 state)))
+      (if (and (not within-comm) (not within-str))
+          (progn
+            (put-text-property
+             (match-beginning 1)
+             (1+ (match-beginning 1))
+             'syntax-table (string-to-syntax "|"))
+            (let ((end-found nil))
+              (while
+                  (and
+                   (not end-found)
+                   (scheme-match-regexp-end end))
+                (if
+                    (not (char-equal
+                          (char-before (match-beginning 1))
+                          ?\\ ))
+                    (progn
+                      (put-text-property
+                       (match-beginning 1)
+                       (1+ (match-beginning 1))
+                       'syntax-table (string-to-syntax "|"))
+                      (setq end-found t)
+                      )))))))))
+
+(defun scheme-syntax-propertize-regexp-2 (end)
+  (let ((end-found nil))
+    (while (scheme-match-regexp-end end)
+      (let* ((state (save-excursion
+                      (syntax-ppss (match-beginning 1))))
+             (within-str (nth 3 state))
+             (within-comm (nth 4 state))
+             (start-delim-pos (nth 8 state)))
+        (if (and (not within-comm)
+                 within-str
+                 (string=
+                  (buffer-substring-no-properties
+                   start-delim-pos
+                   (1+ start-delim-pos))
+                  "#")
+                 (not (char-equal
+                       (char-before (match-beginning 1))
+                       ?\\ )))
+            (progn
+                    (put-text-property
+                     (match-beginning 1)
+                     (1+ (match-beginning 1))
+                     'syntax-table (string-to-syntax "|"))
+                    (setq end-found t)
+                    ))))))
+
 ;;;###autoload
 (define-derived-mode dsssl-mode scheme-mode "DSSSL"
   "Major mode for editing DSSSL code.

[-- Attachment #3: Type: text/plain, Size: 31 bytes --]


-- 
Toshi (Toshihiro Umehara)

^ permalink raw reply related	[flat|nested] 12+ messages in thread
* Re: Scheme Mode and Regular Expression Literals
@ 2024-03-19  3:06 Toshi Umehara
  2024-03-19 13:36 ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Toshi Umehara @ 2024-03-19  3:06 UTC (permalink / raw)
  To: monnier; +Cc: Eli Zaretskii, jcubic, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 4375 bytes --]


Thank you very much for your comments, Stefan. As you point out, the
previous code is found to fail when regular expressions and sexp
comments exist in the same line.

Now I put the logic to scan the beginning of sexp comments and regular
expressions in syntax-propertize-rules macro. I use different functions
to be invoked by them, because the beginning of regular expressions
need to be canceled when they are already in normal strings or
comments. (This kind of logic might be required also for sexp comment,
but the current implementation does not do that and does not seem to
cause any harms.)

Therefore, if I use the same function, it requires an extra conditional
code only for regular expressions, which is a repetition of what is done
in syntax-propertize-rules level. (At syntax-propertize-rules leve, we
already know it's a beginning of a sexp comment or a possible beginning
of regular expression).

The following explains the functions implemented.

- scheme-syntax-propertize-sexp-comment

  No change from the current built-in implementation

- scheme-syntax-propertize-regexp-end

  If the posistion is already in regular expressions and not in
  comments, it searches regular expression end (/) ignoring backslash
  slash (\/).


- scheme-syntax-propertize-regexp

  This is invoked in syntax-propertize-rules. It cancels syntax class
  assignment to # of #/ if the # part is already in strings or
  comments. (Precisely, it assigns @ syntax class to # .) Otherwise it
  continues to scan the end of regular expression by calling
  scheme-syntax-propertize-regexp-end.


Thanks.


#+BEGIN_SRC
(add-hook
 'scheme-mode-hook
 (lambda ()
   (setq-local
    syntax-propertize-function
    (lambda (beg end)
      (goto-char beg)
      (scheme-syntax-propertize-sexp-comment (point) end)
      (scheme-syntax-propertize-regexp-end (point) end)
      (funcall
       (syntax-propertize-rules
        ("\\(#\\);" (1 (prog1 "< cn"
                         (scheme-syntax-propertize-sexp-comment
                          (point) end))))
        ("\\(#\\)/" (1 (prog1 "|"
                         (scheme-syntax-propertize-regexp
                          (point) end))))
         )
       (point) end)
      ))))

(defun scheme-syntax-propertize-sexp-comment (_ end)
  (let ((state (syntax-ppss)))
    (when (eq 2 (nth 7 state))
      ;; It's a sexp-comment.  Tell parse-partial-sexp where it ends.
      (condition-case nil
          (progn
            (goto-char (+ 2 (nth 8 state)))
            ;; FIXME: this doesn't handle the case where the sexp
            ;; itself contains a #; comment.
            (forward-sexp 1)
            (put-text-property (1- (point)) (point)
                               'syntax-table (string-to-syntax "> cn")))
        (scan-error (goto-char end))))))

(defun scheme-syntax-propertize-regexp-end (_ end)
  (let* ((state (syntax-ppss))
         (within-str (nth 3 state))
         (within-comm (nth 4 state))
         (start-delim-pos (nth 8 state)))
    (if (and (not within-comm)
             (and within-str
                  (string=
                   (buffer-substring-no-properties
                    start-delim-pos
                    (1+ start-delim-pos))
                   "#")))
        (let ((end-found nil))
          (while (and
                  (not end-found)
                  (re-search-forward "\\(/\\)" end t))
            (progn
              (if
                  (not (char-equal
                        (char-before (match-beginning 1))
                        ?\\ ))
                  (progn
                    (put-text-property
                     (match-beginning 1)
                     (1+ (match-beginning 1))
                     'syntax-table (string-to-syntax "|"))
                    (setq end-found t)
                    )))))
      )))

(defun scheme-syntax-propertize-regexp (_ end)
  (let* ((match-start-state (save-excursion
                              (syntax-ppss (match-beginning 1))))
         (within-str (nth 3 match-start-state))
         (within-comm (nth 4 match-start-state)))
    (if (or within-str within-comm)
          (put-text-property ;; Cancel regular expression start
           (match-beginning 1)
           (1+ (match-beginning 1))
           'syntax-table (string-to-syntax "@"))
      (scheme-syntax-propertize-regexp-end _ end)
      )))

#+END_SRC


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: deals with scheme regular expression syntax --]
[-- Type: text/x-patch, Size: 2791 bytes --]

diff --git a/lisp/progmodes/scheme.el b/lisp/progmodes/scheme.el
index 67abab6913d..98405513099 100644
--- a/lisp/progmodes/scheme.el
+++ b/lisp/progmodes/scheme.el
@@ -410,11 +410,18 @@ scheme-sexp-comment-syntax-table
 (defun scheme-syntax-propertize (beg end)
   (goto-char beg)
   (scheme-syntax-propertize-sexp-comment (point) end)
+  (scheme-syntax-propertize-regexp-end (point) end)
   (funcall
    (syntax-propertize-rules
     ("\\(#\\);" (1 (prog1 "< cn"
-                     (scheme-syntax-propertize-sexp-comment (point) end)))))
-   (point) end))
+                     (scheme-syntax-propertize-sexp-comment
+                      (point) end))))
+    ("\\(#\\)/" (1 (prog1 "|"
+                     (scheme-syntax-propertize-regexp
+                      (point) end))))
+    )
+   (point) end)
+  )
 
 (defun scheme-syntax-propertize-sexp-comment (_ end)
   (let ((state (syntax-ppss)))
@@ -430,6 +437,49 @@ scheme-syntax-propertize-sexp-comment
                                'syntax-table (string-to-syntax "> cn")))
         (scan-error (goto-char end))))))
 
+(defun scheme-syntax-propertize-regexp-end (_ end)
+  (let* ((state (syntax-ppss))
+         (within-str (nth 3 state))
+         (within-comm (nth 4 state))
+         (start-delim-pos (nth 8 state)))
+    (if (and (not within-comm)
+             (and within-str
+                  (string=
+                   (buffer-substring-no-properties
+                    start-delim-pos
+                    (1+ start-delim-pos))
+                   "#")))
+        (let ((end-found nil))
+          (while (and
+                  (not end-found)
+                  (re-search-forward "\\(/\\)" end t))
+            (progn
+              (if
+                  (not (char-equal
+                        (char-before (match-beginning 1))
+                        ?\\ ))
+                  (progn
+                    (put-text-property
+                     (match-beginning 1)
+                     (1+ (match-beginning 1))
+                     'syntax-table (string-to-syntax "|"))
+                    (setq end-found t)
+                    )))))
+      )))
+
+(defun scheme-syntax-propertize-regexp (_ end)
+  (let* ((match-start-state (save-excursion
+                              (syntax-ppss (match-beginning 1))))
+         (within-str (nth 3 match-start-state))
+         (within-comm (nth 4 match-start-state)))
+    (if (or within-str within-comm)
+          (put-text-property ;; Cancel regular expression start
+           (match-beginning 1)
+           (1+ (match-beginning 1))
+           'syntax-table (string-to-syntax "@"))
+      (scheme-syntax-propertize-regexp-end _ end)
+      )))
+
 ;;;###autoload
 (define-derived-mode dsssl-mode scheme-mode "DSSSL"
   "Major mode for editing DSSSL code.

[-- Attachment #3: Type: text/plain, Size: 32 bytes --]



-- 
Toshi (Toshihiro Umehara)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-03-23  2:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-27 14:46 Scheme Mode and Regular Expression Literals Jakub T. Jankiewicz
  -- strict thread matches above, loose matches on Subject: below --
2024-03-09  2:59 Toshi Umehara
2024-03-09 13:37 ` Jakub T. Jankiewicz
2024-03-14  8:40 ` Eli Zaretskii
2024-03-14 11:38   ` Mattias Engdegård
2024-03-14 13:34     ` Stefan Monnier
2024-03-14 15:09       ` Jakub T. Jankiewicz
2024-03-17  0:28 Toshi Umehara
2024-03-17  2:02 ` Stefan Monnier
2024-03-19  3:06 Toshi Umehara
2024-03-19 13:36 ` Stefan Monnier
2024-03-23  2:45   ` Toshi Umehara

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).