From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Scheme Mode and Regular Expression Literals Date: Tue, 19 Mar 2024 09:36:40 -0400 Message-ID: References: <87edc6kjin.fsf@niceume.com> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23622"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Eli Zaretskii , jcubic@onet.pl, emacs-devel@gnu.org To: Toshi Umehara Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Mar 19 14:37:48 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rmZf2-0005vg-Ao for ged-emacs-devel@m.gmane-mx.org; Tue, 19 Mar 2024 14:37:48 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rmZeI-0006ao-Hg; Tue, 19 Mar 2024 09:37:02 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rmZeG-0006aW-Ro for emacs-devel@gnu.org; Tue, 19 Mar 2024 09:37:01 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rmZeC-0002M5-5a; Tue, 19 Mar 2024 09:36:58 -0400 Original-Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id D3CA31000FC; Tue, 19 Mar 2024 09:36:50 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1710855409; bh=s4OBY5hwT7f8uFEQqc/NgZbfEHDG4QSbzSg8L6+pm+s=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=k6UwlfTKclwOZH9Cp2TcQ6CWxOyLu3eWpQcBlt6VViI2j+sVPi/l1BEXJdQU7YaoO 3ikQ3mqNLEwqw4XxifBkxXzufslTX/1/FVj8ppCR5GtRzn4OOLlUMI0Qz7+Z3Tbkaw j4iWsvxJju3kAzPTMUp9pQiJNeknZyN0GnL/tKZNSh8Zs24MT0wPpMqQ6e9A+l2kKd HEVZXrVIvQ1r5GwflGOo52KYPc7jHG4e5DRNJUMIdnlYJQ18tt5jFdu9aYH+ePseP5 2XxCz4OBUbwy0sS1pNGp8cdPquYzJrBZzuYGlB1qxBKMkr49+LyhT469MtqeabJ68T 0RtlfehhQ6PvA== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id C27F4100048; Tue, 19 Mar 2024 09:36:49 -0400 (EDT) Original-Received: from pastel (unknown [104.247.238.200]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 9768C1204B0; Tue, 19 Mar 2024 09:36:49 -0400 (EDT) In-Reply-To: <87edc6kjin.fsf@niceume.com> (Toshi Umehara's message of "Tue, 19 Mar 2024 12:06:24 +0900") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:317188 Archived-At: > Now I put the logic to scan the beginning of sexp comments and regular > expressions in syntax-propertize-rules macro. I use different functions > to be invoked by them, because the beginning of regular expressions > need to be canceled when they are already in normal strings or > comments. We usually don't bother doing that because it affects only navigation *within* those strings/comments :-) The code looks pretty good now. Can you turn it into a patch against `scheme.el`? In the mean time, see my comments below. > (defun scheme-syntax-propertize-regexp-end (_ end) > (let* ((state (syntax-ppss)) > (within-str (nth 3 state)) > (within-comm (nth 4 state)) > (start-delim-pos (nth 8 state))) > (if (and (not within-comm) > (and within-str > (string= > (buffer-substring-no-properties > start-delim-pos > (1+ start-delim-pos)) > "#"))) (eq ?# (char-after start-delim-pos)) would be simpler and more efficient. Also, `within-str` and `within-comm` are mutually exclusive, so the `(and (not within-comm)` is redundant. > (let ((end-found nil)) > (while (and > (not end-found) > (re-search-forward "\\(/\\)" end t)) You don't need the \\( \\), you can just use (match-beginning 0) instead, which will let Emacs use the simpler non-regexp search algorithm. > (progn > (if > (not (char-equal > (char-before (match-beginning 1)) > ?\\ )) This fails for #/foo\\/ In sh-script.el I used (eq -1 (% (save-excursion (skip-chars-backward "\\\\")) 2)) At other places we let the regexp matcher skip those by using a hideous regexp such as "[^\\]\\(?:\\\\\\\\\\)*/" (but this regexp fails to match the closing / if it's right at the starting position, so you have to use a workaround such as doing a (forward-char -1) before searching). > (progn > (put-text-property > (match-beginning 1) > (1+ (match-beginning 1)) Aka (match-end 1). > 'syntax-table (string-to-syntax "|")) > (setq end-found t) > ))))) > ))) You can avoid the `end-found` thingy with (while (and (re-search-forward "/" end 'move) (eq -1 (% (save-excursion (skip-chars-backward "\\\\")) 2)))) (when (< (point) end) ;; Double check that `re-search-forward` succeeded. (put-text-property ...)) > (defun scheme-syntax-propertize-regexp (_ end) > (let* ((match-start-state (save-excursion > (syntax-ppss (match-beginning 1)))) > (within-str (nth 3 match-start-state)) > (within-comm (nth 4 match-start-state))) > (if (or within-str within-comm) (nth 8 match-start-state) gives the same boolean answer as this `or`. And instead of having `syntax-propertize-rules` add a | syntax and then you replacing it with "@" you could also tell `syntax-propertize-rules` when to do it with something like: ("\\(#\\)/" (1 (when (null (nth 8 (save-excursion (syntax-ppss (match-beginning 0))))) (prog1 "|" (scheme-syntax-propertize-regexp-end (point) end))))) tho `syntax-propertize-rules` sadly doesn't currently handle this combination of `when` and `prog1` :-( ("\\(#\\)/" (1 (when (null (nth 8 (save-excursion (syntax-ppss (match-beginning 0))))) (put-text-property ... "|" ..) (scheme-syntax-propertize-regexp-end (point) end) nil))) - Stefan