On 25/11/2023 16:42, JD Smith wrote: > Bridging emacs syntax to treesitter in a robust way seems like it could be a subtle enterprise, so I’d prefer to leave that to one of the experts. Right now the syntax-propertize-function in python-mode does one simple thing: ensure triple quotes are properly marked as strings. Since the treesitter grammar doesn’t distinguish between different flavors of strings, something similar would still be needed, if we want to continue to treat various string flavors distinctly using syntax. > > Is moving all syntactification (beyond just font-lock) over to TS an explicit goal for all the *-ts-mode’s? It would make sense - since this way we would only have one source of syntax-recognition bugs, rather than two (both the grammar and the definition in Elisp). Attached is a patch you can try (that uses treesit for s-p-f). Unfortunately, it's not quite perfect (nor is python-syntax-stringify, according to its FIXME inside): after certain modifications, the syntax-table property is not applied. I've done some print-debugging in python--treesit-parser-after-change, and it looks like the problem is this: in certain cases (e.g. when electric-pair-post-self-insert-function fires) the parser notifier fires only after syntax-propertize has been called -- and it fires inside of it. Meaning it's too late to flush the syntax-propertize cache at that point. The reason for it is, overall, the fast that we're trigger parser's after-change notifiers lazily: only after some other feature has to initialize the parser, calling treesit_ensure_parsed from treesit-parser-root-node. I think bug#66732 might also be a variation of this problem. As for what to do about this one -- probably something involving syntax-propertize-extend-region-functions, adding an entry which would initialize the parser, but not call syntax-ppss-flush-cache directly (or at least not just that). It would signal the earlier position to extend to through some dynamic variable. This is getting tricky enough to move from the individual major modes into treesit.el proper, I think. Yuan and others, thoughts welcome. JD, I do believe the attached patch is TRT (or close to it), but depending on how it works for you, and how quickly we deal with the above problem, it might make sense to enact your original suggestion first. And finally, here's the backtrace that led me to the above conclusions: backtrace() (message "in progress, backtrace %s" (backtrace)) (progn (message "in progress, backtrace %s" (backtrace))) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp (car r))) (setq --cl-var-- (if --cl-var-- (min --cl-var-- temp) temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp (car r))) (setq --cl-var-- (if --cl-var-- (min --cl-var-- temp) temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide)) (progn (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp ...)) (setq --cl-var-- (if --cl-var-- ... temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp ...)) (setq --cl-var-- (if --cl-var-- ... temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide))) (if ranges (progn (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* (...) (setq --cl-var-- ...)) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* (...) (setq --cl-var-- ...)) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide)))) python--treesit-parser-after-change(((27 . 50)) #) treesit-buffer-root-node(python) treesit-node-at(42) (let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- (point) 3) (- (point) 2) 'syntax-table (string-to-syntax "|"))) ((and (equal (treesit-node-type node) "\"") (= (treesit-node-start node) (- (point) 3))) (put-text-property (1- (point)) (point) 'syntax-table (string-to-syntax "|"))))) (cond (t (message "pt %s" (point)) (let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- (point) 3) (- (point) 2) 'syntax-table (string-to-syntax "|"))) ((and (equal (treesit-node-type node) "\"") (= (treesit-node-start node) (- ... 3))) (put-text-property (1- (point)) (point) 'syntax-table (string-to-syntax "|"))))))) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- ... 3) (- ... 2) 'syntax-table (string-to-syntax "|"))) ((and (equal ... "\"") (= ... ...)) (put-text-property (1- ...) (point) 'syntax-table (string-to-syntax "|")))))))) (closure (t) (start end) (goto-char start) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node ...)) (cond (... ...) (... ...)))))))(39 50) funcall((closure (t) (start end) (goto-char start) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node ...)) (cond (... ...) (... ...))))))) 39 50) python--treesit-syntax-propertize-function-1(39 50) syntax-propertize(42) syntax-ppss(42) electric-pair-syntax-info(39) electric-pair-post-self-insert-function() self-insert-command(1 39) funcall-interactively(self-insert-command 1 39) #(self-insert-command nil nil) call-interactively@ido-cr+-record-current-command(# self-insert-command nil nil) apply(call-interactively@ido-cr+-record-current-command # (self-insert-command nil nil)) call-interactively(self-insert-command nil nil) command-execute(self-insert-command)