From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Theodor Thornhill via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#60623: 30.0.50; Add forward-sentence with tree sitter support Date: Tue, 10 Jan 2023 20:33:52 +0100 Message-ID: <87zgaqmbfj.fsf@thornhill.no> References: <87o7ratva2.fsf@thornhill.no> <87bkn9tasb.fsf@thornhill.no> <83sfgloz5w.fsf@gnu.org> <875ydgu8dd.fsf@thornhill.no> <83fsckpznk.fsf@gnu.org> <87358ku6x2.fsf@thornhill.no> <837cxvq3x9.fsf@gnu.org> <87h6wyss3d.fsf@thornhill.no> <83358io2cd.fsf@gnu.org> Reply-To: Theodor Thornhill Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13598"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 60623@debbugs.gnu.org, juri@linkov.net, casouri@gmail.com, monnier@iro.umontreal.ca, mardani29@yahoo.es To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Jan 10 20:37:11 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pFKQp-0003KM-5t for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 10 Jan 2023 20:37:11 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pFKPC-0006Kk-PI; Tue, 10 Jan 2023 14:35:30 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFKOn-0005yI-Qk for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 14:35:20 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pFKOk-00022h-EZ for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 14:35:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pFKOj-00017Y-Q0 for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 14:35:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Theodor Thornhill Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 10 Jan 2023 19:35:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60623 X-GNU-PR-Package: emacs Original-Received: via spool by 60623-submit@debbugs.gnu.org id=B60623.16733792424227 (code B ref 60623); Tue, 10 Jan 2023 19:35:01 +0000 Original-Received: (at 60623) by debbugs.gnu.org; 10 Jan 2023 19:34:02 +0000 Original-Received: from localhost ([127.0.0.1]:41176 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFKNk-00015o-KJ for submit@debbugs.gnu.org; Tue, 10 Jan 2023 14:34:02 -0500 Original-Received: from out-167.mta0.migadu.com ([91.218.175.167]:34161) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFKNh-00015b-LV for 60623@debbugs.gnu.org; Tue, 10 Jan 2023 14:33:59 -0500 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thornhill.no; s=key1; t=1673379235; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XpSOM+1H45SUxhYfXkBXdKS2b/Oce4P9FblHtFPnXVg=; b=W0p51eEgBy+EAm+kPZNOGsDX3IpJK1VUlRT74GxYqysR8QX9dUdulLn5MOLJjjlkdUMbeh BvaE50xpdSvqKFOew89Itd3asFtxRa4TimJPap21ske697b780RBUnG4r9LlIVWiKVCNpi kwaG69b3hXVDdGS+JC1LoxfeeMXGGO6c4kDrGEhd1zMDStRzJmDMmWeF0ymwilYC8z4bqc UR7mEDXcyAOmCIrBuvW9wy4UHrwRyTjEBXUfksycgGuq0b630ZVfXxSm7f4uMCRQ7x26Se dw8YFkBYLEZQ4jrF0r3efdTtLVXLEW3AsM81CPwBTV96OMd7QcmAFUnJabVk3g== In-Reply-To: <83358io2cd.fsf@gnu.org> X-Migadu-Flow: FLOW_OUT X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:253107 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: Theodor Thornhill >> Cc: mardani29@yahoo.es, 60623@debbugs.gnu.org, casouri@gmail.com, >> monnier@iro.umontreal.ca, juri@linkov.net >> Date: Tue, 10 Jan 2023 09:37:26 +0100 >> >> >> > No, because these are not really sentences in some human-readable >> >> > language, these are program parts. As such they should be somewhere >> >> > under "27 Programs", possibly in "Defuns". >> >> > >> >> > However, "Sentences" might mention that programming modes have their >> >> > own interpretation of "sentence" and corresponding movement commands. >> >> >> >> Yeah, that makes sense. Should I make an attempt at such formulations, >> >> or will you do it at a later time? >> > >> > It is better that you try, if only to gain experience ;-) >> >> How about this for starter? > > Very good, thank you very much. A few comments below. > >> --- a/doc/emacs/programs.texi >> +++ b/doc/emacs/programs.texi >> @@ -163,6 +163,7 @@ Defuns >> * Left Margin Paren:: An open-paren or similar opening delimiter >> starts a defun if it is at the left margin. >> * Moving by Defuns:: Commands to move over or mark a major definition. >> +* Moving by Sentences:: Commands to move over certain definitions in code. > ^^^^^^^^^^^ > I'd use "code units" or "units of code" here. Done. > > Also, should we perhaps name the section "Moving by Statements"? or > would it be too inaccurate? > I'm not sure. I think that maybe because the commands involved, and the ones that implicitly will be impacted, such as kill-sentence and friends it is best to stay with Sentences? But a statement is the better term wrt programming languages of course. I hold no strong opinions here. >> + These commands move point or set up the region based on definitions, >> +also called @dfn{sentences}. Even though sentences is usually > > Each @dfn in a manual should have an index entry, so that readers > could easily find it. in this case, the index entry should qualify > the "sentences" term by the fact that we are talking about units of > code. So: > > @cindex sentences, in programming languages > Done. >> +considered when writing human languages, Emacs can use the same >> +commands to move over certain constructs in programming languages >> +(@pxref{Sentences}, @pxref{Moving by Defuns}). In a programming >> +language a sentence is usually a complete language construct smaller >> +than defuns, but larger than sexps (@pxref{List Motion,,, elisp, The >> +Emacs Lisp Reference Manual}). > > A couple of examples from two different languages could be a great > help here. Otherwise this text sounds a bit too abstract. > Something like this? >> +@kindex M-a >> +@kindex M-e > > Since we already have M-e elsewhere in the manual, I suggest to > qualify the key bindings here: > > @kindex M-a @r{(programming modes)} > > and similarly for M-e. The @r{..} thingy is necessary to reset to the > default typeface, since key index is implicitly typeset in @code. > >> +@findex backward-sentence >> +@findex forward-sentence > > Likewise with these two @findex entries: qualify them, since we have > the same commands documented elsewhere under "Sentences". > Done. >> --- a/doc/emacs/text.texi >> +++ b/doc/emacs/text.texi >> @@ -253,6 +253,14 @@ Sentences >> of a sentence. Set the variable @code{sentence-end-without-period} to >> @code{t} in such cases. >> >> + Even though the above mentioned sentence movement commands are based >> +on human languages, other Emacs modes can set these command to get >> +similar functionality. What exactly a sentence is in a non-human >> +language is dependent on the target language, but usually it is >> +complete statements, such as a variable definition and initialization, >> +or a conditional statement (@pxref{Moving by Sentences,,, emacs, The >> +extensible self-documenting text editor}). > > The last sentence should be in "Moving by Sentences", since it > describes the commands documented there. Also, please add a > cross-reference here to "Moving by Sentences", since you mention that > in the text (and rightfully so). > Is something like this what you meant? >> +@defvar treesit-sentence-type-regexp >> +The value of this variable is a regexp matching the node type of sentence >> +nodes. (For ``node'' and ``node type'', @pxref{Parsing Program Source}.) >> + >> +@findex treesit-forward-sentence >> +@findex forward-sentence >> +@findex backward-sentence >> +If Emacs is compiled with tree-sitter, it can use the tree-sitter >> +parser information to move across syntax constructs. Since what >> +exactly is considered a sentence varies between languages, a major mode >> +should set @code{treesit-sentence-type-regexp} to determine that. Then >> +the mode can get navigation-by-sentence functionality for free, by using >> +@code{forward-sentence} and @code{backward-sentence}. > > Here please also add a cross-reference to the "Moving by Sentences" > node in the Emacs manual, so that people could understand what kind of > "sentences" this is talking about. > >> +** New defvar forward-sentence-function. > ^^^^^^^^^^ > "New variable" > >> +Emacs now can set this variable to customize the behavior of the >> +'forward-sentence' function. > > Not "Emacs", but "major modes". > >> +** New defun forward-sentence-default-function. > ^^^^^^^^^ > "New function" > >> +The previous implementation of 'forward-sentence' is moved into its >> +own function, to be bound by 'forward-sentence-function'. >> + >> +** New defvar-local 'treesit-sentence-type-regexp. >> +Similarly to 'treesit-defun-type-regexp', this variable is used to >> +navigate sentences in Tree-sitter enabled modes. >> + >> +** New function 'treesit-forward-sentence'. >> +treesit.el now conditionally sets 'forward-sentence-function' for all >> +Tree-sitter modes that sets 'treesit-sentence-type-regexp'. > > Please make these related items sub-headings of a common heading, > something like "Commands and variables to move by program statements". > Done. >> + >> >> * Changes in Specialized Modes and Packages in Emacs 30.1 >> --- >> diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el >> index 73abb155aaa..fd2d83eeebf 100644 >> --- a/lisp/textmodes/paragraphs.el >> +++ b/lisp/textmodes/paragraphs.el >> @@ -441,13 +441,12 @@ end-of-paragraph-text >> (if (< (point) (point-max)) >> (end-of-paragraph-text)))))) >> >> -(defun forward-sentence (&optional arg) >> +(defun forward-sentence-default-function (&optional arg) >> "Move forward to next end of sentence. With argument, repeat. >> When ARG is negative, move backward repeatedly to start of sentence. >> >> The variable `sentence-end' is a regular expression that matches ends of >> sentences. Also, every paragraph boundary terminates sentences as well." >> - (interactive "^p") >> (or arg (setq arg 1)) >> (let ((opoint (point)) >> (sentence-end (sentence-end))) >> @@ -480,6 +479,18 @@ forward-sentence >> (let ((npoint (constrain-to-field nil opoint t))) >> (not (= npoint opoint))))) >> >> +(defvar forward-sentence-function #'forward-sentence-default-function >> + "Function to be used to calculate sentence movements. >> +See `forward-sentence' for a description of its behavior.") >> + >> +(defun forward-sentence (&optional arg) >> + "Move forward to next end of sentence. With argument, repeat. > ^^^^^^^^^^^^^^^^^^^^^ > "With argument ARG, repeat." The doc string should reference the > arguments where possible. > Thanks, done. >> +When ARG is negative, move backward repeatedly to start of sentence. > ^^^^ > "If", not "When". > Done >> +(defvar-local treesit-sentence-type-regexp "" >> + "A regexp that matches the node type of sentence nodes. > > Why is the default an empty regexp? wouldn't nil be better? Indeed it will, done. How about this? Theo --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Add-forward-sentence-with-tree-sitter-support-bug-60.patch >From 05b880680f70825d56b624b55f39b9114ba22c8d Mon Sep 17 00:00:00 2001 From: Theodor Thornhill Date: Sun, 8 Jan 2023 20:28:02 +0100 Subject: [PATCH] Add forward-sentence with tree sitter support (bug#60623) * etc/NEWS: Mention the new changes. * lisp/textmodes/paragraphs.el (forward-sentence-default-function): Move old implementation to its own function. (forward-sentence-function): New defvar defaulting to old behavior. (forward-sentence): Use the variable in this function unconditionally. * lisp/treesit.el (treesit-sentence-type-regexp): New defvar. (treesit-forward-sentence): New defun. (treesit-major-mode-setup): Conditionally set forward-sentence-function. * doc/emacs/programs.texi (Defuns): Add new subsection. (Moving by Sentences): Add some documentation with xrefs to the elisp manual and related nodes. * doc/lispref/positions.texi (List Motion): Mention treesit-sentence-type-regexp and describe how to enable this functionality. --- doc/emacs/programs.texi | 56 ++++++++++++++++++++++++++++++++++++ doc/emacs/text.texi | 5 ++++ doc/lispref/positions.texi | 17 +++++++++++ etc/NEWS | 18 ++++++++++++ lisp/textmodes/paragraphs.el | 15 ++++++++-- lisp/treesit.el | 27 +++++++++++++++++ 6 files changed, 136 insertions(+), 2 deletions(-) diff --git a/doc/emacs/programs.texi b/doc/emacs/programs.texi index 44cad5a148e..a2cdf6c6eb9 100644 --- a/doc/emacs/programs.texi +++ b/doc/emacs/programs.texi @@ -163,6 +163,7 @@ Defuns * Left Margin Paren:: An open-paren or similar opening delimiter starts a defun if it is at the left margin. * Moving by Defuns:: Commands to move over or mark a major definition. +* Moving by Sentences:: Commands to move over certain code units. * Imenu:: Making buffer indexes as menus. * Which Function:: Which Function mode shows which function you are in. @end menu @@ -254,6 +255,61 @@ Moving by Defuns language. Other major modes may replace any or all of these key bindings for that purpose. +@node Moving by Sentences +@subsection Moving by Sentences +@cindex sentences, in programming languages + + These commands move point or set up the region based on units of +code, also called @dfn{sentences}. Even though sentences are usually +considered when writing human languages, Emacs can use the same +commands to move over certain constructs in programming languages +(@pxref{Sentences}, @pxref{Moving by Defuns}). In a programming +language a sentence is usually a complete language construct smaller +than defuns, but larger than sexps (@pxref{List Motion,,, elisp, The +Emacs Lisp Reference Manual}). What exactly a sentence is in a +non-human language is dependent on the target language, but usually it +is complete statements, such as a variable definition and +initialization, or a conditional statement. An example of a sentence +in the C language could be + +@example +int x = 5; +@end example + +or in the JavaScript language it could look like + +@example +const thing = () => console.log("Hi"); + +const foo = [1] == '1' + ? "No way" + : "..."; +@end example + +@table @kbd +@item M-a +Move to beginning of current or preceding sentence +(@code{backward-sentence}). +@item M-e +Move to end of current or following sentence (@code{forward-sentence}). +@end table + +@cindex move to beginning or end of sentence +@cindex sentence, move to beginning or end +@kindex M-a @r{(programming modes)} +@kindex M-e @r{(programming modes)} +@findex backward-sentence @r{(programming modes)} +@findex forward-sentence @r{(programming modes)} + The commands to move to the beginning and end of the current +sentence are @kbd{M-a} (@code{backward-sentence}) and @kbd{M-e} +(@code{forward-sentence}). If you repeat one of these commands, or +use a positive numeric argument, each repetition moves to the next +sentence in the direction of motion. + + @kbd{M-a} with a negative argument @minus{}@var{n} moves forward +@var{n} times to the next end of a sentence. Likewise, @kbd{M-e} with +a negative argument moves back to a start of a sentence. + @node Imenu @subsection Imenu @cindex index of buffer definitions diff --git a/doc/emacs/text.texi b/doc/emacs/text.texi index 8fbf731a4f7..acd3bb21c29 100644 --- a/doc/emacs/text.texi +++ b/doc/emacs/text.texi @@ -253,6 +253,11 @@ Sentences of a sentence. Set the variable @code{sentence-end-without-period} to @code{t} in such cases. + Even though the above mentioned sentence movement commands are based +on human languages, other Emacs modes can set these command to get +similar functionality (@pxref{Moving by Sentences,,, emacs, The +extensible self-documenting text editor}). + @node Paragraphs @section Paragraphs @cindex paragraphs diff --git a/doc/lispref/positions.texi b/doc/lispref/positions.texi index f3824436246..8d95ecee7ab 100644 --- a/doc/lispref/positions.texi +++ b/doc/lispref/positions.texi @@ -858,6 +858,23 @@ List Motion recognize nested defuns. @end defvar +@defvar treesit-sentence-type-regexp +The value of this variable is a regexp matching the node type of sentence +nodes. (For ``node'' and ``node type'', @pxref{Parsing Program Source}.) +@end defvar + +@findex treesit-forward-sentence +@findex forward-sentence +@findex backward-sentence +If Emacs is compiled with tree-sitter, it can use the tree-sitter +parser information to move across syntax constructs. Since what +exactly is considered a sentence varies between languages, a major +mode should set @code{treesit-sentence-type-regexp} to determine that. +Then the mode can get navigation-by-sentence functionality for free, +by using @code{forward-sentence} and +@code{backward-sentence}(@pxref{Moving by Sentences,,, emacs, The +extensible self-documenting text editor}). + @node Skipping Characters @subsection Skipping Characters @cindex skipping characters diff --git a/etc/NEWS b/etc/NEWS index 3aa8f2abb77..0c782eeaee8 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -66,6 +66,24 @@ treesit.el now unconditionally sets 'transpose-sexps-function' for all Tree-sitter modes. This functionality utilizes the new 'transpose-sexps-function'. +** Commands and variables to move by program statements + +*** New variable 'forward-sentence-function'. +Major modes now can set this variable to customize the behavior of the +'forward-sentence' function. + +*** New function 'forward-sentence-default-function'. +The previous implementation of 'forward-sentence' is moved into its +own function, to be bound by 'forward-sentence-function'. + +*** New defvar-local 'treesit-sentence-type-regexp. +Similarly to 'treesit-defun-type-regexp', this variable is used to +navigate sentences in Tree-sitter enabled modes. + +*** New function 'treesit-forward-sentence'. +treesit.el now conditionally sets 'forward-sentence-function' for all +Tree-sitter modes that sets 'treesit-sentence-type-regexp'. + * Changes in Specialized Modes and Packages in Emacs 30.1 --- diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el index 73abb155aaa..bf249fdcdfb 100644 --- a/lisp/textmodes/paragraphs.el +++ b/lisp/textmodes/paragraphs.el @@ -441,13 +441,12 @@ end-of-paragraph-text (if (< (point) (point-max)) (end-of-paragraph-text)))))) -(defun forward-sentence (&optional arg) +(defun forward-sentence-default-function (&optional arg) "Move forward to next end of sentence. With argument, repeat. When ARG is negative, move backward repeatedly to start of sentence. The variable `sentence-end' is a regular expression that matches ends of sentences. Also, every paragraph boundary terminates sentences as well." - (interactive "^p") (or arg (setq arg 1)) (let ((opoint (point)) (sentence-end (sentence-end))) @@ -480,6 +479,18 @@ forward-sentence (let ((npoint (constrain-to-field nil opoint t))) (not (= npoint opoint))))) +(defvar forward-sentence-function #'forward-sentence-default-function + "Function to be used to calculate sentence movements. +See `forward-sentence' for a description of its behavior.") + +(defun forward-sentence (&optional arg) + "Move forward to next end of sentence. With argument ARG, repeat. +If ARG is negative, move backward repeatedly to start of +sentence. Delegates its work to `forward-sentence-function'." + (interactive "^p") + (or arg (setq arg 1)) + (funcall forward-sentence-function arg)) + (defun count-sentences (start end) "Count sentences in current buffer from START to END." (let ((sentences 0) diff --git a/lisp/treesit.el b/lisp/treesit.el index a7f453a8899..4c01a8db281 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -1792,6 +1792,31 @@ treesit-text-type-regexp \"text_block\" in the case of a string. This is used by `prog-fill-reindent-defun' and friends.") +(defvar-local treesit-sentence-type-regexp nil + "A regexp that matches the node type of sentence nodes. + +A sentence node is a node that is bigger than a sexp, and +delimits larger statements in the source code. It is, however, +smaller in scope than defuns. This is used by +`treesit-forward-sentence' and friends.") + +(defun treesit-forward-sentence (&optional arg) + "Tree-sitter `forward-sentence-function' function. + +ARG is the same as in `forward-sentence'. + +If inside comment or other nodes described in +`treesit-sentence-type-regexp', use +`forward-sentence-default-function', else move across nodes as +described by `treesit-sentence-type-regexp'." + (if (string-match-p + treesit-text-type-regexp + (treesit-node-type (treesit-node-at (point)))) + (funcall #'forward-sentence-default-function arg) + (funcall + (if (> arg 0) #'treesit-end-of-thing #'treesit-beginning-of-thing) + treesit-sentence-type-regexp (abs arg)))) + (defun treesit-default-defun-skipper () "Skips spaces after navigating a defun. This function tries to move to the beginning of a line, either by @@ -2256,6 +2281,8 @@ treesit-major-mode-setup #'treesit-add-log-current-defun)) (setq-local transpose-sexps-function #'treesit-transpose-sexps) + (when treesit-sentence-type-regexp + (setq-local forward-sentence-function #'treesit-forward-sentence)) ;; Imenu. (when treesit-simple-imenu-settings -- 2.34.1 --=-=-=--