From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#60623: 30.0.50; Add forward-sentence with tree sitter support Date: Tue, 10 Jan 2023 17:07:14 +0200 Message-ID: <83358io2cd.fsf@gnu.org> References: <87o7ratva2.fsf@thornhill.no> <87bkn9tasb.fsf@thornhill.no> <83sfgloz5w.fsf@gnu.org> <875ydgu8dd.fsf@thornhill.no> <83fsckpznk.fsf@gnu.org> <87358ku6x2.fsf@thornhill.no> <837cxvq3x9.fsf@gnu.org> <87h6wyss3d.fsf@thornhill.no> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11890"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 60623@debbugs.gnu.org, juri@linkov.net, casouri@gmail.com, monnier@iro.umontreal.ca, mardani29@yahoo.es To: Theodor Thornhill Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Jan 10 18:51:50 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pFImr-0002F9-65 for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 10 Jan 2023 18:51:49 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pFGDV-0005pW-CL; Tue, 10 Jan 2023 10:07:09 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFGDS-0005n0-FM for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 10:07:06 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pFGDO-0001JQ-7w for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 10:07:05 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pFGDN-00005C-J3 for bug-gnu-emacs@gnu.org; Tue, 10 Jan 2023 10:07:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 10 Jan 2023 15:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60623 X-GNU-PR-Package: emacs Original-Received: via spool by 60623-submit@debbugs.gnu.org id=B60623.1673363217304 (code B ref 60623); Tue, 10 Jan 2023 15:07:01 +0000 Original-Received: (at 60623) by debbugs.gnu.org; 10 Jan 2023 15:06:57 +0000 Original-Received: from localhost ([127.0.0.1]:40884 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFGDI-0008WV-NP for submit@debbugs.gnu.org; Tue, 10 Jan 2023 10:06:57 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:47340) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFGDG-0008WG-GQ for 60623@debbugs.gnu.org; Tue, 10 Jan 2023 10:06:55 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFGDA-0001H9-1b; Tue, 10 Jan 2023 10:06:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=JAZzDtCqE0JOXp1uO0yGHI+om28CNvvr9oKWAPy/M6E=; b=D4h/D5e+ldjX xal6AZ6MxbfdfMUh01i9MJIEwlAuV3K6Q29WDC7BiGe82b6xfvR0eMGS9Y2SoZgc95MKikFdC5P/T k3PuVnYnjDCsf0+1kSKIhxMFWoK1EhfYRMOfHXK9Z1GppkbW5Q93BxSICE7g4B7ccrKRnF/g/M86j rCQs1dsp5TT2iqXZEnlS6a/Eq/teZGyzspIjVyTt9dFc7mn3Al8toxZHgdhFxWsgruCAH3ezL6vnQ RlNxggvUW2DXPhQ9nJs7T6nO6QyCiWMLoDZk5BmV7A3wcDdSdR2Pjr+/Nzt9FZSTnFKxec+2XkZ0h CzjuJ5I3i5FAYB/BchWmFQ==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFGD9-00072z-Hg; Tue, 10 Jan 2023 10:06:47 -0500 In-Reply-To: <87h6wyss3d.fsf@thornhill.no> (message from Theodor Thornhill on Tue, 10 Jan 2023 09:37:26 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:253093 Archived-At: > From: Theodor Thornhill > Cc: mardani29@yahoo.es, 60623@debbugs.gnu.org, casouri@gmail.com, > monnier@iro.umontreal.ca, juri@linkov.net > Date: Tue, 10 Jan 2023 09:37:26 +0100 > > >> > No, because these are not really sentences in some human-readable > >> > language, these are program parts. As such they should be somewhere > >> > under "27 Programs", possibly in "Defuns". > >> > > >> > However, "Sentences" might mention that programming modes have their > >> > own interpretation of "sentence" and corresponding movement commands. > >> > >> Yeah, that makes sense. Should I make an attempt at such formulations, > >> or will you do it at a later time? > > > > It is better that you try, if only to gain experience ;-) > > How about this for starter? Very good, thank you very much. A few comments below. > --- a/doc/emacs/programs.texi > +++ b/doc/emacs/programs.texi > @@ -163,6 +163,7 @@ Defuns > * Left Margin Paren:: An open-paren or similar opening delimiter > starts a defun if it is at the left margin. > * Moving by Defuns:: Commands to move over or mark a major definition. > +* Moving by Sentences:: Commands to move over certain definitions in code. ^^^^^^^^^^^ I'd use "code units" or "units of code" here. Also, should we perhaps name the section "Moving by Statements"? or would it be too inaccurate? > + These commands move point or set up the region based on definitions, > +also called @dfn{sentences}. Even though sentences is usually Each @dfn in a manual should have an index entry, so that readers could easily find it. in this case, the index entry should qualify the "sentences" term by the fact that we are talking about units of code. So: @cindex sentences, in programming languages > +considered when writing human languages, Emacs can use the same > +commands to move over certain constructs in programming languages > +(@pxref{Sentences}, @pxref{Moving by Defuns}). In a programming > +language a sentence is usually a complete language construct smaller > +than defuns, but larger than sexps (@pxref{List Motion,,, elisp, The > +Emacs Lisp Reference Manual}). A couple of examples from two different languages could be a great help here. Otherwise this text sounds a bit too abstract. > +@kindex M-a > +@kindex M-e Since we already have M-e elsewhere in the manual, I suggest to qualify the key bindings here: @kindex M-a @r{(programming modes)} and similarly for M-e. The @r{..} thingy is necessary to reset to the default typeface, since key index is implicitly typeset in @code. > +@findex backward-sentence > +@findex forward-sentence Likewise with these two @findex entries: qualify them, since we have the same commands documented elsewhere under "Sentences". > --- a/doc/emacs/text.texi > +++ b/doc/emacs/text.texi > @@ -253,6 +253,14 @@ Sentences > of a sentence. Set the variable @code{sentence-end-without-period} to > @code{t} in such cases. > > + Even though the above mentioned sentence movement commands are based > +on human languages, other Emacs modes can set these command to get > +similar functionality. What exactly a sentence is in a non-human > +language is dependent on the target language, but usually it is > +complete statements, such as a variable definition and initialization, > +or a conditional statement (@pxref{Moving by Sentences,,, emacs, The > +extensible self-documenting text editor}). The last sentence should be in "Moving by Sentences", since it describes the commands documented there. Also, please add a cross-reference here to "Moving by Sentences", since you mention that in the text (and rightfully so). > +@defvar treesit-sentence-type-regexp > +The value of this variable is a regexp matching the node type of sentence > +nodes. (For ``node'' and ``node type'', @pxref{Parsing Program Source}.) > + > +@findex treesit-forward-sentence > +@findex forward-sentence > +@findex backward-sentence > +If Emacs is compiled with tree-sitter, it can use the tree-sitter > +parser information to move across syntax constructs. Since what > +exactly is considered a sentence varies between languages, a major mode > +should set @code{treesit-sentence-type-regexp} to determine that. Then > +the mode can get navigation-by-sentence functionality for free, by using > +@code{forward-sentence} and @code{backward-sentence}. Here please also add a cross-reference to the "Moving by Sentences" node in the Emacs manual, so that people could understand what kind of "sentences" this is talking about. > +** New defvar forward-sentence-function. ^^^^^^^^^^ "New variable" > +Emacs now can set this variable to customize the behavior of the > +'forward-sentence' function. Not "Emacs", but "major modes". > +** New defun forward-sentence-default-function. ^^^^^^^^^ "New function" > +The previous implementation of 'forward-sentence' is moved into its > +own function, to be bound by 'forward-sentence-function'. > + > +** New defvar-local 'treesit-sentence-type-regexp. > +Similarly to 'treesit-defun-type-regexp', this variable is used to > +navigate sentences in Tree-sitter enabled modes. > + > +** New function 'treesit-forward-sentence'. > +treesit.el now conditionally sets 'forward-sentence-function' for all > +Tree-sitter modes that sets 'treesit-sentence-type-regexp'. Please make these related items sub-headings of a common heading, something like "Commands and variables to move by program statements". > + > > * Changes in Specialized Modes and Packages in Emacs 30.1 > --- > diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el > index 73abb155aaa..fd2d83eeebf 100644 > --- a/lisp/textmodes/paragraphs.el > +++ b/lisp/textmodes/paragraphs.el > @@ -441,13 +441,12 @@ end-of-paragraph-text > (if (< (point) (point-max)) > (end-of-paragraph-text)))))) > > -(defun forward-sentence (&optional arg) > +(defun forward-sentence-default-function (&optional arg) > "Move forward to next end of sentence. With argument, repeat. > When ARG is negative, move backward repeatedly to start of sentence. > > The variable `sentence-end' is a regular expression that matches ends of > sentences. Also, every paragraph boundary terminates sentences as well." > - (interactive "^p") > (or arg (setq arg 1)) > (let ((opoint (point)) > (sentence-end (sentence-end))) > @@ -480,6 +479,18 @@ forward-sentence > (let ((npoint (constrain-to-field nil opoint t))) > (not (= npoint opoint))))) > > +(defvar forward-sentence-function #'forward-sentence-default-function > + "Function to be used to calculate sentence movements. > +See `forward-sentence' for a description of its behavior.") > + > +(defun forward-sentence (&optional arg) > + "Move forward to next end of sentence. With argument, repeat. ^^^^^^^^^^^^^^^^^^^^^ "With argument ARG, repeat." The doc string should reference the arguments where possible. > +When ARG is negative, move backward repeatedly to start of sentence. ^^^^ "If", not "When". > +(defvar-local treesit-sentence-type-regexp "" > + "A regexp that matches the node type of sentence nodes. Why is the default an empty regexp? wouldn't nil be better?