From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Erik Charlebois Newsgroups: gmane.emacs.devel Subject: Re: multi-character syntactic entities in syntax tables Date: Fri, 26 Apr 2013 15:22:22 -0400 Message-ID: References: <87sj2d86o9.fsf@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=089e0122ac4656060904db487441 X-Trace: ger.gmane.org 1367004148 23542 80.91.229.3 (26 Apr 2013 19:22:28 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 26 Apr 2013 19:22:28 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dmitry Gutov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 26 21:22:32 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UVoE2-0006SY-0I for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 21:22:30 +0200 Original-Received: from localhost ([::1]:54830 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVoE1-0004CL-HK for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 15:22:29 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42185) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVoDx-0004Be-5F for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:22:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UVoDv-0008Lr-NT for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:22:25 -0400 Original-Received: from mail-ie0-x232.google.com ([2607:f8b0:4001:c03::232]:53645) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVoDv-0008Lf-Ep for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:22:23 -0400 Original-Received: by mail-ie0-f178.google.com with SMTP id aq17so5471348iec.9 for ; Fri, 26 Apr 2013 12:22:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=IoVxg34CM9+nBHOjz6zQUmBfsaFjdYAlpkN5BEkzk20=; b=rBP4G6MhtyJjUr3oBfXNXMMyTD757lDKcHWlD89zBh2UcSb19munrXb9HypiDLvs9c tfPCd6u4n6J/N9uR1foLquasVwI5/KtxfYhqDixLIMov7BEL9a3HgqFq5QnKLsvwdY8x rVQUe8csIiIxHESMNMrQqEG6ZU9uRt34qa6vzh3ijWlW+ZmqEVHr2jCCQ3TaLDwq50Uf QmBr6nkOm4vgjZMmy5gBmPXPt7iCrMx9YXPrwiznfF9xfwmxtRpmpkOArpb8r8Kl42F9 inj5SnESmvY+X3rzA7ps0lTOLQqLe3+3Wqe/QTiBlNnx+KS5MMU3FHpD0WcNcu+yVtcS dhww== X-Received: by 10.50.114.195 with SMTP id ji3mr2719536igb.67.1367004142762; Fri, 26 Apr 2013 12:22:22 -0700 (PDT) Original-Received: by 10.64.59.193 with HTTP; Fri, 26 Apr 2013 12:22:22 -0700 (PDT) In-Reply-To: <87sj2d86o9.fsf@yandex.ru> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c03::232 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:159171 Archived-At: --089e0122ac4656060904db487441 Content-Type: text/plain; charset=UTF-8 Off the top of my head, point motion (e.g. forward-word should skip the entire word, not stop where the syntax class changes from "(" to "w") and font locking (show-paren-mode should highlight the entire matching words). Since the matching keyword lengths can be different (begin vs end), my understanding is I can't just turn them into ((((( and ))) because it doesn't balance. Currently I have some hacks for Ruby mode that makes the first characters of the block keywords have "(" or ")" syntax class. It works fine, aside from point motion and font locking. On Fri, Apr 26, 2013 at 2:53 PM, Dmitry Gutov wrote: > Erik Charlebois writes: > > > One of the items in etc/TODO is: > > > > ** Beefed-up syntax-tables. > > *** recognize multi-character syntactic entities like `begin' and > > `end'. > > > > Lately I'm using languages where this would be quite useful and would > > be interested in adding support. Before I dive in, are there any > > strong opinions about how this should be implemented? > > > > The approach I was thinking of taking is defining a new syntax > > character class (let's say, *) which inherits from the previous > > character (recursively if the previous character is *). The important > > distinction is that they would not be treated as a new instance of > > that syntax class, so point movement by syntax class or paren matching > > would work (e.g. begin would be (****, and would only add 1 level of > > paren nesting). > > > > A mode would use a syntax-propertize-function to tag keywords with > > appropriate text properties. So something like Ruby: > > > > class Foo > > def Bar > > if condition > > ... > > end > > end > > end > > ruby-mode code could definitely benefit from something like this. > > > would have syntax classes like: > > > > (**** www > > (** www > > (* wwwwwwwww > > ... > > )** > > )** > > )** > > I don't think using syntax-propertize-function is something the person > who wrote that TODO entry had in mind, but if we'll use it for that > purpose, at least in ruby-mode implementing something like a "generic > parenthesis" class should suffice (which would work similarly to generic > string and generic comment delimiters), since all non-curly blocks in > Ruby end the same way. > > So, what's the rationale for your, more complex proposal? In what > context would treating e, g, i and n in "begin" as parenthesis openers > will be useful? > --089e0122ac4656060904db487441 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Off the top of my head, point motion (e.g. forw= ard-word should skip the entire word, not stop where the syntax class chang= es from "(" to "w") and font locking (show-paren-mode s= hould highlight the entire matching words).

Since the matching keyword lengths ca= n be different (begin vs end), my understanding is I can't just turn th= em into ((((( and ))) because it doesn't balance.

Currently I have some hacks for Ruby mode that makes the f= irst characters of the block keywords have "(" or ")" s= yntax class. It works fine, aside from point motion and font locking.



On Fri, Apr 26, 2013 at 2:53 PM, Dmitry Gutov <dgutov@yandex.= ru> wrote:
Erik= Charlebois <erikcharlebois@= gmail.com> writes:

> One of the items in etc/TODO is:
>
> ** Beefed-up syntax-tables.
> *** recognize multi-character syntactic entities like `begin' and<= br> > `end'.
>
> Lately I'm using languages where this would be quite useful and wo= uld
> be interested in adding support. Before I dive in, are there any
> strong opinions about how this should be implemented?
>
> The approach I was thinking of taking is defining a new syntax
> character class (let's say, *) which inherits from the previous > character (recursively if the previous character is *). The important<= br> > distinction is that they would not be treated as a new instance of
> that syntax class, so point movement by syntax class or paren matching=
> would work (e.g. begin would be (****, and would only add 1 level of > paren nesting).
>
> A mode would use a syntax-propertize-function to tag keywords with
> appropriate text properties. So something like Ruby:
>
> class Foo
> def Bar
> if condition
> ...
> end
> end
> end

ruby-mode code could definitely benefit from something like thi= s.

> would have syntax classes like:
>
> (**** www
> (** www
> (* wwwwwwwww
> ...
> )**
> )**
> )**

I don't think using syntax-propertize-function is something the p= erson
who wrote that TODO entry had in mind, but if we'll use it for that
purpose, at least in ruby-mode implementing something like a "generic<= br> parenthesis" class should suffice (which would work similarly to gener= ic
string and generic comment delimiters), since all non-curly blocks in
Ruby end the same way.

So, what's the rationale for your, more complex proposal? In what
context would treating e, g, i and n in "begin" as parenthesis op= eners
will be useful?

--089e0122ac4656060904db487441--