From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Stefan Monnier <monnier@IRO.UMontreal.CA>
Newsgroups: gmane.emacs.devel
Subject: Re: multi-character syntactic entities in syntax tables
Date: Fri, 26 Apr 2013 15:26:01 -0400
Message-ID: <jwvmwsl6ra1.fsf-monnier+emacs@gnu.org>
References: <CAC+abJbzNH7fNy=M3Spm6XxNTTqpU1VsCdmbZHV+yGHD+tMcbg@mail.gmail.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: ger.gmane.org 1367004368 25884 80.91.229.3 (26 Apr 2013 19:26:08 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Fri, 26 Apr 2013 19:26:08 +0000 (UTC)
Cc: emacs-devel@gnu.org
To: Erik Charlebois <erikcharlebois@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 26 21:26:13 2013
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UVoHa-0002JT-Tl
	for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 21:26:11 +0200
Original-Received: from localhost ([::1]:58986 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UVoHa-0006bZ-74
	for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 15:26:10 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:43134)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1UVoHU-0006Vo-Qa
	for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:26:06 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1UVoHT-0001IU-Sf
	for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:26:04 -0400
Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]:60134)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1UVoHT-0001II-LO
	for emacs-devel@gnu.org; Fri, 26 Apr 2013 15:26:03 -0400
Original-Received: from faina.iro.umontreal.ca (lechon.iro.umontreal.ca
	[132.204.27.242])
	by pruche.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id r3QJQ18D001293;
	Fri, 26 Apr 2013 15:26:01 -0400
Original-Received: by faina.iro.umontreal.ca (Postfix, from userid 20848)
	id AA4DEB40E2; Fri, 26 Apr 2013 15:26:01 -0400 (EDT)
In-Reply-To: <CAC+abJbzNH7fNy=M3Spm6XxNTTqpU1VsCdmbZHV+yGHD+tMcbg@mail.gmail.com>
	(Erik Charlebois's message of "Fri, 26 Apr 2013 13:28:42 -0400")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)
X-NAI-Spam-Flag: NO
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0
X-NAI-Spam-Rules: 1 Rules triggered
	RV4561=0
X-NAI-Spam-Version: 2.3.0.9362 : core <4561> : streams <948845> : uri <1404812>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 132.204.246.22
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:159172
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/159172>

> One of the items in etc/TODO is:
> ** Beefed-up syntax-tables.
> *** recognize multi-character syntactic entities like `begin' and `end'.

> Lately I'm using languages where this would be quite useful and would be
> interested in adding support. Before I dive in, are there any strong
> opinions about how this should be implemented?

> The approach I was thinking of taking is defining a new syntax character
> class (let's say, *) which inherits from the previous character
> (recursively if the previous character is *). The important distinction is
> that they would not be treated as a new instance of that syntax class, so
> point movement by syntax class or paren matching would work (e.g. begin
> would be (****, and would only add 1 level of paren nesting).

I see.  So you'd rely on syntax-propertize-function to recognize those
multi-char entities and label them with one of the current syntaxes for
the first char and "*" for the other ones, thus labelling the symbol as
forming a single entity.

That's interesting.  The main drawback I see with it is the heavy
reliance on syntax-propertize, which can imply a significant performance
cost when jumping to the end of a largish buffer (forcing the whole
buffer to be lexed).

But it sounds like an attractive "easy" way to extend syntax tables to
support multi-char entities.

BTW: have you tried to set forward-sexp-function to something like
ruby-forward-sexp?


        Stefan