From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Erik Charlebois Newsgroups: gmane.emacs.devel Subject: multi-character syntactic entities in syntax tables Date: Fri, 26 Apr 2013 13:28:42 -0400 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=e89a8f234a51d0fb1d04db46dd0c X-Trace: ger.gmane.org 1366997328 15510 80.91.229.3 (26 Apr 2013 17:28:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 26 Apr 2013 17:28:48 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 26 19:28:50 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UVmS1-00050Z-U7 for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 19:28:50 +0200 Original-Received: from localhost ([::1]:39497 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVmS1-0006BV-Hc for ged-emacs-devel@m.gmane.org; Fri, 26 Apr 2013 13:28:49 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:38243) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVmRx-0006BL-TO for emacs-devel@gnu.org; Fri, 26 Apr 2013 13:28:47 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UVmRv-0002cW-KA for emacs-devel@gnu.org; Fri, 26 Apr 2013 13:28:45 -0400 Original-Received: from mail-ie0-x234.google.com ([2607:f8b0:4001:c03::234]:49866) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UVmRv-0002cB-F7 for emacs-devel@gnu.org; Fri, 26 Apr 2013 13:28:43 -0400 Original-Received: by mail-ie0-f180.google.com with SMTP id to1so5138014ieb.39 for ; Fri, 26 Apr 2013 10:28:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=2hOrVhdEXe+UpRpc0vT4tU5QZMu+4dmlFapn5LtHtek=; b=ISnZ/kL/EzM+fi5TAZzPgmBCL6UG9X4bh2IldbMg2Asp4/gmUpJx4yYqwrrIrzm1vb O9QQqQ+35m8vbNQlOVxRPl+iK/Wkb8bmDMsmEEVhYY9XygqShIanavZK3+SZR0+R0iE0 V8TdBJM7aCX2OLdsN7Ri5/jo/XMxRZ2K8u4f4F6DtaUKhc5AAcJ6E7Jrva+p7ByS6vSF RBksk4EYcYrsV6K71++nepABkA8ilvdSPja30NXtB/NQBRNUzCegkQ17NugNKwLbxJai aa+2MlGKorYoAOvMOW0TUN229FRmiUaf6i4kfBdgtBvvTZJp/pCaJjAj98L5lrFlgCH5 u11w== X-Received: by 10.50.170.36 with SMTP id aj4mr2382792igc.67.1366997322493; Fri, 26 Apr 2013 10:28:42 -0700 (PDT) Original-Received: by 10.64.59.193 with HTTP; Fri, 26 Apr 2013 10:28:42 -0700 (PDT) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c03::234 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:159168 Archived-At: --e89a8f234a51d0fb1d04db46dd0c Content-Type: text/plain; charset=UTF-8 One of the items in etc/TODO is: ** Beefed-up syntax-tables. *** recognize multi-character syntactic entities like `begin' and `end'. Lately I'm using languages where this would be quite useful and would be interested in adding support. Before I dive in, are there any strong opinions about how this should be implemented? The approach I was thinking of taking is defining a new syntax character class (let's say, *) which inherits from the previous character (recursively if the previous character is *). The important distinction is that they would not be treated as a new instance of that syntax class, so point movement by syntax class or paren matching would work (e.g. begin would be (****, and would only add 1 level of paren nesting). A mode would use a syntax-propertize-function to tag keywords with appropriate text properties. So something like Ruby: class Foo def Bar if condition ... end end end would have syntax classes like: (**** www (** www (* wwwwwwwww ... )** )** )** --e89a8f234a51d0fb1d04db46dd0c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
One of the items in etc/TODO is:

*= * Beefed-up syntax-tables.
*** recognize multi-character syntacti= c entities like `begin' and `end'.

Lately I'm using languages where this would be quite useful and would b= e interested in adding support. Before I dive in, are there any strong opin= ions about how this should be implemented?

The approach I was thinking of taking is defining a new syntax character cl= ass (let's say, *) which inherits from the previous character (recursiv= ely if the previous character is *). The important distinction is that they= would not be treated as a new instance of that syntax class, so point move= ment by syntax class or paren matching would work (e.g. begin would be (***= *, and would only add 1 level of paren nesting).

A mode would use a syntax-propertize-functi= on to tag keywords with appropriate text properties. So something like Ruby= :

class Foo
=C2=A0 def= Bar
=C2=A0 =C2=A0 if condition
=C2=A0 =C2=A0 =C2=A0 = ...
=C2=A0 =C2=A0 end
=C2=A0 end
<= div style>end

would have syntax classe= s like:

(**** www
=C2=A0 (** www
=C2=A0 =C2=A0 (* w= wwwwwwww
=C2=A0 =C2=A0 =C2=A0 =C2=A0...
=C2= =A0 =C2=A0 )**
=C2=A0 )**
)**




--e89a8f234a51d0fb1d04db46dd0c--