From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Writing syntax-propertize-function for strings in code in strings, etc Date: Sun, 09 Sep 2012 04:13:24 +0400 Message-ID: <504BDF24.2080008@yandex.ru> References: <87a9x1jiyu.fsf@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1347149618 24100 80.91.229.3 (9 Sep 2012 00:13:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 9 Sep 2012 00:13:38 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Sep 09 02:13:37 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TAV9c-0003dI-3C for ged-emacs-devel@m.gmane.org; Sun, 09 Sep 2012 02:13:36 +0200 Original-Received: from localhost ([::1]:56967 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TAV9Y-0005HU-N1 for ged-emacs-devel@m.gmane.org; Sat, 08 Sep 2012 20:13:32 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42744) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TAV9V-0005HK-BW for emacs-devel@gnu.org; Sat, 08 Sep 2012 20:13:30 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TAV9T-0001Ao-TL for emacs-devel@gnu.org; Sat, 08 Sep 2012 20:13:29 -0400 Original-Received: from forward17.mail.yandex.net ([95.108.253.142]:37928) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TAV9T-0001Ag-D5 for emacs-devel@gnu.org; Sat, 08 Sep 2012 20:13:27 -0400 Original-Received: from smtp19.mail.yandex.net (smtp19.mail.yandex.net [95.108.252.19]) by forward17.mail.yandex.net (Yandex) with ESMTP id 43CF310610D1; Sun, 9 Sep 2012 04:13:24 +0400 (MSK) Original-Received: from smtp19.mail.yandex.net (localhost [127.0.0.1]) by smtp19.mail.yandex.net (Yandex) with ESMTP id 20EC7BE0286; Sun, 9 Sep 2012 04:13:24 +0400 (MSK) Original-Received: from 5x166x246x245.dynamic.spb.ertelecom.ru (5x166x246x245.dynamic.spb.ertelecom.ru [5.166.246.245]) by smtp19.mail.yandex.net (nwsmtp/Yandex) with ESMTP id DN6qclJc-DN6G4xqk; Sun, 9 Sep 2012 04:13:23 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1347149604; bh=sy2LxNCEIYOwAozOrYXN2+GfpdHZUs4xIpyLh+DVpPM=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=bUoFt+WGUhrlmkDZoL08KQvs5ey0Fkh/ogZAFvw4YLJGQuX/aicxHneeCkGTTXdaF VJFU3s/3/PXzScGteCx5qH7E3JGV9hlZhc/8JPfWMArv+DV48zSTgvBW0OHizZVgFa tqc7Ti1GzgUJT/CrNe0c588VLKxo45+GIURnp/dg= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 95.108.253.142 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:153187 Archived-At: On 08.09.2012 23:31, Stefan Monnier wrote: >> Sublime Text handles these aspects rather excellently, and even >> highlights the code inside as code, not string contents: >> http://i.imgur.com/NH1Ye.png >> Is there a proper way to do so in Emacs? > > Currently, it's pretty difficult for Emacs to handle it like in the > picture above. > >> My first idea was, when propertizing interpolation, to see what kind of >> string we're inside, and apply the appropriate syntax to the enclosing >> braces, thus splitting the literal in two. But (a) string quotes class >> doesn't work that way (text characters on both ends of a literal must >> be the same), (b) if we're inside a percent literal (syntax class: >> generic string), and the literal spans several lines, we need to be able >> to jump to its real beginning position from its end, but with this >> approach (nth 8 (syntax-ppss)) will just return the beginning of the >> last piece. Saving buffer positions to text properties looks not very >> reliable, since the respective text may be deleted and re-inserted. > >> Suggestions? > > I think the better approach is to extend syntax.c with such a notion of > "syntax within strings". This could hopefully be used for: > - Strings within strings (e.g. Postscript nested strings). > - Comments within strings (I think some regexps allow comments). > - Code within strings (as here and in shell scripts). > I'm not sure what that would look like concretely. Maybe a new string > quote syntax which specifies a syntax-table to use within the string? In the current case, the syntactic meanings of characters are the same as outside the string, except a certain character should end the "inner" region and return the state after it to "inside string" (*). Maybe just two new classes, similar to open and close parenthesis (to support nesting)? * Preferably, only when it's not inside an "inner" string or comment. At least, that's how it works in Ruby 1.9: irb(main):011:0> %(#{"})"}) => "})" irb(main):013:0> %(#{#}) irb(main):014:0> }) => "" The above examples also won't work with current percent literals handling, but that's less important, I think. parse-partial-sexp will probably need to keep some sort of stack for string-related data, so that when we're after the end of an "inner" region, we could find out what is the outer string's type and where it started. And when inside the inner region, the position of its start. Use the 9th state element and bump the total number to 10?