From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Writing syntax-propertize-function for strings in code in strings, etc Date: Fri, 26 Oct 2012 23:18:10 +0400 Message-ID: <508AE1F2.4000408@yandex.ru> References: <87a9x1jiyu.fsf@yandex.ru> <504FE870.7070002@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1351279096 29638 80.91.229.3 (26 Oct 2012 19:18:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 26 Oct 2012 19:18:16 +0000 (UTC) To: Stefan Monnier , emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 26 21:18:25 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TRpQG-0005q2-9N for ged-emacs-devel@m.gmane.org; Fri, 26 Oct 2012 21:18:24 +0200 Original-Received: from localhost ([::1]:60430 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TRpQ8-00078n-DF for ged-emacs-devel@m.gmane.org; Fri, 26 Oct 2012 15:18:16 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:50531) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TRpQ4-00078i-MV for emacs-devel@gnu.org; Fri, 26 Oct 2012 15:18:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TRpQ3-0006K8-7y for emacs-devel@gnu.org; Fri, 26 Oct 2012 15:18:12 -0400 Original-Received: from forward9.mail.yandex.net ([77.88.61.48]:58855) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TRpQ2-0006Js-GF for emacs-devel@gnu.org; Fri, 26 Oct 2012 15:18:11 -0400 Original-Received: from smtp8.mail.yandex.net (smtp8.mail.yandex.net [77.88.61.54]) by forward9.mail.yandex.net (Yandex) with ESMTP id 503F6CE19B9; Fri, 26 Oct 2012 23:18:08 +0400 (MSK) Original-Received: from smtp8.mail.yandex.net (localhost [127.0.0.1]) by smtp8.mail.yandex.net (Yandex) with ESMTP id 2A9BC1B603A5; Fri, 26 Oct 2012 23:18:08 +0400 (MSK) Original-Received: from 5x166x242x229.dynamic.spb.ertelecom.ru (5x166x242x229.dynamic.spb.ertelecom.ru [5.166.242.229]) by smtp8.mail.yandex.net (nwsmtp/Yandex) with ESMTP id I7qeYjmA-I7qqCZCR; Fri, 26 Oct 2012 23:18:07 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1351279088; bh=7wGyMFSuISYgx7vSRCoHbz6XKbMtf2v4sTwy3yAEXJo=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=NONR8HeF1Q+Nj1JefmQGFeWSbVSbme04pzXkXiW0aLXFUZJOXeQW8nUGpXnnaGm8M ui9OAoF4326yt1ibDfDMnhnspHda0AfKm515NuZrETCLl/uVcuFsaHMjiPr9UiXHbI +t+KN3vwPkoG4JScq4mV+FmzusNolZAlEDbAnrjw= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 77.88.61.48 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:154529 Archived-At: On 26.10.2012 20:19, Stefan Monnier wrote: >>> I think the better approach is to extend syntax.c with such a notion of >>> "syntax within strings". This could hopefully be used for: >>> - Strings within strings (e.g. Postscript nested strings). >>> - Comments within strings (I think some regexps allow comments). >>> - Code within strings (as here and in shell scripts). >>> I'm not sure what that would look like concretely. Maybe a new string >>> quote syntax which specifies a syntax-table to use within the string? >> In the current case, the syntactic meanings of characters are the same as >> outside the string, except a certain character should end the "inner" region >> and return the state after it to "inside string" (*). > > Right, that's the "code within string" case, where you just need one > char to mean "pop last state". Or that last character's text would just be assigned a class from the syntax-propertize-function, no different syntax table required. Not sure how useful would the first option be. >> Maybe just two new classes, similar to open and close parenthesis (to >> support nesting)? > > Yes, one "push " and one "pop". So, I don't see the usefulness of the value in the simple case of embedding code in the same language. Unless we're doing something like the "multiple-modes" use case, which we discussed in another thread. This looks like a more general solution. > Of course, this is fine for parse-partial-sexp, but it's a different > matter for backward-sexp, where the "pop" would also need to know the > . Maybe in the latter case the scanning function, when encountering the "pop" syntax property, would just skip ahead until it finds the corresponding "push"? Unless we want to support intersecting subregions, like ([{])}. >> parse-partial-sexp will probably need to keep some sort of stack for >> string-related data, so that when we're after the end of an "inner" region, >> we could find out what is the outer string's type and where it started. >> And when inside the inner region, the position of its start. >> Use the 9th state element and bump the total number to 10? > > The total number is already 10. And yes, I think we can use the 9th > element. Currently, the 9th element is a stack of open-paren positions. > So, I think we can reuse it (presumably we'd want parens and nested > strings to be "mutually properly nested"). --Dmitry