From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thorsten Jolitz Newsgroups: gmane.emacs.help Subject: Re: Low level trickery for changing character syntax? Date: Wed, 09 Apr 2014 09:44:39 +0200 Message-ID: <87d2grhrq0.fsf@gmail.com> References: <87lhvfzrgt.fsf@gmail.com> <5344E1A7.50900@easy-emacs.de> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1397029445 5783 80.91.229.3 (9 Apr 2014 07:44:05 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 9 Apr 2014 07:44:05 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Apr 09 09:43:58 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WXnAs-0007BC-0I for geh-help-gnu-emacs@m.gmane.org; Wed, 09 Apr 2014 09:43:58 +0200 Original-Received: from localhost ([::1]:44868 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXnAr-0000kp-ML for geh-help-gnu-emacs@m.gmane.org; Wed, 09 Apr 2014 03:43:57 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44350) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXnAY-0000iH-Bv for help-gnu-emacs@gnu.org; Wed, 09 Apr 2014 03:43:44 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WXnAP-0000Lf-FT for help-gnu-emacs@gnu.org; Wed, 09 Apr 2014 03:43:38 -0400 Original-Received: from plane.gmane.org ([80.91.229.3]:38313) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXnAP-0000Kv-4q for help-gnu-emacs@gnu.org; Wed, 09 Apr 2014 03:43:29 -0400 Original-Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1WXnAN-0006rk-2d for help-gnu-emacs@gnu.org; Wed, 09 Apr 2014 09:43:27 +0200 Original-Received: from g231106174.adsl.alicedsl.de ([92.231.106.174]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 09 Apr 2014 09:43:27 +0200 Original-Received: from tjolitz by g231106174.adsl.alicedsl.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 09 Apr 2014 09:43:27 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 197 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: g231106174.adsl.alicedsl.de User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Cancel-Lock: sha1:SokmXgftEiybNmCJKehfLOCTcbA= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.229.3 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:97053 Archived-At: Andreas Röhler writes: > Am 08.04.2014 19:00, schrieb Thorsten Jolitz: >> >> Hi List, >> >> assume an imaginary elisp library gro.el I cannot (or don't want to) >> change that is used on files of type A, with functions matching these >> kinds of strings: >> >> #+begin_src emacs-lisp >> (defconst rgxp-1 "^[*] [*]Fat[*]$") >> >> (defun foo (strg) >> (and (string-match "^\\*+[ \t]* \\*.+\\*" strg) >> (string-match rgxp-1 strg))) >> #+end_src >> >> #+results: >> : foo >> >> #+begin_src emacs-lisp >> (foo "* *Fat*") >> #+end_src >> >> #+results: >> : 0 >> >> #+begin_src emacs-lisp >> (foo "+ *Fat*") >> #+end_src >> >> #+results: >> >> Now assume I want to use gro.el functionality on files of type B >> such that it matches strings likes this: >> >> #+begin_src emacs-lisp >> (foo "// # *Fat*//" ) >> #+end_src >> >> In short, when called from file.type-A, I want foo to match "// # >> *Fat*//", while it should only match "* *Fat*" when called from >> file.type-B (without changing foo or rgxp-1). >> >> Thus in rgxp-1 and in foo, "^" would need to be replaced with "^// ", >> the first "*" would need to be replaced with "#" (the other occurences >> not), and "$" would need to be replaced with "//$". >> >> Now I wonder what would be the best way (or at least a possible way) to >> achieve this with Emacs low-level trickery (almost) without touching >> gro.el. I don't enough know about syntax table low-level stuff besides >> reading the manual, so these are only vague speculations: >> >> 1. Change the syntax-table of gro.el whenever it is applied to files of >> type B such that "^" is seen as "^// ", "*" as "#" etc.? >> >> 2. Define new categories and put "^" "*" and "$" in them, and somehow >> load/activate these categories conditional on the type of file gro.el >> functionality is called upon. These categories should then achieve that >> "^" is seen as "^// " etc when the categories are loaded? >> >> 3. Define "^" and "$", when found at beg/end of a string, as 'generic >> comment delimiter, and define "/" as generic comment delimiter too, such >> that "^//" and "//$" are matched by "^" and "$"? >> >> I know that these ideas do not and cannot work as described, but I'm >> looking for a hint which idea could possibly work? What would be the way >> to go? >> >> Or is this completely unrealistic and the only way to achieve it is to >> change the hardcoded regexps in (imaginary) library gro.el? >> > > You could define different syntax-tables and than call functions > > if type-A > (with-syntax-table type-A ... That looks like a promising approach, but I never worked with syntax-tables so I ask myself: Is it possible to redefine characters "^", "$" and "*" in a syntax-table in such a way that the same hardcoded regexp, e.g. ,------------------ | "^[*] [*]Fat[*]$" `------------------ matches "* *Fat*" when called (with-syntax-table type-A ...), but matches e.g. "// # *Fat*//" when called (with-syntax-table type-B ...)? * First approach (from the elisp manual) ,--------------------------------------------------------------------- | A syntax descriptor is a Lisp string that describes the syntax class | and other syntactic properties of a character. When you want to | modify the syntax of a character, that is done by calling the | function modify-syntax-entry and passing a syntax descriptor as one | of its arguments (see Syntax Table Functions). | | The first character in a syntax descriptor must be a syntax class | designator character. The second character, if present, specifies a | matching character (e.g., in Lisp, the matching character for '(' is | ')'); a space specifies that there is no matching character. Then | come characters specifying additional syntax properties (see Syntax | Flags). | | If no matching character or flags are needed, only one character | (specifying the syntax class) is sufficient. | | For example, the syntax descriptor for the character '*' in C mode | is ". 23" (i.e., punctuation, matching character slot unused, second | character of a comment-starter, first character of a comment-ender), | and the entry for '/' is '. 14' (i.e., punctuation, matching | character slot unused, first character of a comment-starter, second | character of a comment-ender). `--------------------------------------------------------------------- I can see how give e.g. "^" a different syntax class from this quote, maybe make it a comment-starter, but I cannot see how to make it match the combination of itself, two comment-starters and a space if and only if it follows a \", i.e. how to make ,------ | (looking-at "^") `------ match e.g. ,------- | "// " `------- at the beginning of a line when called (with-syntax-table type-B ...)? * Second approach (from the elisp manual) ,--------------------------------------------------------------------- | When the syntax table is not flexible enough to specify the syntax | of a language, you can override the syntax table for specific | character occurrences in the buffer, by applying a syntax-table text | property. See Text Properties, for how to apply text properties. `--------------------------------------------------------------------- where I find: ,------------------------------------------------------------------- | Properties with Special Meanings | | Here is a table of text property names that have special built-in | meanings. | | syntax-table | The syntax-table property overrides what the syntax table says | about this particular character. See Syntax Properties. `------------------------------------------------------------------- So I could assign "^" some special value for its special text property 'syntax-table, but w/o an example how to achieve my goal this way I'm a bit lost here. * Third approach (from the elisp manual) ,-------------------------------------------------------------------- | Categories | | Categories provide an alternate way of classifying characters | syntactically. You can define several categories as needed, then | independently assign each character to one or more categories. | Unlike syntax classes, categories are not mutually exclusive; it is | normal for one character to belong to several categories. `-------------------------------------------------------------------- category-tables are buffer-local like syntax-tables, what is useful in my case. Say I define category-table "B" buffer-local in buffers of type-B files. But what then? Would I have to put "^", "/" (or more generally 'comment-start') and " " in that category, such that a single ,------ | (looking-at "^") `------ matches ,------- | "// " `------- when called from a buffer with buffer-local category "B"? I cannot see how this should work. -- cheers, Thorsten