From: Thorsten Jolitz <tjolitz@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: Low level trickery for changing character syntax?
Date: Wed, 09 Apr 2014 09:44:39 +0200 [thread overview]
Message-ID: <87d2grhrq0.fsf@gmail.com> (raw)
In-Reply-To: 5344E1A7.50900@easy-emacs.de
Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
> Am 08.04.2014 19:00, schrieb Thorsten Jolitz:
>>
>> Hi List,
>>
>> assume an imaginary elisp library gro.el I cannot (or don't want to)
>> change that is used on files of type A, with functions matching these
>> kinds of strings:
>>
>> #+begin_src emacs-lisp
>> (defconst rgxp-1 "^[*] [*]Fat[*]$")
>>
>> (defun foo (strg)
>> (and (string-match "^\\*+[ \t]* \\*.+\\*" strg)
>> (string-match rgxp-1 strg)))
>> #+end_src
>>
>> #+results:
>> : foo
>>
>> #+begin_src emacs-lisp
>> (foo "* *Fat*")
>> #+end_src
>>
>> #+results:
>> : 0
>>
>> #+begin_src emacs-lisp
>> (foo "+ *Fat*")
>> #+end_src
>>
>> #+results:
>>
>> Now assume I want to use gro.el functionality on files of type B
>> such that it matches strings likes this:
>>
>> #+begin_src emacs-lisp
>> (foo "// # *Fat*//" )
>> #+end_src
>>
>> In short, when called from file.type-A, I want foo to match "// #
>> *Fat*//", while it should only match "* *Fat*" when called from
>> file.type-B (without changing foo or rgxp-1).
>>
>> Thus in rgxp-1 and in foo, "^" would need to be replaced with "^// ",
>> the first "*" would need to be replaced with "#" (the other occurences
>> not), and "$" would need to be replaced with "//$".
>>
>> Now I wonder what would be the best way (or at least a possible way) to
>> achieve this with Emacs low-level trickery (almost) without touching
>> gro.el. I don't enough know about syntax table low-level stuff besides
>> reading the manual, so these are only vague speculations:
>>
>> 1. Change the syntax-table of gro.el whenever it is applied to files of
>> type B such that "^" is seen as "^// ", "*" as "#" etc.?
>>
>> 2. Define new categories and put "^" "*" and "$" in them, and somehow
>> load/activate these categories conditional on the type of file gro.el
>> functionality is called upon. These categories should then achieve that
>> "^" is seen as "^// " etc when the categories are loaded?
>>
>> 3. Define "^" and "$", when found at beg/end of a string, as 'generic
>> comment delimiter, and define "/" as generic comment delimiter too, such
>> that "^//" and "//$" are matched by "^" and "$"?
>>
>> I know that these ideas do not and cannot work as described, but I'm
>> looking for a hint which idea could possibly work? What would be the way
>> to go?
>>
>> Or is this completely unrealistic and the only way to achieve it is to
>> change the hardcoded regexps in (imaginary) library gro.el?
>>
>
> You could define different syntax-tables and than call functions
>
> if type-A
> (with-syntax-table type-A ...
That looks like a promising approach, but I never worked with
syntax-tables so I ask myself:
Is it possible to redefine characters "^", "$" and "*" in a syntax-table
in such a way that the same hardcoded regexp, e.g.
,------------------
| "^[*] [*]Fat[*]$"
`------------------
matches "* *Fat*" when called (with-syntax-table type-A ...), but
matches e.g. "// # *Fat*//" when called (with-syntax-table type-B ...)?
* First approach
(from the elisp manual)
,---------------------------------------------------------------------
| A syntax descriptor is a Lisp string that describes the syntax class
| and other syntactic properties of a character. When you want to
| modify the syntax of a character, that is done by calling the
| function modify-syntax-entry and passing a syntax descriptor as one
| of its arguments (see Syntax Table Functions).
|
| The first character in a syntax descriptor must be a syntax class
| designator character. The second character, if present, specifies a
| matching character (e.g., in Lisp, the matching character for '(' is
| ')'); a space specifies that there is no matching character. Then
| come characters specifying additional syntax properties (see Syntax
| Flags).
|
| If no matching character or flags are needed, only one character
| (specifying the syntax class) is sufficient.
|
| For example, the syntax descriptor for the character '*' in C mode
| is ". 23" (i.e., punctuation, matching character slot unused, second
| character of a comment-starter, first character of a comment-ender),
| and the entry for '/' is '. 14' (i.e., punctuation, matching
| character slot unused, first character of a comment-starter, second
| character of a comment-ender).
`---------------------------------------------------------------------
I can see how give e.g. "^" a different syntax class from this quote,
maybe make it a comment-starter, but I cannot see how to make it match
the combination of itself, two comment-starters and a space if and only
if it follows a \", i.e. how to make
,------
| (looking-at "^")
`------
match e.g.
,-------
| "// "
`-------
at the beginning of a line when called (with-syntax-table type-B ...)?
* Second approach
(from the elisp manual)
,---------------------------------------------------------------------
| When the syntax table is not flexible enough to specify the syntax
| of a language, you can override the syntax table for specific
| character occurrences in the buffer, by applying a syntax-table text
| property. See Text Properties, for how to apply text properties.
`---------------------------------------------------------------------
where I find:
,-------------------------------------------------------------------
| Properties with Special Meanings
|
| Here is a table of text property names that have special built-in
| meanings.
|
| syntax-table
| The syntax-table property overrides what the syntax table says
| about this particular character. See Syntax Properties.
`-------------------------------------------------------------------
So I could assign "^" some special value for its special text property
'syntax-table, but w/o an example how to achieve my goal this way I'm a
bit lost here.
* Third approach
(from the elisp manual)
,--------------------------------------------------------------------
| Categories
|
| Categories provide an alternate way of classifying characters
| syntactically. You can define several categories as needed, then
| independently assign each character to one or more categories.
| Unlike syntax classes, categories are not mutually exclusive; it is
| normal for one character to belong to several categories.
`--------------------------------------------------------------------
category-tables are buffer-local like syntax-tables, what is useful in
my case. Say I define category-table "B" buffer-local in buffers of
type-B files. But what then? Would I have to put "^", "/" (or more
generally 'comment-start') and " " in that category, such that a single
,------
| (looking-at "^")
`------
matches
,-------
| "// "
`-------
when called from a buffer with buffer-local category "B"? I cannot see
how this should work.
--
cheers,
Thorsten
next prev parent reply other threads:[~2014-04-09 7:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-08 17:00 Low level trickery for changing character syntax? Thorsten Jolitz
2014-04-08 17:06 ` Thorsten Jolitz
2014-04-08 17:49 ` Andreas Röhler
2014-04-09 0:26 ` Thorsten Jolitz
2014-04-09 5:59 ` Andreas Röhler
2014-04-09 7:44 ` Thorsten Jolitz [this message]
2014-04-09 9:56 ` Andreas Röhler
2014-04-09 12:49 ` Stefan Monnier
2014-04-09 13:12 ` Thorsten Jolitz
2014-04-09 7:09 ` Tassilo Horn
2014-04-09 8:52 ` Org Minor Mode (was Re: Low level trickery for changing character syntax?) Thorsten Jolitz
2014-04-09 12:50 ` Stefan Monnier
2014-04-09 13:01 ` Tassilo Horn
2014-04-09 13:43 ` Thorsten Jolitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d2grhrq0.fsf@gmail.com \
--to=tjolitz@gmail.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).