all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* multiline regex mode?
@ 2006-10-09 12:57 Giles Chamberlin
  2006-10-13 20:02 ` Dieter Wilhelm
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Giles Chamberlin @ 2006-10-09 12:57 UTC (permalink / raw)



Is it possible to have .* match over a new line?

I'd like to be able to match 

{foo}

and

{
   foo
}

with a regex of the form {.*} without having to specify the line
break.

Thanks
-- 
Giles

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-10-09 12:57 multiline regex mode? Giles Chamberlin
@ 2006-10-13 20:02 ` Dieter Wilhelm
       [not found] ` <mailman.8131.1160774366.9609.help-gnu-emacs@gnu.org>
  2006-11-23 19:25 ` Stefan Monnier
  2 siblings, 0 replies; 19+ messages in thread
From: Dieter Wilhelm @ 2006-10-13 20:02 UTC (permalink / raw)


Giles Chamberlin <giles.chamberlin@tandberg.net> writes:

> Is it possible to have .* match over a new line?
>
> I'd like to be able to match 
>
> {foo}
>
> and
>
> {
>    foo
> }
>
> with a regex of the form {.*} without having to specify the line
> break.

Does this (between the "")

"{[^}]*}"

work for you?  

>
> Thanks

-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
       [not found] ` <mailman.8131.1160774366.9609.help-gnu-emacs@gnu.org>
@ 2006-10-16 10:20   ` Giles Chamberlin
  2006-10-16 11:24     ` Michaël Cadilhac
       [not found]     ` <mailman.8206.1160997864.9609.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 19+ messages in thread
From: Giles Chamberlin @ 2006-10-16 10:20 UTC (permalink / raw)


[snip] regex to match {foo} and {\nfoo\n}

Dieter Wilhelm <dieter@duenenhof-wilhelm.de> writes:

> Does this (between the "")
>
> "{[^}]*}"
>
> work for you?  

That works but I' don't understand why given that {.*} fails.  Any
enlightenment gratefully received!

-- 
Giles

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-10-16 10:20   ` Giles Chamberlin
@ 2006-10-16 11:24     ` Michaël Cadilhac
       [not found]     ` <mailman.8206.1160997864.9609.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 19+ messages in thread
From: Michaël Cadilhac @ 2006-10-16 11:24 UTC (permalink / raw)
  Cc: help-gnu-emacs


[-- Attachment #1.1: Type: text/plain, Size: 1145 bytes --]

Giles Chamberlin <giles.chamberlin@tandberg.net> writes:

> [snip] regex to match {foo} and {\nfoo\n}
>
> Dieter Wilhelm <dieter@duenenhof-wilhelm.de> writes:
>
>> Does this (between the "")
>>
>> "{[^}]*}"
>>
>> work for you?  
>
> That works but I' don't understand why given that {.*} fails.  Any
> enlightenment gratefully received!

This is because of the following (from M-x info m elisp):

`.' (Period)
     is a special character that matches any single character except a
     newline.

`[^ ... ]'
     A complemented character alternative can match a newline, unless
     newline is mentioned as one of the characters not to match.  This
     is in contrast to the handling of regexps in programs such as
     `grep'.

-- 
/!\ My mail address changed, please update your files accordingly.
 |      Michaël `Micha' Cadilhac   |  Ajoutez du whisky                     |
 |         Epita/LRDE Promo 2007   |           à n'importe quel texte,      |
 |  http://michael.cadilhac.name   |    ça vous fera un beau pangramme.     |
 `--  -   JID: micha@amessage.be --'          -- Michel Clavel         -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
       [not found]     ` <mailman.8206.1160997864.9609.help-gnu-emacs@gnu.org>
@ 2006-10-16 15:42       ` Giles Chamberlin
  0 siblings, 0 replies; 19+ messages in thread
From: Giles Chamberlin @ 2006-10-16 15:42 UTC (permalink / raw)


michael@cadilhac.name (Michaël Cadilhac) writes:

> Giles Chamberlin <giles.chamberlin@tandberg.net> writes:
>>
>> That works but I don't understand why given that {.*} fails.  Any
>> enlightenment gratefully received!
>
> This is because of the following (from M-x info m elisp):
>
> `.' (Period)
>      is a special character that matches any single character except a
>      newline.
>
> `[^ ... ]'
>      A complemented character alternative can match a newline, unless
>      newline is mentioned as one of the characters not to match. 

Ah - all is now clear - thanks for your help.

-- 
Giles

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-10-09 12:57 multiline regex mode? Giles Chamberlin
  2006-10-13 20:02 ` Dieter Wilhelm
       [not found] ` <mailman.8131.1160774366.9609.help-gnu-emacs@gnu.org>
@ 2006-11-23 19:25 ` Stefan Monnier
  2006-11-24 21:14   ` Dieter Wilhelm
  2 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2006-11-23 19:25 UTC (permalink / raw)


> Is it possible to have .* match over a new line?

> I'd like to be able to match 

> {foo}

> and

> {
>    foo
> }

> with a regex of the form {.*} without having to specify the line
> break.

For this particular request, yes you can: "{[^}]*}".
The "." char in a regexp is just a shorthand for "[^\n]".
If you want the {..} to be balanced, then regexps are not the answer.


        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-23 19:25 ` Stefan Monnier
@ 2006-11-24 21:14   ` Dieter Wilhelm
  2006-11-24 22:51     ` Peter Dyballa
  0 siblings, 1 reply; 19+ messages in thread
From: Dieter Wilhelm @ 2006-11-24 21:14 UTC (permalink / raw)
  Cc: help-gnu-emacs

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Is it possible to have .* match over a new line?
>
>> I'd like to be able to match 
>> {foo}

>> and

>> {
>>    foo
>> }
>
>> with a regex of the form {.*} without having to specify the line
>> break.
>
> For this particular request, yes you can: "{[^}]*}".
> The "." char in a regexp is just a shorthand for "[^\n]".
> If you want the {..} to be balanced, then regexps are not the answer.

Just for completeness: where to look for the answer? In AWK, Perl, an
Elisp function on the Emacs wiki?

Thanks
-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-24 21:14   ` Dieter Wilhelm
@ 2006-11-24 22:51     ` Peter Dyballa
  2006-11-25  3:01       ` Dieter Wilhelm
  2006-11-25  3:32       ` Perry Smith
  0 siblings, 2 replies; 19+ messages in thread
From: Peter Dyballa @ 2006-11-24 22:51 UTC (permalink / raw)
  Cc: GNU Emacs List


Am 24.11.2006 um 22:14 schrieb Dieter Wilhelm:

> Just for completeness: where to look for the answer? In AWK, Perl, an
> Elisp function on the Emacs wiki?

Into this list – or its archive(s)! This kind of question was  
answered more than once. This year, last year, ... It might be time  
to put it into an FAQ or a Wiki. Hints are given in the Elisp info  
doc on Regexp (where the full description can be found):

	`.' (Period)
	     is a special character that matches any single character
	     except a newline.
	
	`[^ ... ]'
	     A complemented character alternative can match a newline,
	     unless newline is mentioned as one of the characters not
	     to match.  This is in contrast to the handling of regexps
	     in programs such as `grep'.

So "{[^}]*}" stands for 'a region that starts with `{´ and has no `}´  
until the final `}´ is hit; between both braces any number (starting  
with 0) of any character except `}´ can appear.'

--
Mit friedvollen Grüßen

   Pete

To be is to do.
                        -- I. Kant
To do is to be.
                        -- A. Sartre
Yabba-Dabba-Doo!
                        -- F. Flintstone

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-24 22:51     ` Peter Dyballa
@ 2006-11-25  3:01       ` Dieter Wilhelm
  2006-11-25 13:14         ` Peter Dyballa
       [not found]         ` <mailman.1092.1164460454.2155.help-gnu-emacs@gnu.org>
  2006-11-25  3:32       ` Perry Smith
  1 sibling, 2 replies; 19+ messages in thread
From: Dieter Wilhelm @ 2006-11-25  3:01 UTC (permalink / raw)
  Cc: GNU Emacs List

Peter Dyballa <Peter_Dyballa@Web.DE> writes:

> Am 24.11.2006 um 22:14 schrieb Dieter Wilhelm:
>
>> Just for completeness: where to look for the answer? In AWK, Perl, an
>> Elisp function on the Emacs wiki?
>
> Into this list – or its archive(s)! This kind of question was
> answered more than once. This year, last year, ... It might be time
> to put it into an FAQ or a Wiki. Hints are given in the Elisp info
> doc on Regexp (where the full description can be found):
>
> 	`.' (Period)
> 	     is a special character that matches any single character
> 	     except a newline.
> 	
> 	`[^ ... ]'
> 	     A complemented character alternative can match a newline,
> 	     unless newline is mentioned as one of the characters not
> 	     to match.  This is in contrast to the handling of regexps
> 	     in programs such as `grep'.
>
> So "{[^}]*}" stands for 'a region that starts with `{´ and has no `}´
> until the final `}´ is hit; between both braces any number (starting
> with 0) of any character except `}´ can appear.'

Thanks Peter for the hassle.  Maybe I expressed myself in a confusing
manner: The period and the character alternatives I understand.  What
I really wanted to know is where to look for a possibility of
searching for *balanced* brackets like { { } } because I'll need this
in my own Elisp stuff.  It's clear that there is some code in Emacs
for it (e. g. C-M-f etc.) but I have a hunch that there might be
something else out there.  Maybe a regexp extension in Perl or ...

-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-24 22:51     ` Peter Dyballa
  2006-11-25  3:01       ` Dieter Wilhelm
@ 2006-11-25  3:32       ` Perry Smith
  2006-11-25 10:11         ` Quoting style of arguments etc. [was: multiline regex mode?] Dieter Wilhelm
  2006-11-25 10:23         ` multiline regex mode? Peter Dyballa
  1 sibling, 2 replies; 19+ messages in thread
From: Perry Smith @ 2006-11-25  3:32 UTC (permalink / raw)
  Cc: Dieter Wilhelm, GNU Emacs List


[-- Attachment #1.1: Type: text/plain, Size: 1442 bytes --]

On Nov 24, 2006, at 4:51 PM, Peter Dyballa wrote:

>
> Am 24.11.2006 um 22:14 schrieb Dieter Wilhelm:
>
>> Just for completeness: where to look for the answer? In AWK, Perl, an
>> Elisp function on the Emacs wiki?
>
> Into this list – or its archive(s)! This kind of question was  
> answered more than once. This year, last year, ... It might be time  
> to put it into an FAQ or a Wiki. Hints are given in the Elisp info  
> doc on Regexp (where the full description can be found):
>
> 	`.' (Period)
> 	     is a special character that matches any single character
> 	     except a newline.
> 	
> 	`[^ ... ]'
> 	     A complemented character alternative can match a newline,
> 	     unless newline is mentioned as one of the characters not
> 	     to match.  This is in contrast to the handling of regexps
> 	     in programs such as `grep'.
>
> So "{[^}]*}" stands for 'a region that starts with `{´ and has no `} 
> ´ until the final `}´ is hit; between both braces any number  
> (starting with 0) of any character except `}´ can appear.'

I don't quite understand why this should be in an "emacs" FAQ or  
wiki.  Its basic regular expressions.  One such document is here:

http://en.wikipedia.org/wiki/Regular_expressions#Syntax

Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )

Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems




[-- Attachment #1.2: Type: text/html, Size: 4819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Quoting style of arguments etc. [was: multiline regex mode?]
  2006-11-25  3:32       ` Perry Smith
@ 2006-11-25 10:11         ` Dieter Wilhelm
  2006-11-25 10:23         ` multiline regex mode? Peter Dyballa
  1 sibling, 0 replies; 19+ messages in thread
From: Dieter Wilhelm @ 2006-11-25 10:11 UTC (permalink / raw)
  Cc: Peter Dyballa, GNU Emacs List

Perry Smith <pedz@easesoftware.com> writes:

>           So "{[^}]*}" stands for 'a region that starts with `{´ and has no `}´ until the final `}´ is hit; between both braces any number (starting
>      with 0) of any character except `}´ can appear.'
>
> I don't quite understand why this should be in an "emacs" FAQ or wiki.  Its basic regular expressions.  One such document is here:
>
> http://en.wikipedia.org/wiki/Regular_expressions#Syntax
>

It's even nearer at hand:
C-h i g "(Emacs)Regexps" or `(Elisp)Regular Expressions'

While we are at it: Is there an accepted quoting style for arguments
etc.?  For example:

1. In the above case expressing the need for pressing keys and
   distinguishing between textual arguments for the minibuffer: "" or
   `' or something else?

2. In the comment string of functions when expressing also key
   bindings and subsequent command names: "Please type `M-x
   function-name'" or "Please type \"M-x function-name\""?

I can't remember there is something in the Elisp manual for coding
conventions.

-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-25  3:32       ` Perry Smith
  2006-11-25 10:11         ` Quoting style of arguments etc. [was: multiline regex mode?] Dieter Wilhelm
@ 2006-11-25 10:23         ` Peter Dyballa
  1 sibling, 0 replies; 19+ messages in thread
From: Peter Dyballa @ 2006-11-25 10:23 UTC (permalink / raw)
  Cc: Dieter Wilhelm, GNU Emacs List


Am 25.11.2006 um 04:32 schrieb Perry Smith:

> I don't quite understand why this should be in an "emacs" FAQ or wiki.

To keep list traffic low, to save Internet bandwidth for valuable  
spam ...

--
Greetings

   Pete

"I wouldn't recommend sex, drugs or insanity for everyone, but  
they've always worked for me."
                                           -- Hunter S. Thompson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-25  3:01       ` Dieter Wilhelm
@ 2006-11-25 13:14         ` Peter Dyballa
  2006-11-25 16:32           ` Perry Smith
       [not found]           ` <mailman.1100.1164472360.2155.help-gnu-emacs@gnu.org>
       [not found]         ` <mailman.1092.1164460454.2155.help-gnu-emacs@gnu.org>
  1 sibling, 2 replies; 19+ messages in thread
From: Peter Dyballa @ 2006-11-25 13:14 UTC (permalink / raw)
  Cc: GNU Emacs List


Am 25.11.2006 um 04:01 schrieb Dieter Wilhelm:

>> So "{[^}]*}" stands for 'a region that starts with `{´ and has no `}´
>> until the final `}´ is hit; between both braces any number (starting
>> with 0) of any character except `}´ can appear.'
>
> Thanks Peter for the hassle.  Maybe I expressed myself in a confusing
> manner: The period and the character alternatives I understand.  What
> I really wanted to know is where to look for a possibility of
> searching for *balanced* brackets like { { } } because I'll need this
> in my own Elisp stuff.  It's clear that there is some code in Emacs
> for it (e. g. C-M-f etc.) but I have a hunch that there might be
> something else out there.  Maybe a regexp extension in Perl or ...

I think you can't use one regular expression for a variety of nested  
"*balanced* brackets like { { } }".

A simple regexp would be: find a region that starts after `{´ and has  
neither `}´ nor `{´
until it reaches the first `}´:

	{[^{}]*}

It's obvious that it can't find your case. Before and after the  
previous case from above this previous case has to be repeated. Let's  
try this:

	{[^{}]*·{[^{}]*}·[^{}]*}	; the · might help to understand, they are  
not meant to be parts of the regexp! So remove them before trying to  
use it!

{ and no { or }, then { and no { or }, then }, and no { or }, but a  
final }.

You can find the { { { } } } case ... or the { { } { } } case or ...  
And you would need to find an algorithm in which sequence to apply them!


Perl won't help. It has the same restrictions (or it wouldn't handle  
basic or extended regular expressions). It's only possible to use a  
more complicated syntax that less people would try to understand.


--
Mit friedvollen Grüßen

   Pete

Windows, c'est un peu comme le beaujolais nouveau: à chaque nouvelle  
cuvée on sait que ce sera dégueulasse, mais on en prend quand même,  
par masochisme.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
       [not found]         ` <mailman.1092.1164460454.2155.help-gnu-emacs@gnu.org>
@ 2006-11-25 14:11           ` Harald Hanche-Olsen
  2006-11-25 18:27             ` Dieter Wilhelm
       [not found]             ` <mailman.1107.1164479289.2155.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 19+ messages in thread
From: Harald Hanche-Olsen @ 2006-11-25 14:11 UTC (permalink / raw)


+ Peter Dyballa <Peter_Dyballa@Web.DE>:

| I think you can't use one regular expression for a variety of nested
| "*balanced* brackets like { { } }".

You think rightly.  In formal language theory, the regular languages
are precisely the ones that can be defined by regular grammars.  It is
a theorem that they are precisely the languages that can be recognized
by a finite state automaton (FSA).  A regular expression is a way of
specifying a regular grammar, i.e., a regular language, and so they
are not powerful enough to recognize balanced parentheses.  The reason
is that you need to be able to count arbitrarily high, in order to be
able to tell the difference between expressions like
{{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}} and {{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}}}.
A FSA with fifteen states cannot tell the difference between the two
expressions above, since it would need to be able to count to sixteen
in order to do so.

So the paren mathcing algorithms in emacs need to be written in code.
And indeed it is:  At the bottom you'll find the function scan-sexps,
which is written in C (for efficiency, no doubt).

See also:

  http://en.wikipedia.org/wiki/Regular_grammar

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-25 13:14         ` Peter Dyballa
@ 2006-11-25 16:32           ` Perry Smith
  2006-11-25 18:33             ` Dieter Wilhelm
       [not found]           ` <mailman.1100.1164472360.2155.help-gnu-emacs@gnu.org>
  1 sibling, 1 reply; 19+ messages in thread
From: Perry Smith @ 2006-11-25 16:32 UTC (permalink / raw)
  Cc: Dieter Wilhelm, GNU Emacs List


[-- Attachment #1.1: Type: text/plain, Size: 2232 bytes --]

On Nov 25, 2006, at 7:14 AM, Peter Dyballa wrote:
> Am 25.11.2006 um 04:01 schrieb Dieter Wilhelm:
>>> So "{[^}]*}" stands for 'a region that starts with `{´ and has no  
>>> `}´
>>> until the final `}´ is hit; between both braces any number (starting
>>> with 0) of any character except `}´ can appear.'
>>
>> Thanks Peter for the hassle.  Maybe I expressed myself in a confusing
>> manner: The period and the character alternatives I understand.  What
>> I really wanted to know is where to look for a possibility of
>> searching for *balanced* brackets like { { } } because I'll need this
>> in my own Elisp stuff.  It's clear that there is some code in Emacs
>> for it (e. g. C-M-f etc.) but I have a hunch that there might be
>> something else out there.  Maybe a regexp extension in Perl or ...
>
> I think you can't use one regular expression for a variety of  
> nested "*balanced* brackets like { { } }".

Correct.

For the curious: regular expressions create what is called a  
deterministic finite automata (DFA).  Also known as a state machine.   
They can not "count" which is what you ask asking for.  A simple  
"reason" is because the count can be infinite (although that  
explanation leaves out much).

The next level up in language parsing theory is a push down automata  
(PDA).  This is, roughly speaking, a DFA coupled with a stack.  The  
stack is infinite.  So, it now has the power to count.

The easiest way to do a PDA in lisp is with recursive decent and a  
rather simple lisp function can call itself when it hits a second  
{ and return when it hits a }.  When the last function returns, you  
have hit the matching } of the first {.

All that aside, emacs has code written to balance parens, braces,  
brackets, etc.  You can look at forward-sexp as a starting point.   
And, in the case of emacs, it is pretty flexible.  By specifying  
syntax tables, you can tell it what characters match each other.   
Look at modify-syntax-entry for that piece of the puzzle.

Hope this helps...

Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )

Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems



[-- Attachment #1.2: Type: text/html, Size: 7748 bytes --]

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
help-gnu-emacs mailing list
help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [OT] Re: multiline regex mode?
       [not found]           ` <mailman.1100.1164472360.2155.help-gnu-emacs@gnu.org>
@ 2006-11-25 17:23             ` Harald Hanche-Olsen
  0 siblings, 0 replies; 19+ messages in thread
From: Harald Hanche-Olsen @ 2006-11-25 17:23 UTC (permalink / raw)


+ Perry Smith <pedz@easesoftware.com>:

| On Nov 25, 2006, at 7:14 AM, Peter Dyballa wrote:Am 25.11.2006 um 04:01 schrieb Dieter Wilhelm:So "{[^}]*}" stands for[...]</SPAN></SPAN></SPAN></SPAN></SPAN>

Yep, one very long line, with HTML markup.

Please post /plain/ text with lines under 80 characters.  Nobody will
read your posts otherwise.  Well, I certainly won't, and I am sure I
am not alone in this.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-25 14:11           ` Harald Hanche-Olsen
@ 2006-11-25 18:27             ` Dieter Wilhelm
       [not found]             ` <mailman.1107.1164479289.2155.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 19+ messages in thread
From: Dieter Wilhelm @ 2006-11-25 18:27 UTC (permalink / raw)
  Cc: help-gnu-emacs

Harald Hanche-Olsen <hanche@math.ntnu.no> writes:

> + Peter Dyballa <Peter_Dyballa@Web.DE>:
>
> | I think you can't use one regular expression for a variety of nested
> | "*balanced* brackets like { { } }".
>

Got it Pete, in this case regexps are a no no! Good to know.

> You think rightly.  In formal language theory, the regular languages
> are precisely the ones that can be defined by regular grammars.  It is
> a theorem that they are precisely the languages that can be recognized
> by a finite state automaton (FSA).  A regular expression is a way of
> specifying a regular grammar, i.e., a regular language, and so they
> are not powerful enough to recognize balanced parentheses.  The reason
> is that you need to be able to count arbitrarily high, in order to be
> able to tell the difference between expressions like
> {{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}} and {{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}}}.
> A FSA with fifteen states cannot tell the difference between the two
> expressions above, since it would need to be able to count to sixteen
> in order to do so.

Thank you for the background.

> So the paren mathcing algorithms in emacs need to be written in code.
> And indeed it is:  At the bottom you'll find the function scan-sexps,
> which is written in C (for efficiency, no doubt).

OK, then it has to be scan-sexps in the code.

>
> See also:
>
>   http://en.wikipedia.org/wiki/Regular_grammar

I don't have to understand it, must I? ;-) Thank you for the pointers.

-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
  2006-11-25 16:32           ` Perry Smith
@ 2006-11-25 18:33             ` Dieter Wilhelm
  0 siblings, 0 replies; 19+ messages in thread
From: Dieter Wilhelm @ 2006-11-25 18:33 UTC (permalink / raw)
  Cc: Peter Dyballa, GNU Emacs List

Perry Smith <pedz@easesoftware.com> writes:

> On Nov 25, 2006, at 7:14 AM, Peter Dyballa wrote:
>
>           I think you can't use one regular expression for a variety of nested "*balanced* brackets like { { } }".
>
> Correct.
>

....

>
> The easiest way to do a PDA in lisp is with recursive decent and a
> rather simple lisp function can call itself when it hits a second {
> and return when it hits a }.  When the last function returns, you
> have hit the matching } of the first {.
>
>
> All that aside, emacs has code written to balance parens, braces,
> brackets, etc.  You can look at forward-sexp as a starting point. 
> And, in the case of emacs, it is pretty flexible.  By specifying
> syntax tables, you can tell it what characters match each other. 
> Look at modify-syntax-entry for that piece of the puzzle.
>
>
>
> Hope this helps...
>

It does, thanks

-- 
    Best wishes

    H. Dieter Wilhelm
    Darmstadt, Germany

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: multiline regex mode?
       [not found]             ` <mailman.1107.1164479289.2155.help-gnu-emacs@gnu.org>
@ 2006-11-25 19:29               ` Harald Hanche-Olsen
  0 siblings, 0 replies; 19+ messages in thread
From: Harald Hanche-Olsen @ 2006-11-25 19:29 UTC (permalink / raw)


+ Dieter Wilhelm <dieter@duenenhof-wilhelm.de>:

| Harald Hanche-Olsen <hanche@math.ntnu.no> writes:
|
|>   http://en.wikipedia.org/wiki/Regular_grammar
|
| I don't have to understand it, must I? ;-)

There will be a quiz.  Were you not informed?

| Thank you for the pointers.

You're welcome.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-11-25 19:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-09 12:57 multiline regex mode? Giles Chamberlin
2006-10-13 20:02 ` Dieter Wilhelm
     [not found] ` <mailman.8131.1160774366.9609.help-gnu-emacs@gnu.org>
2006-10-16 10:20   ` Giles Chamberlin
2006-10-16 11:24     ` Michaël Cadilhac
     [not found]     ` <mailman.8206.1160997864.9609.help-gnu-emacs@gnu.org>
2006-10-16 15:42       ` Giles Chamberlin
2006-11-23 19:25 ` Stefan Monnier
2006-11-24 21:14   ` Dieter Wilhelm
2006-11-24 22:51     ` Peter Dyballa
2006-11-25  3:01       ` Dieter Wilhelm
2006-11-25 13:14         ` Peter Dyballa
2006-11-25 16:32           ` Perry Smith
2006-11-25 18:33             ` Dieter Wilhelm
     [not found]           ` <mailman.1100.1164472360.2155.help-gnu-emacs@gnu.org>
2006-11-25 17:23             ` [OT] " Harald Hanche-Olsen
     [not found]         ` <mailman.1092.1164460454.2155.help-gnu-emacs@gnu.org>
2006-11-25 14:11           ` Harald Hanche-Olsen
2006-11-25 18:27             ` Dieter Wilhelm
     [not found]             ` <mailman.1107.1164479289.2155.help-gnu-emacs@gnu.org>
2006-11-25 19:29               ` Harald Hanche-Olsen
2006-11-25  3:32       ` Perry Smith
2006-11-25 10:11         ` Quoting style of arguments etc. [was: multiline regex mode?] Dieter Wilhelm
2006-11-25 10:23         ` multiline regex mode? Peter Dyballa

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.