unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* regexp-exec fails for long strings
@ 2002-04-25 20:05 Thien-Thi Nguyen
  0 siblings, 0 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2002-04-25 20:05 UTC (permalink / raw)
  Cc: guile-user

well, good and bad news.

the good news is that another guile project was recently added to the
projects list.  the bad news is that for www.gnu.org, template.scm now
fails due to unfulfilled regexp matching.  (this has resulted in an
empty project page there.)  below is a test case that demonstrates the
problem.  running "pre-inst-guile -s" on it shows three "ok" and three
"FAIL" for both HEAD and branch_release-1-6.

if someone can confirm similar behavior on another system (perhaps by
varying the appended string length), i will add this to the bugs db.
(i ask for this confirmation because my system's old sdrams are prone to
mysterious failures, and i want to rule that out as a reason.)

thi

____________________________________________
(use-modules (ice-9 regex))

(define ok "<section_name>Projects List</section_name>")

(define rx (make-regexp "<section_name>(.*)</section_name>"))

(define (+space s n)
  (string-append s (make-string n #\space)))

(define (test s)
  (format #t "string-length ~A\t=> ~A\n"
          (string-length s)
          (if (regexp-exec rx s)
              "ok"
              "FAIL")))

;; do it
(test ok)                               ; ok
(test (+space ok 11672))                ; ok
(test (+space ok 11673))                ; ok
(test (+space ok 11674))                ; FAIL
(test (+space ok 11675))                ; FAIL
(test (+space ok 11700))                ; FAIL

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found] <E170pV0-00018q-00@giblet>
@ 2002-04-25 21:47 ` rm
       [not found] ` <20020425214713.GA19857@www>
  1 sibling, 0 replies; 8+ messages in thread
From: rm @ 2002-04-25 21:47 UTC (permalink / raw)
  Cc: bug-guile, guile-user

On Thu, Apr 25, 2002 at 01:05:50PM -0700, Thien-Thi Nguyen wrote:
>[...] 
> if someone can confirm similar behavior on another system (perhaps by
> varying the appended string length), i will add this to the bugs db.
> (i ask for this confirmation because my system's old sdrams are prone to
> mysterious failures, and i want to rule that out as a reason.)

Ok, this is on my rather 'old' guile-1.5 box (the guile-HEAD is at home):

|  guile> (version)
|  "1.5.0"
|  guile> (use-modules (ice-9 regex))
|  guile> 
|  guile> (define ok "<section_name>Projects List</section_name>")
|  guile> 
|  guile> (define rx (make-regexp "<section_name>(.*)</section_name>"))
|  guile> 
|  guile> (define (+space s n)
|  ...   (string-append s (make-string n #\space)))
|  guile> 
|  guile> (define (test s)
|  ...   (format #t "string-length ~A\t=> ~A\n"
|  ...           (string-length s)
|  ...           (if (regexp-exec rx s)
|  ...               "ok"
|  ...               "FAIL")))
|  guile> (test ok)
|  string-length 42	=> ok
|  guile> (test (+space ok 11672))
|  string-length 11714	=> ok
|  guile> (test (+space ok 11673))
|  string-length 11715	=> ok
|  guile> (test (+space ok 11674))
|  string-length 11716	=> FAIL
|  guile> (test (+space ok 11675))
|  string-length 11717	=> FAIL
|  guile> (test (+space ok 11700))
|  string-length 11742	=> FAIL
|  guile> 
|  

Hmmm, never really used guile's regex heavily - that's where i
still use <blush>perl</blush>.

  Ralf

> thi
> 
> ____________________________________________
> (use-modules (ice-9 regex))
> 
> (define ok "<section_name>Projects List</section_name>")
> 
> (define rx (make-regexp "<section_name>(.*)</section_name>"))
> 
> (define (+space s n)
>   (string-append s (make-string n #\space)))
> 
> (define (test s)
>   (format #t "string-length ~A\t=> ~A\n"
>           (string-length s)
>           (if (regexp-exec rx s)
>               "ok"
>               "FAIL")))
> 
> ;; do it
> (test ok)                               ; ok
> (test (+space ok 11672))                ; ok
> (test (+space ok 11673))                ; ok
> (test (+space ok 11674))                ; FAIL
> (test (+space ok 11675))                ; FAIL
> (test (+space ok 11700))                ; FAIL
> 
> _______________________________________________
> Guile-user mailing list
> Guile-user@gnu.org
> http://mail.gnu.org/mailman/listinfo/guile-user

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found] ` <20020425214713.GA19857@www>
@ 2002-04-25 23:14   ` Wolfgang Jährling
  2002-04-25 23:29   ` Thien-Thi Nguyen
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Wolfgang Jährling @ 2002-04-25 23:14 UTC (permalink / raw)
  Cc: ttn, guile-user

Hi!

rm@fabula.de <rm@fabula.de> wrote:
> Hmmm, never really used guile's regex heavily - that's where i
> still use <blush>perl</blush>.

And while we are at it: Anyone ever thought about using S-Expressions
for regular expressionsin Guile? I really like the idea, as that would
make them much cleaner and more scheme-ish.

(See <http://www.schemers.org/Documents/FAQ/#id2802833>.)

Cheers,
GNU/Wolfgang

-- 
Wolfgang Jährling  <wolfgang@pro-linux.de>  \\  http://stdio.cjb.net/
Debian GNU/Hurd user && Debian GNU/Linux user \\  http://www.gnu.org/
The Hurd Hacking Guide: http://www.gnu.org/software/hurd/hacking-guide/
["We're way ahead of you here. The Hurd has always been on the    ]
[ cutting edge of not being good for anything." -- Roland McGrath ]


_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found] ` <20020425214713.GA19857@www>
  2002-04-25 23:14   ` Wolfgang Jährling
@ 2002-04-25 23:29   ` Thien-Thi Nguyen
  2002-04-26  4:32   ` Tom Lord
       [not found]   ` <E170sgG-0001Na-00@giblet>
  3 siblings, 0 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2002-04-25 23:29 UTC (permalink / raw)
  Cc: bug-guile, guile-user

   From: rm@fabula.de
   Date: Thu, 25 Apr 2002 23:47:13 +0200

   |  guile> (test ok)
   |  string-length 42	=> ok
   |  guile> (test (+space ok 11672))
   |  string-length 11714	=> ok
   |  guile> (test (+space ok 11673))
   |  string-length 11715	=> ok
   |  guile> (test (+space ok 11674))
   |  string-length 11716	=> FAIL
   |  guile> (test (+space ok 11675))
   |  string-length 11717	=> FAIL
   |  guile> (test (+space ok 11700))
   |  string-length 11742	=> FAIL
   |  guile> 

i traced this to `regexec' (glibc 2.2.4) and stopped -- it's not a guile
bug after all, as far as i could tell.  (also, could not reproduce the
behavior under FreeBSD 4.4-RELEASE.)

probably time to apt-get update...

thi

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found] ` <20020425214713.GA19857@www>
  2002-04-25 23:14   ` Wolfgang Jährling
  2002-04-25 23:29   ` Thien-Thi Nguyen
@ 2002-04-26  4:32   ` Tom Lord
       [not found]   ` <E170sgG-0001Na-00@giblet>
  3 siblings, 0 replies; 8+ messages in thread
From: Tom Lord @ 2002-04-26  4:32 UTC (permalink / raw)
  Cc: ttn, bug-guile, guile-user


       Hmmm, never really used guile's regex heavily - that's where i
       still use <blush>perl</blush>.


If you want high-power regexps, you should resurrect the Rx interface,
grabbing the latest libhackerlab.  In fact, rather than resurrecting
the old Guile Rx interface, you should grab and adapt the one from
systas (not currently in release, but I'll put up a new version soon).

I regularly use regexps that are several KB long, constructed by a
structured regexp compiler similar to the one in SCSH.  It's fast,
convenient, and very accurate.

-t


_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found]   ` <E170sgG-0001Na-00@giblet>
@ 2002-05-15  3:53     ` Thien-Thi Nguyen
       [not found]     ` <E177prI-0001VF-00@giblet>
  1 sibling, 0 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2002-05-15  3:53 UTC (permalink / raw)


   From: Thien-Thi Nguyen <ttn@giblet.glug.org>
   Date: Thu, 25 Apr 2002 16:29:40 -0700

   i traced this to `regexec' (glibc 2.2.4) and stopped -- it's not a guile
   bug after all, as far as i could tell.  (also, could not reproduce the
   behavior under FreeBSD 4.4-RELEASE.)

   probably time to apt-get update...

this problem persists w/ glibc 2.2.5.  hmmm....

thi

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found]     ` <E177prI-0001VF-00@giblet>
@ 2002-05-15  6:04       ` Wolfgang Jährling
       [not found]       ` <20020515080420.B201@dose.pro-linux.de>
  1 sibling, 0 replies; 8+ messages in thread
From: Wolfgang Jährling @ 2002-05-15  6:04 UTC (permalink / raw)
  Cc: bug-guile, guile-user

Thien-Thi Nguyen <ttn@giblet.glug.org> wrote:
>    From: Thien-Thi Nguyen <ttn@giblet.glug.org>
>    Date: Thu, 25 Apr 2002 16:29:40 -0700
> 
>    i traced this to `regexec' (glibc 2.2.4) and stopped -- it's not a guile
>    bug after all, as far as i could tell.  (also, could not reproduce the
>    behavior under FreeBSD 4.4-RELEASE.)
> 
>    probably time to apt-get update...
> 
> this problem persists w/ glibc 2.2.5.  hmmm....

AFAIK, the glibc-people are rewriting the regex code for 2.3, because
the current implementation got unmaintainable and has various strange
limitations. This could be one of those.

Cheers,
GNU/Wolfgang

-- 
Wolfgang Jährling  <wolfgang@pro-linux.de>  \\  http://stdio.cjb.net/
Debian GNU/Hurd user && Debian GNU/Linux user \\  http://www.gnu.org/
The Hurd Hacking Guide: http://www.gnu.org/software/hurd/hacking-guide/
["We're way ahead of you here. The Hurd has always been on the    ]
[ cutting edge of not being good for anything." -- Roland McGrath ]

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: regexp-exec fails for long strings
       [not found]       ` <20020515080420.B201@dose.pro-linux.de>
@ 2002-05-15  8:19         ` Thien-Thi Nguyen
  0 siblings, 0 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2002-05-15  8:19 UTC (permalink / raw)
  Cc: bug-guile, guile-user

   From: =?iso-8859-1?Q?Wolfgang_J=E4hrling?= <wolfgang@pro-linux.de>
   Date: Wed, 15 May 2002 08:04:20 +0200

   AFAIK, the glibc-people are rewriting the regex code for 2.3, because
   the current implementation got unmaintainable and has various strange
   limitations. This could be one of those.

that's good news (that i wouldn't have known w/o this tip -- thanks!).
i suppose i could contribute this as a test case.  i wonder what glibc
testing framework is like.

thi

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-user


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-05-15  8:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <E170pV0-00018q-00@giblet>
2002-04-25 21:47 ` regexp-exec fails for long strings rm
     [not found] ` <20020425214713.GA19857@www>
2002-04-25 23:14   ` Wolfgang Jährling
2002-04-25 23:29   ` Thien-Thi Nguyen
2002-04-26  4:32   ` Tom Lord
     [not found]   ` <E170sgG-0001Na-00@giblet>
2002-05-15  3:53     ` Thien-Thi Nguyen
     [not found]     ` <E177prI-0001VF-00@giblet>
2002-05-15  6:04       ` Wolfgang Jährling
     [not found]       ` <20020515080420.B201@dose.pro-linux.de>
2002-05-15  8:19         ` Thien-Thi Nguyen
2002-04-25 20:05 Thien-Thi Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).