From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.user Subject: Re: Guile regular expressions are too greedy Date: Wed, 22 Jul 2009 14:21:22 +0200 Message-ID: <87zlawj4cd.fsf@gnu.org> References: <4A66E467.2060709@btinternet.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1248265318 7321 80.91.229.12 (22 Jul 2009 12:21:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 22 Jul 2009 12:21:58 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Wed Jul 22 14:21:51 2009 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MTapS-0008KE-QS for guile-user@m.gmane.org; Wed, 22 Jul 2009 14:21:51 +0200 Original-Received: from localhost ([127.0.0.1]:38852 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MTapS-00084V-6y for guile-user@m.gmane.org; Wed, 22 Jul 2009 08:21:50 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MTapN-000849-Lp for guile-user@gnu.org; Wed, 22 Jul 2009 08:21:45 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MTapI-0007xs-9a for guile-user@gnu.org; Wed, 22 Jul 2009 08:21:44 -0400 Original-Received: from [199.232.76.173] (port=43128 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MTapI-0007xl-1m for guile-user@gnu.org; Wed, 22 Jul 2009 08:21:40 -0400 Original-Received: from main.gmane.org ([80.91.229.2]:40235 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MTapH-0005qG-Mz for guile-user@gnu.org; Wed, 22 Jul 2009 08:21:39 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1MTapA-0005yx-Id for guile-user@gnu.org; Wed, 22 Jul 2009 12:21:32 +0000 Original-Received: from laptop-147-210-128-170.labri.fr ([147.210.128.170]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Jul 2009 12:21:32 +0000 Original-Received: from ludo by laptop-147-210-128-170.labri.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Jul 2009 12:21:32 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 46 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: laptop-147-210-128-170.labri.fr X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 4 Thermidor an 217 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: x86_64-unknown-linux-gnu User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) Cancel-Lock: sha1:kMuCJ0r8ZtcNK7IOZ3TFwe2vidk= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:7367 Archived-At: Hello, Chris Dennis writes: > Hello Guile People > > Is there a way to make Guile regular expressions less greedy? I > understand that POSIX doesn't define non-greedy modifiers. > > Specifically, I'm trying to parse font names such as > > Arial 12 > Arial Bold Italic 14 > Nimbus Sans L Bold Italic Condensed 11 > > so that I can construct CSS styles from them. > > I've tried the following, but the first (.*) gobbles up everything > before the size because the other elements are optional: > > (define s (string-match "(.*)( +(bold|semi-bold|regular|light))?( > +(italic|oblique))?( +(condensed))? +([0-9]+)" "nimbus sans l bold > italic condensed 11")) Here's a possible solution: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (define rx (make-regexp "^([^ ]+)( +(bold|semi-bold|regular|light))?( +(italic|oblique))?( +(condensed))? +([0-9]+)" regexp/extended)) scheme@(guile-user)> (define m (regexp-exec rx "nimbus bold italic condensed 11")) scheme@(guile-user)> (match:substring m 1) "nimbus" scheme@(guile-user)> (match:substring m 2) " bold" scheme@(guile-user)> (match:substring m 3) "bold" scheme@(guile-user)> (match:substring m 4) " italic" scheme@(guile-user)> (match:substring m 5) "italic" --8<---------------cut here---------------end--------------->8--- Note that I slightly modified the string that's matched because "sans" and "l" are not meant to be matched. Hope this helps, Ludo'.