unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Kevin Ryde <user42@zip.com.au>
Subject: regexp-quote bugs
Date: Sun, 22 Aug 2004 11:43:23 +1000	[thread overview]
Message-ID: <87brh42bic.fsf@zip.com.au> (raw)

        * regex-posix.c (scm_regexp_quote): Rewrite of ice-9 regex
        regexp-quote in C.  Fix [ and |, they must be quoted.  Fix quoting of
        ( ) { + ? for regexp/basic, must use char class [(] etc since \( in
        fact them become special.

This is for 1.6 too.

I'm assuming regexp-quote is meant to quote for both regexp/basic and
regexp/extended usages.  At the moment it's got problems in both.

New code below (have to find the posix spec to double check what needs
to be done, but this works with glibc for a start).



SCM_DEFINE (scm_regexp_quote, "regexp-quote", 1, 0, 0,
            (SCM str),
	    "Return a regexp string which matches @var{str} literally, ie.@:\n"
	    "any characters like @samp{*} in @var{str} which are special in\n"
	    "a regexp are quoted.  If there are no special characters then\n"
	    "@var{str} itself is returned.\n"
	    "\n"
	    "The regexp returned can be used with both @code{regexp/basic}\n"
	    "and @code{regexp/extended}, the quoting applied is safe for\n"
	    "both styles.")
#define FUNC_NAME s_scm_regexp_quote
{
  size_t     i, j, len, newlen;
  const char *ptr;
  char       *newptr;
  SCM        newstr;

  SCM_VALIDATE_STRING (SCM_ARG1, str);
  ptr = scm_i_string_chars (str);
  len = scm_i_string_length (str);

  /* [ * . \ ^ and $ are special in both regexp/basic and regexp/extended
     and can be backslash escaped.

     ( ) { } + ? and | are special in regexp/extended so must be escaped.
     But that can't be done with a backslash since in regexp/basic sequences
     \( \) \{ \} \+ \? and \| are special.  Character class forms [(] etc
     are used instead.

     ] is not special outside a [ ] character class, so doesn't need to be
     escaped.  */

#define REGEXP_QUOTE_BACKSLASH                  \
  case '[':                                     \
 case '*':                                      \
 case '.':                                      \
 case '\\':                                     \
 case '^':                                      \
 case '$'

#define REGEXP_QUOTE_CHARCLASS                  \
  case '(':                                     \
 case ')':                                      \
 case '{':                                      \
 case '}':                                      \
 case '+':                                      \
 case '?':                                      \
 case '|'

  for (i = 0, newlen = 0; i < len; i++)
    {
      switch (ptr[i]) {
      REGEXP_QUOTE_BACKSLASH:
        newlen += 2;
        break;
      REGEXP_QUOTE_CHARCLASS:
        newlen += 3;
        break;
      default:
        newlen += 1;
        break;
      }
    }

  if (newlen == len)
    return str;

  newstr = scm_i_make_string (newlen, &newptr);
  for (i = 0, j = 0; i < len; i++, j++)
    {
      char c = ptr[i];
      switch (c) {
      REGEXP_QUOTE_BACKSLASH:
        newptr[j++] = '\\';
        goto store_c;
      REGEXP_QUOTE_CHARCLASS:
        newptr[j++] = '[';
        newptr[j++] = c;
        newptr[j] = ']';
        break;
      default:
      store_c:
        newptr[j] = c;
        break;
      }
    }
  scm_remember_upto_here_1 (str);
  return newstr;
}
#undef FUNC_NAME


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel


             reply	other threads:[~2004-08-22  1:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-22  1:43 Kevin Ryde [this message]
2004-08-24 17:23 ` regexp-quote bugs Marius Vollmer
2004-08-25  1:15   ` Kevin Ryde
2004-09-07 16:09     ` Marius Vollmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87brh42bic.fsf@zip.com.au \
    --to=user42@zip.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).