unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: Andreas Schwab <schwab@suse.de>, Kenichi Handa <handa@etl.go.jp>,
	eliz@is.elta.co.il, emacs-devel@gnu.org
Subject: Re: regex and case-fold-search problem
Date: Mon, 26 Aug 2002 12:11:42 -0400	[thread overview]
Message-ID: <200208261611.g7QGBgt24993@rum.cs.yale.edu> (raw)
In-Reply-To: buo65xyl6z0.fsf@mcspd15.ucom.lsi.nec.co.jp

> Andreas Schwab <schwab@suse.de> writes:
> > |> Yeah, but character ranges make perfect sense in many local contexts.
> > |> E.g., [0-9], or [<0>-<9>] where <0> and <9> are `wide' digits from some
> > |> character set.
> > 
> > What does [A-Z] mean in EBCDIC?  [0-9] is a special case, because ISO C
> > requires that 0,1,2,3,4,5,6,7,8,9 are consecutive in the execution
> > character set.  But in many locales the collating sequence <A> - <Z>
> > contains more that just the upper case letters from the English alphabet.
> 
> The question is not `does [A-Z] make sense?', but rather: `_if_ [A-Z]
> makes sense, does [a-z] make sense too?'
> 
> That is, we aren't the ones writing [A-Z], it's lisp authors or users
> entering regexps or something.  If they want to enter a less-than-useful
> character range, that's their prerogative; however, emacs should avoid
> making what they enter _less_ meaningful because of the case-fold-search
> setting.
> 
> My point was that perhaps in practice, the ranges that would get screwed
> up by case-fold-search are even less sensible that normal, meaning it's
> likely most people wouldn't (or shouldn't) use them, and we really don't
> need to worry about the issue.  [ASCII is probably a special case, since
> it's so well known that people actually do tend to specify wierd ranges]
> 
> [but it looks like maybe it will get fixed properly anyway...]

I agree that we shouldn't spend too much time on it.
The patch I installed does the following:
- Fix a few problems such as ``if the case-table mapped ?* to ?o then
  "\\(fo\\)*" used to only match "foo"''.  Luckily such case-tables
  are not very common, so nobody noticed the problem.
- case-fold-search now works correctly for ranges in ASCII
- case-fold-search still doesn't work correctly for ranges in non-ASCII
  but it matches at least as much as when case-fold-search is nil: i.e.
  the range might include some chars which the user didn't expect, but it
  at least include the chars which the user expected.  The previous behavior
  was that the range could include some unexpected chars as well and could
  also not include some expected chars.  The current code matches at least
  as many strings as the previous one.

I think that's good enough for now,


	Stefan

  reply	other threads:[~2002-08-26 16:11 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-23  6:25 regex and case-fold-search problem Kenichi Handa
2002-08-23 15:56 ` Eli Zaretskii
2002-08-24  0:51   ` Kenichi Handa
2002-08-24  1:03     ` Miles Bader
2002-08-24  9:42       ` Eli Zaretskii
2002-08-24 16:16       ` Andreas Schwab
2002-08-26  1:54         ` Miles Bader
2002-08-26 16:11           ` Stefan Monnier [this message]
2002-08-26 21:51         ` Richard Stallman
2002-08-24  9:39     ` Eli Zaretskii
2002-08-26  1:29       ` Kenichi Handa
2002-08-26  2:31         ` Miles Bader
2002-08-25 22:21     ` Kim F. Storm
2002-08-23 17:36 ` Stefan Monnier
2002-08-23 21:52   ` Stefan Monnier
2002-08-24  1:16   ` Kenichi Handa
2002-08-25 18:52     ` Stefan Monnier
2002-08-26  1:56       ` Kenichi Handa
2002-08-24 10:40   ` Kai Großjohann
2002-08-26 21:51 ` Richard Stallman
2002-08-29  8:53   ` Kenichi Handa
2002-08-29 12:33     ` Kim F. Storm
2002-08-29 13:38       ` Kenichi Handa
2002-08-29 15:00         ` Kim F. Storm
2002-08-29 16:00         ` Stefan Monnier
2002-08-30  1:11           ` Kenichi Handa
2002-08-30 19:19             ` Richard Stallman
2002-08-30 19:19     ` Richard Stallman
2002-08-30 20:08       ` Stefan Monnier
2002-09-01 13:15         ` Richard Stallman
2002-09-01 16:26           ` Stefan Monnier
2002-09-02 14:54             ` Richard Stallman
2002-09-02 16:58               ` Stefan Monnier
2002-09-04 14:13                 ` Richard Stallman
2002-09-04 16:04                   ` Stefan Monnier
2002-09-05 18:02                     ` Richard Stallman
2002-09-06  1:00                       ` re-search-forward seems to be broken Miles Bader
2002-09-06 20:03                         ` Richard Stallman
2002-08-31  6:14       ` regex and case-fold-search problem Eli Zaretskii
2002-09-01 13:14         ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200208261611.g7QGBgt24993@rum.cs.yale.edu \
    --to=monnier+gnu/emacs@rum.cs.yale.edu \
    --cc=eliz@is.elta.co.il \
    --cc=emacs-devel@gnu.org \
    --cc=handa@etl.go.jp \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).