unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* regex.c: emacs & glibc (and xemacs, and grep and ...)
@ 2002-04-04 18:55 Sam Steingold
  2002-04-04 19:24 ` Stefan Monnier
  2002-04-05 19:05 ` Richard Stallman
  0 siblings, 2 replies; 12+ messages in thread
From: Sam Steingold @ 2002-04-04 18:55 UTC (permalink / raw)


Emacs comes with it's own version of regex.c, even though most UNIX
systems have a regex implementation.  IIUC, this is for portability.

Suppose glibc's regex and Emacs' regex are unified - will Emacs still
build it's own regex version even on gnu systems (linux, hurd)?

this is probably not a big deal (regex is relatively small) -
but is there a way to detect that the system-supplied regex
(or another, bigger library, e.g., gettext or iconv) is good enough?

Finally, GNU CLISP (http://clisp.cons.org) comes with a GNU regex.c of
1994(!) - which version should we upgrade to?  Emacs?  GLIBC? GNU grep?

thanks.

-- 
Sam Steingold (http://www.podval.org/~sds) running RedHat7.2 GNU/Linux
Keep Jerusalem united! <http://www.onejerusalem.org/Petition.asp>
Read, think and remember! <http://www.iris.org.il> <http://www.memri.org/>
UNIX is a way of thinking.  Windows is a way of not thinking.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-04 18:55 regex.c: emacs & glibc (and xemacs, and grep and ...) Sam Steingold
@ 2002-04-04 19:24 ` Stefan Monnier
  2002-04-05  1:25   ` Paul Eggert
  2002-04-05 19:05 ` Richard Stallman
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2002-04-04 19:24 UTC (permalink / raw)
  Cc: emacs-devel

> Emacs comes with it's own version of regex.c, even though most UNIX
> systems have a regex implementation.  IIUC, this is for portability.

It's for Emacs' portability as well as for elisp program's portability
(so the set of supported regexps is always the same) and more importantly
because the regexp routines need to know about Emacs' internal buffer
representation (as a pair of char arrays, hence the need for re_match_2)
and multibyte char representation as well as Emacs' syntax tables.

> Suppose glibc's regex and Emacs' regex are unified - will Emacs still
> build it's own regex version even on gnu systems (linux, hurd)?

Most likely yes, because the glibc version will not know about Emacs'
syntax tables and multibyte chars.

> Finally, GNU CLISP (http://clisp.cons.org) comes with a GNU regex.c of
> 1994(!) - which version should we upgrade to?  Emacs?  GLIBC? GNU grep?

The regex library in glibc-2.1 derived from the same code base as the one
of Emacs.  I fixed several problems in the code for Emacs-21 and tried
to re-merge the glibc code with the new Emacs code (so as to fix the glibc
code as well) but never finished it (i.e. I did integrate some of glibc's
changes into Emacs' regex.c but not enough to be able to replace glibc's
version).
In the mean time, supposedly glibc has switched to a new code base for its
regexp engine (I know glibc maintainers hated the regexp code, and I can
partly understand why, although the Emacs version has been somewhat
improved) which seems inadequate (for now at least) for Emacs and even
somewhat unstable, last I heard of it.

I don't know what GNU grep uses, but CVS uses the same code as Emacs
and so do several other GNU packages, so maybe GNU grep as well
(although they might not have been upgraded to the Emacs-21 code,
I don't know).
BTW, note that same code doesn't mean that they could be put into
a shared library because they're compiled with different compilation
options.


	Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-04 19:24 ` Stefan Monnier
@ 2002-04-05  1:25   ` Paul Eggert
  2002-04-05  2:47     ` Miles Bader
  2002-04-05 14:48     ` Stefan Monnier
  0 siblings, 2 replies; 12+ messages in thread
From: Paul Eggert @ 2002-04-05  1:25 UTC (permalink / raw)
  Cc: sds, emacs-devel

> From: "Stefan Monnier" <monnier+gnu/emacs@RUM.cs.yale.edu>
> Date: Thu, 04 Apr 2002 14:24:55 -0500
> 
> In the mean time, supposedly glibc has switched to a new code base for its
> regexp engine (I know glibc maintainers hated the regexp code, and I can
> partly understand why, although the Emacs version has been somewhat
> improved) which seems inadequate (for now at least) for Emacs and even
> somewhat unstable, last I heard of it.

My impression is that the new glibc engine still needs some work.
I've been communicating with Isamu Hasegawa, its author, and have
mostly focused on RE compatibility concerns, notably POSIX
conformance.  His first goal, understandably so, is to get it up to
speed for glibc's own purposes, and later to worry about other uses
like grep and Emacs.

In other news, the 2001 edition of the original Unix regular
expression code has been distributed under the LGPL by Caldera, the
current copyright holders.  This was done after the IBM contribution.
However, I haven't had a chance to look at the Unix code.  If you're
interested, it's in <http://unixtools.sourceforge.net/>.


> I don't know what GNU grep uses,

Grep uses a version of regex.c that forked from the glibc version
after Emacs did.  The GNU core utilities have a version that forked
from glibc after Emacs did, but before grep did.  I don't know about
the gnulib version (perhaps it's supposed to be identical to the Emacs
version?).

It is a bit of a mess.  With the exception of the new glibc code and
the Unix code, it should be relatively easy to merge all these
versions, if someone could find the time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05  1:25   ` Paul Eggert
@ 2002-04-05  2:47     ` Miles Bader
  2002-04-05 14:48     ` Stefan Monnier
  1 sibling, 0 replies; 12+ messages in thread
From: Miles Bader @ 2002-04-05  2:47 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, sds, emacs-devel

Paul Eggert <eggert@twinsun.com> writes:
> It is a bit of a mess.  With the exception of the new glibc code and
> the Unix code, it should be relatively easy to merge all these
> versions, if someone could find the time.

I guess the big question is, is the new glibc code promising enough to
just wait until that reaches a better state, and work from it instead?

-Miles
-- 
The secret to creativity is knowing how to hide your sources.
  --Albert Einstein

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05  1:25   ` Paul Eggert
  2002-04-05  2:47     ` Miles Bader
@ 2002-04-05 14:48     ` Stefan Monnier
  2002-04-05 23:41       ` Richard Stallman
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2002-04-05 14:48 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, sds, emacs-devel

> > I don't know what GNU grep uses,
> 
> Grep uses a version of regex.c that forked from the glibc version
> after Emacs did.  The GNU core utilities have a version that forked
> from glibc after Emacs did, but before grep did.  I don't know about
> the gnulib version (perhaps it's supposed to be identical to the Emacs
> version?).

Yes, the gnulib version is the Emacs version (it is the very same RCS
file shared between the various CVS repositories).

> It is a bit of a mess.  With the exception of the new glibc code and
> the Unix code, it should be relatively easy to merge all these
> versions, if someone could find the time.

Given the kind of changes I've made to Emacs' code, merging into
the Emacs code will be easier than merging the Emacs changes into
some other fork of the code.


	Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-04 18:55 regex.c: emacs & glibc (and xemacs, and grep and ...) Sam Steingold
  2002-04-04 19:24 ` Stefan Monnier
@ 2002-04-05 19:05 ` Richard Stallman
  2002-04-05 20:27   ` Sam Steingold
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Stallman @ 2002-04-05 19:05 UTC (permalink / raw)
  Cc: emacs-devel

    Emacs comes with it's own version of regex.c, even though most UNIX
    systems have a regex implementation.  IIUC, this is for portability.

Also because we had to have one for GNU.

    Suppose glibc's regex and Emacs' regex are unified - will Emacs still
    build it's own regex version even on gnu systems (linux, hurd)?

Yes, because it has special conditionalized features specifically for
Emacs which were not compiled in when using it for Glibc.
Unless we were to provide interfaces for those things in Glibc,
the system-supplied regex will never be usable.

    Finally, GNU CLISP (http://clisp.cons.org) comes with a GNU regex.c of
    1994(!) - which version should we upgrade to?  Emacs?  GLIBC? GNU grep?

I wish I knew.  The versions diverged due to inattention,
and recently a completely new regex was put into Glibc.
I am told it has major problems.  Supposing they are fixed,
you will probably want to use that one in CLISP.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05 19:05 ` Richard Stallman
@ 2002-04-05 20:27   ` Sam Steingold
  2002-04-06 17:32     ` Richard Stallman
  0 siblings, 1 reply; 12+ messages in thread
From: Sam Steingold @ 2002-04-05 20:27 UTC (permalink / raw)
  Cc: emacs-devel

> * In message <200204051905.g35J5a618932@aztec.santafe.edu>
> * On the subject of "Re: regex.c: emacs & glibc (and xemacs, and grep and ...)"
> * Sent on Fri, 5 Apr 2002 12:05:36 -0700 (MST)
> * Honorable Richard Stallman <rms@gnu.org> writes:
>
>     Finally, GNU CLISP (http://clisp.cons.org) comes with a GNU regex.c of
>     1994(!) - which version should we upgrade to?  Emacs?  GLIBC? GNU grep?
> 
> I wish I knew.  The versions diverged due to inattention,
> and recently a completely new regex was put into Glibc.
> I am told it has major problems.  Supposing they are fixed,
> you will probably want to use that one in CLISP.

Could you please elaborate?
What are the problems?
When they are expected to be fixed?
Why do you recommend it for CLISP (instead of the Emacs's version)?

Thanks.

-- 
Sam Steingold (http://www.podval.org/~sds) running RedHat7.2 GNU/Linux
Keep Jerusalem united! <http://www.onejerusalem.org/Petition.asp>
Read, think and remember! <http://www.iris.org.il> <http://www.memri.org/>
"A pint of sweat will save a gallon of blood."          -- George S. Patton

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05 14:48     ` Stefan Monnier
@ 2002-04-05 23:41       ` Richard Stallman
  2002-04-08 16:26         ` Stefan Monnier
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Stallman @ 2002-04-05 23:41 UTC (permalink / raw)
  Cc: eggert, monnier+gnu/emacs, sds, emacs-devel

Would you like to merge some of these versions of regex?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05 20:27   ` Sam Steingold
@ 2002-04-06 17:32     ` Richard Stallman
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Stallman @ 2002-04-06 17:32 UTC (permalink / raw)
  Cc: emacs-devel

    > I wish I knew.  The versions diverged due to inattention,
    > and recently a completely new regex was put into Glibc.
    > I am told it has major problems.  Supposing they are fixed,
    > you will probably want to use that one in CLISP.

    Could you please elaborate?
    What are the problems?
    When they are expected to be fixed?

I have no such information, sorry.

    Why do you recommend it for CLISP (instead of the Emacs's version)?

A full rewrite is probably cleaner.
I know nothing more about it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-05 23:41       ` Richard Stallman
@ 2002-04-08 16:26         ` Stefan Monnier
  2002-04-08 19:22           ` Paul Eggert
  2002-04-10 14:23           ` Richard Stallman
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Monnier @ 2002-04-08 16:26 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, eggert, sds, emacs-devel

> Would you like to merge some of these versions of regex?

Which versions would that be ?

The only one I know of is glibc but they have switched, so that doesn't
apply any more.

I'll try and check out GNU grep.

Any other ?


	Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-08 16:26         ` Stefan Monnier
@ 2002-04-08 19:22           ` Paul Eggert
  2002-04-10 14:23           ` Richard Stallman
  1 sibling, 0 replies; 12+ messages in thread
From: Paul Eggert @ 2002-04-08 19:22 UTC (permalink / raw)
  Cc: rms, sds, emacs-devel

> From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
> Date: Mon, 08 Apr 2002 12:26:41 -0400
> 
> > Would you like to merge some of these versions of regex?
> 
> I'll try and check out GNU grep.

Thanks.  Could you please look at the latest CVS versions:

http://savannah.gnu.org/cgi-bin/viewcvs/grep/grep/lib/regex.c
http://savannah.gnu.org/cgi-bin/viewcvs/grep/grep/lib/regex.h
http://savannah.gnu.org/cgi-bin/viewcvs/grep/grep/lib/posix/regex.h


Please also look at the latest sources in fileutils.  This integrates
all the changes that I know of that occur in textutils, shellutils,
and diffutils.  It is quite close to what was in glibc before the
recent major change to glibc.

ftp://alpha.gnu.org/gnu/fetish/fileutils-4.1.8.tar.gz


libiberty has a copy of regex.c that renames all external
routines with an "x" prefix so they do not collide with the native
regex routines or with other components regex routines.  This sounds
like useful functionality, though I think it should be enabled only if
the user defines a configuration macro.  libiberty may have other
changes; I don't know.

I don't know why the GCC and GDB versions of libiberty regex differ so
greatly.  It appears to me that the GDB versions are newer, but I
suppose there could be useful changes in the GCC version too.

http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/libiberty/regex.c?cvsroot=src
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/include/xregex.h?cvsroot=src
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/include/xregex2.h?cvsroot=src

http://subversions.gnu.org/cgi-bin/viewcvs/gcc/gcc/include/xregex.h
http://subversions.gnu.org/cgi-bin/viewcvs/gcc/gcc/include/xregex2.h
http://subversions.gnu.org/cgi-bin/viewcvs/gcc/gcc/libiberty/regex.c


There are probably other versions of regex.c floating around, but I
think these are the major ones outside Emacs and glibc 2.2.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: regex.c: emacs & glibc (and xemacs, and grep and ...)
  2002-04-08 16:26         ` Stefan Monnier
  2002-04-08 19:22           ` Paul Eggert
@ 2002-04-10 14:23           ` Richard Stallman
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Stallman @ 2002-04-10 14:23 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, eggert, sds, emacs-devel

    > Would you like to merge some of these versions of regex?

    Which versions would that be ?

    The only one I know of is glibc but they have switched, so that doesn't
    apply any more.

    I'll try and check out GNU grep.

I recall there was one in Gnulib.

It is probably not the time to do this merge now,
because we should see whether the new Glibc regex
is a suitable base for all the other programs to use.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2002-04-10 14:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-04 18:55 regex.c: emacs & glibc (and xemacs, and grep and ...) Sam Steingold
2002-04-04 19:24 ` Stefan Monnier
2002-04-05  1:25   ` Paul Eggert
2002-04-05  2:47     ` Miles Bader
2002-04-05 14:48     ` Stefan Monnier
2002-04-05 23:41       ` Richard Stallman
2002-04-08 16:26         ` Stefan Monnier
2002-04-08 19:22           ` Paul Eggert
2002-04-10 14:23           ` Richard Stallman
2002-04-05 19:05 ` Richard Stallman
2002-04-05 20:27   ` Sam Steingold
2002-04-06 17:32     ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).