From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Stefan Monnier" Newsgroups: gmane.emacs.devel Subject: Re: regex.c: emacs & glibc (and xemacs, and grep and ...) Date: Thu, 04 Apr 2002 14:24:55 -0500 Sender: emacs-devel-admin@gnu.org Message-ID: <200204041924.g34JOt718874@rum.cs.yale.edu> References: NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1017948441 8929 127.0.0.1 (4 Apr 2002 19:27:21 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 4 Apr 2002 19:27:21 +0000 (UTC) Cc: emacs-devel@gnu.org Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 16tCtF-0002Ju-00 for ; Thu, 04 Apr 2002 21:27:21 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 16tD5v-0008Sd-00 for ; Thu, 04 Apr 2002 21:40:27 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 16tCsy-00009K-00; Thu, 04 Apr 2002 14:27:04 -0500 Original-Received: from rum.cs.yale.edu ([128.36.229.169]) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 16tCqv-0008Ro-00; Thu, 04 Apr 2002 14:24:57 -0500 Original-Received: (from monnier@localhost) by rum.cs.yale.edu (8.11.6/8.11.6) id g34JOt718874; Thu, 4 Apr 2002 14:24:55 -0500 X-Mailer: exmh version 2.4 06/23/2000 with nmh-1.0.4 Original-To: Sam Steingold Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.8 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:2380 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:2380 > Emacs comes with it's own version of regex.c, even though most UNIX > systems have a regex implementation. IIUC, this is for portability. It's for Emacs' portability as well as for elisp program's portability (so the set of supported regexps is always the same) and more importantly because the regexp routines need to know about Emacs' internal buffer representation (as a pair of char arrays, hence the need for re_match_2) and multibyte char representation as well as Emacs' syntax tables. > Suppose glibc's regex and Emacs' regex are unified - will Emacs still > build it's own regex version even on gnu systems (linux, hurd)? Most likely yes, because the glibc version will not know about Emacs' syntax tables and multibyte chars. > Finally, GNU CLISP (http://clisp.cons.org) comes with a GNU regex.c of > 1994(!) - which version should we upgrade to? Emacs? GLIBC? GNU grep? The regex library in glibc-2.1 derived from the same code base as the one of Emacs. I fixed several problems in the code for Emacs-21 and tried to re-merge the glibc code with the new Emacs code (so as to fix the glibc code as well) but never finished it (i.e. I did integrate some of glibc's changes into Emacs' regex.c but not enough to be able to replace glibc's version). In the mean time, supposedly glibc has switched to a new code base for its regexp engine (I know glibc maintainers hated the regexp code, and I can partly understand why, although the Emacs version has been somewhat improved) which seems inadequate (for now at least) for Emacs and even somewhat unstable, last I heard of it. I don't know what GNU grep uses, but CVS uses the same code as Emacs and so do several other GNU packages, so maybe GNU grep as well (although they might not have been upgraded to the Emacs-21 code, I don't know). BTW, note that same code doesn't mean that they could be put into a shared library because they're compiled with different compilation options. Stefan