* regex.c simplification @ 2018-06-16 15:35 Daniel Colascione 2018-06-16 15:53 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Daniel Colascione @ 2018-06-16 15:35 UTC (permalink / raw) To: emacs-devel I was doing some work on regex.c just now, and I was frustrated that the code is unnecessarily complicated by the ifdefs necessary to support some theoretical non-Emacs use case. Is all of this complexity really necessary? Are we sure the !emacs case even compiles? Are there non-Emacs users of the Emacs regex code? Can we just fork the implementation? How about baking in switches like MATCH_MAY_ALLOCATE? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 15:35 regex.c simplification Daniel Colascione @ 2018-06-16 15:53 ` Eli Zaretskii 2018-06-16 16:11 ` Paul Eggert 2018-06-16 16:12 ` Daniel Colascione 2018-06-16 16:09 ` Noam Postavsky 2018-06-16 16:35 ` Perry E. Metzger 2 siblings, 2 replies; 30+ messages in thread From: Eli Zaretskii @ 2018-06-16 15:53 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel > Date: Sat, 16 Jun 2018 08:35:34 -0700 > From: "Daniel Colascione" <dancol@dancol.org> > > I was doing some work on regex.c just now, and I was frustrated that the > code is unnecessarily complicated by the ifdefs necessary to support some > theoretical non-Emacs use case. Is all of this complexity really > necessary? Are we sure the !emacs case even compiles? Are there non-Emacs > users of the Emacs regex code? Can we just fork the implementation? How > about baking in switches like MATCH_MAY_ALLOCATE? I think we still haven't abandoned the hope of updating to the latest glibc/gnulib versions of regex.c, although I'm not sure how practical these hopes are at this point. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 15:53 ` Eli Zaretskii @ 2018-06-16 16:11 ` Paul Eggert 2018-06-16 16:17 ` Daniel Colascione ` (2 more replies) 2018-06-16 16:12 ` Daniel Colascione 1 sibling, 3 replies; 30+ messages in thread From: Paul Eggert @ 2018-06-16 16:11 UTC (permalink / raw) To: Eli Zaretskii, Daniel Colascione; +Cc: emacs-devel Eli Zaretskii wrote: > I think we still haven't abandoned the hope of updating to the latest > glibc/gnulib versions of regex.c, although I'm not sure how practical > these hopes are at this point. That's been on my list of things to do for ages. I don't know if it'll ever get done, or even whether it's worth doing. As far as I know, Emacs is the only package that still uses the "old" regex.c code derived from pre-2002 glibc. Everybody else has migrated to the "new" regex.c code that was contributed to glibc in 2002 and is in Gnulib. So, in some sense regex.c has already forked; we just haven't made it official. A complication: src/regex.c is compiled twice, once within lib-src (for etags) and once within src (for Emacs proper), and the "#if defined emacs" stuff in src/regex.c matters for this. If we wanted to make the fork more official, we could simplify src/regex.c to not worry about lib-src, by having etags use Glibc/Gnulib regex rather than Emacs regex. That would be easy for me to arrange, if you like. Once we did that, you could simplify src/regex.c by assuming that 'emacs' is defined. None of this would preclude us from eventually merging Emacs src/regex.c with Gnulib/glibc, a task that is so hard that the changes Daniel is thinking about wouldn't make it much harder. While we're on the topic, a couple of more comments about regex code. The "old" and the "new" regex implementations both have problems. The old one has serious performance problems in some cases, and fails to conform to POSIX. The new one is typically better in both departments, but is so complicated that no maintainer understands it (I have attempted to contact the original contributor Isamu Hasegawa of Square Enix Co., Ltd., but have never heard back), so its (hopefully few) bugs remain unfixed. The Perl regular expression library is popular in other free software and appears to be better maintained than either "old" or "new" regexp code. GNU Grep, for example, uses either the "new" regexp code or the Perl library, depending on command-line options. The Perl library tends to be more like the "old" regex implementation, in that it prefers functionality and flexibility to performance; however, it has many more features than the "old" regex code does. Among other things, it supports a more-readable regular expression syntax (a topic that came up recently on this mailing list in another context). ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:11 ` Paul Eggert @ 2018-06-16 16:17 ` Daniel Colascione 2018-06-16 18:06 ` Andreas Schwab 2018-06-18 14:08 ` Stefan Monnier 2 siblings, 0 replies; 30+ messages in thread From: Daniel Colascione @ 2018-06-16 16:17 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel > Eli Zaretskii wrote: >> I think we still haven't abandoned the hope of updating to the latest >> glibc/gnulib versions of regex.c, although I'm not sure how practical >> these hopes are at this point. > > That's been on my list of things to do for ages. I don't know if it'll > ever get > done, or even whether it's worth doing. > > As far as I know, Emacs is the only package that still uses the "old" > regex.c > code derived from pre-2002 glibc. Everybody else has migrated to the "new" > regex.c code that was contributed to glibc in 2002 and is in Gnulib. So, > in some > sense regex.c has already forked; we just haven't made it official. > > A complication: src/regex.c is compiled twice, once within lib-src (for > etags) > and once within src (for Emacs proper), and the "#if defined emacs" stuff > in > src/regex.c matters for this. > > If we wanted to make the fork more official, we could simplify src/regex.c > to > not worry about lib-src, by having etags use Glibc/Gnulib regex rather > than > Emacs regex. That's probably a good idea. The other approach would be to run etags inside a real Emacs context somehow, and that seems too complicated. > That would be easy for me to arrange, if you like. Thanks. > While we're on the topic, a couple of more comments about regex code. The regex API could be a lot better too. It'd be nice to expose the pattern compilation machinery to lisp as some kind of new pattern pvec object, then let lisp manage the cache. The nice thing about doing it this way is that you could transparently support having multiple different kinds of pattern --- e.g., PEGs, PCRE-syntax REs --- and use them transparently, since you'd be able to supply a pattern object anywhere you pass a regex string today. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:11 ` Paul Eggert 2018-06-16 16:17 ` Daniel Colascione @ 2018-06-16 18:06 ` Andreas Schwab 2018-06-16 19:27 ` Perry E. Metzger 2018-06-18 14:08 ` Stefan Monnier 2 siblings, 1 reply; 30+ messages in thread From: Andreas Schwab @ 2018-06-16 18:06 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel The problem is that none of the other regex implementations support a gap. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 18:06 ` Andreas Schwab @ 2018-06-16 19:27 ` Perry E. Metzger 2018-06-17 16:50 ` Clément Pit-Claudel 0 siblings, 1 reply; 30+ messages in thread From: Perry E. Metzger @ 2018-06-16 19:27 UTC (permalink / raw) To: Andreas Schwab; +Cc: Eli Zaretskii, Paul Eggert, Daniel Colascione, emacs-devel On Sat, 16 Jun 2018 20:06:43 +0200 Andreas Schwab <schwab@linux-m68k.org> wrote: > The problem is that none of the other regex implementations support > a gap. Not quite. A couple of them (say TRE) support having a mechanism to fetch the next character rather than assuming they're present in a flat array or what have you, which would allow for dealing with a gap buffer. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 19:27 ` Perry E. Metzger @ 2018-06-17 16:50 ` Clément Pit-Claudel 0 siblings, 0 replies; 30+ messages in thread From: Clément Pit-Claudel @ 2018-06-17 16:50 UTC (permalink / raw) To: emacs-devel On 2018-06-16 15:27, Perry E. Metzger wrote: > On Sat, 16 Jun 2018 20:06:43 +0200 Andreas Schwab > <schwab@linux-m68k.org> wrote: >> The problem is that none of the other regex implementations support >> a gap. > > Not quite. A couple of them (say TRE) support having a mechanism to > fetch the next character rather than assuming they're present in a > flat array or what have you, which would allow for dealing with a gap > buffer. Yeah, but TRE is unmaintained, and has open security issues on its tracker :/ PCRE *should* support a gap, but in practice it doesn't (suspending a search and resuming it in another buffer isn't guaranteed to give the same results as it would have on a single contiguous buffer). There's some relevant context at https://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00622.html Clément. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:11 ` Paul Eggert 2018-06-16 16:17 ` Daniel Colascione 2018-06-16 18:06 ` Andreas Schwab @ 2018-06-18 14:08 ` Stefan Monnier 2018-07-17 23:58 ` Paul Eggert 2 siblings, 1 reply; 30+ messages in thread From: Stefan Monnier @ 2018-06-18 14:08 UTC (permalink / raw) To: emacs-devel > If we wanted to make the fork more official, we could simplify src/regex.c > to not worry about lib-src, by having etags use Glibc/Gnulib regex rather > than Emacs regex. I would welcome such a change. Stefan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-18 14:08 ` Stefan Monnier @ 2018-07-17 23:58 ` Paul Eggert 2018-07-20 0:33 ` Stefan Monnier 0 siblings, 1 reply; 30+ messages in thread From: Paul Eggert @ 2018-07-17 23:58 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: >> If we wanted to make the fork more official, we could simplify src/regex.c >> to not worry about lib-src, by having etags use Glibc/Gnulib regex rather >> than Emacs regex. > I would welcome such a change. I started the ball rolling by writing a patch that changes etags to use Glibc regex, falling back on a Gnulib copy if Glibc is not available; see Bug#32194. We can follow up later by simplifying the Emacs-only regex code to assume Emacs. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-17 23:58 ` Paul Eggert @ 2018-07-20 0:33 ` Stefan Monnier 2018-07-20 0:59 ` Paul Eggert 0 siblings, 1 reply; 30+ messages in thread From: Stefan Monnier @ 2018-07-20 0:33 UTC (permalink / raw) To: emacs-devel >>> If we wanted to make the fork more official, we could simplify src/regex.c >>> to not worry about lib-src, by having etags use Glibc/Gnulib regex rather >>> than Emacs regex. >> I would welcome such a change. > I started the ball rolling by writing a patch that changes etags to use > Glibc regex, falling back on a Gnulib copy if Glibc is not available; see > Bug#32194. We can follow up later by simplifying the Emacs-only regex code > to assume Emacs. I wonder: does etags use regexps internally, or only to handle user-provided "--regex" arguments? More specifically, does it come with its own set of hardcoded regexps? Stefan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 0:33 ` Stefan Monnier @ 2018-07-20 0:59 ` Paul Eggert 2018-07-20 1:42 ` Stefan Monnier 2018-07-20 6:58 ` Eli Zaretskii 0 siblings, 2 replies; 30+ messages in thread From: Paul Eggert @ 2018-07-20 0:59 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: > does etags use regexps internally, or only to handle > user-provided "--regex" arguments? > More specifically, does it come with its own set of hardcoded regexps? No, etags uses the regexp code only for --regex arguments. It would of course be simpler to disable --regex on platforms lacking the glibc regex API. However, my impression is that etags --regex gets some use. For example: https://stackoverflow.com/questions/21283687/what-do-you-put-in-your-standard-etags-regex-calls http://xahlee.info/comp/ctags_etags_gtags.html ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 0:59 ` Paul Eggert @ 2018-07-20 1:42 ` Stefan Monnier 2018-07-20 6:59 ` Eli Zaretskii 2018-07-20 6:58 ` Eli Zaretskii 1 sibling, 1 reply; 30+ messages in thread From: Stefan Monnier @ 2018-07-20 1:42 UTC (permalink / raw) To: emacs-devel >> does etags use regexps internally, or only to handle >> user-provided "--regex" arguments? >> More specifically, does it come with its own set of hardcoded regexps? > No, etags uses the regexp code only for --regex arguments. It would of > course be simpler to disable --regex on platforms lacking the glibc regex > API. However, my impression is that etags --regex gets some use. For > example: > https://stackoverflow.com/questions/21283687/what-do-you-put-in-your-standard-etags-regex-calls > http://xahlee.info/comp/ctags_etags_gtags.html I was thinking of just always using the libc regexp code (whether it's GNU libc or something else). Stefan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 1:42 ` Stefan Monnier @ 2018-07-20 6:59 ` Eli Zaretskii 2018-07-20 21:49 ` Paul Eggert 0 siblings, 1 reply; 30+ messages in thread From: Eli Zaretskii @ 2018-07-20 6:59 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Thu, 19 Jul 2018 21:42:48 -0400 > > I was thinking of just always using the libc regexp code (whether it's > GNU libc or something else). Yes, that'd be a possibility. Do we have any supported platform that does NOT have its own regexp code, whether in libc or as a separate library? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 6:59 ` Eli Zaretskii @ 2018-07-20 21:49 ` Paul Eggert 2018-07-21 6:43 ` Eli Zaretskii 0 siblings, 1 reply; 30+ messages in thread From: Paul Eggert @ 2018-07-20 21:49 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: emacs-devel On 07/19/2018 11:59 PM, Eli Zaretskii wrote: >> I was thinking of just always using the libc regexp code (whether it's >> GNU libc or something else). > Yes, that'd be a possibility. Do we have any supported platform that > does NOT have its own regexp code, whether in libc or as a separate > library? > Every POSIX-conforming platform has regexp code somewhere, using the POSIX API. However, I can see some trouble using that code: * Some of libc regex implementations have been reasonably buggy. Most GNU apps don't use these implementations any more so I'm not sure what their status is. * We may need to use an option like -lregex to get the system library implementation, and that would have to be configured. * Perhaps 'etags' users are using GNU extensions in their regular expressions, and if we switch to the libc API their usage will break. * You're the expert, but as far as I know MS-Windows does not support the POSIX API so presumably we'd have to provide a substitute anyway, for MS-Windows. * etags uses the GNU API so it would have to be changed to use the POSIX API. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 21:49 ` Paul Eggert @ 2018-07-21 6:43 ` Eli Zaretskii 2018-07-21 7:17 ` Paul Eggert 0 siblings, 1 reply; 30+ messages in thread From: Eli Zaretskii @ 2018-07-21 6:43 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 20 Jul 2018 14:49:15 -0700 > > On 07/19/2018 11:59 PM, Eli Zaretskii wrote: > >> I was thinking of just always using the libc regexp code (whether it's > >> GNU libc or something else). > > Yes, that'd be a possibility. Do we have any supported platform that > > does NOT have its own regexp code, whether in libc or as a separate > > library? > > > Every POSIX-conforming platform has regexp code somewhere, using the > POSIX API. However, I can see some trouble using that code: > > * Some of libc regex implementations have been reasonably buggy. Most > GNU apps don't use these implementations any more so I'm not sure what > their status is. > > * We may need to use an option like -lregex to get the system library > implementation, and that would have to be configured. > > * Perhaps 'etags' users are using GNU extensions in their regular > expressions, and if we switch to the libc API their usage will break. We could recommend such users to install GNU regexp, which AFAIK exposes the Posix API as well. > * You're the expert, but as far as I know MS-Windows does not support > the POSIX API so presumably we'd have to provide a substitute anyway, > for MS-Windows. GNU regexp is available as a separate library on Windows, I used it in several ports of GNU and Unix packages. > * etags uses the GNU API so it would have to be changed to use the POSIX > API. Right. There's still the alternative which I asked about a couple of days ago: use the Gnulib regexp without the additional code pulled in by mbrtowc, I hope that's a viable option. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-21 6:43 ` Eli Zaretskii @ 2018-07-21 7:17 ` Paul Eggert 2018-08-01 0:17 ` Paul Eggert 0 siblings, 1 reply; 30+ messages in thread From: Paul Eggert @ 2018-07-21 7:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel Eli Zaretskii wrote: >> * Perhaps 'etags' users are using GNU extensions in their regular >> expressions, and if we switch to the libc API their usage will break. > > We could recommend such users to install GNU regexp, which AFAIK > exposes the Posix API as well. I assume you mean GNU regex. That project is long dead, and has been superseded by Gnulib. I would not recommend it for Emacs usage. See: https://www.gnu.org/software/regex/ > There's still the alternative which I asked about a couple of days > ago: use the Gnulib regexp without the additional code pulled in by > mbrtowc, I hope that's a viable option. Yes, I've built that and am testing it. I plan to report back soon. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-21 7:17 ` Paul Eggert @ 2018-08-01 0:17 ` Paul Eggert 2018-08-01 2:38 ` Brett Gilio 0 siblings, 1 reply; 30+ messages in thread From: Paul Eggert @ 2018-08-01 0:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel On 07/21/2018 12:17 AM, Paul Eggert wrote: >> There's still the alternative which I asked about a couple of days >> ago: use the Gnulib regexp without the additional code pulled in by >> mbrtowc, I hope that's a viable option. > > Yes, I've built that and am testing it. I plan to report back soon. I tested it a bit, simplified the regex code on the Emacs side, and sent a new set of patches here: https://bugs.gnu.org/32194#11 This eliminates about 2500 lines of Emacs C source code, yeay! More improvement could be done, but it is getting time to merge in what I've got. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-08-01 0:17 ` Paul Eggert @ 2018-08-01 2:38 ` Brett Gilio 0 siblings, 0 replies; 30+ messages in thread From: Brett Gilio @ 2018-08-01 2:38 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, monnier, emacs-devel Paul Eggert writes: > On 07/21/2018 12:17 AM, Paul Eggert wrote: >>> There's still the alternative which I asked about a couple of days >>> ago: use the Gnulib regexp without the additional code pulled in by >>> mbrtowc, I hope that's a viable option. >> >> Yes, I've built that and am testing it. I plan to report back soon. > > I tested it a bit, simplified the regex code on the Emacs side, and sent a new > set of patches here: > > https://bugs.gnu.org/32194#11 > > This eliminates about 2500 lines of Emacs C source code, yeay! More improvement > could be done, but it is getting time to merge in what I've got. Thank you for your work, Paul. It is nice to see the source code getting a little bit lighter, rather than to the contrary. -- Brett M. Gilio Free Software Foundation, Member https://parabola.nu | https://emacs.org ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-07-20 0:59 ` Paul Eggert 2018-07-20 1:42 ` Stefan Monnier @ 2018-07-20 6:58 ` Eli Zaretskii 1 sibling, 0 replies; 30+ messages in thread From: Eli Zaretskii @ 2018-07-20 6:58 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 19 Jul 2018 17:59:12 -0700 > > No, etags uses the regexp code only for --regex arguments. It would of course be > simpler to disable --regex on platforms lacking the glibc regex API. However, my > impression is that etags --regex gets some use. For example: > > https://stackoverflow.com/questions/21283687/what-do-you-put-in-your-standard-etags-regex-calls > http://xahlee.info/comp/ctags_etags_gtags.html We actually use the --regex switch in our own Makefile, for the TAGS target, see src/Makefile.in. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 15:53 ` Eli Zaretskii 2018-06-16 16:11 ` Paul Eggert @ 2018-06-16 16:12 ` Daniel Colascione 2018-06-16 16:43 ` Perry E. Metzger 1 sibling, 1 reply; 30+ messages in thread From: Daniel Colascione @ 2018-06-16 16:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel >> Date: Sat, 16 Jun 2018 08:35:34 -0700 >> From: "Daniel Colascione" <dancol@dancol.org> >> >> I was doing some work on regex.c just now, and I was frustrated that the >> code is unnecessarily complicated by the ifdefs necessary to support >> some >> theoretical non-Emacs use case. Is all of this complexity really >> necessary? Are we sure the !emacs case even compiles? Are there >> non-Emacs >> users of the Emacs regex code? Can we just fork the implementation? How >> about baking in switches like MATCH_MAY_ALLOCATE? > > I think we still haven't abandoned the hope of updating to the latest > glibc/gnulib versions of regex.c, although I'm not sure how practical > these hopes are at this point. I checked out the latest glibc and gnulib sources. Both are so far diverged that I think updating Emacs to that code is hopeless. (They have a DFA mode, for example.) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:12 ` Daniel Colascione @ 2018-06-16 16:43 ` Perry E. Metzger 0 siblings, 0 replies; 30+ messages in thread From: Perry E. Metzger @ 2018-06-16 16:43 UTC (permalink / raw) To: Daniel Colascione; +Cc: Eli Zaretskii, emacs-devel On Sat, 16 Jun 2018 09:12:34 -0700 "Daniel Colascione" <dancol@dancol.org> wrote: > I checked out the latest glibc and gnulib sources. Both are so far > diverged that I think updating Emacs to that code is hopeless. > (They have a DFA mode, for example.) That probably performs a whole lot better on large searches. :( -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 15:35 regex.c simplification Daniel Colascione 2018-06-16 15:53 ` Eli Zaretskii @ 2018-06-16 16:09 ` Noam Postavsky 2018-06-16 16:35 ` Perry E. Metzger 2 siblings, 0 replies; 30+ messages in thread From: Noam Postavsky @ 2018-06-16 16:09 UTC (permalink / raw) To: Daniel Colascione; +Cc: Emacs developers On 16 June 2018 at 11:35, Daniel Colascione <dancol@dancol.org> wrote: > I was doing some work on regex.c just now, and I was frustrated that the > code is unnecessarily complicated by the ifdefs necessary to support some > theoretical non-Emacs use case. Is all of this complexity really > necessary? Are we sure the !emacs case even compiles? Are there non-Emacs > users of the Emacs regex code? In terms of #ifndef emacs, I believe lib-src/etags.c uses that. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 15:35 regex.c simplification Daniel Colascione 2018-06-16 15:53 ` Eli Zaretskii 2018-06-16 16:09 ` Noam Postavsky @ 2018-06-16 16:35 ` Perry E. Metzger 2018-06-16 16:42 ` Daniel Colascione 2 siblings, 1 reply; 30+ messages in thread From: Perry E. Metzger @ 2018-06-16 16:35 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel On Sat, 16 Jun 2018 08:35:34 -0700 "Daniel Colascione" <dancol@dancol.org> wrote: > I was doing some work on regex.c just now, and I was frustrated > that the code is unnecessarily complicated by the ifdefs necessary > to support some theoretical non-Emacs use case. Is all of this > complexity really necessary? Are we sure the !emacs case even > compiles? Are there non-Emacs users of the Emacs regex code? Can we > just fork the implementation? How about baking in switches like > MATCH_MAY_ALLOCATE? The emacs regex code is hardly state of the art. I would suggest that there are many other, better, free software implementations of regexes. Indeed, arguably at some point the Emacs regex code could use an overhaul. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:35 ` Perry E. Metzger @ 2018-06-16 16:42 ` Daniel Colascione 2018-06-16 16:55 ` Eli Zaretskii 0 siblings, 1 reply; 30+ messages in thread From: Daniel Colascione @ 2018-06-16 16:42 UTC (permalink / raw) To: Perry E. Metzger; +Cc: Daniel Colascione, emacs-devel > On Sat, 16 Jun 2018 08:35:34 -0700 "Daniel Colascione" > <dancol@dancol.org> wrote: >> I was doing some work on regex.c just now, and I was frustrated >> that the code is unnecessarily complicated by the ifdefs necessary >> to support some theoretical non-Emacs use case. Is all of this >> complexity really necessary? Are we sure the !emacs case even >> compiles? Are there non-Emacs users of the Emacs regex code? Can we >> just fork the implementation? How about baking in switches like >> MATCH_MAY_ALLOCATE? > > The emacs regex code is hardly state of the art. I would suggest that > there are many other, better, free software implementations of > regexes. There are. Unfortunately, none of them understand predicates like \= and \s|. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:42 ` Daniel Colascione @ 2018-06-16 16:55 ` Eli Zaretskii 2018-06-16 18:24 ` Perry E. Metzger 0 siblings, 1 reply; 30+ messages in thread From: Eli Zaretskii @ 2018-06-16 16:55 UTC (permalink / raw) To: Daniel Colascione; +Cc: dancol, emacs-devel, perry > Date: Sat, 16 Jun 2018 09:42:37 -0700 > From: "Daniel Colascione" <dancol@dancol.org> > Cc: Daniel Colascione <dancol@dancol.org>, emacs-devel@gnu.org > > > The emacs regex code is hardly state of the art. I would suggest that > > there are many other, better, free software implementations of > > regexes. > > There are. Unfortunately, none of them understand predicates like \= and \s|. Right. And there are a few more features important to Emacs that other implementations don't support. So, while I think modernizing our regex code would be a welcome development, we shouldn't mislead ourselves into thinking that any other implementation could be a drop-in replacement. Some work will be needed to add the features we expect. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 16:55 ` Eli Zaretskii @ 2018-06-16 18:24 ` Perry E. Metzger 2018-06-16 18:29 ` Eli Zaretskii 0 siblings, 1 reply; 30+ messages in thread From: Perry E. Metzger @ 2018-06-16 18:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel On Sat, 16 Jun 2018 19:55:42 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Sat, 16 Jun 2018 09:42:37 -0700 > > From: "Daniel Colascione" <dancol@dancol.org> > > Cc: Daniel Colascione <dancol@dancol.org>, emacs-devel@gnu.org > > > > > The emacs regex code is hardly state of the art. I would > > > suggest that there are many other, better, free software > > > implementations of regexes. > > > > There are. Unfortunately, none of them understand predicates like > > \= and \s|. > > Right. And there are a few more features important to Emacs that > other implementations don't support. > > So, while I think modernizing our regex code would be a welcome > development, we shouldn't mislead ourselves into thinking that any > other implementation could be a drop-in replacement. Some work will > be needed to add the features we expect. > I was arguing in the opposite direction, that there isn't much point in thinking others will be interested in using the Emacs regex code in the future. -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 18:24 ` Perry E. Metzger @ 2018-06-16 18:29 ` Eli Zaretskii 2018-06-16 18:58 ` Perry E. Metzger 0 siblings, 1 reply; 30+ messages in thread From: Eli Zaretskii @ 2018-06-16 18:29 UTC (permalink / raw) To: Perry E. Metzger; +Cc: dancol, emacs-devel > Date: Sat, 16 Jun 2018 14:24:02 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: "Daniel Colascione" <dancol@dancol.org>, emacs-devel@gnu.org > > > So, while I think modernizing our regex code would be a welcome > > development, we shouldn't mislead ourselves into thinking that any > > other implementation could be a drop-in replacement. Some work will > > be needed to add the features we expect. > > > > I was arguing in the opposite direction, that there isn't much point > in thinking others will be interested in using the Emacs regex code > in the future. How's that relevant to the issue at hand? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 18:29 ` Eli Zaretskii @ 2018-06-16 18:58 ` Perry E. Metzger 2018-06-16 19:27 ` Eli Zaretskii 0 siblings, 1 reply; 30+ messages in thread From: Perry E. Metzger @ 2018-06-16 18:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dancol, emacs-devel On Sat, 16 Jun 2018 21:29:55 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Sat, 16 Jun 2018 14:24:02 -0400 > > From: "Perry E. Metzger" <perry@piermont.com> > > Cc: "Daniel Colascione" <dancol@dancol.org>, emacs-devel@gnu.org > > > > > So, while I think modernizing our regex code would be a welcome > > > development, we shouldn't mislead ourselves into thinking that > > > any other implementation could be a drop-in replacement. Some > > > work will be needed to add the features we expect. > > > > > > > I was arguing in the opposite direction, that there isn't much > > point in thinking others will be interested in using the Emacs > > regex code in the future. > > How's that relevant to the issue at hand? > The original question was "should we keep the code that isn't needed by emacs on the premise something else might need it someday." I was implying that, no, the odds that something else would want it someday seem low. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 18:58 ` Perry E. Metzger @ 2018-06-16 19:27 ` Eli Zaretskii 2018-06-18 9:36 ` Robert Pluim 0 siblings, 1 reply; 30+ messages in thread From: Eli Zaretskii @ 2018-06-16 19:27 UTC (permalink / raw) To: Perry E. Metzger; +Cc: dancol, emacs-devel > Date: Sat, 16 Jun 2018 14:58:56 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: dancol@dancol.org, emacs-devel@gnu.org > > The original question was "should we keep the code that isn't needed > by emacs on the premise something else might need it someday." I was > implying that, no, the odds that something else would want it someday > seem low. Yes, but the reason to keep the code not needed by Emacs is not because someone outside of Emacs will want it. It's because we ourselves use it in etags. If we ever import regex from gnulib, then yes, we will have to keep non-Emacs code also for future merging with gnulib. But not now. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: regex.c simplification 2018-06-16 19:27 ` Eli Zaretskii @ 2018-06-18 9:36 ` Robert Pluim 0 siblings, 0 replies; 30+ messages in thread From: Robert Pluim @ 2018-06-18 9:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dancol, emacs-devel, Perry E. Metzger Eli Zaretskii <eliz@gnu.org> writes: >> Date: Sat, 16 Jun 2018 14:58:56 -0400 >> From: "Perry E. Metzger" <perry@piermont.com> >> Cc: dancol@dancol.org, emacs-devel@gnu.org >> >> The original question was "should we keep the code that isn't needed >> by emacs on the premise something else might need it someday." I was >> implying that, no, the odds that something else would want it someday >> seem low. > > Yes, but the reason to keep the code not needed by Emacs is not > because someone outside of Emacs will want it. It's because we > ourselves use it in etags. > We could switch to external etags, and remove our copy. Iʼm assuming there are differences between the two implementations, but I donʼt know exactly what they are. Robert ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2018-08-01 2:38 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-06-16 15:35 regex.c simplification Daniel Colascione 2018-06-16 15:53 ` Eli Zaretskii 2018-06-16 16:11 ` Paul Eggert 2018-06-16 16:17 ` Daniel Colascione 2018-06-16 18:06 ` Andreas Schwab 2018-06-16 19:27 ` Perry E. Metzger 2018-06-17 16:50 ` Clément Pit-Claudel 2018-06-18 14:08 ` Stefan Monnier 2018-07-17 23:58 ` Paul Eggert 2018-07-20 0:33 ` Stefan Monnier 2018-07-20 0:59 ` Paul Eggert 2018-07-20 1:42 ` Stefan Monnier 2018-07-20 6:59 ` Eli Zaretskii 2018-07-20 21:49 ` Paul Eggert 2018-07-21 6:43 ` Eli Zaretskii 2018-07-21 7:17 ` Paul Eggert 2018-08-01 0:17 ` Paul Eggert 2018-08-01 2:38 ` Brett Gilio 2018-07-20 6:58 ` Eli Zaretskii 2018-06-16 16:12 ` Daniel Colascione 2018-06-16 16:43 ` Perry E. Metzger 2018-06-16 16:09 ` Noam Postavsky 2018-06-16 16:35 ` Perry E. Metzger 2018-06-16 16:42 ` Daniel Colascione 2018-06-16 16:55 ` Eli Zaretskii 2018-06-16 18:24 ` Perry E. Metzger 2018-06-16 18:29 ` Eli Zaretskii 2018-06-16 18:58 ` Perry E. Metzger 2018-06-16 19:27 ` Eli Zaretskii 2018-06-18 9:36 ` Robert Pluim
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).