* bug#24100: [PATCH 0/4] Some regex dead-code elimination
@ 2016-07-28 17:00 Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
2016-08-02 16:06 ` bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
0 siblings, 2 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-07-28 17:00 UTC (permalink / raw)
To: 24100
The first patch is just a minor cleanup, the rests bank on the
observation that Emacs only ever uses its own regex syntax.
Overall, this shrinks Emacs by over 60 kB and expecetdly also speeds
it up.
This patchest is put on top of my previous on from bug#24009. The
whole branch can be seen at https://github.com/mina86/emacs/.
Unless there are objections, I’ve submit it in a week or so.
Michal Nazarewicz (4):
Remove dead opcodes in regex bytecode
Get rid of re_set_syntax
Get rid of re_set_whitespace_regexp
Hardcode regex syntax to remove dead code handling different syntax
src/regex.c | 101 ++++++++++++++++++++++++++++++++++-------------------------
src/regex.h | 20 ++++++++++--
src/search.c | 17 +++-------
3 files changed, 80 insertions(+), 58 deletions(-)
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode
2016-07-28 17:00 bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
@ 2016-07-28 18:07 ` Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 2/4] Get rid of re_set_syntax Michal Nazarewicz
` (2 more replies)
2016-08-02 16:06 ` bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
1 sibling, 3 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-07-28 18:07 UTC (permalink / raw)
To: 24100
There is no way to specify before_dot and after_dot opcodes in a regex
so code handling those ends up being dead. Remove it.
* src/regex.c (print_partial_compiled_pattern, regex_compile,
analyze_first, re_match_2_internal): Remove handling and references to
before_dot and after_dot opcodes.
---
src/regex.c | 28 +---------------------------
1 file changed, 1 insertion(+), 27 deletions(-)
diff --git a/src/regex.c b/src/regex.c
index 3a25835..261d299 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -669,9 +669,7 @@ typedef enum
notsyntaxspec
#ifdef emacs
- ,before_dot, /* Succeeds if before point. */
- at_dot, /* Succeeds if at point. */
- after_dot, /* Succeeds if after point. */
+ , at_dot, /* Succeeds if at point. */
/* Matches any character whose category-set contains the specified
category. The operator is followed by a byte which contains a
@@ -1053,18 +1051,10 @@ print_partial_compiled_pattern (re_char *start, re_char *end)
break;
# ifdef emacs
- case before_dot:
- fprintf (stderr, "/before_dot");
- break;
-
case at_dot:
fprintf (stderr, "/at_dot");
break;
- case after_dot:
- fprintf (stderr, "/after_dot");
- break;
-
case categoryspec:
fprintf (stderr, "/categoryspec");
mcnt = *p++;
@@ -3440,8 +3430,6 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
goto normal_char;
#ifdef emacs
- /* There is no way to specify the before_dot and after_dot
- operators. rms says this is ok. --karl */
case '=':
laststart = b;
BUF_PUSH (at_dot);
@@ -4018,9 +4006,7 @@ analyze_first (const_re_char *p, const_re_char *pend, char *fastmap,
/* All cases after this match the empty string. These end with
`continue'. */
- case before_dot:
case at_dot:
- case after_dot:
#endif /* !emacs */
case no_op:
case begline:
@@ -6148,24 +6134,12 @@ re_match_2_internal (struct re_pattern_buffer *bufp, const_re_char *string1,
break;
#ifdef emacs
- case before_dot:
- DEBUG_PRINT ("EXECUTING before_dot.\n");
- if (PTR_BYTE_POS (d) >= PT_BYTE)
- goto fail;
- break;
-
case at_dot:
DEBUG_PRINT ("EXECUTING at_dot.\n");
if (PTR_BYTE_POS (d) != PT_BYTE)
goto fail;
break;
- case after_dot:
- DEBUG_PRINT ("EXECUTING after_dot.\n");
- if (PTR_BYTE_POS (d) <= PT_BYTE)
- goto fail;
- break;
-
case categoryspec:
case notcategoryspec:
{
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#24100: [PATCH 2/4] Get rid of re_set_syntax
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
@ 2016-07-28 18:07 ` Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 3/4] Get rid of re_set_whitespace_regexp Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 4/4] Hardcode regex syntax to remove dead code handling different syntax Michal Nazarewicz
2 siblings, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-07-28 18:07 UTC (permalink / raw)
To: 24100
Instead of using a global variable for storing regex syntax, pass it
to re_compile_pattern. This is only enabled when compiling Emacs (i.e.
‘#ifdef emacs’).
* src/regex.h (re_set_syntax): Declare only #ifndef emacs.
(re_compile_pattern): Now takes syntax argument #ifdef emacs.
* src/regex.c (re_syntax_options): Define only #ifndef emacs.
(re_compile_pattern): Use the new syntax argument #ifdef emacs.
* src/search.c (compile_pattern_1): Don’t use re_set_syntax and
instead pass syntax to re_compile_pattern directly.
---
src/regex.c | 12 +++++++++++-
src/regex.h | 14 ++++++++++++++
src/search.c | 10 ++++------
3 files changed, 29 insertions(+), 7 deletions(-)
diff --git a/src/regex.c b/src/regex.c
index 261d299..4edc064 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -1149,6 +1149,8 @@ print_double_string (re_char *where, re_char *string1, ssize_t size1,
#endif /* not DEBUG */
\f
+#ifndef emacs
+
/* Set by `re_set_syntax' to the current regexp syntax to recognize. Can
also be assigned to arbitrarily: each pattern buffer stores its own
syntax, so it can be changed between regex compilations. */
@@ -1174,6 +1176,8 @@ re_set_syntax (reg_syntax_t syntax)
}
WEAK_ALIAS (__re_set_syntax, re_set_syntax)
+#endif
+
/* Regexp to use to replace spaces, or NULL meaning don't. */
static const_re_char *whitespace_regexp;
@@ -6271,8 +6275,14 @@ bcmp_translate (const_re_char *s1, const_re_char *s2, register ssize_t len,
const char *
re_compile_pattern (const char *pattern, size_t length,
+#ifdef emacs
+ reg_syntax_t syntax,
+#endif
struct re_pattern_buffer *bufp)
{
+#ifndef emacs
+ const reg_syntax_t syntax = re_syntax_options;
+#endif
reg_errcode_t ret;
/* GNU code is written to assume at least RE_NREGS registers will be set
@@ -6284,7 +6294,7 @@ re_compile_pattern (const char *pattern, size_t length,
setting no_sub. */
bufp->no_sub = 0;
- ret = regex_compile ((re_char*) pattern, length, re_syntax_options, bufp);
+ ret = regex_compile ((re_char*) pattern, length, syntax, bufp);
if (!ret)
return NULL;
diff --git a/src/regex.h b/src/regex.h
index 01b659a..4497333 100644
--- a/src/regex.h
+++ b/src/regex.h
@@ -20,6 +20,13 @@
#ifndef _REGEX_H
#define _REGEX_H 1
+#if defined emacs && (defined _REGEX_RE_COMP || defined _LIBC)
+/* We’re not defining re_set_syntax and using a different prototype of
+ re_compile_pattern when building Emacs so fail compilation early with
+ a (somewhat helpful) error message when conflict is detected. */
+# error "_REGEX_RE_COMP nor _LIBC can be defined if emacs is defined."
+#endif
+
/* Allow the use in C++ code. */
#ifdef __cplusplus
extern "C" {
@@ -453,14 +460,21 @@ typedef struct
\f
/* Declarations for routines. */
+#ifndef emacs
+
/* Sets the current default syntax to SYNTAX, and return the old syntax.
You can also simply assign to the `re_syntax_options' variable. */
extern reg_syntax_t re_set_syntax (reg_syntax_t __syntax);
+#endif
+
/* Compile the regular expression PATTERN, with length LENGTH
and syntax given by the global `re_syntax_options', into the buffer
BUFFER. Return NULL if successful, and an error string if not. */
extern const char *re_compile_pattern (const char *__pattern, size_t __length,
+#ifdef emacs
+ reg_syntax_t syntax,
+#endif
struct re_pattern_buffer *__buffer);
diff --git a/src/search.c b/src/search.c
index 7cb18a2..f041952 100644
--- a/src/search.c
+++ b/src/search.c
@@ -113,8 +113,8 @@ static void
compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
Lisp_Object translate, bool posix)
{
+ reg_syntax_t syntax;
char *val;
- reg_syntax_t old;
cp->regexp = Qnil;
cp->buf.translate = (! NILP (translate) ? translate : make_number (0));
@@ -131,16 +131,15 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
Using BLOCK_INPUT here means the debugger won't run if an error occurs.
So let's turn it off. */
/* BLOCK_INPUT; */
- old = re_set_syntax (RE_SYNTAX_EMACS
- | (posix ? 0 : RE_NO_POSIX_BACKTRACKING));
if (STRINGP (Vsearch_spaces_regexp))
re_set_whitespace_regexp (SSDATA (Vsearch_spaces_regexp));
else
re_set_whitespace_regexp (NULL);
- val = (char *) re_compile_pattern (SSDATA (pattern),
- SBYTES (pattern), &cp->buf);
+ syntax = RE_SYNTAX_EMACS | (posix ? 0 : RE_NO_POSIX_BACKTRACKING);
+ val = (char *) re_compile_pattern (SSDATA (pattern), SBYTES (pattern),
+ syntax, &cp->buf);
/* If the compiled pattern hard codes some of the contents of the
syntax-table, it can only be reused with *this* syntax table. */
@@ -148,7 +147,6 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
re_set_whitespace_regexp (NULL);
- re_set_syntax (old);
/* unblock_input (); */
if (val)
xsignal1 (Qinvalid_regexp, build_string (val));
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#24100: [PATCH 3/4] Get rid of re_set_whitespace_regexp
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 2/4] Get rid of re_set_syntax Michal Nazarewicz
@ 2016-07-28 18:07 ` Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 4/4] Hardcode regex syntax to remove dead code handling different syntax Michal Nazarewicz
2 siblings, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-07-28 18:07 UTC (permalink / raw)
To: 24100
* src/regex.h (re_set_whitespace_regexp): Delete.
(re_compile_pattern): Add whitespace_regexp argument #ifdef emacs.
* src/regex.c (re_set_whitespace_regexp, whitespace_regexp): Delete.
(regex_compile): Add whitespace_regexp argument #ifdef emacs and wrap
whitespace_regexp-related code in an #ifdef emacs so it’s compiled out
unless building Emacs.
(re_compile_pattern): Pass whitespace_regexp argument to regex_compile
* src/search.c (compile_pattern_1): Don’t use re_set_whitespace_regexp,
pass the regex as argument to re_compile_pattern instead.
---
src/regex.c | 39 ++++++++++++++++++++++++---------------
src/regex.h | 3 +--
src/search.c | 13 +++++--------
3 files changed, 30 insertions(+), 25 deletions(-)
diff --git a/src/regex.c b/src/regex.c
index 4edc064..c32a62f 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -1177,16 +1177,6 @@ re_set_syntax (reg_syntax_t syntax)
WEAK_ALIAS (__re_set_syntax, re_set_syntax)
#endif
-
-/* Regexp to use to replace spaces, or NULL meaning don't. */
-static const_re_char *whitespace_regexp;
-
-void
-re_set_whitespace_regexp (const char *regexp)
-{
- whitespace_regexp = (const_re_char *) regexp;
-}
-WEAK_ALIAS (__re_set_syntax, re_set_syntax)
\f
/* This table gives an error message for each of the error codes listed
in regex.h. Obviously the order here has to be same as there.
@@ -1569,6 +1559,9 @@ do { \
static reg_errcode_t regex_compile (re_char *pattern, size_t size,
reg_syntax_t syntax,
+#ifdef emacs
+ const char *whitespace_regexp,
+#endif
struct re_pattern_buffer *bufp);
static void store_op1 (re_opcode_t op, unsigned char *loc, int arg);
static void store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2);
@@ -2398,6 +2391,9 @@ static boolean group_in_compile_stack (compile_stack_type compile_stack,
/* `regex_compile' compiles PATTERN (of length SIZE) according to SYNTAX.
Returns one of error codes defined in `regex.h', or zero for success.
+ If WHITESPACE_REGEXP is given (only #ifdef emacs), it is used instead of
+ a space character in PATTERN.
+
Assumes the `allocated' (and perhaps `buffer') and `translate'
fields are set in BUFP on entry.
@@ -2431,6 +2427,9 @@ do { \
static reg_errcode_t
regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
+#ifdef emacs
+ const char *whitespace_regexp,
+#endif
struct re_pattern_buffer *bufp)
{
/* We fetch characters from PATTERN here. */
@@ -2483,6 +2482,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
/* If the object matched can contain multibyte characters. */
const boolean multibyte = RE_MULTIBYTE_P (bufp);
+#ifdef emacs
/* Nonzero if we have pushed down into a subpattern. */
int in_subpattern = 0;
@@ -2491,6 +2491,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
re_char *main_p;
re_char *main_pattern;
re_char *main_pend;
+#endif
#ifdef DEBUG
debug++;
@@ -2559,6 +2560,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
{
if (p == pend)
{
+#ifdef emacs
/* If this is the end of an included regexp,
pop back to the main regexp and try again. */
if (in_subpattern)
@@ -2569,6 +2571,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
pend = main_pend;
continue;
}
+#endif
/* If this is the end of the main regexp, we are done. */
break;
}
@@ -2577,6 +2580,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
switch (c)
{
+#ifdef emacs
case ' ':
{
re_char *p1 = p;
@@ -2609,6 +2613,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
pend = p + strlen ((const char *) p);
break;
}
+#endif
case '^':
{
@@ -6276,13 +6281,10 @@ bcmp_translate (const_re_char *s1, const_re_char *s2, register ssize_t len,
const char *
re_compile_pattern (const char *pattern, size_t length,
#ifdef emacs
- reg_syntax_t syntax,
+ reg_syntax_t syntax, const char *whitespace_regexp,
#endif
struct re_pattern_buffer *bufp)
{
-#ifndef emacs
- const reg_syntax_t syntax = re_syntax_options;
-#endif
reg_errcode_t ret;
/* GNU code is written to assume at least RE_NREGS registers will be set
@@ -6294,7 +6296,14 @@ re_compile_pattern (const char *pattern, size_t length,
setting no_sub. */
bufp->no_sub = 0;
- ret = regex_compile ((re_char*) pattern, length, syntax, bufp);
+ ret = regex_compile ((re_char*) pattern, length,
+#ifdef emacs
+ syntax,
+ whitespace_regexp,
+#else
+ re_syntax_options,
+#endif
+ bufp);
if (!ret)
return NULL;
diff --git a/src/regex.h b/src/regex.h
index 4497333..af9480d 100644
--- a/src/regex.h
+++ b/src/regex.h
@@ -474,6 +474,7 @@ extern reg_syntax_t re_set_syntax (reg_syntax_t __syntax);
extern const char *re_compile_pattern (const char *__pattern, size_t __length,
#ifdef emacs
reg_syntax_t syntax,
+ const char *whitespace_regexp,
#endif
struct re_pattern_buffer *__buffer);
@@ -627,8 +628,6 @@ extern re_wctype_t re_wctype_parse (const unsigned char **strp, unsigned limit);
typedef int re_wchar_t;
-extern void re_set_whitespace_regexp (const char *regexp);
-
#endif /* not WIDE_CHAR_SUPPORT */
#endif /* regex.h */
diff --git a/src/search.c b/src/search.c
index f041952..c7556a9 100644
--- a/src/search.c
+++ b/src/search.c
@@ -113,6 +113,7 @@ static void
compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
Lisp_Object translate, bool posix)
{
+ const char *whitespace_regexp;
reg_syntax_t syntax;
char *val;
@@ -132,21 +133,17 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
So let's turn it off. */
/* BLOCK_INPUT; */
- if (STRINGP (Vsearch_spaces_regexp))
- re_set_whitespace_regexp (SSDATA (Vsearch_spaces_regexp));
- else
- re_set_whitespace_regexp (NULL);
-
syntax = RE_SYNTAX_EMACS | (posix ? 0 : RE_NO_POSIX_BACKTRACKING);
+ whitespace_regexp = STRINGP (Vsearch_spaces_regexp) ?
+ SSDATA (Vsearch_spaces_regexp) : NULL;
+
val = (char *) re_compile_pattern (SSDATA (pattern), SBYTES (pattern),
- syntax, &cp->buf);
+ syntax, whitespace_regexp, &cp->buf);
/* If the compiled pattern hard codes some of the contents of the
syntax-table, it can only be reused with *this* syntax table. */
cp->syntax_table = cp->buf.used_syntax ? BVAR (current_buffer, syntax_table) : Qt;
- re_set_whitespace_regexp (NULL);
-
/* unblock_input (); */
if (val)
xsignal1 (Qinvalid_regexp, build_string (val));
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#24100: [PATCH 4/4] Hardcode regex syntax to remove dead code handling different syntax
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 2/4] Get rid of re_set_syntax Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 3/4] Get rid of re_set_whitespace_regexp Michal Nazarewicz
@ 2016-07-28 18:07 ` Michal Nazarewicz
2 siblings, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-07-28 18:07 UTC (permalink / raw)
To: 24100
Emacs only ever uses its own regex syntax so support for other syntaxes
is never used. Hardcode the syntax so that the compilar can detect such
dead code and remove it from compiled code.
The only exception is RE_NO_POSIX_BACKTRACKING which can be separatelly
specified. Handle this separatelly with a function argument (replacing
now unnecessary syntax argument).
With this patchset, size of Emacs binary on x86_64 machine is reduced by
around 60 kB:
new-sizes:-rwx------ 3 mpn eng 30254720 Jul 27 23:31 src/emacs
old-sizes:-rwx------ 3 mpn eng 30314828 Jul 27 23:29 src/emacs
* src/regex.h (re_pattern_buffer): Don’t define syntax field #ifdef emacs.
(re_compile_pattern): Replace syntax with posix_backtracking argument.
* src/regex.c (print_compiled_pattern): Don’t print syntax #ifdef emacs.
(regex_compile): #ifdef emacs, replace syntax argument with
posix_backtracking which is now used instead of testing for
RE_NO_POSIX_BACKTRACKING syntax.
(re_match_2_internal): Don’t access bufp->syntax #ifndef emacs.
(re_compile_pattern): Replace syntax with posix_backtracking argument.
* src/search.c (compile_pattern_1): Pass boolean posix_backtracking
instead of syntax to re_compile_pattern.
---
src/regex.c | 40 +++++++++++++++++++++++++++++++---------
src/regex.h | 5 +++--
src/search.c | 4 +---
3 files changed, 35 insertions(+), 14 deletions(-)
diff --git a/src/regex.c b/src/regex.c
index c32a62f..8dafb11 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -1108,7 +1108,9 @@ print_compiled_pattern (struct re_pattern_buffer *bufp)
printf ("no_sub: %d\t", bufp->no_sub);
printf ("not_bol: %d\t", bufp->not_bol);
printf ("not_eol: %d\t", bufp->not_eol);
+#ifndef emacs
printf ("syntax: %lx\n", bufp->syntax);
+#endif
fflush (stdout);
/* Perhaps we should print the translate table? */
}
@@ -1558,9 +1560,11 @@ do { \
/* Subroutine declarations and macros for regex_compile. */
static reg_errcode_t regex_compile (re_char *pattern, size_t size,
- reg_syntax_t syntax,
#ifdef emacs
+ bool posix_backtracking,
const char *whitespace_regexp,
+#else
+ reg_syntax_t syntax,
#endif
struct re_pattern_buffer *bufp);
static void store_op1 (re_opcode_t op, unsigned char *loc, int arg);
@@ -2426,9 +2430,14 @@ do { \
} while (0)
static reg_errcode_t
-regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
+regex_compile (const_re_char *pattern, size_t size,
#ifdef emacs
+# define syntax RE_SYNTAX_EMACS
+ bool posix_backtracking,
const char *whitespace_regexp,
+#else
+ reg_syntax_t syntax,
+# define posix_backtracking (!(syntax & RE_NO_POSIX_BACKTRACKING))
#endif
struct re_pattern_buffer *bufp)
{
@@ -2518,7 +2527,9 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
range_table_work.allocated = 0;
/* Initialize the pattern buffer. */
+#ifndef emacs
bufp->syntax = syntax;
+#endif
bufp->fastmap_accurate = 0;
bufp->not_bol = bufp->not_eol = 0;
bufp->used_syntax = 0;
@@ -3645,7 +3656,7 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
/* If we don't want backtracking, force success
the first time we reach the end of the compiled pattern. */
- if (syntax & RE_NO_POSIX_BACKTRACKING)
+ if (!posix_backtracking)
BUF_PUSH (succeed);
/* We have succeeded; set the length of the buffer. */
@@ -3680,6 +3691,12 @@ regex_compile (const_re_char *pattern, size_t size, reg_syntax_t syntax,
#endif /* not MATCH_MAY_ALLOCATE */
FREE_STACK_RETURN (REG_NOERROR);
+
+#ifdef emacs
+# undef syntax
+#else
+# undef posix_backtracking
+#endif
} /* regex_compile */
\f
/* Subroutines for `regex_compile'. */
@@ -5442,6 +5459,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, const_re_char *string1,
{
int buf_charlen;
re_wchar_t buf_ch;
+ reg_syntax_t syntax;
DEBUG_PRINT ("EXECUTING anychar.\n");
@@ -5450,10 +5468,14 @@ re_match_2_internal (struct re_pattern_buffer *bufp, const_re_char *string1,
target_multibyte);
buf_ch = TRANSLATE (buf_ch);
- if ((!(bufp->syntax & RE_DOT_NEWLINE)
- && buf_ch == '\n')
- || ((bufp->syntax & RE_DOT_NOT_NULL)
- && buf_ch == '\000'))
+#ifdef emacs
+ syntax = RE_SYNTAX_EMACS;
+#else
+ syntax = bufp->syntax;
+#endif
+
+ if ((!(syntax & RE_DOT_NEWLINE) && buf_ch == '\n')
+ || ((syntax & RE_DOT_NOT_NULL) && buf_ch == '\000'))
goto fail;
DEBUG_PRINT (" Matched \"%d\".\n", *d);
@@ -6281,7 +6303,7 @@ bcmp_translate (const_re_char *s1, const_re_char *s2, register ssize_t len,
const char *
re_compile_pattern (const char *pattern, size_t length,
#ifdef emacs
- reg_syntax_t syntax, const char *whitespace_regexp,
+ bool posix_backtracking, const char *whitespace_regexp,
#endif
struct re_pattern_buffer *bufp)
{
@@ -6298,7 +6320,7 @@ re_compile_pattern (const char *pattern, size_t length,
ret = regex_compile ((re_char*) pattern, length,
#ifdef emacs
- syntax,
+ posix_backtracking,
whitespace_regexp,
#else
re_syntax_options,
diff --git a/src/regex.h b/src/regex.h
index af9480d..b672d3f 100644
--- a/src/regex.h
+++ b/src/regex.h
@@ -354,9 +354,10 @@ struct re_pattern_buffer
/* Number of bytes actually used in `buffer'. */
size_t used;
+#ifndef emacs
/* Syntax setting with which the pattern was compiled. */
reg_syntax_t syntax;
-
+#endif
/* Pointer to a fastmap, if any, otherwise zero. re_search uses
the fastmap, if there is one, to skip over impossible
starting points for matches. */
@@ -473,7 +474,7 @@ extern reg_syntax_t re_set_syntax (reg_syntax_t __syntax);
BUFFER. Return NULL if successful, and an error string if not. */
extern const char *re_compile_pattern (const char *__pattern, size_t __length,
#ifdef emacs
- reg_syntax_t syntax,
+ bool posix_backtracking,
const char *whitespace_regexp,
#endif
struct re_pattern_buffer *__buffer);
diff --git a/src/search.c b/src/search.c
index c7556a9..7f2b4f9 100644
--- a/src/search.c
+++ b/src/search.c
@@ -114,7 +114,6 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
Lisp_Object translate, bool posix)
{
const char *whitespace_regexp;
- reg_syntax_t syntax;
char *val;
cp->regexp = Qnil;
@@ -133,12 +132,11 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern,
So let's turn it off. */
/* BLOCK_INPUT; */
- syntax = RE_SYNTAX_EMACS | (posix ? 0 : RE_NO_POSIX_BACKTRACKING);
whitespace_regexp = STRINGP (Vsearch_spaces_regexp) ?
SSDATA (Vsearch_spaces_regexp) : NULL;
val = (char *) re_compile_pattern (SSDATA (pattern), SBYTES (pattern),
- syntax, whitespace_regexp, &cp->buf);
+ posix, whitespace_regexp, &cp->buf);
/* If the compiled pattern hard codes some of the contents of the
syntax-table, it can only be reused with *this* syntax table. */
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#24100: [PATCH 0/4] Some regex dead-code elimination
2016-07-28 17:00 bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
@ 2016-08-02 16:06 ` Michal Nazarewicz
1 sibling, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-08-02 16:06 UTC (permalink / raw)
To: 24100-done
Pushed.
--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-08-02 16:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-28 17:00 bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 1/4] Remove dead opcodes in regex bytecode Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 2/4] Get rid of re_set_syntax Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 3/4] Get rid of re_set_whitespace_regexp Michal Nazarewicz
2016-07-28 18:07 ` bug#24100: [PATCH 4/4] Hardcode regex syntax to remove dead code handling different syntax Michal Nazarewicz
2016-08-02 16:06 ` bug#24100: [PATCH 0/4] Some regex dead-code elimination Michal Nazarewicz
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).