Eli Zaretskii writes: >> I thought it would be easier to document the limit if it's fixed across >> all machines. Otherwise we would have to say something like "For both >> forms, m and n, if specified, may be no larger than INT_MAX, which is >> usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used >> for building Emacs". > > Isn't int 32 bit wide everywhere? I might have been mixing up int with long when I was thinking about this; it seems only a few very obscure platforms have 64 bit ints. According to [1], everywhere but "HAL Computer Systems port of Solaris to the SPARC64" and "Classic UNICOS" has 32 bit ints. [1]: https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models > And anyway, since the bitmap is stored in an int, isn't INT_MAX TRT? Unfortunately, all this discussion of int size seems to be academic. I took another look at the code, there is another limit due to regexp opcode format. We can raise the limit to 2^16-1 though. Here is the use of RE_DUP_MAX, which makes it seem like int-size is the main limit: /* Get the next unsigned number in the uncompiled pattern. */ #define GET_INTERVAL_COUNT(num) \ ... if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num) \ FREE_STACK_RETURN (REG_ESIZEBR); \ static reg_errcode_t regex_compile (const_re_char *pattern, size_t size, { ... int lower_bound = 0, upper_bound = -1; [...] GET_INTERVAL_COUNT (lower_bound); But then INSERT_JUMP2 (succeed_n, laststart, b + 5 + nbytes, lower_bound); /* Like `STORE_JUMP2', but for inserting. Assume `b' is the buffer end. */ #define INSERT_JUMP2(op, loc, to, arg) \ insert_op2 (op, loc, (to) - (loc) - 3, arg, b) /* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2. */ ^^^^^^^^ static void insert_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2, unsigned char *end) { ... store_op2 (op, loc, arg1, arg2); } /* Like `store_op1', but for two two-byte parameters ARG1 and ARG2. */ ^^^^^^^^ static void store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2) { *loc = (unsigned char) op; STORE_NUMBER (loc + 1, arg1); STORE_NUMBER (loc + 3, arg2); } /* Store NUMBER in two contiguous bytes starting at DESTINATION. */ ^^^^^^^^^^^^^^^^^^^^ #define STORE_NUMBER(destination, number) \ do { \ (destination)[0] = (number) & 0377; \ (destination)[1] = (number) >> 8; \ } while (0) Here is the updated patch: