From: Eli Zaretskii <eliz@gnu.org>
To: bruce.connor.am@gmail.com
Cc: clement.pit@gmail.com, emacs-devel@gnu.org
Subject: Re: Char-folding: how can we implement matching multiple characters as a single "thing"?
Date: Tue, 01 Dec 2015 17:50:12 +0200 [thread overview]
Message-ID: <837fkykw23.fsf@gnu.org> (raw)
In-Reply-To: <CAAdUY-JH9=+mFMbqHotjPNKd3RmBfJF1MVqStvYu=18gVhjqvA@mail.gmail.com>
> Date: Tue, 1 Dec 2015 14:18:30 +0000
> From: Artur Malabarba <bruce.connor.am@gmail.com>
>
> There's also a 3rd option. I posted some code here a while ago that
> implemented char-folding by temporarily replacing the
> (current-case-table) with a char-fold-table. This was fast, and much
> nicer than the current regexps, but it had the limitation of only
> being a character-to-character relation. So it couldn't do something
> as basic as 'a' matching "ä" (because that's 1 char matching 2).
>
> However, it's possible that we could combine the two solutions, using
> this case-table for as much as possible and then using regexps for
> anything else. This way the regexp pattern that replaces each input
> character would likely be considerably smaller than 45 chars (I'd
> guess between 3 and 15 depending on the character).
> The number of branches would still scale badly with the input string
> size. but the smaller multiplicative factor should give us more leeway
> before scaling up to 10k chars.
My gut feeling is that if we go to the C level, we should implement
this properly. Coding another partial solution will almost certainly
bump into some subtle limitations. In particular, any solution that
requires a literal search to use regexps under the hood will present
restrictions, because it will not play well with other regexp-based
features, like word search and C-M-s itself.
next prev parent reply other threads:[~2015-12-01 15:50 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-30 15:54 Char-folding: how can we implement matching multiple characters as a single "thing"? Artur Malabarba
2015-11-30 16:12 ` Paul Eggert
2015-11-30 16:49 ` Clément Pit--Claudel
2015-11-30 17:55 ` Eli Zaretskii
2015-11-30 21:48 ` John Wiegley
2015-12-01 14:18 ` Artur Malabarba
2015-12-01 15:50 ` Eli Zaretskii [this message]
2015-12-01 16:31 ` GIT mirror of Lisp dev sources [was: Char-folding: how can we implement matching...] Drew Adams
2015-12-01 16:43 ` Steinar Bang
2015-12-01 17:14 ` Drew Adams
2015-12-01 17:32 ` Artur Malabarba
2015-12-01 18:03 ` Drew Adams
2015-12-01 18:29 ` Karl Fogel
2015-12-01 18:52 ` Artur Malabarba
2015-12-01 21:18 ` Drew Adams
2015-12-01 23:37 ` Artur Malabarba
2015-12-02 0:14 ` Drew Adams
2015-12-02 0:59 ` Artur Malabarba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=837fkykw23.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=bruce.connor.am@gmail.com \
--cc=clement.pit@gmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.