all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: bruce.connor.am@gmail.com, emacs-devel@gnu.org
Subject: Re: Character group folding in searches
Date: Mon, 09 Feb 2015 17:40:44 +0200	[thread overview]
Message-ID: <83k2zr9jpf.fsf@gnu.org> (raw)
In-Reply-To: <jwvzj8nyel3.fsf-monnier+emacs@gnu.org>

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: bruce.connor.am@gmail.com,  emacs-devel@gnu.org
> Date: Sun, 08 Feb 2015 22:03:08 -0500
> 
> > Char-tables are efficient, and at least for decomposition they seem to
> > be the perfect vehicle.  DFAs that come out of arbitrary regexps,
> > OTOH, can sometimes be very inefficient.  That's why I tend to think
> > about this in terms of char-tables.
> 
> That's a false dichotomy.

Actually, it's not a dichotomy at all.  I just explained why
char-tables seem to be a good basis on which to build this feature.

> DFA is about *recognizing* multi-char entities.  If the input
> entities you care about are only single-char (as is the case for
> decomposition), then your DFA will degenerate to a single char-table
> (as is the case now).

I think we have a miscommunication here.  I was talking about the
tables that are part of a DFA that drive its state machine.  Those
tables might become large and sparse, certainly if the input symbol
can be any Unicode character, most of which only match themselves.

I guess I'm still struggling to understand your idea of using DFAs.
E.g., you talk about each node of a DFA being a char-table, but AFAIK
a DFA node is just a state of the automaton, so how can that be
expressed as a char-table?  And above you are saying that a "DFA will
degenerate to a single char-table", which again is a stumbling block
for me, since a DFA is more than a table.  What am I missing?

> But how do you use current char-tables to handle multi-char input
> entities (i.e. to recognize things like "=>")?

I don't understand the question, sorry.  The simple answer is that a
char-table entry can be any Lisp object, including a string, but you
already know that.

If you mean how to compare "=>" with "⇒", then the latter will be
"folded" to the former using a char-table, and then the results will
be compared, either as strings or character by character.  Is this
what you were asking?

> > Who and how will create such a DFA?
> 
> They'd be mechanically constructed (by hand-written code), for example
> driven by the existing Unicode tables.

What would be the input language for specifying such a DFA?  I mean,
how would we specify which sequence of states are acceptable (yielding
a match for the search) and which aren't?




  reply	other threads:[~2015-02-09 15:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-06 13:04 Character group folding in searches Artur Malabarba
2015-02-06 14:32 ` Eli Zaretskii
2015-02-06 16:18   ` Artur Malabarba
2015-02-06 16:44     ` Eli Zaretskii
2015-02-06 18:03   ` Stefan Monnier
2015-02-06 19:03     ` Eli Zaretskii
2015-02-06 19:27       ` Artur Malabarba
2015-02-06 21:38         ` Eli Zaretskii
2015-02-06 22:08           ` Artur Malabarba
2015-02-07  8:38             ` Eli Zaretskii
2015-02-06 19:41       ` Stefan Monnier
2015-02-06 21:43         ` Eli Zaretskii
2015-02-07  0:05           ` Stefan Monnier
2015-02-07  8:47             ` Eli Zaretskii
2015-02-07 15:02               ` Stefan Monnier
2015-02-07 15:31                 ` Eli Zaretskii
2015-02-08 14:03                   ` Stefan Monnier
2015-02-08 19:12                     ` Eli Zaretskii
2015-02-09  3:03                       ` Stefan Monnier
2015-02-09 15:40                         ` Eli Zaretskii [this message]
2015-02-09 16:33                           ` Stefan Monnier
2015-02-09 17:39                             ` Eli Zaretskii
2015-02-10  2:15                               ` Stefan Monnier
2015-02-10 15:45                                 ` Eli Zaretskii
2015-02-07  0:07 ` Juri Linkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83k2zr9jpf.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=bruce.connor.am@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.