From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Artur Malabarba Newsgroups: gmane.emacs.devel Subject: Re: Single quotes in Info Date: Tue, 27 Jan 2015 18:24:09 -0200 Message-ID: References: <87twzhgk84.fsf@wmi.amu.edu.pl> <83lhksshdm.fsf@gnu.org> <9ee0c895-a178-40e1-b1c8-ed2b97071c6b@default> <87h9vgglkz.fsf@wmi.amu.edu.pl> <83h9vcp0bq.fsf@gnu.org> Reply-To: bruce.connor.am@gmail.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a113cde088dbd56050da80982 X-Trace: ger.gmane.org 1422390263 10198 80.91.229.3 (27 Jan 2015 20:24:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 27 Jan 2015 20:24:23 +0000 (UTC) Cc: emacs-devel , Marcin Borkowski To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jan 27 21:24:22 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YGCgP-000200-6d for ged-emacs-devel@m.gmane.org; Tue, 27 Jan 2015 21:24:21 +0100 Original-Received: from localhost ([::1]:49798 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGCgO-0003hv-E0 for ged-emacs-devel@m.gmane.org; Tue, 27 Jan 2015 15:24:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45876) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGCgI-0003ga-2o for emacs-devel@gnu.org; Tue, 27 Jan 2015 15:24:15 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YGCgG-0006ZL-UR for emacs-devel@gnu.org; Tue, 27 Jan 2015 15:24:14 -0500 Original-Received: from mail-ob0-x234.google.com ([2607:f8b0:4003:c01::234]:56501) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGCgE-0006Yu-62; Tue, 27 Jan 2015 15:24:10 -0500 Original-Received: by mail-ob0-f180.google.com with SMTP id uz6so15559252obc.11; Tue, 27 Jan 2015 12:24:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=KkRjBYgPGKYSMVvr622saEcwG2fPtP7zgqGH1ql1qsE=; b=nU8Od6S8va5tLoEZxfHuOk73QXdsd8CxC0BNu38f3jhS6U9sOZfsxob5jLoV30Mx8R DDJX8TLoOy73DlokghQKBCqMem24SW6Yp8L+RaFQy8A37RMFpKVNbnAXlnjmpGP5bL1c VMe2Lv8snY45PG5a71h00qhuTGMfXwS0bX6LHKbG72wYR1s7Lc6JiPQsH9o6wHvLudES t31yJm2B0sWcV3S6UjSA9iFvC9iC2itkL/P707NSAd3v2YMD16sYQoyF5HwOqbgQYt+n tkvrm14WZAOUFFDfx5NLxIJLEWDreP0t+THyPvZ8BYL7RBYczj2qEaJ/TL0JDilqR9gX 6d5w== X-Received: by 10.202.171.69 with SMTP id u66mr1799248oie.27.1422390249502; Tue, 27 Jan 2015 12:24:09 -0800 (PST) Original-Received: by 10.76.125.1 with HTTP; Tue, 27 Jan 2015 12:24:09 -0800 (PST) Original-Received: by 10.76.125.1 with HTTP; Tue, 27 Jan 2015 12:24:09 -0800 (PST) In-Reply-To: <83h9vcp0bq.fsf@gnu.org> X-Google-Sender-Auth: sGfER7Ptg51bJg2JV06d7GGRL5U X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4003:c01::234 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:181855 Archived-At: --001a113cde088dbd56050da80982 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > If this is implemented in isearch, then IMO doing it for quotes alone > makes very little sense. The quotes are just proof of concept. Adding other equivalency classes is easy from here, and I do agree it makes sense to add others. > It would make a lot of sense if it were > implemented in info.el, for searching Info manuals There are ways to do that too if people prefer, but info manuals are not the only ones that contain such characters. For instance, lots of people use round quotes in org-mode files. > (in which case it > should also support the other Unicode characters produced by makeinfo > that have ASCII equivalents, like =E2=87=92 vs =3D>. (Note that this is n= ot > character-for-character equivalence anymore.) I agree with the idea, but it will be more tricky. Translating a character to any regexp is easy right now. Translating multiple characters into a single is more complicated, but I can do that. But I'm worried about the performance of that. > If we do this via our private database, that database is going to be > huge. Is it? I would expect something on the order of 50 lines. That would be large, but not huge. Each entry relates a key from a simple keyboard to a set of possible characters that are not represented in simple keyboards. But maybe I'm just being naive. > I suggest to explore an alternative implementation, which uses > canonical equivalence. I'd love that. > We already have infrastructure for that, see > the description of the 'decomposition' character property in the ELisp > manual. Building this on preexisting infrastructure would be great, but does that go the right way? Does it relate a simple character to all its complex equivalents? Or does it relate each complex character to a simple alternative? --001a113cde088dbd56050da80982 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

> If this is implemented in isearch, then IMO doing it fo= r quotes alone
> makes very little sense.

The quotes are just proof of concept. Adding other equivalen= cy classes is easy from here, and I do agree it makes sense to add others.<= /p>

> It would make a lot of sense if it were
> implemented in info.el, for searching Info manuals

There are ways to do that too if people prefer, but info man= uals are not the only ones that contain such characters.
For instance, lots of people use round quotes in org-mode files.

> (in which case it
> should also support the other Unicode characters produced by makeinfo<= br> > that have ASCII equivalents, like =E2=87=92 vs =3D>. (Note that thi= s is not
> character-for-character equivalence anymore.)

I agree with the idea, but it will be more tricky. Translati= ng a character to any regexp is easy right now. Translating multiple charac= ters into a single is more complicated, but I can do that. But I'm worr= ied about the performance of that.

> If we do this via our private database, that database i= s going to be
> huge.

Is it? I would expect something on the order of 50 lines. Th= at would be large, but not huge. Each entry relates a key from a simple key= board to a set of possible characters that are not represented in simple ke= yboards. But maybe I'm just being naive.

> I suggest to explore an alternative implementation, whi= ch uses
> canonical equivalence.

I'd love that.

> We already have infrastructure for that, see
> the description of the 'decomposition' character property in t= he ELisp
> manual.

Building this on preexisting infrastructure would be great, = but does that go the right way? Does it relate a simple character to all it= s complex equivalents? Or does it relate each complex character to a simple= alternative?

--001a113cde088dbd56050da80982--