From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] add 'string-distance' to calculate Levenshtein distance Date: Sat, 21 Apr 2018 23:47:30 +0300 Organization: LINKOV.NET Message-ID: <87muxwl21p.fsf@mail.linkov.net> References: <87vacuecrn.fsf@gmail.com> <83po3246ah.fsf@gnu.org> <87lgdq831h.fsf@gmail.com> <83muy553ae.fsf@gnu.org> <87o9ilhhcd.fsf@gmail.com> <83d0z14sws.fsf@gnu.org> <87o9il0wka.fsf@gmail.com> <83bmek4jdn.fsf@gnu.org> <83k1t72b2o.fsf@gnu.org> <83bmei36dw.fsf@gnu.org> <83wox3zkm7.fsf@gnu.org> <83fu3pxbu9.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1524344243 14334 195.159.176.226 (21 Apr 2018 20:57:23 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 21 Apr 2018 20:57:23 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) Cc: chen bin , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Apr 21 22:57:19 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f9zZO-0003cU-Q9 for ged-emacs-devel@m.gmane.org; Sat, 21 Apr 2018 22:57:18 +0200 Original-Received: from localhost ([::1]:46057 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f9zbV-0005oa-Kh for ged-emacs-devel@m.gmane.org; Sat, 21 Apr 2018 16:59:29 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f9zZV-00054d-5r for emacs-devel@gnu.org; Sat, 21 Apr 2018 16:57:26 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f9zZU-0006ZD-Ft for emacs-devel@gnu.org; Sat, 21 Apr 2018 16:57:25 -0400 Original-Received: from sub3.mail.dreamhost.com ([69.163.253.7]:42689 helo=homiemail-a100.g.dreamhost.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f9zZT-0006XV-3e; Sat, 21 Apr 2018 16:57:23 -0400 Original-Received: from homiemail-a100.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a100.g.dreamhost.com (Postfix) with ESMTP id 1741D31A073; Sat, 21 Apr 2018 13:57:22 -0700 (PDT) Original-Received: from localhost.linkov.net (m91-129-110-22.cust.tele2.ee [91.129.110.22]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by homiemail-a100.g.dreamhost.com (Postfix) with ESMTPSA id 048CA31A061; Sat, 21 Apr 2018 13:57:20 -0700 (PDT) In-Reply-To: <83fu3pxbu9.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 21 Apr 2018 10:22:54 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy] X-Received-From: 69.163.253.7 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:224775 Archived-At: >> Attached is latest patch. It will use byte compare aglorithm if both >> strings are not multi-byte strings. > > Thanks, let's give people a few days to comment. I have no comments on this implementation, but I wonder if it's possible to adapt this algorithm to implement fuzzy search with the distance? This supposes there is a customizable option that defines a preferred default distance, and then the search functions search text in the buffer to find the strings similar to the search string. I don't know if it the same how is implemented for =E2=80=9CSimilarity search=E2=80=9D in Li= breOffice Writer. But it seems this could find the same text as char-fold search matching equivalent characters and ignoring diacritics or glyphless characters like =E2=80=9Cword joiner=E2=80=9D.