From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Andreas Politz Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] add 'string-distance' to calculate Levenshtein distance Date: Sun, 15 Apr 2018 20:17:13 +0200 Message-ID: <87h8oce3me.fsf@hochschule-trier.de> References: <87vacuecrn.fsf@gmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1523816163 7350 195.159.176.226 (15 Apr 2018 18:16:03 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 15 Apr 2018 18:16:03 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Cc: Nathan Moreau , emacs-devel , Chen Bin To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Apr 15 20:15:58 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f7mBy-0001oh-9J for ged-emacs-devel@m.gmane.org; Sun, 15 Apr 2018 20:15:58 +0200 Original-Received: from localhost ([::1]:60832 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f7mE4-0006dE-OO for ged-emacs-devel@m.gmane.org; Sun, 15 Apr 2018 14:18:08 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f7mDN-0006cB-RF for emacs-devel@gnu.org; Sun, 15 Apr 2018 14:17:26 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f7mDH-0004Eh-Rw for emacs-devel@gnu.org; Sun, 15 Apr 2018 14:17:24 -0400 Original-Received: from gateway-a.fh-trier.de ([143.93.54.181]:57148) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f7mDH-0004By-Ht for emacs-devel@gnu.org; Sun, 15 Apr 2018 14:17:19 -0400 X-Virus-Scanned: by Amavisd-new + Sophos + ClamAV [Rechenzentrum Hochschule Trier (RZ/HT)] Original-Received: from localhost (ip5b43c23f.dynamic.kabel-deutschland.de [91.67.194.63]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: politza) by gateway-a.fh-trier.de (Postfix) with ESMTPSA id 47DC017A54DC; Sun, 15 Apr 2018 20:17:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=simple/simple; d=hochschule-trier.de; s=default; t=1523816234; bh=3nxRwSpSb1S9lbS2wDMyBULDWrE=; h=From:To:Cc:Subject:References:Date:In-Reply-To:Message-ID: MIME-Version:Content-Type; b=HY0+7bfWms+gGNQ64eUO2XL7lD7vv6GQXyD32sqmPSRANfMtPH8RZT08k/7RdDunV YdY5ZqtSCuKsau/c4ThLPHEzb5PUnVl0VzFedfsQrQYVaTmf2bffrzH4Zy6YisRBe2 a0hqEwXy4MnKtTwCNt1cEYwh8hhyoBsJxBN6K/Wk= In-Reply-To: (Paul Eggert's message of "Sat, 14 Apr 2018 10:36:49 -0700") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x [fuzzy] X-Received-From: 143.93.54.181 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:224623 Archived-At: Paul Eggert writes: > lib/diffseq.h uses the Myers-Ukkonen algorithm that scales better for > the common case where strings are closely related. If the two strings > are length N and their Levenshtein distance is D (where D is much less > than N), then lib/diffseq.h is O(N*D) whereas the proposed algorithm > is O(N**2). The Ukkonen algorithm also allows for D to be an input parameter in form of a maximal distance it should search for. This makes this observation even more important, since most callers, I presume, are only interested in string-pairs with distances below some threshold. Incidentally the (only) application of the mentioned Org function (org-babel-edit-distance) uses D=2 (in Emacs 25.3.1). Andreas