From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?iso-8859-1?Q?Jo=E3o_T=E1vora?= Newsgroups: gmane.emacs.devel Subject: Re: new-flex-completion-style Date: Thu, 14 Feb 2019 16:12:25 +0000 Message-ID: References: <20190202232827.27331.87300@vcs0.savannah.gnu.org> <20190202232828.4AE452159A@vcs0.savannah.gnu.org> <87lg2mynrg.fsf@gmail.com> <871s4czm5n.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="66693"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (windows-nt) Cc: Stefan Monnier , emacs-devel To: Daniel Pittman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Feb 14 17:12:47 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1guJcv-000H4j-Ao for ged-emacs-devel@m.gmane.org; Thu, 14 Feb 2019 17:12:41 +0100 Original-Received: from localhost ([127.0.0.1]:50904 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guJcu-0001ys-8W for ged-emacs-devel@m.gmane.org; Thu, 14 Feb 2019 11:12:40 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:39930) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guJcl-0001xL-HN for emacs-devel@gnu.org; Thu, 14 Feb 2019 11:12:32 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1guJck-00078H-JI for emacs-devel@gnu.org; Thu, 14 Feb 2019 11:12:31 -0500 Original-Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:35816) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1guJck-00077e-D6 for emacs-devel@gnu.org; Thu, 14 Feb 2019 11:12:30 -0500 Original-Received: by mail-wr1-x42e.google.com with SMTP id t18so7134022wrx.2 for ; Thu, 14 Feb 2019 08:12:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=tcXpQoHOD2SKwUFK/iVDljFdtAkojgG3tz7oSbPFBNU=; b=dAL5G9+KBQlqG9JSkOegbTQ3ElnXRoPAh69Urmi8MvXTSyN/K8Lf5AOWHzGgQjnB2P hJmCCeW4Qw1Cvlfxk+hzw3v6ffPpm8X/sbyJwFn1ROOV4K9ZsCky0iKy3FWRwox/YsjD Y9ScqlbvCoLDkO5R+Nlao/LIoZP1G1uyob6wDM0yWNWqYFW/UL5AZriLvX/SorO/zzo+ 1ITsOEACRDIuDjqvFliGIdtyOxrTRHhzZ4oRgtTfOcfpQgwERSah1a3M4K2uQSIrjMJV S1NddWbVR8FwPs7hydI9/1y1OMsvjk5w/esPnEvaa+nHwMDGBPV1w3N8DAKUbXbLBYaP KaTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=tcXpQoHOD2SKwUFK/iVDljFdtAkojgG3tz7oSbPFBNU=; b=B1w9ZueDGbHJOwPkGuBcTX/zyypN8mfz4/P4lZfPkKwRYcljgEfrNGeLPpIZQUmyhV GeiJtKXpKbqYy7NvzY0z0kqZJb70aixh/wqdc0kbwYnh/4RnQvzYWPBQ2e4ywO2Lcc+k xHN4FFeIgjSfWdnGwoI+jmpAIySI7oz0xBBb5OyfyPw7QvbYbmk9OS87S8/gN9Qtg1zD kz3RVxZanP2qx3AYn1GfJkQAITOR+E1CxxBJuK682R66wZcmdqRJVr/sx4PFWtFMNf2C aKR+2+k6sk7+tRSMic14L1CVP9KhJFlr1lGClKHlHTQkyOVCXmEHcmi8WcyjBO0ftXH0 t0Xg== X-Gm-Message-State: AHQUAubSbVGmEcZaeDUGXdNZUKoaxTFDxHa1ISbfY3S/o0RPYHS7e8Bj THVa0YPANt2QEUpsChJ29+os9P+C X-Google-Smtp-Source: AHgI3IavVIC9JK5X4AhreCPswYLCZ07uqCrNYNz5l36Xhb4rOEqsvw16vI3qr5UkCsTmjY+ayYi7nw== X-Received: by 2002:adf:e290:: with SMTP id v16mr3650255wri.100.1550160748534; Thu, 14 Feb 2019 08:12:28 -0800 (PST) Original-Received: from GONDOMAR.yourcompany.com (mail1.siscog.pt. [89.115.233.242]) by smtp.gmail.com with ESMTPSA id z1sm2427397wrw.28.2019.02.14.08.12.26 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 14 Feb 2019 08:12:27 -0800 (PST) In-Reply-To: (Daniel Pittman's message of "Thu, 14 Feb 2019 10:35:49 -0500") X-Antivirus: AVG (VPS 190214-2, 14-02-2019), Outbound message X-Antivirus-Status: Clean X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::42e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:233330 Archived-At: Daniel Pittman writes: > > Feel free to add some kind of normalization to the function > > if you can come up with it. Or come up with another function > > entirely. As I said, a function that measures the distance > > in "boring editing operations" between the pattern and the target > > would be a nicer measure. > > The Levenshtein distance and close variants generally do a reasonable > job of approximating human expectations, yes. Close variants (eg: > only delete/insert, no mutate) also work well. `string-distance' in Emacs is the Levenshtein distance and it does a terrible job. it says "foo" is as close to "foobar" as it is to "fboror", which is not acceptable. And neither is a variant like Damerau=96Levenshtein. I could possibly use a variant that computes things in "elementary boring editing operations", which includes moves. There is a fundamental misunderstanding on your side I think. The distance formulae exist to solve another class of problems, which include comparing strings of characters where one is _not_ a subsequence of the other (a subsequence being a not-necessarily-contiguous in-order subset of elements). Often these problems want to compute not only the distance _and_ but also the set of insertion, deletions, transposition that bring a string to another. This is why they are slow. But they do enable fancy things like auto-correct. Those occur in the super well-studied high-tech fields that you suggest I'm ignorant of. And while I'm certainly no expert on the matter, I do think none of that matters here. Emacs uses things like "string-match" to tell if a string matches another. If it doesn't it doesn't, and that's OK for "flex". If, for some reason, the regexp starts being slow for flex-matching, since it writes a regexp like ".*f.*o.*o.*", then it's very easy to optimize these flex patterns in C directly. A greedy implementation (which is fine) will be just O(n). So I don't know what speed problems you are referring to. Neither is it clear to me what fields your "fzf" suggestion would be "blazingly fast" (as it self-describes itself) and/or how plugging it into Emacs would solve anything. In short for "flex", at least for the "flex" I'm interested in, Emacs can separate the matching and scoring problem, so your worries do not make sense. This "flex" style even already exists in Emacs, but only in ido.el. That's what I'm trying to being to other areas of Emacs. On the other hand, if you're writing an "autocorrection" completion style then yes, these things could start making sense. But I'm not writing that. > PPS: the best modern tools seem to be derived from the problem of > genome sequence matching, but heuristics are definitely helpful too. > Again, fzf is the best version of that I have found. Does it autocorrect? If yes, it might be a good choice for that other completion style you may be describing. It's hardly a good choice for "flex". Jo=E3o