From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: chen bin Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] add 'string-distance' to calculate Levenshtein distance Date: Tue, 17 Apr 2018 22:31:20 +1000 Message-ID: References: <87vacuecrn.fsf@gmail.com> <83po3246ah.fsf@gnu.org> <87lgdq831h.fsf@gmail.com> <83muy553ae.fsf@gnu.org> <87o9ilhhcd.fsf@gmail.com> <83d0z14sws.fsf@gnu.org> <87o9il0wka.fsf@gmail.com> <83bmek4jdn.fsf@gnu.org> <83k1t72b2o.fsf@gnu.org> <83bmei36dw.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="00000000000016209d056a0a8470" X-Trace: blaine.gmane.org 1523968183 8676 195.159.176.226 (17 Apr 2018 12:29:43 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 17 Apr 2018 12:29:43 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Apr 17 14:29:39 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f8Pjr-00026H-KF for ged-emacs-devel@m.gmane.org; Tue, 17 Apr 2018 14:29:36 +0200 Original-Received: from localhost ([::1]:35868 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f8Plw-0007mJ-HE for ged-emacs-devel@m.gmane.org; Tue, 17 Apr 2018 08:31:44 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44872) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f8Plk-0007kW-F5 for emacs-devel@gnu.org; Tue, 17 Apr 2018 08:31:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f8Plj-0002qy-9w for emacs-devel@gnu.org; Tue, 17 Apr 2018 08:31:32 -0400 Original-Received: from mail-lf0-x231.google.com ([2a00:1450:4010:c07::231]:38253) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1f8Plb-0002oC-As; Tue, 17 Apr 2018 08:31:23 -0400 Original-Received: by mail-lf0-x231.google.com with SMTP id z130-v6so1511224lff.5; Tue, 17 Apr 2018 05:31:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=aaIXqJIoUmPv9thHRY+DTtdWib+PhKdWGDGpTUQG0aY=; b=Lzn11jCjYF8BMclg7cA/8KZclD+bpKFWD9GgG8iNJf6UESwWZVilWvs8IvGI39HH8G upBTo0u7eyUol8GaXXOk/4QDstmjI9fg9FltIXR2sKMSFh8SUpwOwtgdWj+QXR+Vda/C h6OwOAv1cBMzFivv9PwrRtgGAYgUX0XzZoVWlWZASfrH7KFtm/S1ynL0NNqxD56qadPS +94XLAWsoFivm7XFs7i1SOuuzcVcB6SJGlnvMBpQ0R65EGKRHIgOEWZytRgsOhnAgim+ d5h0PmdCsQsCeIQx6SD2xikhX2OgwBWkt/qz+BoyY/cBLmxCUhd1J7V/vQyZ1yJLtXFf XPkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=aaIXqJIoUmPv9thHRY+DTtdWib+PhKdWGDGpTUQG0aY=; b=Q1KnTYhEuIkY90bT6Mc0mSZfTlhA+T4unwmqXpLke1hA9N4zaQfxIMuvr0NYVE6/gA m5vMe95NQP5E52ptcLlw8hyWEcVZfuq/GW+hbxX7xk+8G0xhGF7GEc+GzudZwiN50Nr+ edqrUGawV5TNcFQiNYZKwW4M63ZN3DBq+VUP9GZtt6bXQ7F/mElX0EXxI7oP4e7q0bo/ 8jUjyZilx3CxOgAeuTCtNGqpQv8eL+Rm1lUnlx7Js66slXHVyGd7Bbqny9byehlgRVcC vIym1X3R4zCU5SzbOEPRLemKtVmxntTdM03wwtHvGreWsq/JXxkDtL4dbCuPTyi7KGJ7 j3tw== X-Gm-Message-State: ALQs6tBHUhPekxOaaoXWBvmQ2FAjg81ecTe2fA/LWW8S2ytaak4Kx9lv cW7ZR4cMRGwWLdjd7FYo7DkIGEV2318hcwMPWNFg8Q== X-Google-Smtp-Source: AIpwx49vd6irDA4SA8HCS3lqx/tglia4yt/hRzKaqb8BaJRuwkV5vX/jxiFGnkiiDiODidUaoet//rdqwqlwFlO2es4= X-Received: by 2002:a19:2585:: with SMTP id l127-v6mr1465150lfl.17.1523968281804; Tue, 17 Apr 2018 05:31:21 -0700 (PDT) Original-Received: by 10.46.20.93 with HTTP; Tue, 17 Apr 2018 05:31:20 -0700 (PDT) In-Reply-To: <83bmei36dw.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4010:c07::231 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:224700 Archived-At: --00000000000016209d056a0a8470 Content-Type: text/plain; charset="UTF-8" Hi, Eli, As you suggested, I re-write the code using 'FETCH_STRING_CHAR_ADVANCE'. I did some peformance test. My original wide character array solution is two time fast as current solution. But I understand now we avoid repeated memory allocation for wide character array. So the generic performance looks OK to me. I also implemented the byte comparing version. It's 4 times as fast. And I do need use it to compare file path in my package 'counsel-etags'. That's the major reason why I started this task The fille path couldn't contain any funny characters (emoji). so it'sperfectly fine to use byte comparing version. Regards, Chen On Tue, Apr 17, 2018 at 12:37 PM, Eli Zaretskii wrote: >> From: chen bin >> Date: Tue, 17 Apr 2018 11:32:23 +1000 >> >> I intentionally avoid cc `emacs-devel` in my previous mail because I >> feel `culture/race` is sensitive thing. > > Please don't. There's no sensitivity about this stuff on emacs-devel, > in my long experience with Emacs development. > >> I will study your mail and figure out a better solution. > > Thanks. Feel free to ask more questions, and sorry again for my > stupid typo mistake. > -- help me, help you. --00000000000016209d056a0a8470 Content-Type: text/x-patch; charset="UTF-8"; name="0001-add-api-string-distance.patch" Content-Disposition: attachment; filename="0001-add-api-string-distance.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_jg3mrgou0 RnJvbSBlNzg1Yjc2NmNmNDEyYzVkNmIwYzJkOGEzZjZlNTc1Njc1MzkwYzgzIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBDaGVuIEJpbiA8Y2hlbmJpbi5zaEBnbWFpbC5jb20+CkRhdGU6 IFN1biwgMTUgQXByIDIwMTggMDI6MjA6MjkgKzEwMDAKU3ViamVjdDogW1BBVENIXSBhZGQgYXBp IHN0cmluZy1kaXN0YW5jZQoKLS0tCiBldGMvTkVXUyAgICAgICAgICAgICAgICB8ICAyICsrCiBz cmMvZm5zLmMgICAgICAgICAgICAgICB8IDYxICsrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysKIHRlc3QvbGlzcC9zdWJyLXRlc3RzLmVsIHwgMTggKysrKysr KysrKysrKysrCiAzIGZpbGVzIGNoYW5nZWQsIDgxIGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQg YS9ldGMvTkVXUyBiL2V0Yy9ORVdTCmluZGV4IDVhYTkyZTI5OTEuLjNjY2UyYzQ4YzcgMTAwNjQ0 Ci0tLSBhL2V0Yy9ORVdTCisrKyBiL2V0Yy9ORVdTCkBAIC00OTAsNiArNDkwLDggQEAgeC1sb3N0 LXNlbGVjdGlvbi1ob29rcywgeC1zZW50LXNlbGVjdGlvbi1ob29rcwogKysrCiAqKiBOZXcgZnVu Y3Rpb24gYXNzb2MtZGVsZXRlLWFsbC4KIAorKiogTmV3IGZ1bmN0aW9uIHN0cmluZy1kaXN0YW5j ZSB0byBjYWxjdWxhdGUgTGV2ZW5zaHRlaW4gZGlzdGFuY2UgYmV0d2VlbiB0d28gc3RyaW5ncy4K KwogKiogJ3ByaW50LXF1b3RlZCcgbm93IGRlZmF1bHRzIHRvIHQsIHNvIGlmIHlvdSB3YW50IHRv IHNlZQogKHF1b3RlIHgpIGluc3RlYWQgb2YgJ3ggeW91IHdpbGwgaGF2ZSB0byBiaW5kIGl0IHRv IG5pbCB3aGVyZSBhcHBsaWNhYmxlLgogCmRpZmYgLS1naXQgYS9zcmMvZm5zLmMgYi9zcmMvZm5z LmMKaW5kZXggOTRiOWQ5ODRmMC4uY2U3NzliMmE3OSAxMDA2NDQKLS0tIGEvc3JjL2Zucy5jCisr KyBiL3NyYy9mbnMuYwpAQCAtMTUzLDYgKzE1Myw2NiBAQCBJZiBTVFJJTkcgaXMgbXVsdGlieXRl LCB0aGlzIG1heSBiZSBncmVhdGVyIHRoYW4gdGhlIGxlbmd0aCBvZiBTVFJJTkcuICAqLykKICAg cmV0dXJuIG1ha2VfbnVtYmVyIChTQllURVMgKHN0cmluZykpOwogfQogCitERUZVTiAoInN0cmlu Zy1kaXN0YW5jZSIsIEZzdHJpbmdfZGlzdGFuY2UsIFNzdHJpbmdfZGlzdGFuY2UsIDIsIDMsIDAs CisgICAgICAgZG9jOiAvKiBSZXR1cm4gTGV2ZW5zaHRlaW4gZGlzdGFuY2UgYmV0d2VlbiBTVFJJ TkcxIGFuZCBTVFJJTkcyLgorSWYgQllURUNPTVBBUkUgaXMgbmlsLCB3ZSBjb21wYXJlIGNoYXJh Y3RlciBvZiBzdHJpbmdzLgorSWYgQllURUNPTVBBUkUgaXMgdCwgd2UgY29tcGFyZSBieXRlIG9m IHN0cmluZ3MuCitDb21wYXJpbmcgYnkgYnl0ZSBpcyBmYXN0ZXIgYW5kIG5vbi1hc2NpaSBjaGFy YWN0ZXJzIGhhcyB3ZWlnaHRlZCBkaXN0YW5jZS4KK0Nhc2UgaXMgc2lnbmlmaWNhbnQsIGJ1dCB0 ZXh0IHByb3BlcnRpZXMgYXJlIGlnbm9yZWQuICovKQorICAoTGlzcF9PYmplY3Qgc3RyaW5nMSwg TGlzcF9PYmplY3Qgc3RyaW5nMiwgTGlzcF9PYmplY3QgYnl0ZWNvbXBhcmUpCit7CisgIENIRUNL X1NUUklORyAoc3RyaW5nMSk7CisgIENIRUNLX1NUUklORyAoc3RyaW5nMik7CisKKyAgYm9vbCB1 c2VfYnl0ZWNvbXBhcmUgPSAhTklMUChieXRlY29tcGFyZSk7CisgIHB0cmRpZmZfdCBsZW4xID0g U0NIQVJTIChzdHJpbmcxKTsKKyAgcHRyZGlmZl90IGxlbjIgPSBTQ0hBUlMgKHN0cmluZzIpOwor ICBwdHJkaWZmX3QgeCwgeSwgbGFzdGRpYWcsIG9sZGRpYWc7CisKKyAgVVNFX1NBRkVfQUxMT0NB OworICBwdHJkaWZmX3QgKmNvbHVtbiA9IFNBRkVfQUxMT0NBICgobGVuMSArIDEpICogc2l6ZW9m IChwdHJkaWZmX3QpKTsKKyAgZm9yICh5ID0gMTsgeSA8PSBsZW4xOyB5KyspCisgICAgY29sdW1u W3ldID0geTsKKworICBpZiAodXNlX2J5dGVjb21wYXJlKQorICAgIHsKKyAgICAgIGNoYXIgKnMx ID0gU1NEQVRBIChzdHJpbmcxKTsKKyAgICAgIGNoYXIgKnMyID0gU1NEQVRBIChzdHJpbmcyKTsK KworICAgICAgZm9yICh4ID0gMTsgeCA8PSBsZW4yOyB4KyspCisgICAgICAgIHsKKyAgICAgICAg ICBjb2x1bW5bMF0gPSB4OworICAgICAgICAgIGZvciAoeSA9IDEsIGxhc3RkaWFnID0geCAtIDE7 IHkgPD0gbGVuMTsgeSsrKQorICAgICAgICAgICAgeworICAgICAgICAgICAgICBvbGRkaWFnID0g Y29sdW1uW3ldOworICAgICAgICAgICAgICBjb2x1bW5beV0gPSBtaW4gKG1pbiAoY29sdW1uW3ld ICsgMSwgY29sdW1uW3ktMV0gKyAxKSwgbGFzdGRpYWcgKyAoczFbeS0xXSA9PSBzMlt4LTFdPyAw IDogMSkpOworICAgICAgICAgICAgICBsYXN0ZGlhZyA9IG9sZGRpYWc7CisgICAgICAgICAgICB9 CisgICAgICAgIH0KKyAgICB9CisgIGVsc2UKKyAgICB7CisgICAgICBpbnQgYzEsIGMyOworICAg ICAgcHRyZGlmZl90IGkxLCBpMV9ieXRlLCBpMiwgaTJfYnl0ZTsKKyAgICAgIGkyID0gaTJfYnl0 ZSA9IDA7CisgICAgICBmb3IgKHggPSAxOyB4IDw9IGxlbjI7IHgrKykKKyAgICAgICAgeworICAg ICAgICAgIGNvbHVtblswXSA9IHg7CisgICAgICAgICAgRkVUQ0hfU1RSSU5HX0NIQVJfQURWQU5D RSAoYzIsIHN0cmluZzIsIGkyLCBpMl9ieXRlKTsKKyAgICAgICAgICBpMSA9IGkxX2J5dGUgPSAw OworICAgICAgICAgIGZvciAoeSA9IDEsIGxhc3RkaWFnID0geCAtIDE7IHkgPD0gbGVuMTsgeSsr KQorICAgICAgICAgICAgeworICAgICAgICAgICAgICBvbGRkaWFnID0gY29sdW1uW3ldOworICAg ICAgICAgICAgICBGRVRDSF9TVFJJTkdfQ0hBUl9BRFZBTkNFIChjMSwgc3RyaW5nMSwgaTEsIGkx X2J5dGUpOworICAgICAgICAgICAgICBjb2x1bW5beV0gPSBtaW4gKG1pbiAoY29sdW1uW3ldICsg MSwgY29sdW1uW3ktMV0gKyAxKSwgbGFzdGRpYWcgKyAoYzEgPT0gYzI/IDAgOiAxKSk7CisgICAg ICAgICAgICAgIGxhc3RkaWFnID0gb2xkZGlhZzsKKyAgICAgICAgICAgIH0KKyAgICAgICAgfQor ICAgIH0KKyAgU0FGRV9GUkVFICgpOworICByZXR1cm4gbWFrZV9udW1iZXIgKGNvbHVtbltsZW4x XSk7Cit9CisKIERFRlVOICgic3RyaW5nLWVxdWFsIiwgRnN0cmluZ19lcXVhbCwgU3N0cmluZ19l cXVhbCwgMiwgMiwgMCwKICAgICAgICBkb2M6IC8qIFJldHVybiB0IGlmIHR3byBzdHJpbmdzIGhh dmUgaWRlbnRpY2FsIGNvbnRlbnRzLgogQ2FzZSBpcyBzaWduaWZpY2FudCwgYnV0IHRleHQgcHJv cGVydGllcyBhcmUgaWdub3JlZC4KQEAgLTUyMjYsNiArNTI4Niw3IEBAIHRoaXMgdmFyaWFibGUu ICAqLyk7CiAgIGRlZnN1YnIgKCZTbGVuZ3RoKTsKICAgZGVmc3ViciAoJlNzYWZlX2xlbmd0aCk7 CiAgIGRlZnN1YnIgKCZTc3RyaW5nX2J5dGVzKTsKKyAgZGVmc3ViciAoJlNzdHJpbmdfZGlzdGFu Y2UpOwogICBkZWZzdWJyICgmU3N0cmluZ19lcXVhbCk7CiAgIGRlZnN1YnIgKCZTY29tcGFyZV9z dHJpbmdzKTsKICAgZGVmc3ViciAoJlNzdHJpbmdfbGVzc3ApOwpkaWZmIC0tZ2l0IGEvdGVzdC9s aXNwL3N1YnItdGVzdHMuZWwgYi90ZXN0L2xpc3Avc3Vici10ZXN0cy5lbAppbmRleCA1MmI2MWQ5 ZmI5Li4xMzEwZjViMmE1IDEwMDY0NAotLS0gYS90ZXN0L2xpc3Avc3Vici10ZXN0cy5lbAorKysg Yi90ZXN0L2xpc3Avc3Vici10ZXN0cy5lbApAQCAtMjgxLDYgKzI4MSwyNCBAQCBzdWJyLXRlc3Qt LWZyYW1lcy0xCiAgIChzaG91bGQgKGVxdWFsIChzdHJpbmctbWF0Y2gtcCAiXFxgW1s6Ymxhbms6 XV1cXCciICJcdTMwMDAiKSAwKSkKICAgKHNob3VsZC1ub3QgKHN0cmluZy1tYXRjaC1wICJcXGBb WzpibGFuazpdXVxcJyIgIlxOe0xJTkUgU0VQQVJBVE9SfSIpKSkKIAorKGVydC1kZWZ0ZXN0IHN1 YnItdGVzdHMtLXN0cmluZy1kaXN0YW5jZSAoKQorICAiVGVzdCBgc3RyaW5nLWRpc3RhbmNlJyBi ZWhhdmlvci4iCisgIDs7IEFTQ0lJIGNoYXJhY3RlcnMgYXJlIGFsd2F5cyBmaW5lCisgIChzaG91 bGQgKGVxdWFsIDEgKHN0cmluZy1kaXN0YW5jZSAiaGVlbG8iICJoZWxsbyIpKSkKKyAgKHNob3Vs ZCAoZXF1YWwgMiAoc3RyaW5nLWRpc3RhbmNlICJhZWVsbyIgImhlbGxvIikpKQorICAoc2hvdWxk IChlcXVhbCAwIChzdHJpbmctZGlzdGFuY2UgImFiIiAiYWIiIHQpKSkKKyAgKHNob3VsZCAoZXF1 YWwgMSAoc3RyaW5nLWRpc3RhbmNlICJhYiIgImFiYyIgdCkpKQorCisgIDs7IHN0cmluZyBjb250 YWluaW5nIGhhbnppIGNoYXJhY3RlciwgY29tcGFyZSBieSBieXRlCisgIChzaG91bGQgKGVxdWFs IDYgKHN0cmluZy1kaXN0YW5jZSAiYWIiICJhYuaIkeWluSIgdCkpKQorICAoc2hvdWxkIChlcXVh bCAzIChzdHJpbmctZGlzdGFuY2UgImFiIiAiYeaIkWIiIHQpKSkKKyAgKHNob3VsZCAoZXF1YWwg MyAoc3RyaW5nLWRpc3RhbmNlICLmiJEiICLlpbkiIHQpKQorCisgIDs7IHN0cmluZyBjb250YWlu aW5nIGhhbnppIGNoYXJhY3RlciwgY29tcGFyZSBieSBjaGFyYWN0ZXIKKyAgKHNob3VsZCAoZXF1 YWwgMiAoc3RyaW5nLWRpc3RhbmNlICJhYiIgImFi5oiR5aW5IikpKQorICAoc2hvdWxkIChlcXVh bCAxIChzdHJpbmctZGlzdGFuY2UgImFiIiAiYeaIkWIiKSkpCisgIChzaG91bGQgKGVxdWFsIDEg KHN0cmluZy1kaXN0YW5jZSAi5oiRIiAi5aW5IikpKSkKKwogKGVydC1kZWZ0ZXN0IHN1YnItdGVz dHMtLWRvbGlzdC0td3JvbmctbnVtYmVyLW9mLWFyZ3MgKCkKICAgIlRlc3QgdGhhdCBgZG9saXN0 JyBkb2Vzbid0IGFjY2VwdCB3cm9uZyB0eXBlcyBvciBsZW5ndGggb2YgU1BFQywKIGNmLiBCdWcj MjU0NzcuIgotLSAKMi4xNi4zCgo= --00000000000016209d056a0a8470--