From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: master 544db1e: Faster grep pattern for identifiers Date: Wed, 15 Sep 2021 21:39:36 +0300 Message-ID: References: <83h7elbzo3.fsf@gnu.org> <7b0409e3-fc88-b34e-9365-25356bb85859@yandex.ru> <83bl4tbxyu.fsf@gnu.org> <12215e07-af4e-2db7-1869-16ac92feb806@yandex.ru> <8335q5bt9b.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3585"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 Cc: mattiase@acm.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Sep 15 20:42:15 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mQZrL-0000mY-Ie for ged-emacs-devel@m.gmane-mx.org; Wed, 15 Sep 2021 20:42:15 +0200 Original-Received: from localhost ([::1]:47820 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mQZrK-0001La-Iz for ged-emacs-devel@m.gmane-mx.org; Wed, 15 Sep 2021 14:42:14 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35688) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mQZos-0006J4-O0 for emacs-devel@gnu.org; Wed, 15 Sep 2021 14:39:42 -0400 Original-Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]:33720) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mQZoq-00036d-P9; Wed, 15 Sep 2021 14:39:42 -0400 Original-Received: by mail-wm1-x32d.google.com with SMTP id 192-20020a1c04c9000000b002f7a4ab0a49so4443803wme.0; Wed, 15 Sep 2021 11:39:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=X5hORFvYiGsmjMFP2lBRPrpgJ7NSFJ576HKS2YCnSKc=; b=k0U2CKUA5OreCphVEl3lXQ2El7l0dU/GNX/nyUXjNRzqLkJtCQV3B8nB/Wu3tu6Y36 S8UqfZJMYCuATE7xkTHrDsJshH6aTuGoFXc3LCBIOhHKaVDb57iKBsyfBgh3UCtX7e5D n6ulvH6gjYD0+WS2nZeYhh77UUS6EWQ/hoMBtZqx5yaOolNhPM9UZbvEXE0W8dju3Arv nAiyB9Fm0Ual6Hykl5hy2UAbcppaeIgQhbQUcvUZHw6DXFhvjMCzC1/2KGcWWiIgJPvr X7oxm39JL3ICjiG7rnzGRfQzkNYoPg/LJTFmk3OrfJH31kXrfpjkhHXBg2ZZVUNdxeWc HwnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=X5hORFvYiGsmjMFP2lBRPrpgJ7NSFJ576HKS2YCnSKc=; b=cHOe/YbbvWDxiTTE8nZ3xB2Yo41MBAO2vZkoTs0F/BrvI7QIR5LP6qZG8flf2vX0LZ wcJqy0idaYmq0jyH3t6sutEsK/Xh7cCduxJ7nq3SJS13j8y+gdqK77mHtCyK3ArBy9L+ BMc8UojOYiGoqmVr1zsX6ym4cAXD3lgmeCBOvr4LielKpegsxobCMXGdAoF3Mo7gH0QA 0FW1a8rRfi+GI7o3XPUpLMuGKlF1Cd8LnhXOmTd8rPHNdhQyaUOqm4+WZRs4qcM0kZrt 1NQwX5oO+aFUR+HskLk9+FkXlTJaFRKVFlO4y0/Z8+5UytpXtDp4+S7uy+wUxJFqzZjR 1U4w== X-Gm-Message-State: AOAM532qoHD/aq9he+Ct65D6l8HSLUoSlS4LCuoHJTLbF47cQruaHj3c aDX7QJwqcEENKMDX7mVFXiw8v7YJ1hs= X-Google-Smtp-Source: ABdhPJwsE/rg0SIgRmnXcem4xqzRTX0OJIDgE4jhXujVAxFHxLWK0XN/vdJsgd0bcsHeF95HLniRZg== X-Received: by 2002:a05:600c:4f95:: with SMTP id n21mr5983642wmq.22.1631731178759; Wed, 15 Sep 2021 11:39:38 -0700 (PDT) Original-Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id u6sm916016wrp.0.2021.09.15.11.39.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Sep 2021 11:39:38 -0700 (PDT) In-Reply-To: <8335q5bt9b.fsf@gnu.org> Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=raaahh@gmail.com; helo=mail-wm1-x32d.google.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, NICE_REPLY_A=-1.698, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:274760 Archived-At: On 15.09.2021 21:14, Eli Zaretskii wrote: >> Cc:mattiase@acm.org,emacs-devel@gnu.org >> From: Dmitry Gutov >> Date: Wed, 15 Sep 2021 21:06:09 +0300 >> >> On 15.09.2021 19:33, Eli Zaretskii wrote: >>> And what about the "alternative Grep's"? >> The author of the commit uses one such Grep. >> >> I also tested with ripgrep, to similar success. > So they all have, miraculously, the same notion of what is a word, > regardless of the programming language and the script/character set? > Amazing. Not exactly (e.g. Grep includes international chars in the "word" set, and Ripgrep does not), but the notions of "not word" are compatible enough for our purposes. Speaking of Ripgrep, the compatible behavior of -w is only with recent versions (reported and fixed in https://github.com/BurntSushi/ripgrep/issues/389), starting with 0.10.0. Debian 10 and Fedora 31 include that versions or newer (https://repology.org/project/ripgrep/versions). Not that it's really important: we don't support Ripgrep officially.