From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: chad Newsgroups: gmane.emacs.devel Subject: Re: [feature/internal-msys] thoughts of a more function windows package Date: Mon, 25 Jan 2021 14:13:15 -0800 Message-ID: References: <87pn2dq3xv.fsf@russet.org.uk> <83ft39hnk1.fsf@gnu.org> <87h7nppzjy.fsf@russet.org.uk> <838s90hhb6.fsf@gnu.org> <87zh1gircl.fsf@russet.org.uk> <83turofw8r.fsf@gnu.org> <87sg6v76fd.fsf_-_@russet.org.uk> <83czxy7530.fsf@gnu.org> <87zh12grzh.fsf@russet.org.uk> <83wnw659kw.fsf@gnu.org> <87o8higedm.fsf@russet.org.uk> <83sg6t5t86.fsf@gnu.org> <87bldh7xue.fsf@russet.org.uk> <83bldg6h0o.fsf@gnu.org> <87a6sygf0s.fsf@russet.org.uk> <87a6sydjvw.fsf@telefonica.net> <871re9nc3e.fsf@russet.org.uk> <83k0s12g6b.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000d6115505b9c0d82b" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="526"; mail-complaints-to="usenet@ciao.gmane.io" Cc: ofv@wanadoo.es, Eli Zaretskii , Phillip Lord , EMACS development team To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Jan 25 23:14:15 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1l4A7i-000Aa1-Hx for ged-emacs-devel@m.gmane-mx.org; Mon, 25 Jan 2021 23:14:14 +0100 Original-Received: from localhost ([::1]:52434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l4A7h-0001sy-Hc for ged-emacs-devel@m.gmane-mx.org; Mon, 25 Jan 2021 17:14:13 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55868) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l4A70-0001Tb-HQ for emacs-devel@gnu.org; Mon, 25 Jan 2021 17:13:31 -0500 Original-Received: from mail-yb1-xb35.google.com ([2607:f8b0:4864:20::b35]:45763) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1l4A6x-0008OQ-Tq; Mon, 25 Jan 2021 17:13:30 -0500 Original-Received: by mail-yb1-xb35.google.com with SMTP id e67so14827862ybc.12; Mon, 25 Jan 2021 14:13:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=F7N+kCJukPAK4yT2UsZaBTcdF8FpWkTAZMkHgQAghTk=; b=mcHB/53jjSUXX/tjS4DIlpHZkP6NZ0jjQon7rsqiijsEsJgY4/V4BkiZhvkOCOL3fr Nrun16CUqWNBxuBV3LvoJMqwX8PFNfantihira/9Ij4FdLs2pGgIsC1z2shcKY5CXnZL B6rz/z4ccGtIzKf6QNJKeKVh469sljOgUHgb+blnhEWGH2GJceMdRwsRiJVpsQ4Wys5R nau22QeKAylMwfN5JV650CeDt0Sl8X005v6cHn8yB6E0vKynOSdySROavs+WiQSWdrQX fzKfdrLy1TIsJq+ZYGsl4vSi3ZlW500dmY51ZUuDw5vsF9BHM3EnOU6v7n5/YUDWgxR4 +3Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F7N+kCJukPAK4yT2UsZaBTcdF8FpWkTAZMkHgQAghTk=; b=JvthJtyLR0l0SchIcGgMFp87bd1lwLanAME6JIoD0MkQ6+nP8Vy1hw6S+IXCf/fAYG SQ3y8rGcdgFJA74jEuCEB7kxjD1z0lP6YQR0+ac7LVlYyGL0+h64Lrr66aNeKhPiBYO8 UUz60qv/Y4Hxalt73YmESSVKyIv92o61VpDEmhr8YJv9Y60GDn/sudlHTOK4nIQ2Dxx5 xyF9qPoli4wnSkeXoZqyhnSUKA28zMmwVs+KBR5IqUxzZQyZuRz5mXCiIEFclSbFdXYh ysK6jkBRzLISOFcPB+ZNnV0uWdpuxO8QJJNBKHQTFACIFaLvOwX6ArsvdYbt7cbxFEsp Ilkw== X-Gm-Message-State: AOAM532nhZv6GtWHRE+oi/7DaGVbAjwc55k5FlZsQyLKc4bFBFP8BTwF +7JlRCfO9UobbEUpZatnWFrUj9B8sI0TqSUPUmY= X-Google-Smtp-Source: ABdhPJyJnUjTNFMt8PV4/6LpJy2IwJvfJwa/1pO5lv7HDUtVbsoWxBv83/R7nDVnI2aRmv3mPTqCZBChcAKYNNW31wo= X-Received: by 2002:a25:6951:: with SMTP id e78mr4237141ybc.51.1611612806319; Mon, 25 Jan 2021 14:13:26 -0800 (PST) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::b35; envelope-from=yandros@gmail.com; helo=mail-yb1-xb35.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:263426 Archived-At: --000000000000d6115505b9c0d82b Content-Type: text/plain; charset="UTF-8" On Mon, Jan 25, 2021 at 12:42 PM Stefan Monnier wrote: > > Tangent, but: you might be interested in fd-find, which is an > > implementation of (paraphrasing here) about 80% of find that's usually > > several times faster. > > Any hint *why* it's faster? > At this point I'm just passing along what I've read on the project pages, but it claims most of the speed comes from leveraging the speed efforts of ripgrep's component pieces (it's rust-based regex and ignore packages). There are two relevant chunks from ripgrep's project page; a blog post with details and benchmarks, and a summary on ripgrep's project page. Blog post: https://blog.burntsushi.net/ripgrep/ Summarizing, ripgrep is fast because: > > - It is built on top of Rust's regex engine > . Rust's regex engine uses finite > automata, SIMD and aggressive literal optimizations to make searching very > fast. (PCRE2 support can be opted into with the -P/--pcre2 flag.) > > > - Rust's regex library maintains performance with full Unicode support > by building UTF-8 decoding directly into its deterministic finite automaton > engine. > > > - It supports searching with either memory maps or by searching > incrementally with an intermediate buffer. The former is better for single > files and the latter is better for large directories. ripgrep chooses the > best searching strategy for you automatically. > > > - Applies your ignore patterns in .gitignore files using a RegexSet > . That means a > single file path can be matched against multiple glob patterns > simultaneously. > > > - It uses a lock-free parallel recursive directory iterator, courtesy > of crossbeam and ignore > . > > Hope that helps, ~Chad --000000000000d6115505b9c0d82b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Jan 25, 2021 at 12:42 PM Stefan Monni= er <monnier@iro.umontreal.ca= > wrote:
= > Tangent, but: you might be interested in fd-find, which is an
> implementation of=C2=A0 (paraphrasing here) about 80% of find that'= ;s usually
> several times faster.

Any hint *why* it's faster?

At this= point I'm just passing along what I've read on the project pages, = but it claims most of the speed comes from leveraging the speed efforts of = ripgrep's component pieces (it's rust-based regex and ignore packag= es). There are two relevant chunks from ripgrep's=C2=A0project page; a = blog post with details and benchmarks, and a summary on ripgrep's=C2=A0= project page.

<= div>
Summar= izing, ripgrep is fast because:
  • It is built on top of=C2=A0Rust's regex engine= . Rust's regex engine uses finite automata, SIMD and aggressive lit= eral optimizations to make searching very fast. (PCRE2 support can be opted= into with the=C2=A0-P/--pcre2=C2= =A0flag.)
  • Rust's regex library maint= ains performance with full Unicode support by building UTF-8 decoding direc= tly into its deterministic finite automaton engine.
  • It supports searching with either memory maps or by searching i= ncrementally with an intermediate buffer. The former is better for single f= iles and the latter is better for large directories. ripgrep chooses the be= st searching strategy for you automatically.
  • Applies your ignore patterns in=C2=A0.gitignore=C2=A0files using a=C2=A0RegexSet. That means a single file path can b= e matched against multiple glob patterns simultaneously.
  • It uses a lock-free parallel recursive directory iterator,= courtesy of=C2=A0crossbeam=C2=A0and= =C2=A0ignore.=C2=A0

Hope that helps,
~Chad
=C2=A0
--000000000000d6115505b9c0d82b--