From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: dired-do-find-regexp failure with latin-1 encoding Date: Sun, 29 Nov 2020 19:32:17 +0200 Message-ID: References: <87blfhjr4q.fsf@gmx.net> <83k0u5mjvf.fsf@gnu.org> <877dq5jp51.fsf@gmx.net> <83im9pmh0v.fsf@gnu.org> <106736d6-1732-3f24-15c5-af7bcfd688c6@yandex.ru> <83blfhmdho.fsf@gnu.org> <247a8edb-7b70-ad32-1ba1-43b5458a82b0@yandex.ru> <42ba5cae-e0d7-afd1-9974-62e7ee5840c6@yandex.ru> <83360smbq8.fsf@gnu.org> <1142c209-27d4-292c-f087-e0ccb480d893@yandex.ru> <83mtz0krnh.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19781"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 Cc: stephen.berman@gmx.net, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Nov 29 18:34:15 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kjQaT-00050G-VQ for ged-emacs-devel@m.gmane-mx.org; Sun, 29 Nov 2020 18:34:13 +0100 Original-Received: from localhost ([::1]:39804 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kjQaS-000186-Tg for ged-emacs-devel@m.gmane-mx.org; Sun, 29 Nov 2020 12:34:12 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48134) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kjQYi-0008Rw-6b for emacs-devel@gnu.org; Sun, 29 Nov 2020 12:32:25 -0500 Original-Received: from mail-ej1-x633.google.com ([2a00:1450:4864:20::633]:35512) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kjQYg-0007fG-8S; Sun, 29 Nov 2020 12:32:23 -0500 Original-Received: by mail-ej1-x633.google.com with SMTP id f23so16205627ejk.2; Sun, 29 Nov 2020 09:32:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=GBVdTwsrEwtwenJzjWwcSR6dyIWX024V1+3aqlJkVzU=; b=RQxKYHz4PZfFr6KVgSnv2ICGSCXkqWEHL6r/54WsqEtthx+Yu2C9v+seGTGc0ymm8r g81IaAsxi329fPK5Ju//zYNshZpxliUVTKLfFKwajDfxm6Q6L25OiJy6GuNZ8EsBck3d hba+O7pRFXPcdg+HhIqcyD7xOlFb/pj8LNF4e4tj/hxHWdzSMRzpkqzviYw+3TASAq/d CT0JvsF5l0KR7Pg7/4FM0KuHWQfVvTze1nPaVIop42296HhVpMH2TOcy81kKcsqY6W8F HsA632WvKmKacyRMO3dcT5kPXfKBvsvJtm5tD697u/EY92DrpH71fi4jTrMG+EcmUOeI CaPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=GBVdTwsrEwtwenJzjWwcSR6dyIWX024V1+3aqlJkVzU=; b=FqoAYjlO140/HgkTSDg9nUSnCz8d0OFg+hVX8Z+vxwuXQ2MYA+KCLefJ2ym1CDoypN 4GpHCuBdYehM5ANDZMOugse50NmLlj1xg5FOEvkMf5hzbdGlLmhyoZajt1hMpsfWedRd ul4TETQnQyjUpKL5shGrNzzH10b3trIaqxjCQ5JCpU/rHegDvczIZllEKwrso5zCpTq3 q5gZeDwm0bDOmRLa/diVPEouGKMUrvj3QQMkE3Iuxm7rBREbl952moUpOorDgBmfCDNm 8xqoAoIi8inQnUrg9mfp4c9pk02dFMIj75KIl9agpwyLYkqK079RlXATek90u6n9nvu1 Otow== X-Gm-Message-State: AOAM532HAgymIx4w4WQSpFt5Rlre7PJu6ZhFq6JdiW6/jk6Z4Tn2nMo0 wECTe7RdU0/X0asXG5mj3enHCedjxOLMtA== X-Google-Smtp-Source: ABdhPJyPI8Ne1LX671WVGyo5B/60qd2WygmR+b50CQsRG+023jYL+kxpj6Zzog5SLHdENbVzSF50cA== X-Received: by 2002:a17:906:1db1:: with SMTP id u17mr17381278ejh.359.1606671139565; Sun, 29 Nov 2020 09:32:19 -0800 (PST) Original-Received: from [192.168.0.4] ([66.205.71.3]) by smtp.googlemail.com with ESMTPSA id k17sm7461875ejj.1.2020.11.29.09.32.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 29 Nov 2020 09:32:18 -0800 (PST) In-Reply-To: <83mtz0krnh.fsf@gnu.org> Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::633; envelope-from=raaahh@gmail.com; helo=mail-ej1-x633.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:260018 Archived-At: On 29.11.2020 19:18, Eli Zaretskii wrote: >> That wouldn't be easy, but some script that performs conversion based on >> file contents could work. > > It could work in principle, but I think in practice it will not be > faster than doing everything in Emacs Lisp, because each file will > need to be read twice. It will certainly be faster if the host if remote. On a local machine, you might be right, but we'd have to benchmark to be sure. If the calls to the conversion program are done in parallel to the subsequent searches, reading the file twice might not be a problem (with the benefit of a disk cache). And if rg itself performs the search faster than Emacs' regexp engine, that can also be a factor. Depends on process spawning overhead, I suppose. >>> It would be brittle, unless that program actually reads the entire >>> file (which will be slow). >> >> How does Emacs do it? Does it read until the end of the file? > > No, just a small initial part of it. That's one reason why the > results are not guaranteed to be correct. But if we consider that approach good enough for Emacs, it should probably be good enough for doing a search from inside Emacs.