From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: dired-do-find-regexp failure with latin-1 encoding Date: Sun, 29 Nov 2020 19:44:57 +0200 Message-ID: <9dcc71f4-1d76-1436-67c9-89d7711af42c@yandex.ru> References: <87blfhjr4q.fsf@gmx.net> <83k0u5mjvf.fsf@gnu.org> <877dq5jp51.fsf@gmx.net> <83im9pmh0v.fsf@gnu.org> <106736d6-1732-3f24-15c5-af7bcfd688c6@yandex.ru> <83blfhmdho.fsf@gnu.org> <247a8edb-7b70-ad32-1ba1-43b5458a82b0@yandex.ru> <838sakmccw.fsf@gnu.org> <83o8jgkrxo.fsf@gnu.org> <83im9okrcc.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17210"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 Cc: stephen.berman@gmx.net, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Nov 29 18:52:37 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kjQsH-0004OM-1c for ged-emacs-devel@m.gmane-mx.org; Sun, 29 Nov 2020 18:52:37 +0100 Original-Received: from localhost ([::1]:39168 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kjQsF-00051l-T8 for ged-emacs-devel@m.gmane-mx.org; Sun, 29 Nov 2020 12:52:35 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50238) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kjQkx-0000Gq-TI for emacs-devel@gnu.org; Sun, 29 Nov 2020 12:45:03 -0500 Original-Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]:37169) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kjQkv-0000yj-TP; Sun, 29 Nov 2020 12:45:03 -0500 Original-Received: by mail-wr1-x42c.google.com with SMTP id i2so12047912wrs.4; Sun, 29 Nov 2020 09:45:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=hTaC4fzXIGUdRF1J6cpY7CFAlRLCEc5DMCIWaefnrIU=; b=aPJJYPGssBUbnFg25CDKvvhZlBR8CdUuHui+GYQMpxNxrgdTxXk0WpusT1rgg9JaVz hRnw+JCBNCodhqNkukFyjN/A5IzggunamqGKm5QrMczTIdFPOfo2R0+Q0ujaWwgw+2tj xhghNoKVhlVeGjfgWDTX0AAUFkyk3peMGb1vHIUEYM4IGw1B9vzO2MeK58tL6AiS896U QpPxC1f20DZK84uH6x2HVQ+sBbh/PCFgXxmVaTgbwyTvEBiAw8nAJt8p9dq3QbMAVC2/ GBq+DHAIkHymYnVEloSujp1Lzzmt4/qMzQkCkaCC+J3bKS/lrICvnHxio3yroMvKo3d0 VWqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=hTaC4fzXIGUdRF1J6cpY7CFAlRLCEc5DMCIWaefnrIU=; b=Hy+6f5SW4aijyRZfLdVzzgE3UCEgHFxyR3eLb5ta/4TFK2FRmtg7jYdXbbSxL97nRs 1AV+oZpINnWzJpbRnt9CCtuvqKa7zZ88VKb24Qzk4GGVk2UpGKyqKCmxmjlTKlWSL7FY hzTV9K+VTzE8I1j+vRYkh69MZH8DWAM97kRO+I0da4mGBrGBKRcHPgWddmbDe2r7U94n AaFqaWiZIBhZtTLeew5EUOJZGZDCIY8+zzki1WPkOU5xVahbNC4IfxGoTjW/hjJh3p/L fGGRLMug9i/zt6c3HkhEcIBghwy9aJVyXnvCw7H3U5ALhACuyE/pXHefVyNjEKL8VvZt DOvg== X-Gm-Message-State: AOAM530ZTX/YOKjlQPYYmV/C1z1jHzBNNZbmI/J/CViigxHwMOeIttyI 2YZ1OXx3X/Yh/TA8a+qM+ZrQFAVmlfPRHg== X-Google-Smtp-Source: ABdhPJxumyxQqkeaewU6FUi65zlg8eM/EPiN0dh1i+FqozQ1poEBCVlyRfkyHYmKc0eSB/xIDdkdvg== X-Received: by 2002:adf:ec0d:: with SMTP id x13mr16991692wrn.207.1606671899762; Sun, 29 Nov 2020 09:44:59 -0800 (PST) Original-Received: from [192.168.0.4] ([66.205.71.3]) by smtp.googlemail.com with ESMTPSA id x125sm6506561wmx.20.2020.11.29.09.44.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 29 Nov 2020 09:44:59 -0800 (PST) In-Reply-To: <83im9okrcc.fsf@gnu.org> Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::42c; envelope-from=raaahh@gmail.com; helo=mail-wr1-x42c.google.com X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HEADER_FROM_DIFFERENT_DOMAINS=0.248, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:260021 Archived-At: On 29.11.2020 19:25, Eli Zaretskii wrote: >> Cc: stephen.berman@gmx.net, emacs-devel@gnu.org >> From: Dmitry Gutov >> Date: Sun, 29 Nov 2020 19:19:43 +0200 >> >>> Is that � what Grep actually produced? >> >> That's copied from a terminal emulator. >> >> If I run it with shell-command, I get this: >> >> premi\350re is first >> premie?re is slightly different >> >> (\350 being a raw char) > > Then I think injecting LC_ALL=C into the environment when running Grep > in this case makes the results more useful? And we can then avoid > using -a? I'm not so sure. LC_ALL=C seems more problematic than -a: $ grep ф test.txt фыва $ grep -a ф test.txt фыва $ LC_ALL=C grep ф test.txt (nothing) Curiously, LC_ALL=C grep première latin1.txt works just fine with my terminal emulator, but that probably because it decodes the multibyte search string under the covers before using it as argument. It doesn't work in Emacs without 'C-x RET c'.