From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: dired-do-find-regexp failure with latin-1 encoding Date: Sat, 28 Nov 2020 21:13:20 +0200 Message-ID: <83im9pmh0v.fsf@gnu.org> References: <87blfhjr4q.fsf@gmx.net> <83k0u5mjvf.fsf@gnu.org> <877dq5jp51.fsf@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30031"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Stephen Berman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Nov 28 20:17:46 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kj5j8-0007hw-1p for ged-emacs-devel@m.gmane-mx.org; Sat, 28 Nov 2020 20:17:46 +0100 Original-Received: from localhost ([::1]:54550 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kj5j7-0006zw-0k for ged-emacs-devel@m.gmane-mx.org; Sat, 28 Nov 2020 14:17:45 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59366) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kj5f4-0004Re-Mc for emacs-devel@gnu.org; Sat, 28 Nov 2020 14:13:34 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:48663) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kj5f4-00074c-9I; Sat, 28 Nov 2020 14:13:34 -0500 Original-Received: from [176.228.60.248] (port=3993 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kj5f3-00046s-BF; Sat, 28 Nov 2020 14:13:33 -0500 In-Reply-To: <877dq5jp51.fsf@gmx.net> (message from Stephen Berman on Sat, 28 Nov 2020 19:46:18 +0100) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:259969 Archived-At: > From: Stephen Berman > Cc: emacs-devel@gnu.org > Date: Sat, 28 Nov 2020 19:46:18 +0100 > > > Does it work for ä if you say > > > > C-x RET c latin-1 RET A ä RET > > > > ? > > Yes (with -a added to the grep invocation, but not without it). And > then with either 'a' or 'ä' as the search term, *xref* displays 'aä'. > So this seems to be the best workaround, though inconvenient for > frequent uses I really don't see any other way, especially if different files in the directory have different encodings. Grep looks for bytes, not characters, and is agnostic to encoding. And even if we'd do this in Emacs Lisp, we'd still need to trust Emacs to guess/detect the correct encoding of each file. > Do you then agree to adding -a to the grep invocation in > xref-matches-in-files? Or could that have undesirable consequences? Adding -a probably cannot do any harm, but its support should be detected, since I don't think it's portable enough (it isn't in the latest Posix spec, at least).