From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: What does Emacs on w32 know that grep can't figure out? Date: Tue, 5 Oct 2010 02:51:41 +0200 Message-ID: References: <87bp7d1o6k.fsf@catnip.gol.com> <83bp7dqcgu.fsf@gnu.org> <87mxqw28cv.fsf@tux.homenetwork> <87mxqw4oup.fsf@tux.homenetwork> <83ocbcpqif.fsf@gnu.org> <83d3rrozyw.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1286239948 4548 80.91.229.12 (5 Oct 2010 00:52:28 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 5 Oct 2010 00:52:28 +0000 (UTC) Cc: thierry.volpiatto@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 05 02:52:27 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1P2vlT-00026x-NJ for ged-emacs-devel@m.gmane.org; Tue, 05 Oct 2010 02:52:26 +0200 Original-Received: from localhost ([127.0.0.1]:53517 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2vlM-0000ko-GO for ged-emacs-devel@m.gmane.org; Mon, 04 Oct 2010 20:52:12 -0400 Original-Received: from [140.186.70.92] (port=50781 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2vlD-0000kX-GR for emacs-devel@gnu.org; Mon, 04 Oct 2010 20:52:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2vlC-0008SR-6t for emacs-devel@gnu.org; Mon, 04 Oct 2010 20:52:03 -0400 Original-Received: from mail-qw0-f41.google.com ([209.85.216.41]:45173) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2vlC-0008SI-27; Mon, 04 Oct 2010 20:52:02 -0400 Original-Received: by qwb8 with SMTP id 8so4065222qwb.0 for ; Mon, 04 Oct 2010 17:52:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=ZGDsbQYFSIv9/BdKAkDQG6+furvFrYqD9vzOlpcYLOk=; b=NX+qTCoORfivGoNoxHBGkDRDJNMTUSCMgP5SSfKkRnqT0UGHRLHMH0LOJMzMqefu5l HU0YfHMHsB/8bCMd1vGb7Ycm7HloJOSo8Ca0itOvvw27yu8O0WGMEBIb27BsaVsbjHh3 5zx+ne9zGWyklBdZCKtBzjQYSyiZKSt11fYiE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=Ob2TojuXcOXeJfexV4oQ2+isTOddlF4EEzSaAhAIzLqCaDjP0viFBVJuNGw3DFXPS/ 72CDlz8mv7+8CC+MDXE3UImoN3QxHwTWiwPGe/vaNs/1YzDPcYinmr6/6C5zn51QIWQ7 Mmy2DQckT7LvqSW5pFLg95fMmDws5EUXy1H7w= Original-Received: by 10.229.2.28 with SMTP id 28mr7523218qch.267.1286239921386; Mon, 04 Oct 2010 17:52:01 -0700 (PDT) Original-Received: by 10.229.220.195 with HTTP; Mon, 4 Oct 2010 17:51:41 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:131338 Archived-At: On Tue, Oct 5, 2010 at 2:46 AM, Lennart Borgman wrote: > On Mon, Oct 4, 2010 at 12:50 AM, Lennart Borgman > wrote: >> On Sun, Oct 3, 2010 at 9:09 PM, Eli Zaretskii wrote: >>>> From: Lennart Borgman >>>> Date: Sun, 3 Oct 2010 06:10:14 +0200 >>>> Cc: Thierry Volpiatto , emacs-devel@gnu.o= rg >>>> >>>> =C2=A0 http://technet.microsoft.com/en-us/library/dd315403.aspx >>>> >>>> it does not look as it autodetects the coding system in the file. >>> >>> Why should it? > .. >>> What tool does, besides Emacs? > .. >>> It does support UTF-16. =C2=A0That's the "Unicode" part of the values y= ou >>> can submit to the -Encoding option (you need to "think MS" to get it). > > This discussion seems to have stalled. So here is a summary of what we > have found: > > - utf-16 is a problem since the grep program does not handle it and > utf-16 is common on w32. > > - There seems to be no program we can use that autodetects file coding > the way Emacs does. > > - I would expect Emacs users to believe thata search from within Emacs > would autodetect coding system since Emacs does it. (Eli disagrees on > this.) A shit. Doing two things simultaneously does not improve you writing... It should have been like this: What can we do? The options I can see are: A) Use powershell + select-string (+ another cmdlet for dir tree searching) on w32 by default. That has two benefits: 1. Users on w32 does not have to install grep+find. But on older systems they have to install powershell. Anyway it is easy to test for powershell and use it if it is installed. (A problem is the hardcoding of parameters to grep, like -i, but that can be resolved). 2. utf-16 files can also be searched on w32. B) Make an internal grep command in Emacs. A naive version in elisp is quick to write but will be inefficient especially for large files. A C version will require some restructuring of insert_file_contents, but seems otherwise not hard to code. (Except perhaps for buffering for effiency, but I have no idea if that is needed.) I think B is the best choice, but I do not expect anyone to write it now. Maybe add it to our to-do list? For the moment I suggest that we implement A.