From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: What does Emacs on w32 know that grep can't figure out? Date: Tue, 5 Oct 2010 02:46:56 +0200 Message-ID: References: <87bp7d1o6k.fsf@catnip.gol.com> <83bp7dqcgu.fsf@gnu.org> <87mxqw28cv.fsf@tux.homenetwork> <87mxqw4oup.fsf@tux.homenetwork> <83ocbcpqif.fsf@gnu.org> <83d3rrozyw.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1286239659 3643 80.91.229.12 (5 Oct 2010 00:47:39 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 5 Oct 2010 00:47:39 +0000 (UTC) Cc: thierry.volpiatto@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Oct 05 02:47:37 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1P2vgo-0001HG-Nw for ged-emacs-devel@m.gmane.org; Tue, 05 Oct 2010 02:47:35 +0200 Original-Received: from localhost ([127.0.0.1]:35573 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2vgk-0007iN-WC for ged-emacs-devel@m.gmane.org; Mon, 04 Oct 2010 20:47:27 -0400 Original-Received: from [140.186.70.92] (port=33944 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2vgf-0007iG-7Q for emacs-devel@gnu.org; Mon, 04 Oct 2010 20:47:22 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2vgd-0007e8-4h for emacs-devel@gnu.org; Mon, 04 Oct 2010 20:47:21 -0400 Original-Received: from mail-qy0-f176.google.com ([209.85.216.176]:49308) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2vgd-0007dn-1h; Mon, 04 Oct 2010 20:47:19 -0400 Original-Received: by qyk9 with SMTP id 9so628609qyk.0 for ; Mon, 04 Oct 2010 17:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=BpnB4s3qLN16+8HIYx5mi1lChCYLZAVY5EqY3Q4mFlM=; b=idPU7Fwl0jDre6SwqFULoziE+CDQIo+jhegDA5PictNLQyRFNZWsQKe6tyn1GOHQKN YUu5gh8qQBYZH27WbQOJkjWPCIl9U7p/Av7/hPCfhOJlrU2LcG1v3AbeIalo6sESn9bq 8hduQ5WC18K9UbjiaYsHVPibvLKDFr2FIFe4M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=bVA8QjtA8cZJDM224UcwV2NpNJQDE9U54Ax7e0+XR/+bDp+hCXv2Y9TdUWiU71rxfY yvQT618U394MCerjvFyVtWbQEy6r9sJRa2w54XAimw6Hp9d3LLq9QAuElk5YJGCq0fvW 1KB082ulAOUETLAXHmtQhS8TT4dr5pFVtjMxY= Original-Received: by 10.224.114.194 with SMTP id f2mr7528004qaq.183.1286239637637; Mon, 04 Oct 2010 17:47:17 -0700 (PDT) Original-Received: by 10.229.220.195 with HTTP; Mon, 4 Oct 2010 17:46:56 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:131337 Archived-At: On Mon, Oct 4, 2010 at 12:50 AM, Lennart Borgman wrote: > On Sun, Oct 3, 2010 at 9:09 PM, Eli Zaretskii wrote: >>> From: Lennart Borgman >>> Date: Sun, 3 Oct 2010 06:10:14 +0200 >>> Cc: Thierry Volpiatto , emacs-devel@gnu.or= g >>> >>> =C2=A0 http://technet.microsoft.com/en-us/library/dd315403.aspx >>> >>> it does not look as it autodetects the coding system in the file. >> >> Why should it? .. >> What tool does, besides Emacs? .. >> It does support UTF-16. =C2=A0That's the "Unicode" part of the values yo= u >> can submit to the -Encoding option (you need to "think MS" to get it). This discussion seems to have stalled. So here is a summary of what we have found: - utf-16 is a problem since the grep program does not handle it and utf-16 is common on w32. - There seems to be no program we can use that autodetects file coding the way Emacs does. - I would expect Emacs users to believe thata search from within Emacs would autodetect coding system since Emacs does it. (Eli disagrees on this.) What can we do? The options I can see are: - Use powershell + select-string (+ another cmdlet for dir tree searching) on w32 by default. That has two benefits: 1. Users on w32 does not have to install grep+find. But on older systems they have to install powershell. Anyway it is easy to test for powershell and use it if it is installed. (A problem is the hardcoding of parameters to grep, like -i, but that can be resolved). 2. Make an internal grep command in Emacs. A naive version in elisp is quick to write but will be inefficient especially for large files. A C version will require some restructuring of insert_file_contents, but seems otherwise not hard to code. (Except perhaps for buffering for effiency, but I have no idea if that is needed.) I think 2 is the best choice, but I do not expect anyone to write it now. Maybe add it to our to-do list? For the moment I suggest that we implement 1.