From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.help Subject: RE: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars Date: Wed, 13 Jun 2018 12:26:27 -0700 (PDT) Message-ID: <6568494b-e721-410f-8658-c4df45ef92f1@default> References: <356e7bf9-3f93-448c-a067-f6b567d5aa5a@default> <877en2edi0.fsf@telefonica.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1528917944 8832 195.159.176.226 (13 Jun 2018 19:25:44 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 13 Jun 2018 19:25:44 +0000 (UTC) Cc: Help Gnu Emacs mailing list To: Noam Postavsky , =?utf-8?B?w5NzY2FyIEZ1ZW50ZXM=?= Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Jun 13 21:25:39 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fTBOk-000278-FY for geh-help-gnu-emacs@m.gmane.org; Wed, 13 Jun 2018 21:25:38 +0200 Original-Received: from localhost ([::1]:36520 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTBQp-0006Ph-HN for geh-help-gnu-emacs@m.gmane.org; Wed, 13 Jun 2018 15:27:47 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49917) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTBPh-0005zR-LW for help-gnu-emacs@gnu.org; Wed, 13 Jun 2018 15:26:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fTBPc-0002ri-MW for help-gnu-emacs@gnu.org; Wed, 13 Jun 2018 15:26:37 -0400 Original-Received: from userp2130.oracle.com ([156.151.31.86]:59790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fTBPc-0002qI-Du for help-gnu-emacs@gnu.org; Wed, 13 Jun 2018 15:26:32 -0400 Original-Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w5DJOECw135451; Wed, 13 Jun 2018 19:26:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=CHP1b65koVOo75FSjGdYJbmpX/YAOkl3LgCuy9fNUOs=; b=Kt4dlmMBDwYROIwHAWXxWpV/+fKVPyiDG/dNm1bpZSg/qDpqBChzV+Yh8r6jXWrZsU83 /cUOp6VKhdffnCKz7CuKYjce3dixOZQk4rhupkdcLEWLGd3G/ZfKpZK5ObzpM+tfLgN/ dzV01dtuNuuNrjEbe6YgPUp0XD8OhtdwCnZqMS2YqpIsa7GlnT4KtTI4/dhD9vuBRObb 2B0DdfVAu3Val2SNaDBkYRLMf7VKFx6iM6TA13grQiO7quOL4prrZ0rdVvNJYbcZLNe5 nnXkCHJoAAcM21+qEzpP0tpEs6x9qCaR0fBcnXFgXb9h/+o5X3sw1Zi2kz7KPR9W+rAJ ag== Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2jk0xra4mf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Jun 2018 19:26:29 +0000 Original-Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w5DJQSXK012940 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Jun 2018 19:26:28 GMT Original-Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w5DJQS2g016934; Wed, 13 Jun 2018 19:26:28 GMT In-Reply-To: X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4690.0 (x86)] X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8923 signatures=668702 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=943 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1806130205 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 156.151.31.86 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:117148 Archived-At: > >> Is there a simple way to use `M-x grep' (e.g., giving it > >> some switches or escape chars or replacing them with hex > >> escapes or...) to search for some text that includes > >> non-ASCII Unicode chars? >=20 > > If there is a method, I'll like to know as well. This is the main > reason > > why I don't use Unicode in my source files. >=20 > This seems to do the right with thing with the grep I have installed: >=20 > grep "[^[:cntrl:][:print:]]" *.el >=20 > According to the GNU grep manual [:cntrl:][:print:] looks equivalent > to Emacs' [:ascii:], in the C locale. >=20 > The grep I have installed doesn't seem to support anything but the C > locale anyway (at least, setting LANG isn't needed). It identifies > itself in the --help output as: >=20 > GNU grep version 2.0d > Win32 port with subdirectory search created by Tim Charron > (full source available at > https://urldefense.proofpoint.com/v2/url?u=3Dhttp-3A__www.interlog.com_- > 7Etcharron_grep.html&d=3DDwIFaQ&c=3DRoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65ea= pI_J > nE&r=3DkI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98&m=3DmwTRqK15rRKM1JijTt= XJcy > fypP_2OPkAexmNd725LFQ&s=3DElcYIkHLVnToY1wdciKB3H6WEeO6g1KYRX-M4tBIsro&e= =3D) >=20 > That web page indicates it's from 2001, but works well enough that > I've never bothered to change it. Not sure how Cygwin grep would act. Interesting; thanks. With my (old) Cygwin grep, in the `lisp' directory, that shows 4 hits, 3 in char-fold.el and one in mpc.el. The first char-fold.el hit shows matches for curly quotes, for example. But I guess that won't help me find just curly quotes. ;-) In each case, the grep hits show octal escapes instead of Unicode-char glyp= hs.