* `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
@ 2018-06-13 18:23 Drew Adams
2018-06-13 18:40 ` Óscar Fuentes
2018-06-13 19:08 ` Eli Zaretskii
0 siblings, 2 replies; 15+ messages in thread
From: Drew Adams @ 2018-06-13 18:23 UTC (permalink / raw)
To: help-gnu-emacs@gnu.org List
Is there a simple way to use `M-x grep' (e.g., giving it
some switches or escape chars or replacing them with hex
escapes or...) to search for some text that includes
non-ASCII Unicode chars?
[I'm using (an old) Cygwin `grep'. Dunno whether that
matters.]
I tried to look for "'%s'" (curly-quote) in the Emacs
source code.
E.g., in `info.el' we now have this:
(format "Index for '%s'" string) instead of this:
(format "Index for `%s'" string)
I wanted to see if this kind of change was spread to
other files.
I tried things like "\\x2018%s\\x2019", with no luck.
Is there a simple approach that uses only `M-x grep'
and not, say, piping the result of iconv to grep?
I ended up doing the search using Icicles, but I'd
like to be able to do such a search also using
just `grep' (or `rgrep' etc.).
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 18:23 `grep' command on MS Windows with Cygwin, looking for text with Unicode chars Drew Adams
@ 2018-06-13 18:40 ` Óscar Fuentes
2018-06-13 19:09 ` Drew Adams
2018-06-13 19:16 ` Noam Postavsky
2018-06-13 19:08 ` Eli Zaretskii
1 sibling, 2 replies; 15+ messages in thread
From: Óscar Fuentes @ 2018-06-13 18:40 UTC (permalink / raw)
To: help-gnu-emacs
Drew Adams <drew.adams@oracle.com> writes:
> Is there a simple way to use `M-x grep' (e.g., giving it
> some switches or escape chars or replacing them with hex
> escapes or...) to search for some text that includes
> non-ASCII Unicode chars?
>
> [I'm using (an old) Cygwin `grep'. Dunno whether that
> matters.]
>
> I tried to look for "'%s'" (curly-quote) in the Emacs
> source code.
>
> E.g., in `info.el' we now have this:
> (format "Index for '%s'" string) instead of this:
> (format "Index for `%s'" string)
>
> I wanted to see if this kind of change was spread to
> other files.
>
> I tried things like "\\x2018%s\\x2019", with no luck.
> Is there a simple approach that uses only `M-x grep'
> and not, say, piping the result of iconv to grep?
>
> I ended up doing the search using Icicles, but I'd
> like to be able to do such a search also using
> just `grep' (or `rgrep' etc.).
If there is a method, I'll like to know as well. This is the main reason
why I don't use Unicode in my source files.
(I investigated the matter the last year, on this ml and on the
Internet. My conclusion was negative, at least for UTF-8).
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 18:40 ` Óscar Fuentes
@ 2018-06-13 19:09 ` Drew Adams
2018-06-13 19:16 ` Noam Postavsky
1 sibling, 0 replies; 15+ messages in thread
From: Drew Adams @ 2018-06-13 19:09 UTC (permalink / raw)
To: Óscar Fuentes, help-gnu-emacs
> If there is a method, I'll like to know as well. This is the main reason
> why I don't use Unicode in my source files.
>
> (I investigated the matter the last year, on this ml and on the
> Internet. My conclusion was negative, at least for UTF-8).
Thanks, Oscar. I searched a bit too.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 18:40 ` Óscar Fuentes
2018-06-13 19:09 ` Drew Adams
@ 2018-06-13 19:16 ` Noam Postavsky
2018-06-13 19:22 ` Noam Postavsky
2018-06-13 19:26 ` Drew Adams
1 sibling, 2 replies; 15+ messages in thread
From: Noam Postavsky @ 2018-06-13 19:16 UTC (permalink / raw)
To: Óscar Fuentes; +Cc: Help Gnu Emacs mailing list
On 13 June 2018 at 14:40, Óscar Fuentes <ofv@wanadoo.es> wrote:
> Drew Adams <drew.adams@oracle.com> writes:
>
>> Is there a simple way to use `M-x grep' (e.g., giving it
>> some switches or escape chars or replacing them with hex
>> escapes or...) to search for some text that includes
>> non-ASCII Unicode chars?
> If there is a method, I'll like to know as well. This is the main reason
> why I don't use Unicode in my source files.
This seems to do the right with thing with the grep I have installed:
grep "[^[:cntrl:][:print:]]" *.el
According to the GNU grep manual [:cntrl:][:print:] looks equivalent
to Emacs' [:ascii:], in the C locale.
The grep I have installed doesn't seem to support anything but the C
locale anyway (at least, setting LANG isn't needed). It identifies
itself in the --help output as:
GNU grep version 2.0d
Win32 port with subdirectory search created by Tim Charron
(full source available at http://www.interlog.com/~tcharron/grep.html)
That web page indicates it's from 2001, but works well enough that
I've never bothered to change it. Not sure how Cygwin grep would act.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 19:16 ` Noam Postavsky
@ 2018-06-13 19:22 ` Noam Postavsky
2018-06-13 19:28 ` Drew Adams
2018-06-13 19:26 ` Drew Adams
1 sibling, 1 reply; 15+ messages in thread
From: Noam Postavsky @ 2018-06-13 19:22 UTC (permalink / raw)
To: Óscar Fuentes; +Cc: Help Gnu Emacs mailing list
On 13 June 2018 at 15:16, Noam Postavsky <npostavs@gmail.com> wrote:
> On 13 June 2018 at 14:40, Óscar Fuentes <ofv@wanadoo.es> wrote:
>> Drew Adams <drew.adams@oracle.com> writes:
>>
>>> Is there a simple way to use `M-x grep' (e.g., giving it
>>> some switches or escape chars or replacing them with hex
>>> escapes or...) to search for some text that includes
>>> non-ASCII Unicode chars?
>
>> If there is a method, I'll like to know as well. This is the main reason
>> why I don't use Unicode in my source files.
>
> This seems to do the right with thing with the grep I have installed:
>
> grep "[^[:cntrl:][:print:]]" *.el
Oh, just realized you probably meant "text which includes particular
non-ASCII characters", not "text which includes any non-ASCII
characters".
Never mind me then.
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 19:22 ` Noam Postavsky
@ 2018-06-13 19:28 ` Drew Adams
0 siblings, 0 replies; 15+ messages in thread
From: Drew Adams @ 2018-06-13 19:28 UTC (permalink / raw)
To: Noam Postavsky, Óscar Fuentes; +Cc: Help Gnu Emacs mailing list
> >>> Is there a simple way to use `M-x grep' (e.g., giving it
> >>> some switches or escape chars or replacing them with hex
> >>> escapes or...) to search for some text that includes
> >>> non-ASCII Unicode chars?
> >
> >> If there is a method, I'll like to know as well. This is the main
> >> reason why I don't use Unicode in my source files.
> >
> > This seems to do the right with thing with the grep I have installed:
> > grep "[^[:cntrl:][:print:]]" *.el
>
> Oh, just realized you probably meant "text which includes particular
> non-ASCII characters", not "text which includes any non-ASCII
> characters".
Yes, I did. But that's OK. I learned something useful.
> Never mind me then.
Nope, sorry; can't do that. ;-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 19:16 ` Noam Postavsky
2018-06-13 19:22 ` Noam Postavsky
@ 2018-06-13 19:26 ` Drew Adams
1 sibling, 0 replies; 15+ messages in thread
From: Drew Adams @ 2018-06-13 19:26 UTC (permalink / raw)
To: Noam Postavsky, Óscar Fuentes; +Cc: Help Gnu Emacs mailing list
> >> Is there a simple way to use `M-x grep' (e.g., giving it
> >> some switches or escape chars or replacing them with hex
> >> escapes or...) to search for some text that includes
> >> non-ASCII Unicode chars?
>
> > If there is a method, I'll like to know as well. This is the main
> reason
> > why I don't use Unicode in my source files.
>
> This seems to do the right with thing with the grep I have installed:
>
> grep "[^[:cntrl:][:print:]]" *.el
>
> According to the GNU grep manual [:cntrl:][:print:] looks equivalent
> to Emacs' [:ascii:], in the C locale.
>
> The grep I have installed doesn't seem to support anything but the C
> locale anyway (at least, setting LANG isn't needed). It identifies
> itself in the --help output as:
>
> GNU grep version 2.0d
> Win32 port with subdirectory search created by Tim Charron
> (full source available at
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.interlog.com_-
> 7Etcharron_grep.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_J
> nE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98&m=mwTRqK15rRKM1JijTtXJcy
> fypP_2OPkAexmNd725LFQ&s=ElcYIkHLVnToY1wdciKB3H6WEeO6g1KYRX-M4tBIsro&e=)
>
> That web page indicates it's from 2001, but works well enough that
> I've never bothered to change it. Not sure how Cygwin grep would act.
Interesting; thanks.
With my (old) Cygwin grep, in the `lisp' directory, that shows 4 hits,
3 in char-fold.el and one in mpc.el. The first char-fold.el hit shows
matches for curly quotes, for example. But I guess that won't help me
find just curly quotes. ;-)
In each case, the grep hits show octal escapes instead of Unicode-char glyphs.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 18:23 `grep' command on MS Windows with Cygwin, looking for text with Unicode chars Drew Adams
2018-06-13 18:40 ` Óscar Fuentes
@ 2018-06-13 19:08 ` Eli Zaretskii
2018-06-13 19:43 ` Tomas Nordin
1 sibling, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2018-06-13 19:08 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Wed, 13 Jun 2018 11:23:47 -0700 (PDT)
> From: Drew Adams <drew.adams@oracle.com>
>
> Is there a simple way to use `M-x grep' (e.g., giving it
> some switches or escape chars or replacing them with hex
> escapes or...) to search for some text that includes
> non-ASCII Unicode chars?
Not on MS-Windows with the native Windows build of Emacs, AFAIK. I
think you will need a Cygwin build of Emacs for that, and perhaps also
a newer Cygwin Grep.
Emacs on Windows cannot invoke subprograms with command-line arguments
encoded in anything but the system codepage. And Windows doesn't
support UTF-8 as the system codepage. Sorry.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 19:08 ` Eli Zaretskii
@ 2018-06-13 19:43 ` Tomas Nordin
2018-06-14 2:33 ` Eli Zaretskii
0 siblings, 1 reply; 15+ messages in thread
From: Tomas Nordin @ 2018-06-13 19:43 UTC (permalink / raw)
To: Eli Zaretskii, help-gnu-emacs
Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Wed, 13 Jun 2018 11:23:47 -0700 (PDT)
>> From: Drew Adams <drew.adams@oracle.com>
>>
>> Is there a simple way to use `M-x grep' (e.g., giving it
>> some switches or escape chars or replacing them with hex
>> escapes or...) to search for some text that includes
>> non-ASCII Unicode chars?
>
> Not on MS-Windows with the native Windows build of Emacs, AFAIK. I
> think you will need a Cygwin build of Emacs for that, and perhaps also
> a newer Cygwin Grep.
>
> Emacs on Windows cannot invoke subprograms with command-line arguments
> encoded in anything but the system codepage. And Windows doesn't
> support UTF-8 as the system codepage. Sorry.
Sorry for a naive question, but the "system codepage" or "current system
codepage" wording is used now and then in relation to non-ascii problems
on Windows. If on Windows, what is a good way to figure out the
current system codepage?
--
Tomas
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-13 19:43 ` Tomas Nordin
@ 2018-06-14 2:33 ` Eli Zaretskii
2018-06-14 2:40 ` Eli Zaretskii
0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2018-06-14 2:33 UTC (permalink / raw)
To: help-gnu-emacs
> From: Tomas Nordin <tomasn@posteo.net>
> Date: Wed, 13 Jun 2018 21:43:52 +0200
>
> Sorry for a naive question, but the "system codepage" or "current system
> codepage" wording is used now and then in relation to non-ascii problems
> on Windows. If on Windows, what is a good way to figure out the
> current system codepage?
w32-system-coding-system is one variable that will tell you that.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: `grep' command on MS Windows with Cygwin, looking for text with Unicode chars
2018-06-14 2:33 ` Eli Zaretskii
@ 2018-06-14 2:40 ` Eli Zaretskii
0 siblings, 0 replies; 15+ messages in thread
From: Eli Zaretskii @ 2018-06-14 2:40 UTC (permalink / raw)
To: help-gnu-emacs
> Date: Thu, 14 Jun 2018 05:33:48 +0300
> From: Eli Zaretskii <eliz@gnu.org>
>
> > Sorry for a naive question, but the "system codepage" or "current system
> > codepage" wording is used now and then in relation to non-ascii problems
> > on Windows. If on Windows, what is a good way to figure out the
> > current system codepage?
>
> w32-system-coding-system is one variable that will tell you that.
And w32-ansi-code-page is another.
^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <<356e7bf9-3f93-448c-a067-f6b567d5aa5a@default>]
end of thread, other threads:[~2018-06-14 2:40 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-13 18:23 `grep' command on MS Windows with Cygwin, looking for text with Unicode chars Drew Adams
2018-06-13 18:40 ` Óscar Fuentes
2018-06-13 19:09 ` Drew Adams
2018-06-13 19:16 ` Noam Postavsky
2018-06-13 19:22 ` Noam Postavsky
2018-06-13 19:28 ` Drew Adams
2018-06-13 19:26 ` Drew Adams
2018-06-13 19:08 ` Eli Zaretskii
2018-06-13 19:43 ` Tomas Nordin
2018-06-14 2:33 ` Eli Zaretskii
2018-06-14 2:40 ` Eli Zaretskii
[not found] <<356e7bf9-3f93-448c-a067-f6b567d5aa5a@default>
[not found] ` <<83y3fi33or.fsf@gnu.org>
2018-06-13 19:16 ` Drew Adams
2018-06-13 19:42 ` Eli Zaretskii
2018-06-13 23:09 ` Bob Proulx
2018-06-13 23:37 ` Drew Adams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).