all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Stephen J. Turnbull" <stephen@xemacs.org>
To: Lennart Borgman <lennart.borgman@gmail.com>
Cc: Emacs-Devel devel <emacs-devel@gnu.org>
Subject: What does Emacs on w32 know that grep can't figure out?
Date: Fri, 01 Oct 2010 13:00:02 +0900	[thread overview]
Message-ID: <874od6bm0t.fsf@uwakimon.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <AANLkTimmh0TZE2oVQV4A_7ijE5JfBD5obibDr992TYG=@mail.gmail.com>

Lennart Borgman writes:

 > However trying to search this file from a cmd prompt with (gnuwin32)
 > grep does not work.

No, it almost certainly won't.  grep is byte-oriented and doesn't know
anything about coding systems.  On Unix with a UTF-8-capable terminal
you would do something like

   iconv --from=UTF-16 --to=UTF-8 $FILE | grep some-string

I would think that either Cygwin or Windows provides a version of
iconv.  If not, changing the file to UTF-8 (instead of UTF-16) using
Emacs should make it grep'able.  In some cases grep may think this is
a binary file anyway; if so, use the --text switch to force grep to
treat the file as text.

 > And it does not work with cygwin grep either. They think it is a
 > binary file (even though I changed the line delimiter to unix
 > style).

The EOL delimiter is not a problem.  grep should ignore the presence
or absence of CR when checking for binary files.  The only time it is
likely to matter is if you are searching for a word at the end of the
line, in which case instead of "word$" you can use "word\015?$" or
something like that (if it matters, grep may be EOL-agnostic these
days).

Now, of course they think a UTF-16-encoded file is a binary file.  It
almost certainly contains NUL bytes (because an ASCII or Latin-1
character will always have a trailing NUL in UTF-16LE).

 > What is going on? Is grep sometimes useless on w32 now, or? (How do we
 > handle that in Emacs?)

Emacs tries to guess what the encoding is if you don't specify it.  It
may guess wrong in certain cases, but it should be extremely accurate
in case of any Unicode format.



  parent reply	other threads:[~2010-10-01  4:00 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-30 23:13 What does Emacs on w32 know that grep can't figure out? Lennart Borgman
2010-09-30 23:58 ` Juanma Barranquero
2010-10-01  0:29   ` Lennart Borgman
2010-10-01  0:37     ` Juanma Barranquero
2010-10-01  1:06       ` Lennart Borgman
2010-10-01  1:20         ` Juanma Barranquero
2010-10-01  1:32           ` Lennart Borgman
2010-10-01  1:49             ` Juanma Barranquero
2010-10-01  2:41             ` PJ Weisberg
2010-10-01  5:43               ` PJ Weisberg
2010-10-01  5:37       ` Jan D.
2010-10-01  7:34         ` Eli Zaretskii
2010-10-01  8:12           ` Andreas Schwab
2010-10-01 12:00             ` Eli Zaretskii
2010-10-01 12:40               ` Andreas Schwab
2010-10-01 17:19               ` Jan Djärv
2010-10-01 22:45               ` Stefan Monnier
2010-10-01  7:13     ` Eli Zaretskii
2010-10-01  9:28       ` Lennart Borgman
2010-10-01 10:35         ` Mathias Dahl
2010-10-01 11:24           ` Thierry Volpiatto
2010-10-01 11:26           ` Lennart Borgman
2010-10-01 12:02         ` Eli Zaretskii
2010-10-01 15:03           ` Lennart Borgman
2010-10-01 15:11           ` PJ Weisberg
2010-10-01 22:40     ` Stefan Monnier
2010-10-01 23:35       ` Miles Bader
2010-10-02  0:33         ` Lennart Borgman
2010-10-02  7:29           ` Eli Zaretskii
2010-10-02 10:12             ` Lennart Borgman
2010-10-02 10:32               ` Thierry Volpiatto
2010-10-02 10:50                 ` Lennart Borgman
2010-10-02 12:01                 ` Eli Zaretskii
2010-10-02 14:51                 ` Lennart Borgman
2010-10-02 15:05                   ` Thierry Volpiatto
2010-10-02 15:23                     ` Eli Zaretskii
2010-10-02 15:56                       ` Lennart Borgman
2010-10-02 20:40                         ` Mathias Dahl
2010-10-02 22:20                           ` Eli Zaretskii
2010-10-03  0:35                             ` David Robinow
2010-10-03  4:05                               ` Eli Zaretskii
2010-10-03  4:29                                 ` David Robinow
2010-10-03  5:39                                   ` Eli Zaretskii
2010-10-03  7:25                                     ` Thierry Volpiatto
2010-10-03  4:10                         ` Lennart Borgman
2010-10-03 10:39                           ` Thierry Volpiatto
2010-10-03 13:23                             ` Thierry Volpiatto
2010-10-03 19:09                           ` Eli Zaretskii
2010-10-03 22:50                             ` Lennart Borgman
2010-10-05  0:46                               ` Lennart Borgman
2010-10-05  0:51                                 ` Lennart Borgman
2010-10-05  0:56                                 ` Juanma Barranquero
2010-10-05  1:36                                   ` Lennart Borgman
2010-10-05  1:37                                     ` Lennart Borgman
2010-10-05  1:54                                     ` Juanma Barranquero
2010-10-02 11:59               ` Eli Zaretskii
2010-10-02 12:44                 ` Lennart Borgman
2010-10-02 13:15                   ` Eli Zaretskii
2010-10-01  7:09   ` Eli Zaretskii
2010-10-01  4:00 ` Stephen J. Turnbull [this message]
2010-10-01  7:03   ` Eli Zaretskii
2010-10-01  7:29   ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874od6bm0t.fsf@uwakimon.sk.tsukuba.ac.jp \
    --to=stephen@xemacs.org \
    --cc=emacs-devel@gnu.org \
    --cc=lennart.borgman@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.