From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?ISO-8859-1?Q?Nordl=F6w?= Newsgroups: gmane.emacs.help Subject: Checking if a file is binary (non-textual) Date: Mon, 28 Sep 2009 06:29:28 -0700 (PDT) Organization: http://groups.google.com Message-ID: <3dcc7253-dbfc-431c-ac56-1c34bc7798ad@a6g2000vbp.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1254145478 16704 80.91.229.12 (28 Sep 2009 13:44:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 28 Sep 2009 13:44:38 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Sep 28 15:44:31 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MsGWk-0005sH-Ps for geh-help-gnu-emacs@m.gmane.org; Mon, 28 Sep 2009 15:44:30 +0200 Original-Received: from localhost ([127.0.0.1]:34070 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MsGWk-00089X-7p for geh-help-gnu-emacs@m.gmane.org; Mon, 28 Sep 2009 09:44:30 -0400 Original-Path: news.stanford.edu!usenet.stanford.edu!postnews.google.com!a6g2000vbp.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 16 Original-NNTP-Posting-Host: 150.227.15.253 Original-X-Trace: posting.google.com 1254144568 11498 127.0.0.1 (28 Sep 2009 13:29:28 GMT) Original-X-Complaints-To: groups-abuse@google.com Original-NNTP-Posting-Date: Mon, 28 Sep 2009 13:29:28 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: a6g2000vbp.googlegroups.com; posting-host=150.227.15.253; posting-account=ytJKAgoAAAA1tg4ScoRszebXiIldA5vg User-Agent: G2/1.0 X-HTTP-Via: 1.1 ip1-w.foi.se:8080 (IronPort-WSA/6.3.0-523) X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/532.0 (KHTML, like Gecko) Chrome/4.0.212.0 Safari/532.0,gzip(gfe),gzip(gfe) Original-Xref: news.stanford.edu gnu.emacs.help:173391 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:68504 Archived-At: What characters (bytes) should *not* be present in a text-file that may contain variable-length unicode characters. What does the unicode standard say about this? The reason for asking: I am working on a tool that unifies grep, tags-query-replace, occur, etc. And I really would like this tool to have some clever default behaviour for determining how to present the search (grep) hit-context for different file-types: - textual files: show whole line (as grep and occur does) - binary files: either no context just notify match (like grep) or maybe all [a-zA-Z0-9_]* directly before or after hit - ... /Nordl=F6w