From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Tyler Spivey Newsgroups: gmane.emacs.help Subject: Re: Making re-search-forward search for \377 Date: Sun, 02 Nov 2008 14:35:21 -0800 Message-ID: <87ljw1pxhy.fsf@pcdesk.net> References: <87tzaqporw.fsf@pcdesk.net> <87prlepk45.fsf@pcdesk.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1225665711 3140 80.91.229.12 (2 Nov 2008 22:41:51 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 2 Nov 2008 22:41:51 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Nov 02 23:42:53 2008 connect(): Connection refused Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Kwlel-0006z3-JH for geh-help-gnu-emacs@m.gmane.org; Sun, 02 Nov 2008 23:42:51 +0100 Original-Received: from localhost ([127.0.0.1]:54621 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Kwlde-0002bc-Pc for geh-help-gnu-emacs@m.gmane.org; Sun, 02 Nov 2008 17:41:42 -0500 Original-Path: news.stanford.edu!newsfeed.stanford.edu!postnews.google.com!news1.google.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!newsfe01.iad.POSTED!7564ea0f!not-for-mail Original-Newsgroups: gnu.emacs.help User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) Cancel-Lock: sha1:mo8J894c+u0DNpWOEScOXz+tzl8= Original-Lines: 44 Original-NNTP-Posting-Host: 70.68.146.221 Original-X-Complaints-To: internet.abuse@sjrb.ca Original-X-Trace: newsfe01.iad 1225665321 70.68.146.221 (Sun, 02 Nov 2008 22:35:21 UTC) Original-NNTP-Posting-Date: Sun, 02 Nov 2008 22:35:21 UTC Original-Xref: news.stanford.edu gnu.emacs.help:164022 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:59365 Archived-At: Xah writes: > Xah Lee wrote: >> Xah writes: >> > what's the C-q 377 char? >> >> > if i press Ctrl+q 377 Enter, i get this char: ÿ, which is LATIN SMALL >> > LETTER Y WITH DIAERESIS (unicode U+00FF). >> >> > Then if i do: >> >> > (re-search-forward "ÿ") > > Tyler Spivey wrote: >> I'm probably going to end up working with binary data in a temp >> buffer. Doing more research, I want enable-multibyte-characters to be >> off. Given that, if we go to *scratch* >> and run M-X toggle-enable-multibyte-characters until that variable >> becomes nil, doing C-Q 377 RET gives 0xff, which is what I want >> (according to C-x =, C-u C-x = and M-x describe-char). Now to >> match it, I try: >> >> (re-search-forward "\xff") - no luck > I've done yet more digging, and it seems that I need to use raw-text-unix encoding. I've sort of got this to work, and this next example is more like what I'm doing; the smallest part that seems to fail: (progn (setq re1 "\377\371") (setq re2 "\\(\377\371\\)") (insert (decode-coding-string "line 1\nline 2\377\371" 'raw-text-unix))) Evaluate that in an empty buffer, and then run M-: (re-search-forward re1) RET at the beginning of the text after the sexp. Then try M-: (re-search-forward re2) RET from just after the sexp. re1 matches fine, but re2 won't match. What am I missing here? I thought that putting parens around re1 to get re2 should give me the same expression but with capturing. Here are details on my emacs version: GNU Emacs 23.0.60.1 (x86_64-unknown-linux-gnu, GTK+ Version 2.14.4) of 2008-11-01 on arch1 I tested this in 22.3, and it seems to work. In reading the NEWS file for 23, I see changes in character set handling. What do I need to do to make re2 match what re1 does but with capturing? I realize that in this case I can probably use (match-string 0), but the full RE that I'm going to eventually be matching on is this: "\\(\377[\371\357]\\)\\|\\(\n\\)" Any help would be appreciated. - Tyler