unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Tyler Spivey <tspivey@pcdesk.net>
To: help-gnu-emacs@gnu.org
Subject: Re: Making re-search-forward search for \377
Date: Sun, 02 Nov 2008 14:35:21 -0800	[thread overview]
Message-ID: <87ljw1pxhy.fsf@pcdesk.net> (raw)
In-Reply-To: dc53e4fd-316c-44a9-9f7f-d7455191b623@e1g2000pra.googlegroups.com

Xah <xahlee@gmail.com> writes:

> Xah Lee wrote:
>> Xah<xah...@gmail.com> writes:
>> > what's the C-q 377 char?
>>
>> > if i press Ctrl+q 377 Enter, i get this char: ÿ, which is LATIN SMALL
>> > LETTER Y WITH DIAERESIS (unicode U+00FF).
>>
>> > Then if i do:
>>
>> > (re-search-forward "ÿ")
>
> Tyler Spivey wrote:
>> I'm probably going to end up working with binary data in a temp
>> buffer. Doing more research, I want enable-multibyte-characters to be
>> off. Given that, if we go to *scratch*
>> and run M-X toggle-enable-multibyte-characters until that variable
>> becomes nil, doing C-Q 377 RET gives 0xff, which is what I want
>> (according to C-x =, C-u C-x = and M-x describe-char). Now to
>> match it, I try:
>>
>> (re-search-forward "\xff") - no luck
>
I've done yet more digging, and it seems that I need to use
raw-text-unix encoding. I've sort of got this to work, and this next
example is more like what I'm doing; the smallest part that seems to
fail:
(progn
  (setq re1 "\377\371")
  (setq re2 "\\(\377\371\\)")
  (insert (decode-coding-string "line 1\nline 2\377\371" 'raw-text-unix)))

Evaluate that in an empty buffer, and then run M-: (re-search-forward re1) RET at the beginning of the text after the sexp.
Then try M-: (re-search-forward re2) RET from just after the sexp.
re1 matches fine, but re2 won't match. What am I missing here? I thought that putting parens around re1 to get re2 should
give me the same expression but with capturing. Here are details on my emacs version:
GNU Emacs 23.0.60.1 (x86_64-unknown-linux-gnu, GTK+ Version 2.14.4) of 2008-11-01 on arch1
I tested this in 22.3, and it seems to work. In reading the NEWS file for 23,
I see changes in character set handling. What do I need to do to make re2 match what re1 does but with capturing? I realize
that in this case I can probably use (match-string 0), but the full RE that I'm going to eventually be matching on is this:
"\\(\377[\371\357]\\)\\|\\(\n\\)"
Any help would be appreciated.
- Tyler


  reply	other threads:[~2008-11-02 22:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-02  7:31 Making re-search-forward search for \377 Tyler Spivey
2008-11-02  8:45 ` Xah
2008-11-02  9:12   ` Tyler Spivey
2008-11-02 18:10     ` Kevin Rodgers
2008-11-02 20:32     ` Xah
2008-11-02 22:35       ` Tyler Spivey [this message]
2008-11-03  4:21     ` Eli Zaretskii
     [not found]     ` <mailman.2743.1225686066.25473.help-gnu-emacs@gnu.org>
2008-11-03  4:54       ` Tyler Spivey
2008-11-03 19:42         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ljw1pxhy.fsf@pcdesk.net \
    --to=tspivey@pcdesk.net \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).