all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: rasmith@tamu.edu
To: eliz@gnu.org
Cc: help-gnu-emacs@gnu.org
Subject: Re: search-forward in emacs23 lisp
Date: Mon, 29 Mar 2010 10:01:17 -0500 (CDT)	[thread overview]
Message-ID: <20100329.100117.57809164175956070.rasmith@aristotle.tamu.edu> (raw)
In-Reply-To: <831vf339k4.fsf@gnu.org>

From: Eli Zaretskii <eliz@gnu.org>
Subject: Re: search-forward in emacs23 lisp
Date: Mon, 29 Mar 2010 09:51:07 +0300

>> From: bojohan@gnu.org (Johan =?utf-8?Q?Bockg=C3=A5rd?=)
>> Date: Mon, 29 Mar 2010 01:00:45 +0200
>> Cc: 
>> 
>> There does seem to be a bug regarding search in unibyte buffers,
> 
> Please report this ASAP to the Emacs bug-tracker.  Emacs 23.2 is in
> the last stages of pretest, and so we should not waste any time
> discussing bugs here, if we want them to be fixed in the next release.
> 

After further investigation, I'm not certain it's a bug: it may be an
intentional part of the modifications to accommodate utf-8.  Here are
the details;

In a multibyte-buffer (set-buffer-multibyte t), 
   
(search-forward (char-to-string ?\xff)) matches utf-8 "ÿ" (i.e. \303\277)
(search-forward (char-to-string ?\377)) matches utf-8 "ÿ"
(search-forward (unibyte-string ?\377)) matches byte \377

In a unibyte buffer (set-buffer-multibyte nil)

(search-forward (char-to-string ?\xff)) matches \231\277
(search-forward (char-to-string ?\377)) matches \231\277
(search-forward (unibyte-string ?\377)) matches \231\277

In other words, search-forward cannot find byte \377 when searching in
a *unibyte* buffer, but it can find that same byte if the buffer is
changed to multibyte.  The reason is that in a unibyte buffer,
search-forward apparently changes byte \377 to a two-byte
representation (but not to utf-8, which would be \303\277).  

The code I had a problem with can be fixed by using char-after
(or more elegantly, I've now learned, using skip-chars-forward),
However, there's probably other code out there that's now broken
because of this.  Is it a bug, or was it a mistake to expect
search-forward to find a single high byte in a multibyte buffer in the
first place?

Robin Smith





  reply	other threads:[~2010-03-29 15:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-27 20:31 search-forward in emacs23 lisp rasmith
2010-03-28 16:39 ` rasmith
2010-03-28 16:50   ` Lennart Borgman
2010-03-28 17:04     ` rasmith
2010-03-28 17:10       ` Lennart Borgman
2010-03-28 17:56         ` rasmith
2010-03-28 17:59         ` rasmith
2010-03-28 18:22           ` Lennart Borgman
2010-03-28 21:45 ` Peter Dyballa
2010-03-29  0:44   ` rasmith
2010-03-28 23:00 ` Johan Bockgård
2010-03-29  6:51   ` Eli Zaretskii
2010-03-29 15:01     ` rasmith [this message]
2010-03-29 15:17       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100329.100117.57809164175956070.rasmith@aristotle.tamu.edu \
    --to=rasmith@tamu.edu \
    --cc=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.