all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Drew Adams <drew.adams@oracle.com>
Cc: cyd@gnu.org, 12054@debbugs.gnu.org
Subject: bug#12054: 24.1; regression? font-lock no-break-space with nil nobreak-char-display
Date: Sat, 03 Nov 2012 23:13:40 +0200	[thread overview]
Message-ID: <83pq3u4cfv.fsf@gnu.org> (raw)
In-Reply-To: <0B444DBDD1D14FD7B5EDE10E30ED320D@us.oracle.com>

> From: "Drew Adams" <drew.adams@oracle.com>
> Date: Sat, 3 Nov 2012 12:01:29 -0700
> Cc: 12054@debbugs.gnu.org
> 
> I think I understand this (but I might be misunderstanding).  The \240 in the
> 4-char ASCII regexp string "\240" is interpreted (read?) as a raw byte, not as
> the char I wanted.

Yes.

> That is, the literal string in my code is read as a string that contains only a
> single raw byte of octal 240 in place of the 4 chars \240 (and instead of as a
> string with the multibyte char no-break space).  Is that right?

Yes.

> And putting that together with Eli's statement about insertion ("'insert' treats
> strings such as "\nnn" as unibyte strings"), I understand that the buffer text
> after I type `C-q 240' contains a unibyte raw byte, and not the multibyte char
> no-break space.

No.  It contains the NBSP.  Try it.  C-q inserts a multibyte
character, unlike '(insert "\240")', for example.

> But in that case I do not understand why `C-u C-x =' says that it _is_ the
> Unicode no-break space char.

Because it is.

> And I do not understand why Yidong's font-lock correction also shows
> that it is a no-break space char.

Chong didn't use "\240".

> So I'm confused about what is actually in the buffer.  From the doc and from
> Eli's statement, I gather that there is a unibyte raw byte (octal 240) at that
> position.  But `C-u C-x =' and font-lock seem to tell me that there is a
> (multibyte) no-break space char there.

Try '(insert "\240")' and then "C-x =" will show a unibyte byte.

> > (One reason for doing this is to allow unibyte strings to
> > be specified using string constants in Emacs Lisp source code.)
> 
> I can see how that can be useful.  But I can also see how it would be useful to
> have some way of using octal syntax to match multibyte chars.  Isn't there some
> reasonable way to allow for both?

Maybe, but we didn't find one, at least not one that would be
backward-compatible.

> Is there, for example, (or could there be added) a function that one can apply
> to the unibyte string for \240 that would convert it to a string that DTRT wrt
> multibyte?

Such functions do exist, see the "Converting Representations" node in
the ELisp manual.

> (decode-coding-string "\302\240" 'utf-8)
> 
> That allows use of only octal syntax - good.  But it still doesn't solve the
> problem for older Emacs versions - they raise the error (coding-system-error
> utf-8).

You don't want this, because even if you succeed in producing a NBSP
in Emacs 22 and older, the result will not match NBSP in other
charsets.  It's simply impossible with those versions of Emacs.





  reply	other threads:[~2012-11-03 21:13 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-26  5:43 bug#12054: 24.1; regression? font-lock no-break-space with nil nobreak-char-display Drew Adams
2012-09-16 23:40 ` Drew Adams
2012-11-03 10:50 ` Chong Yidong
2012-11-03 11:03   ` Chong Yidong
2012-11-03 16:25   ` Drew Adams
2012-11-03 16:56     ` Eli Zaretskii
2012-11-03 17:22       ` Drew Adams
2012-11-03 20:57         ` Eli Zaretskii
2012-11-03 19:50       ` Stefan Monnier
2012-11-03 20:02         ` Drew Adams
2012-11-03 20:36           ` Stefan Monnier
2012-11-03 20:42             ` Drew Adams
2012-11-03 17:06     ` Chong Yidong
2012-11-03 17:32       ` Drew Adams
2012-11-03 18:00         ` Chong Yidong
2012-11-03 18:04           ` Drew Adams
2012-11-03 21:00           ` Eli Zaretskii
2012-11-03 19:01       ` Drew Adams
2012-11-03 21:13         ` Eli Zaretskii [this message]
2012-11-04 23:34           ` Drew Adams
2012-11-03 16:37   ` Andreas Schwab
2012-11-03 17:05     ` Drew Adams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83pq3u4cfv.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=12054@debbugs.gnu.org \
    --cc=cyd@gnu.org \
    --cc=drew.adams@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.