* search and replace codepoints
@ 2014-10-24 18:50 Haines Brown
2014-10-24 19:33 ` Eli Zaretskii
[not found] ` <mailman.11972.1414179242.1147.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 5+ messages in thread
From: Haines Brown @ 2014-10-24 18:50 UTC (permalink / raw)
To: help-gnu-emacs
I have frequently pasted hyphenated material into a large .bib file
which sometimes turns out to be a codepoint that LaTeX can't compile.
The typed or pasted hyphen that does not cause a problem looks like
this:
character: - (displayed as -) (codepoint 45, #o55, #x2d)
preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x2D
category: .:Base, a:ASCII, l:Latin, r:Roman
buffer code: #x2D
file code: #x2D (encoded by coding system utf-8-unix)
The pasted hyphen that LaTeX can't compile looks like this:
character: (displayed as ) (codepoint 173, #o255, #xad)
preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xAD
category: b:Arabic, h:Korean, j:Japanese, l:Latin
buffer code: #xC2 #xAD
file code: #xC2 #xAD (encoded by coding system utf-8-unix)
How do I do a search/replace to replace instances of the latter with the
former? What values should I use? Why is the unicode character not
identified with the usual U+...?
Haines Brown
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: search and replace codepoints
2014-10-24 18:50 search and replace codepoints Haines Brown
@ 2014-10-24 19:33 ` Eli Zaretskii
[not found] ` <mailman.11972.1414179242.1147.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2014-10-24 19:33 UTC (permalink / raw)
To: help-gnu-emacs
> From: Haines Brown <haines@histomat.net>
> Date: Fri, 24 Oct 2014 14:50:23 -0400
>
> The pasted hyphen that LaTeX can't compile looks like this:
>
> character: (displayed as ) (codepoint 173, #o255, #xad)
> preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xAD
> category: b:Arabic, h:Korean, j:Japanese, l:Latin
> buffer code: #xC2 #xAD
> file code: #xC2 #xAD (encoded by coding system utf-8-unix)
>
> How do I do a search/replace to replace instances of the latter with the
> former?
Just replace it with M-%, as you would any other character.
> What values should I use?
The one shown above, of course.
> Why is the unicode character not identified with the usual U+...?
The codepoint #xad _is_ the Unicode codepoint, u+00AD.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: search and replace codepoints
[not found] ` <mailman.11972.1414179242.1147.help-gnu-emacs@gnu.org>
@ 2014-10-24 20:15 ` Haines Brown
2014-10-24 21:11 ` Álvar Ibeas
2014-10-25 6:27 ` Eli Zaretskii
0 siblings, 2 replies; 5+ messages in thread
From: Haines Brown @ 2014-10-24 20:15 UTC (permalink / raw)
To: help-gnu-emacs
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Haines Brown <haines@histomat.net>
>> Date: Fri, 24 Oct 2014 14:50:23 -0400
>> How do I do a search/replace to replace instances of the latter with the
>> former?
>
> Just replace it with M-%, as you would any other character.
>
>> What values should I use?
>
> The one shown above, of course.
Many codings were shown above (45, #055, #x2d, 0x2d for just one
character). Because none of them worked, I asked the question.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: search and replace codepoints
2014-10-24 20:15 ` Haines Brown
@ 2014-10-24 21:11 ` Álvar Ibeas
2014-10-25 6:27 ` Eli Zaretskii
1 sibling, 0 replies; 5+ messages in thread
From: Álvar Ibeas @ 2014-10-24 21:11 UTC (permalink / raw)
To: help-gnu-emacs
Hello,
>>> How do I do a search/replace to replace instances of the latter with the
>>> former?
You may copy the character into the kill ring to yank it when prompted
for the first argument of replace-string. You can also evaluate the
following:
(replace-string "\u00ad" "-")
The purpose of the soft hyphen is to mark a possible hyphenation break.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: search and replace codepoints
2014-10-24 20:15 ` Haines Brown
2014-10-24 21:11 ` Álvar Ibeas
@ 2014-10-25 6:27 ` Eli Zaretskii
1 sibling, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2014-10-25 6:27 UTC (permalink / raw)
To: help-gnu-emacs
> From: Haines Brown <haines@histomat.net>
> Date: Fri, 24 Oct 2014 16:15:15 -0400
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> From: Haines Brown <haines@histomat.net>
> >> Date: Fri, 24 Oct 2014 14:50:23 -0400
> >> How do I do a search/replace to replace instances of the latter with the
> >> former?
> >
> > Just replace it with M-%, as you would any other character.
> >
> >> What values should I use?
> >
> > The one shown above, of course.
>
> Many codings were shown above (45, #055, #x2d, 0x2d for just one
> character).
No, I meant only the values I quoted in my message:
> character: (displayed as ) (codepoint 173, #o255, #xad)
They are all the same value, decimal 173, shown in decimal, in octal,
and in hex.
> Because none of them worked, I asked the question.
How did you try using them in a replace command? What I had in mind
was use "C-x 8 RET", which allows you to type the codepoint in hex.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-10-25 6:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24 18:50 search and replace codepoints Haines Brown
2014-10-24 19:33 ` Eli Zaretskii
[not found] ` <mailman.11972.1414179242.1147.help-gnu-emacs@gnu.org>
2014-10-24 20:15 ` Haines Brown
2014-10-24 21:11 ` Álvar Ibeas
2014-10-25 6:27 ` Eli Zaretskii
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).