From: Robert Pluim <rpluim@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: emacs-28 b7d7c2d9e9: Add cross-reference to alternative syntaxes for Unicode
Date: Tue, 18 Oct 2022 17:05:44 +0200 [thread overview]
Message-ID: <87pmepuq7r.fsf@gmail.com> (raw)
In-Reply-To: <83zgdy714g.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 14 Oct 2022 20:42:39 +0300")
>>>>> On Fri, 14 Oct 2022 20:42:39 +0300, Eli Zaretskii <eliz@gnu.org> said:
>> Hmm, so move or copy "General Escape Syntax" under
>> (emacs)International somewhere, and refer to it from the "You can
>> insert non-ASCII characters or search for them" section of that node
>> (since thatʼs where we talk about C-x 8)?
Eli> Yes, something like that.
Hereʼs a rough attempt:
diff --git c/doc/emacs/custom.texi i/doc/emacs/custom.texi
index 2bc1d3820d..817501b3f8 100644
--- c/doc/emacs/custom.texi
+++ i/doc/emacs/custom.texi
@@ -2794,9 +2794,8 @@ Init Non-ASCII
An alternative to using non-@acronym{ASCII} characters directly is
to use one of the character escape syntaxes described in
-@pxref{General Escape Syntax,,, elisp, The Emacs Lisp Reference
-Manual}, as they allow all Unicode codepoints to be specified using
-only @acronym{ASCII} characters.
+@xref{Character Escape Syntax}, as they allow all Unicode codepoints
+to be specified using only @acronym{ASCII} characters.
To bind non-@acronym{ASCII} keys, you must use a vector (@pxref{Init
Rebinding}). The string syntax cannot be used, since the
diff --git c/doc/emacs/mule.texi i/doc/emacs/mule.texi
index f87c1252d3..c202c21aa4 100644
--- c/doc/emacs/mule.texi
+++ i/doc/emacs/mule.texi
@@ -56,7 +56,9 @@ International
your keyboard can produce non-@acronym{ASCII} characters, you can select an
appropriate keyboard coding system (@pxref{Terminal Coding}), and Emacs
will accept those characters. Latin-1 characters can also be input by
-using the @kbd{C-x 8} prefix, see @ref{Unibyte Mode}.
+using the @kbd{C-x 8} prefix, see @ref{Unibyte Mode}. It is also
+possible to write non-@acronym{ASCII} characters using various
+pure-@acronym{ASCII} escape syntaxes, see @ref{Character Escape Syntax}.
With the X Window System, your locale should be set to an appropriate
value to make sure Emacs interprets keyboard input correctly; see
@@ -67,6 +69,7 @@ International
@menu
* International Chars:: Basic concepts of multibyte characters.
+* Character Escape Syntax:: Alternative ways to write characters
* Language Environments:: Setting things up for the language you use.
* Input Methods:: Entering text characters not on your keyboard.
* Select Input Method:: Specifying your choice of input methods.
@@ -240,6 +243,63 @@ International Chars
decomposition: (101 770) ('e' '^')
@end smallexample
+@c This is (almost) verbatim from "General Escape Syntax" in the Emacs
+@c Lisp Reference Manual, please keep in sync.
+@node Character Escape Syntax
+@section Character Escape Syntax
+
+ Input methods provide ways to enter non-@acronym{ASCII} characters,
+but sometimes it is more convenient to use an @acronym{ASCII}-only
+representation, e.g. when there are several similar characters that
+are hard to visually distinguish. Emacs provides several types of
+escape syntax that you can use to write such characters
+
+@enumerate
+@item
+@cindex @samp{\} in character constant
+@cindex backslash in character constants
+@cindex unicode character escape
+You can specify characters by their Unicode names, if any.
+@code{?\N@{@var{NAME}@}} represents the Unicode character named
+@var{NAME}. Thus, @samp{?\N@{LATIN SMALL LETTER A WITH GRAVE@}} is
+equivalent to @code{?à} and denotes the Unicode character U+00E0. To
+simplify entering multi-line strings, you can replace spaces in the
+names by non-empty sequences of whitespace (e.g., newlines).
+
+@item
+You can specify characters by their Unicode values.
+@code{?\N@{U+@var{X}@}} represents a character with Unicode code point
+@var{X}, where @var{X} is a hexadecimal number. Also,
+@code{?\u@var{xxxx}} and @code{?\U@var{xxxxxxxx}} represent code
+points @var{xxxx} and @var{xxxxxxxx}, respectively, where each @var{x}
+is a single hexadecimal digit. For example, @code{?\N@{U+E0@}},
+@code{?\u00e0} and @code{?\U000000E0} are all equivalent to @code{?à}
+and to @samp{?\N@{LATIN SMALL LETTER A WITH GRAVE@}}. The Unicode
+Standard defines code points only up to @samp{U+@var{10ffff}}, so if
+you specify a code point higher than that, Emacs signals an error.
+
+@item
+You can specify characters by their hexadecimal character
+codes. A hexadecimal escape sequence consists of a backslash,
+@samp{x}, and the hexadecimal character code. Thus, @samp{?\x41} is
+the character @kbd{A}, @samp{?\x1} is the character @kbd{C-a}, and
+@code{?\xe0} is the character @kbd{à} (@kbd{a} with grave accent).
+You can use any number of hex digits, so you can represent any
+character code in this way.
+
+@item
+@cindex octal character code
+You can specify characters by their character code in
+octal. An octal escape sequence consists of a backslash followed by
+up to three octal digits; thus, @samp{?\101} for the character
+@kbd{A}, @samp{?\001} for the character @kbd{C-a}, and @code{?\002}
+for the character @kbd{C-b}. Only characters up to octal code 777 can
+be specified this way.
+
+@end enumerate
+
+ These escape sequences may also be used in strings.
+
@node Language Environments
@section Language Environments
@cindex language environments
diff --git c/doc/lispref/objects.texi i/doc/lispref/objects.texi
index a715b45a6c..35f413c5a5 100644
--- c/doc/lispref/objects.texi
+++ i/doc/lispref/objects.texi
@@ -440,6 +440,8 @@ Basic Char Syntax
you should write an extra space after the character constant to
separate it from the following text.)
+@c This is reproduced in "Character Escape Syntax" in the Emacs
+@c manual, please keep in sync.
@node General Escape Syntax
@subsubsection General Escape Syntax
next prev parent reply other threads:[~2022-10-18 15:05 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-14 15:56 emacs-28 b7d7c2d9e9: Add cross-reference to alternative syntaxes for Unicode Eli Zaretskii
2022-10-14 16:17 ` Robert Pluim
2022-10-14 17:42 ` Eli Zaretskii
2022-10-18 15:05 ` Robert Pluim [this message]
2022-10-18 15:18 ` Eli Zaretskii
2022-10-18 15:58 ` Eli Zaretskii
2022-10-18 16:06 ` Robert Pluim
2022-10-21 19:42 ` Richard Stallman
2022-10-22 6:19 ` Eli Zaretskii
2022-10-22 7:43 ` Po Lu
2022-10-22 20:05 ` Richard Stallman
2022-10-24 4:40 ` Christopher Dimech
2022-10-24 13:06 ` Eli Zaretskii
2022-10-22 17:22 ` [External] : " Drew Adams
2022-10-24 3:05 ` RE: [External] : " Christopher Dimech
2022-10-22 20:06 ` Richard Stallman
2022-10-23 5:15 ` Eli Zaretskii
2022-10-24 19:31 ` Richard Stallman
2022-10-24 19:44 ` Eli Zaretskii
2022-10-24 19:57 ` Christopher Dimech
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pmepuq7r.fsf@gmail.com \
--to=rpluim@gmail.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.