From: "Basil L. Contovounesios" <contovob@tcd.ie>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 41571@debbugs.gnu.org
Subject: bug#41571: 27.0.91; "(elisp) Interpolated Strings" is under "(elisp) Text"
Date: Fri, 29 May 2020 19:35:31 +0100 [thread overview]
Message-ID: <877dwu4mj0.fsf@tcd.ie> (raw)
In-Reply-To: <837dwws38a.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 28 May 2020 14:33:57 +0300")
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0001-Improve-format-spec-documentation-bug-41571.patch --]
[-- Type: text/x-diff, Size: 12744 bytes --]
From b6a7bfbbb08d5c09e85f50b9100b39c1e1645bc2 Mon Sep 17 00:00:00 2001
From: "Basil L. Contovounesios" <contovob@tcd.ie>
Date: Thu, 28 May 2020 00:53:42 +0100
Subject: [PATCH] Improve format-spec documentation (bug#41571)
* doc/lispref/text.texi (Interpolated Strings): Move from here...
* doc/lispref/strings.texi (Custom Format Strings): ...to here,
renaming the node and clarifying the documentation.
(Formatting Strings): End node with sentence referring to the next
one.
* lisp/format-spec.el (format-spec): Clarify docstring.
---
doc/lispref/strings.texi | 138 +++++++++++++++++++++++++++++++++++++++
doc/lispref/text.texi | 64 ------------------
lisp/format-spec.el | 49 ++++++++------
3 files changed, 168 insertions(+), 83 deletions(-)
diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 70c3b3cf4b..0c750a8143 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -28,6 +28,7 @@ Strings and Characters
* Text Comparison:: Comparing characters or strings.
* String Conversion:: Converting to and from characters and strings.
* Formatting Strings:: @code{format}: Emacs's analogue of @code{printf}.
+* Custom Format Strings:: Formatting custom @code{format} specifications.
* Case Conversion:: Case conversion functions.
* Case Tables:: Customizing case conversion.
@end menu
@@ -1122,6 +1123,143 @@ Formatting Strings
NaNs and can lose precision and type, and @samp{#x%x} and @samp{#o%o}
can mishandle negative integers. @xref{Input Functions}.
+The functions described in this section accept a fixed set of
+specification characters. The next section describes a function
+@code{format-spec} which accepts custom specification characters.
+
+@node Custom Format Strings
+@section Custom Format Strings
+@cindex custom format string
+@cindex custom @samp{%}-sequence in format
+
+It is, in some circumstances, useful to allow users to control how
+certain text is generated via custom format control strings. For
+example, a format string could control how to display someone's
+forename, surname, and email address. Using the function
+@code{format} described in the previous section, the format string
+could be something like @code{"%s %s <%s>"}. This approach quickly
+becomes impractical, however, as it can be unclear which specification
+character corresponds to which piece of information.
+
+A more convenient format string for such cases would be something like
+@code{"%f %l <%e>"}, where each specification character carries more
+semantic information and can easily be rearranged relative to other
+specification characters. The function @code{format-spec} described
+in this section performs a similar function to @code{format}, except
+it operates on format control strings that comprise arbitrary
+specification characters.
+
+@defun format-spec format specification &optional only-present
+This function returns a string equal to the format control string
+@var{format}, replacing any format specifications it contains with
+values found in the alist @var{specification} (@pxref{Association
+Lists}).
+
+Each key in @var{specification} is a format specification character,
+and its associated value is the string to replace it with. For
+example, an alist entry @code{(?a . "alpha")} means to replace any
+@samp{%a} specifications in @var{format} with @samp{alpha}.
+
+The characters in @var{format}, other than the format specifications,
+are copied directly into the output, including their text properties,
+if any. Any text properties of the format specifications are copied
+to their substitutions.
+
+Some useful properties are gained as a result of @var{specification}
+being an alist. The alist may contain more unique keys than there are
+unique specification characters in @var{format}; unused keys are
+simply ignored. If the same key is contained more than once, the
+first one found is used. If @var{format} contains the same format
+specification character more than once, then the same value found in
+@var{specification} is used as a basis for all of that character's
+substitutions.
+
+The optional argument @var{only-present} indicates how to handle
+format specification characters in @var{format} that are not found in
+@var{specification}. If it is @code{nil} or omitted, an error is
+emitted. Otherwise, those format specifications and any occurrences
+of @samp{%%} in @var{format} are left verbatim in the output,
+including their text properties, if any.
+@end defun
+
+The syntax of format specifications accepted by @code{format-spec} is
+similar, but not identical, to that accepted by @code{format}. In
+both cases, a format specification is a sequence of characters
+beginning with @samp{%} and ending with an alphabetic letter such as
+@samp{s}. The only exception to this is the specification @samp{%%},
+which is replaced with a single @samp{%}.
+
+Unlike @code{format}, which assigns specific meanings to a fixed set
+of specification characters, @code{format-spec} accepts arbitrary
+specification characters and treats them all equally. For example:
+
+@example
+(format-spec "su - %u %l"
+ `((?u . ,(user-login-name))
+ (?l . "ls")))
+ @result{} "su - foo ls"
+@end example
+
+A format specification can include any number of the following flag
+characters immediately after the @samp{%} to modify aspects of the
+substitution.
+
+@table @samp
+@item 0
+This flag causes any padding inserted by the width, if specified, to
+consist of @samp{0} characters instead of spaces.
+
+@item -
+This flag causes any padding inserted by the width, if specified, to
+be inserted on the right rather than the left.
+
+@item <
+This flag causes the substitution to be truncated to the given width,
+if specified, by removing characters from the left.
+
+@item >
+This flag causes the substitution to be truncated to the given width,
+if specified, by removing characters from the right.
+
+@item ^
+This flag converts the substituted text to upper case (@pxref{Case
+Conversion}).
+
+@item _
+This flag converts the substituted text to lower case (@pxref{Case
+Conversion}).
+@end table
+
+The result of using contradictory flags (for instance, both upper and
+lower case) is undefined.
+
+As is the case with @code{format}, a format specification can include
+a width, which is a decimal number that appears after any flags. If a
+substitution contains fewer characters than its specified width, it is
+extended with padding, normally comprising spaces inserted on the
+left:
+
+@example
+(format-spec "%8a is padded on the left with spaces"
+ '((?a . "alpha")))
+ @result{} " alpha is padded on the left with spaces"
+@end example
+
+Here is a more complicated example that combines several
+aforementioned features:
+
+@example
+(format-spec "%<06e %<06b"
+ '((?b . "beta")
+ (?e . "epsilon")))
+ @result{} "psilon 00beta"
+@end example
+
+This format string means ``substitute into the output the values
+associated with @code{?e} and @code{?b} in the given alist, either
+padding them with leading zeros or truncating leading characters until
+they're each six characters wide.''
+
@node Case Conversion
@section Case Conversion in Lisp
@cindex upper case
diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi
index de436fa9e6..a14867e1d1 100644
--- a/doc/lispref/text.texi
+++ b/doc/lispref/text.texi
@@ -58,7 +58,6 @@ Text
of another buffer.
* Decompression:: Dealing with compressed data.
* Base 64:: Conversion to or from base 64 encoding.
-* Interpolated Strings:: Formatting Customizable Strings.
* Checksum/Hash:: Computing cryptographic hashes.
* GnuTLS Cryptography:: Cryptographic algorithms imported from GnuTLS.
* Parsing HTML/XML:: Parsing HTML and XML.
@@ -4662,69 +4661,6 @@ Base 64
is optional, and the URL variant of base 64 encoding is used.
@end defun
-
-@node Interpolated Strings
-@section Formatting Customizable Strings
-
-It is, in some circumstances, useful to present users with a string to
-be customized that can then be expanded programmatically. For
-instance, @code{erc-header-line-format} is @code{"%n on %t (%m,%l)
-%o"}, and each of those characters after the percent signs are
-expanded when the header line is computed. To do this, the
-@code{format-spec} function is used:
-
-@defun format-spec format specification &optional only-present
-@var{format} is the format specification string as in the example
-above. @var{specification} is an alist that has elements where the
-@code{car} is a character and the @code{cdr} is the substitution.
-
-If @var{only-present} is @code{nil}, errors will be signaled if a
-format character has been used that's not present in
-@var{specification}. If it's non-@code{nil}, that format
-specification is left verbatim in the result.
-@end defun
-
-Here's a trivial example:
-
-@example
-(format-spec "su - %u %l"
- `((?u . ,(user-login-name))
- (?l . "ls")))
- @result{} "su - foo ls"
-@end example
-
-In addition to allowing padding/limiting to a certain length, the
-following modifiers can be used:
-
-@table @asis
-@item @samp{0}
-Pad with zeros instead of the default spaces.
-
-@item @samp{-}
-Pad to the right.
-
-@item @samp{^}
-Use upper case.
-
-@item @samp{_}
-Use lower case.
-
-@item @samp{<}
-If the length needs to be limited, remove characters from the left.
-
-@item @samp{>}
-Same as previous, but remove characters from the right.
-@end table
-
-If contradictory modifiers are used (for instance, both upper and
-lower case), then what happens is undefined.
-
-As an example, @samp{"%<010b"} means ``insert the @samp{b} expansion,
-but pad with leading zeros if it's less than ten characters, and if
-it's more than ten characters, shorten by removing characters from the
-left.''
-
-
@node Checksum/Hash
@section Checksum/Hash
@cindex MD5 checksum
diff --git a/lisp/format-spec.el b/lisp/format-spec.el
index f418cea425..4bf636e685 100644
--- a/lisp/format-spec.el
+++ b/lisp/format-spec.el
@@ -29,35 +29,46 @@
(defun format-spec (format specification &optional only-present)
"Return a string based on FORMAT and SPECIFICATION.
-FORMAT is a string containing `format'-like specs like \"su - %u %k\",
-while SPECIFICATION is an alist mapping from format spec characters
-to values.
+FORMAT is a string containing `format'-like specs like \"su - %u %k\".
+SPECIFICATION is an alist mapping format specification characters
+to their substitutions.
For instance:
(format-spec \"su - %u %l\"
- `((?u . ,(user-login-name))
+ \\=`((?u . ,(user-login-name))
(?l . \"ls\")))
-Each format spec can have modifiers, where \"%<010b\" means \"if
-the expansion is shorter than ten characters, zero-pad it, and if
-it's longer, chop off characters from the left side\".
+Each %-spec may contain optional flag and width modifiers, as
+follows:
-The following modifiers are allowed:
+ %<flags><width>character
-* 0: Use zero-padding.
-* -: Pad to the right.
-* ^: Upper-case the expansion.
-* _: Lower-case the expansion.
-* <: Limit the length by removing chars from the left.
-* >: Limit the length by removing chars from the right.
+The following flags are allowed:
-Any text properties on a %-spec itself are propagated to the text
-that it generates.
+* 0: Pad to the width, if given, with zeros instead of spaces.
+* -: Pad to the width, if given, on the right instead of the left.
+* <: Truncate to the width, if given, by removing leading characters.
+* >: Truncate to the width, if given, by removing trailing characters.
+* ^: Convert to upper case.
+* _: Convert to lower case.
-If ONLY-PRESENT, format spec characters not present in
-SPECIFICATION are ignored, and the \"%\" characters are left
-where they are, including \"%%\" strings."
+The width modifier behaves like the corresponding one in `format'
+when applied to %s.
+
+For example, \"%<010b\" means \"substitute into the output the
+value associated with ?b in SPECIFICATION, either padding it with
+leading zeros or truncating leading characters until it's ten
+characters wide\".
+
+Any text properties of FORMAT are copied to the result, with any
+text properties of a %-spec itself copied to its substitution.
+
+ONLY-PRESENT indicates how to handle %-spec characters not
+present in SPECIFICATION. If it is nil or omitted, emit an
+error; otherwise leave those %-specs and any occurrences of
+\"%%\" in FORMAT verbatim in the result, including their text
+properties, if any."
(with-temp-buffer
(insert format)
(goto-char (point-min))
--
2.26.2
[-- Attachment #2: Type: text/plain, Size: 1156 bytes --]
Eli Zaretskii <eliz@gnu.org> writes:
>> From: "Basil L. Contovounesios" <contovob@tcd.ie>
>> Cc: 41571@debbugs.gnu.org
>> Date: Thu, 28 May 2020 11:41:54 +0100
>>
>> While I'm at it, may I change the node name?
>
> Sure, the names aren't cast in stone, and the current one doesn't
> strike me as especially successful/accurate.
>
>> In fact, couldn't format-spec be documented alongside format and
>> format-message under "(elisp) Formatting Strings"?
>
> I thought about that as well when I read your original message, but
> concluded that "Formatting Strings" is already too long. We could
> end that node with a sentence referring to the next one, so that
> interested readers could continue there right away. WDYT?
SGTM.
>> Just to recap: format-spec is like format, except it allows custom
>> %-sequence characters, such as %z, which are substituted in a similar
>> way to format's %s. A common use case is to allow users to customise
>> different output presented to them via custom format control strings.
>
> This text is sorely missed at the beginning of the node about
> format-spec.
How's the attached for emacs-27?
Thanks,
--
Basil
next prev parent reply other threads:[~2020-05-29 18:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-27 23:57 bug#41571: 27.0.91; "(elisp) Interpolated Strings" is under "(elisp) Text" Basil L. Contovounesios
2020-05-28 6:58 ` Eli Zaretskii
2020-05-28 10:41 ` Basil L. Contovounesios
2020-05-28 11:33 ` Eli Zaretskii
2020-05-29 18:35 ` Basil L. Contovounesios [this message]
2020-05-29 19:41 ` Eli Zaretskii
2020-05-31 9:24 ` Basil L. Contovounesios
2020-05-31 16:03 ` Eli Zaretskii
2020-06-02 14:03 ` Basil L. Contovounesios
2020-06-02 16:56 ` Eli Zaretskii
2020-06-02 19:57 ` Basil L. Contovounesios
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877dwu4mj0.fsf@tcd.ie \
--to=contovob@tcd.ie \
--cc=41571@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).