From: rm@tuxteam.de
To: Davis Herring <herring@lanl.gov>
Cc: Ralf Mattes <rm@seid-online.de>,
Lennart Borgman <lennart.borgman@gmail.com>,
emacs-devel@gnu.org
Subject: Re: Converting a string to valid XHTML id?
Date: Wed, 1 Dec 2010 16:58:58 +0100 [thread overview]
Message-ID: <20101201155858.GB12842@seid-online.de> (raw)
In-Reply-To: <40291.130.55.118.19.1291217640.squirrel@webmail.lanl.gov>
On Wed, Dec 01, 2010 at 07:34:00AM -0800, Davis Herring wrote:
> > (let ((old (assoc id org-newhtml-escaped-ids))
>
> Wouldn't it be easier to do something like percent encoding? Map
> everything that isn't [-.a-zA-Z0-9] onto _HH. Multibyte characters could
> be handled by writing their UTF-8 encoding, or else by escaping as _nHH...
> where n is the number of hex digits needed (itself always a single digit):
That sounds tempting but is wrong :-/ Percent-encoding doesn't produce
valid ID values. From the html 4 specs:
6.2 SGML basic types
....
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
followed by any number of letters, digits ([0-9]), hyphens ("-"),
underscores ("_"), colons (":"), and periods (".").
Cheers. Ralf Mattes
>
> ;; Uses Emacs' internal encoding instead of UTF-8 proper.
> (defun org-newhtml-escape-id (str)
> "Return a valid xhtml id attribute string.
> See URL `http://xhtml.com/en/xhtml/reference/attribute-data-types/#id'."
> (replace-regexp-in-string
> "[^-.a-zA-Z0-9]" (lambda (c)
> (mapconcat (lambda (d) (format "_%02x" d))
> (string-as-unibyte c) "")) str))
>
> Certainly someone could already have an id "foo_5fbar", but the
> table-based implementation already makes the assumption that all IDs will
> be generated by it.
>
> Davis
>
> --
> This product is sold by volume, not by mass. If it appears too dense or
> too sparse, it is because mass-energy conversion has occurred during
> shipping.
next prev parent reply other threads:[~2010-12-01 15:58 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-29 1:43 Converting a string to valid XHTML id? Lennart Borgman
2010-11-29 18:08 ` Andreas Schwab
2010-11-29 18:18 ` Lennart Borgman
2010-11-29 18:33 ` Deniz Dogan
2010-11-29 18:39 ` Lennart Borgman
2010-11-30 14:50 ` Ralf Mattes
2010-12-01 14:53 ` Lennart Borgman
2010-12-01 15:34 ` Davis Herring
2010-12-01 15:58 ` rm [this message]
2010-12-01 22:32 ` Davis Herring
2010-12-01 23:12 ` Lennart Borgman
2010-12-01 23:16 ` Davis Herring
2010-12-01 23:31 ` Lennart Borgman
2010-12-02 0:12 ` Davis Herring
2010-12-02 0:44 ` Lennart Borgman
2010-12-02 1:18 ` Davis Herring
2010-12-02 1:51 ` Lennart Borgman
2010-12-01 15:51 ` Stefan Monnier
2010-12-01 19:51 ` Lennart Borgman
2010-12-02 2:37 ` Kevin Rodgers
2010-12-02 2:54 ` Lennart Borgman
2010-12-02 4:42 ` PJ Weisberg
2010-12-02 12:26 ` Lennart Borgman
2010-12-02 15:50 ` Lawrence Mitchell
2010-12-02 17:47 ` Lennart Borgman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101201155858.GB12842@seid-online.de \
--to=rm@tuxteam.de \
--cc=emacs-devel@gnu.org \
--cc=herring@lanl.gov \
--cc=lennart.borgman@gmail.com \
--cc=rm@seid-online.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).