unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: rm@tuxteam.de
To: Davis Herring <herring@lanl.gov>
Cc: Ralf Mattes <rm@seid-online.de>,
	Lennart Borgman <lennart.borgman@gmail.com>,
	emacs-devel@gnu.org
Subject: Re: Converting a string to valid XHTML id?
Date: Wed, 1 Dec 2010 16:58:58 +0100	[thread overview]
Message-ID: <20101201155858.GB12842@seid-online.de> (raw)
In-Reply-To: <40291.130.55.118.19.1291217640.squirrel@webmail.lanl.gov>

On Wed, Dec 01, 2010 at 07:34:00AM -0800, Davis Herring wrote:
> >   (let ((old (assoc id org-newhtml-escaped-ids))
> 
> Wouldn't it be easier to do something like percent encoding?  Map
> everything that isn't [-.a-zA-Z0-9] onto _HH.  Multibyte characters could
> be handled by writing their UTF-8 encoding, or else by escaping as _nHH...
> where n is the number of hex digits needed (itself always a single digit):


That sounds tempting but is wrong :-/ Percent-encoding doesn't produce
valid  ID values. From the html 4 specs:

 6.2 SGML basic types

  ....

 ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
 followed by any number of letters, digits ([0-9]), hyphens ("-"),
 underscores ("_"), colons (":"), and periods (".").


Cheers. Ralf Mattes

> 
> ;; Uses Emacs' internal encoding instead of UTF-8 proper.
> (defun org-newhtml-escape-id (str)
>   "Return a valid xhtml id attribute string.
> See URL `http://xhtml.com/en/xhtml/reference/attribute-data-types/#id'."
>   (replace-regexp-in-string
>    "[^-.a-zA-Z0-9]" (lambda (c)
>                       (mapconcat (lambda (d) (format "_%02x" d))
>                                  (string-as-unibyte c) "")) str))
> 
> Certainly someone could already have an id "foo_5fbar", but the
> table-based implementation already makes the assumption that all IDs will
> be generated by it.
> 
> Davis
> 
> -- 
> This product is sold by volume, not by mass.  If it appears too dense or
> too sparse, it is because mass-energy conversion has occurred during
> shipping.



  reply	other threads:[~2010-12-01 15:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-29  1:43 Converting a string to valid XHTML id? Lennart Borgman
2010-11-29 18:08 ` Andreas Schwab
2010-11-29 18:18   ` Lennart Borgman
2010-11-29 18:33     ` Deniz Dogan
2010-11-29 18:39       ` Lennart Borgman
2010-11-30 14:50         ` Ralf Mattes
2010-12-01 14:53           ` Lennart Borgman
2010-12-01 15:34             ` Davis Herring
2010-12-01 15:58               ` rm [this message]
2010-12-01 22:32                 ` Davis Herring
2010-12-01 23:12                   ` Lennart Borgman
2010-12-01 23:16                     ` Davis Herring
2010-12-01 23:31                       ` Lennart Borgman
2010-12-02  0:12                         ` Davis Herring
2010-12-02  0:44                           ` Lennart Borgman
2010-12-02  1:18                             ` Davis Herring
2010-12-02  1:51                               ` Lennart Borgman
2010-12-01 15:51             ` Stefan Monnier
2010-12-01 19:51               ` Lennart Borgman
2010-12-02  2:37                 ` Kevin Rodgers
2010-12-02  2:54                   ` Lennart Borgman
2010-12-02  4:42                     ` PJ Weisberg
2010-12-02 12:26                       ` Lennart Borgman
2010-12-02 15:50                         ` Lawrence Mitchell
2010-12-02 17:47                           ` Lennart Borgman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101201155858.GB12842@seid-online.de \
    --to=rm@tuxteam.de \
    --cc=emacs-devel@gnu.org \
    --cc=herring@lanl.gov \
    --cc=lennart.borgman@gmail.com \
    --cc=rm@seid-online.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).