From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: Converting a string to valid XHTML id? Date: Thu, 2 Dec 2010 18:47:19 +0100 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1291313091 11217 80.91.229.12 (2 Dec 2010 18:04:51 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 2 Dec 2010 18:04:51 +0000 (UTC) Cc: emacs-devel@gnu.org To: Lawrence Mitchell Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 02 19:04:47 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PODWM-0005fC-0U for ged-emacs-devel@m.gmane.org; Thu, 02 Dec 2010 19:04:42 +0100 Original-Received: from localhost ([127.0.0.1]:52459 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PODWK-0006x4-RC for ged-emacs-devel@m.gmane.org; Thu, 02 Dec 2010 13:04:40 -0500 Original-Received: from [140.186.70.92] (port=33595 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PODFv-0001YQ-0P for emacs-devel@gnu.org; Thu, 02 Dec 2010 12:47:43 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PODFt-0001xq-Fv for emacs-devel@gnu.org; Thu, 02 Dec 2010 12:47:42 -0500 Original-Received: from mail-ew0-f41.google.com ([209.85.215.41]:37890) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PODFt-0001xU-BS for emacs-devel@gnu.org; Thu, 02 Dec 2010 12:47:41 -0500 Original-Received: by ewy27 with SMTP id 27so18182671ewy.0 for ; Thu, 02 Dec 2010 09:47:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=4j92F45oWymknCzwMH3Uuy8WxNLG2YTZAY++EgTApdw=; b=h1YxsYqJSVTR5DbDOEizBI7iLEudFUFIJoX5fCeC8WVq+zMJG6Xm/r/JYMZWi6ZuS+ DUqds8k0iH1scXfkG+p3kXvBznA9xLYlUstx/eLYjN+6p1hcixX+pjsFLf0GCJv1HxdN uLW4zG2XRmbTpaiJDlxnaOpwmUNvQxUJc6NY8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=ujjzveI1TftVzDT6B8yUVFfQHDC4dyA3LThCiLa+JH3wEitfHNQZn9Umkqs8+NCw5U B9j0PRDpYuopPLS914W1glE9an/u1qao61kpzD2Ij2clRJnt9aW3wrT5uyDE9LP1kc5V UaaT/RmowkxkCf50dmJAYpnuQfydjEila9tsc= Original-Received: by 10.213.5.5 with SMTP id 5mr1218426ebt.84.1291312059393; Thu, 02 Dec 2010 09:47:39 -0800 (PST) Original-Received: by 10.213.29.8 with HTTP; Thu, 2 Dec 2010 09:47:19 -0800 (PST) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:133326 Archived-At: On Thu, Dec 2, 2010 at 4:50 PM, Lawrence Mitchell wrote: > > Or use Davis' solution which works in a similar way, and as a > bonus you can map back to the original id easily. > > Recall his solution: > > (defun org-newhtml-escape-id (str) > =C2=A0"Return a valid xhtml id attribute string. > See URL `http://xhtml.com/en/xhtml/reference/attribute-data-types/#id'." > =C2=A0(replace-regexp-in-string > =C2=A0 "[^-.a-zA-Z0-9]" (lambda (c) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0(mapconcat (lambda (d) (format "_%02x" d)) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (string-as-unibyte c) "")) str)) > > Notice that the output uses "_" which is a /valid/ char in an > xhtml id. =C2=A0However, it is not considered valid in an input > string. > > So (org-newhtml-escape-id "foo_5fbar") =3D> foo_5f5fbar > But (org-newhtml-escape-id "foo_bar") =3D> foo_5fbar > > So notice that valid ids /without/ an underscore in them are left > as is, but ids with an underscore are encoded under this scheme, > so you can't generate a collision. Ah, thanks, now I understand. I missed that detail.