From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ralf Mattes Newsgroups: gmane.emacs.devel Subject: Re: Converting a string to valid XHTML id? Date: Tue, 30 Nov 2010 14:50:16 +0000 (UTC) Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1291200104 25017 80.91.229.12 (1 Dec 2010 10:41:44 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 1 Dec 2010 10:41:44 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 01 11:41:40 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PNk81-00069D-QU for ged-emacs-devel@m.gmane.org; Wed, 01 Dec 2010 11:41:37 +0100 Original-Received: from localhost ([127.0.0.1]:33531 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PNk80-0006NP-EZ for ged-emacs-devel@m.gmane.org; Wed, 01 Dec 2010 05:41:36 -0500 Original-Received: from [140.186.70.92] (port=43235 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PNeMd-0006RT-PI for emacs-devel@gnu.org; Tue, 30 Nov 2010 23:32:38 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PNSEd-0003fy-8V for emacs-devel@gnu.org; Tue, 30 Nov 2010 10:35:16 -0500 Original-Received: from lo.gmane.org ([80.91.229.12]:59610) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PNSEd-0003Rl-0D for emacs-devel@gnu.org; Tue, 30 Nov 2010 10:35:15 -0500 Original-Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PNSET-00040z-55 for emacs-devel@gnu.org; Tue, 30 Nov 2010 16:35:05 +0100 Original-Received: from 82.113.121.50 ([82.113.121.50]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 30 Nov 2010 16:35:05 +0100 Original-Received: from rm by 82.113.121.50 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 30 Nov 2010 16:35:05 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 37 Original-X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: 82.113.121.50 User-Agent: Pan/0.132 (Waxed in Black) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:133267 Archived-At: On Mon, 29 Nov 2010 19:39:17 +0100, Lennart Borgman wrote: > On Mon, Nov 29, 2010 at 7:33 PM, Deniz Dogan > wrote: >> ... >> What is this for? Just curious. > > I need something like this for exporting org-mode to html (Jambunathan > and I are rewriting the export routines to cover export to odt too). > > BTW, I came up with this for the moment: > > > ;; (org-newhtml-escape-id "fig:5") > ;; (org-newhtml-escape-id "56") > (defun org-newhtml-escape-id (id) > "Return a valid id string. > See URL http://www.w3schools.com/tags/att_standard_id.asp" > (setq id (replace-regexp-in-string "\\`\\([^A-Za-z]\\)" "ANON-\\1" id > nil)) (setq id (replace-regexp-in-string "[^A-Za-z0-9_.-]" "-" id t))) But this is wrong - it'll possibly generate invalid html. Consider the following: (org-newhtml-escape-id "this is cool!") ⇒ "this-is-cool-" (org-newhtml-escape-id "this is cool?") ⇒ "this-is-cool-" collapsing two different strings to the same ID, resulting in invalid html. Cheers, Ralf Mattes