From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: xah lee Newsgroups: gmane.emacs.bugs Subject: bug#270: texinfo generates invalid html Date: Sat, 17 May 2008 08:49:04 -0700 Message-ID: <21886EF4-051C-4307-9601-9D002D90280E__20518.6644140214$1211043687$gmane$org@xahlee.org> Reply-To: xah lee , 270@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1211043642 31240 80.91.229.12 (17 May 2008 17:00:42 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 17 May 2008 17:00:42 +0000 (UTC) To: emacs-pretest-bug@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat May 17 19:01:20 2008 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JxPmZ-0007GC-VU for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 May 2008 19:01:20 +0200 Original-Received: from localhost ([127.0.0.1]:53462 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxPlq-0007mV-DB for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 May 2008 13:00:34 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JxP0b-0005K2-Ct for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:45 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JxP0a-0005IP-37 for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:44 -0400 Original-Received: from [199.232.76.173] (port=60393 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxP0Z-0005I9-QB for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:43 -0400 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:42897) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JxP0Z-0007oM-2h for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:43 -0400 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4HGBfv9028331; Sat, 17 May 2008 09:11:41 -0700 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id m4HG037U024110; Sat, 17 May 2008 09:00:03 -0700 X-Loop: don@donarmstrong.com Resent-From: xah lee Original-Sender: emacs-devel-bounces+monnier=iro.umontreal.ca@gnu.org Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sat, 17 May 2008 16:00:03 +0000 Resent-Message-ID: Resent-Sender: don@donarmstrong.com X-Emacs-PR-Message: report 270 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by submit@emacsbugs.donarmstrong.com id=B.121103943622833 (code B ref -1); Sat, 17 May 2008 16:00:03 +0000 Original-Received: (at submit) by emacsbugs.donarmstrong.com; 17 May 2008 15:50:36 +0000 Original-Received: from mercure.iro.umontreal.ca (mercure.iro.umontreal.ca [132.204.24.67]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4HFoSLe022823 for ; Sat, 17 May 2008 08:50:29 -0700 Original-Received: by mercure.iro.umontreal.ca (Postfix, from userid 20848) id AE9D32CFC8E; Sat, 17 May 2008 11:50:27 -0400 (EDT) Original-Received: from perlin.iro.umontreal.ca (perlin.iro.umontreal.ca [132.204.24.51]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id 8EB7F2CFA47 for ; Sat, 17 May 2008 11:50:27 -0400 (EDT) Original-Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) by perlin.iro.umontreal.ca (Postfix) with ESMTP id 684E114821A for ; Sat, 17 May 2008 11:50:15 -0400 (EDT) Original-Received: from localhost ([127.0.0.1]:60394 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfn-0007j6-5w for monnier@iro.umontreal.ca; Sat, 17 May 2008 11:50:15 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JxOfT-0007il-9Z for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:55 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JxOfS-0007hl-By for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400 Original-Received: from [199.232.76.173] (port=52557 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfS-0007hR-5x for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:56804) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JxOfR-0002zA-Ok for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:53 -0400 Original-Received: from mail.gnu.org ([199.232.76.166]:51338 helo=mx10.gnu.org) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1JxOeK-0001jS-PZ for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:48:44 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1JxOfN-0002yL-75 for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:53 -0400 Original-Received: from mout.perfora.net ([74.208.4.197]:63609) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JxOfM-0002yD-Rf for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:49 -0400 Original-Received: from [192.168.1.2] (c-24-6-97-120.hsd1.ca.comcast.net [24.6.97.120]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0MKpCa-1JxOfL0EKM-0003WL; Sat, 17 May 2008 11:49:48 -0400 X-Mailer: Apple Mail (2.753) X-Provags-ID: V01U2FsdGVkX1+jBYS3lMVNHRtWkDkvN9pMheFepXIwmLu3ckP v+aY71uz1VH5Y8hquOlAkfwuO0aUiSj2mERAFC3/Zd/wdNSnMT ppf97G2zU2tLPasIsbc7g== X-detected-kernel: by monty-python.gnu.org: Linux 2.6? (barebone, rare!) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-2.561, requis 5, autolearn=not spam, BAYES_00 -2.60, MIME_QP_LONG_LINE 0.04, SPF_HELO_PASS -0.00) X-DIRO-MailScanner-From: emacs-devel-bounces+monnier=iro.umontreal.ca@gnu.org X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3) Resent-Date: Sat, 17 May 2008 12:11:44 -0400 X-Mailman-Approved-At: Sat, 17 May 2008 12:59:35 -0400 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:18036 Archived-At: The elisp document generated by texinfo in html is not valid html. Here's the major problems: ------------------------------------- Problems with texinfo generated html, with respect to html 4 =20 transitional: * there's no doctype declaration. * when there's a footnote, it is generated as


=20 which is invalid. ------------------------------------- Problems with respect to html4 strict: * =E2=80=9C
    =E2=80=9D should just be = =E2=80=9C
      =E2=80=9D. * sometimes there's =E2=80=9C

      =E2=80=9D but missing = a =20 opening =E2=80=9C

      =E2=80=9D. * whenever there's a =E2=80=9CCommon Lisp note:=E2=80=9D, it = should =20 have a =E2=80=9C

      =E2=80=9D wrapped around the block, since it's inside = =20 =E2=80=9C

      =E2=80=9D and html4strict requires it. ------------------------------------- Other minor problems: * the css is plastered into every page. It should be one css =20 file instead. * it should declare utf8 as the charset. (so that it doesn't =20 need to do a lot html character encoding) * the ending

      is often not used. ------------------------------------- Dead Links to external docs In the elisp manual (one node per html page, roughly 850 html pages), =20= there are 70 (local) links to other GNU documents. The local links =20 are nice in that they provide cross-reference, but if one hosts only =20 the elisp doc, all these local links will be dead. Therefore, it would be nice, to have perhaps at texinfo level to =20 embed markers to links that cross-ref to external docs, or perhaps at =20= the html conversion level to provide a option to filter local links, =20 so that local links can replaced as non-links (such as =E2=80=9CSee = Emacs =20 manual node on Abbrev=E2=80=9D) or full http links to the right uri at =20= gnu.org. ------------------------------------- Use of ascii... texinfo still use the convention of backtick ` and straight single =20 quote ' to emulate curly ones =E2=80=9C=E2=80=9D and =E2=80=98=E2=80=99, = and other ascii =20 kludge such as =E2=80=9C=3D>=E2=80=9D instead of =E2=80=9C=E2=87=92=E2=80=9D= . The ability to =20 displaying these chars has been widely available on commercial =20 platforms since mid 1990s, and on linuxes since about 2003 or so =20 (emacs itself support unicode to a practical degree since emacs 21, =20 released in 2001). It is perhaps time to update gnu doc convention to =20= utf8 and use the proper characters. ------------------------------------- Note: The HTML generated by texinfo is actually far superior than other =20 org's, such as those of perl, pyhton, java, in the sense that when =20 sending the html to w3c's validator, texinfo's html actually contain =20 just a few errors, all are fixable. While other org's such as python =20 (which was generated from TeX), are so messy that is not fixable. Kudos to the textinfo developer(s). PS I had problem with the quality of FSF's documentation uri. Namely, =20= sometimes the doc's uri disappears, so that people cannot reliably =20 link to it. Also, some links in the doc are dead links to to the =20 transformation scheme of links from texinfo to uri. For detail, see: =20 http://xahlee.org/emacs/gnu_doc.html (warning: rant) Xah xah@xahlee.org =E2=88=91 http://xahlee.org/ =E2=98=84