From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: xah lee Newsgroups: gmane.emacs.devel,gmane.emacs.pretest.bugs Subject: texinfo generates invalid html Date: Sat, 17 May 2008 08:49:04 -0700 Message-ID: <21886EF4-051C-4307-9601-9D002D90280E@xahlee.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1211039409 17627 80.91.229.12 (17 May 2008 15:50:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 17 May 2008 15:50:09 +0000 (UTC) To: emacs-pretest-bug@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 17 17:50:45 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JxOgH-0004Wh-Ek for ged-emacs-devel@m.gmane.org; Sat, 17 May 2008 17:50:45 +0200 Original-Received: from localhost ([127.0.0.1]:60394 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfX-0007j6-Rw for ged-emacs-devel@m.gmane.org; Sat, 17 May 2008 11:49:59 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JxOfT-0007il-9Z for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:55 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JxOfS-0007hl-By for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400 Original-Received: from [199.232.76.173] (port=52557 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfS-0007hR-5x for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:56804) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JxOfR-0002zA-Ok for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:53 -0400 Original-Received: from mail.gnu.org ([199.232.76.166]:51338 helo=mx10.gnu.org) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1JxOeK-0001jS-PZ for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:48:44 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1JxOfN-0002yL-75 for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:53 -0400 Original-Received: from mout.perfora.net ([74.208.4.197]:63609) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JxOfM-0002yD-Rf for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:49 -0400 Original-Received: from [192.168.1.2] (c-24-6-97-120.hsd1.ca.comcast.net [24.6.97.120]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0MKpCa-1JxOfL0EKM-0003WL; Sat, 17 May 2008 11:49:48 -0400 X-Mailer: Apple Mail (2.753) X-Provags-ID: V01U2FsdGVkX1+jBYS3lMVNHRtWkDkvN9pMheFepXIwmLu3ckP v+aY71uz1VH5Y8hquOlAkfwuO0aUiSj2mERAFC3/Zd/wdNSnMT ppf97G2zU2tLPasIsbc7g== X-detected-kernel: by monty-python.gnu.org: Linux 2.6? (barebone, rare!) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:97326 gmane.emacs.pretest.bugs:22358 Archived-At: The elisp document generated by texinfo in html is not valid html. Here's the major problems: ------------------------------------- Problems with texinfo generated html, with respect to html 4 =20 transitional: * there's no doctype declaration. * when there's a footnote, it is generated as


=20 which is invalid. ------------------------------------- Problems with respect to html4 strict: * =E2=80=9C
    =E2=80=9D should just be = =E2=80=9C
      =E2=80=9D. * sometimes there's =E2=80=9C

      =E2=80=9D but missing = a =20 opening =E2=80=9C

      =E2=80=9D. * whenever there's a =E2=80=9CCommon Lisp note:=E2=80=9D, it = should =20 have a =E2=80=9C

      =E2=80=9D wrapped around the block, since it's inside = =20 =E2=80=9C

      =E2=80=9D and html4strict requires it. ------------------------------------- Other minor problems: * the css is plastered into every page. It should be one css =20 file instead. * it should declare utf8 as the charset. (so that it doesn't =20 need to do a lot html character encoding) * the ending

      is often not used. ------------------------------------- Dead Links to external docs In the elisp manual (one node per html page, roughly 850 html pages), =20= there are 70 (local) links to other GNU documents. The local links =20 are nice in that they provide cross-reference, but if one hosts only =20 the elisp doc, all these local links will be dead. Therefore, it would be nice, to have perhaps at texinfo level to =20 embed markers to links that cross-ref to external docs, or perhaps at =20= the html conversion level to provide a option to filter local links, =20 so that local links can replaced as non-links (such as =E2=80=9CSee = Emacs =20 manual node on Abbrev=E2=80=9D) or full http links to the right uri at =20= gnu.org. ------------------------------------- Use of ascii... texinfo still use the convention of backtick ` and straight single =20 quote ' to emulate curly ones =E2=80=9C=E2=80=9D and =E2=80=98=E2=80=99, = and other ascii =20 kludge such as =E2=80=9C=3D>=E2=80=9D instead of =E2=80=9C=E2=87=92=E2=80=9D= . The ability to =20 displaying these chars has been widely available on commercial =20 platforms since mid 1990s, and on linuxes since about 2003 or so =20 (emacs itself support unicode to a practical degree since emacs 21, =20 released in 2001). It is perhaps time to update gnu doc convention to =20= utf8 and use the proper characters. ------------------------------------- Note: The HTML generated by texinfo is actually far superior than other =20 org's, such as those of perl, pyhton, java, in the sense that when =20 sending the html to w3c's validator, texinfo's html actually contain =20 just a few errors, all are fixable. While other org's such as python =20 (which was generated from TeX), are so messy that is not fixable. Kudos to the textinfo developer(s). PS I had problem with the quality of FSF's documentation uri. Namely, =20= sometimes the doc's uri disappears, so that people cannot reliably =20 link to it. Also, some links in the doc are dead links to to the =20 transformation scheme of links from texinfo to uri. For detail, see: =20 http://xahlee.org/emacs/gnu_doc.html (warning: rant) Xah xah@xahlee.org =E2=88=91 http://xahlee.org/ =E2=98=84