From mboxrd@z Thu Jan 1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: xah lee
Newsgroups: gmane.emacs.bugs
Subject: bug#270: texinfo generates invalid html
Date: Sat, 17 May 2008 08:49:04 -0700
Message-ID: <21886EF4-051C-4307-9601-9D002D90280E__20518.6644140214$1211043687$gmane$org@xahlee.org>
Reply-To: xah lee , 270@emacsbugs.donarmstrong.com
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0 (Apple Message framework v753)
Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1211043642 31240 80.91.229.12 (17 May 2008 17:00:42 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sat, 17 May 2008 17:00:42 +0000 (UTC)
To: emacs-pretest-bug@gnu.org
Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat May 17 19:01:20 2008
Return-path:
Envelope-to: geb-bug-gnu-emacs@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
by lo.gmane.org with esmtp (Exim 4.50)
id 1JxPmZ-0007GC-VU
for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 May 2008 19:01:20 +0200
Original-Received: from localhost ([127.0.0.1]:53462 helo=lists.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43)
id 1JxPlq-0007mV-DB
for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 May 2008 13:00:34 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
id 1JxP0b-0005K2-Ct
for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:45 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
id 1JxP0a-0005IP-37
for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:44 -0400
Original-Received: from [199.232.76.173] (port=60393 helo=monty-python.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1JxP0Z-0005I9-QB
for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:43 -0400
Original-Received: from rzlab.ucr.edu ([138.23.92.77]:42897)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from ) id 1JxP0Z-0007oM-2h
for bug-gnu-emacs@gnu.org; Sat, 17 May 2008 12:11:43 -0400
Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1])
by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4HGBfv9028331;
Sat, 17 May 2008 09:11:41 -0700
Original-Received: (from debbugs@localhost)
by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id m4HG037U024110;
Sat, 17 May 2008 09:00:03 -0700
X-Loop: don@donarmstrong.com
Resent-From: xah lee
Original-Sender: emacs-devel-bounces+monnier=iro.umontreal.ca@gnu.org
Resent-To: bug-submit-list@donarmstrong.com
Resent-CC: Emacs Bugs
Resent-Date: Sat, 17 May 2008 16:00:03 +0000
Resent-Message-ID:
Resent-Sender: don@donarmstrong.com
X-Emacs-PR-Message: report 270
X-Emacs-PR-Package: emacs
X-Emacs-PR-Keywords:
Original-Received: via spool by submit@emacsbugs.donarmstrong.com id=B.121103943622833
(code B ref -1); Sat, 17 May 2008 16:00:03 +0000
Original-Received: (at submit) by emacsbugs.donarmstrong.com; 17 May 2008 15:50:36 +0000
Original-Received: from mercure.iro.umontreal.ca (mercure.iro.umontreal.ca
[132.204.24.67])
by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m4HFoSLe022823
for ; Sat, 17 May 2008 08:50:29 -0700
Original-Received: by mercure.iro.umontreal.ca (Postfix, from userid 20848)
id AE9D32CFC8E; Sat, 17 May 2008 11:50:27 -0400 (EDT)
Original-Received: from perlin.iro.umontreal.ca (perlin.iro.umontreal.ca
[132.204.24.51])
by mercure.iro.umontreal.ca (Postfix) with ESMTP id 8EB7F2CFA47
for ; Sat, 17 May 2008 11:50:27 -0400 (EDT)
Original-Received: from lists.gnu.org (lists.gnu.org [199.232.76.165])
by perlin.iro.umontreal.ca (Postfix) with ESMTP id 684E114821A
for ; Sat, 17 May 2008 11:50:15 -0400 (EDT)
Original-Received: from localhost ([127.0.0.1]:60394 helo=lists.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfn-0007j6-5w
for monnier@iro.umontreal.ca; Sat, 17 May 2008 11:50:15 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
id 1JxOfT-0007il-9Z
for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:55 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
id 1JxOfS-0007hl-By
for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400
Original-Received: from [199.232.76.173] (port=52557 helo=monty-python.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1JxOfS-0007hR-5x
for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:54 -0400
Original-Received: from fencepost.gnu.org ([140.186.70.10]:56804)
by monty-python.gnu.org with esmtp (Exim 4.60)
(envelope-from ) id 1JxOfR-0002zA-Ok
for emacs-devel@gnu.org; Sat, 17 May 2008 11:49:53 -0400
Original-Received: from mail.gnu.org ([199.232.76.166]:51338 helo=mx10.gnu.org)
by fencepost.gnu.org with esmtp (Exim 4.67)
(envelope-from ) id 1JxOeK-0001jS-PZ
for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:48:44 -0400
Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim
4.60) (envelope-from ) id 1JxOfN-0002yL-75
for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:53 -0400
Original-Received: from mout.perfora.net ([74.208.4.197]:63609)
by monty-python.gnu.org with esmtp (Exim 4.60)
(envelope-from ) id 1JxOfM-0002yD-Rf
for emacs-pretest-bug@gnu.org; Sat, 17 May 2008 11:49:49 -0400
Original-Received: from [192.168.1.2] (c-24-6-97-120.hsd1.ca.comcast.net [24.6.97.120])
by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis)
id 0MKpCa-1JxOfL0EKM-0003WL; Sat, 17 May 2008 11:49:48 -0400
X-Mailer: Apple Mail (2.753)
X-Provags-ID: V01U2FsdGVkX1+jBYS3lMVNHRtWkDkvN9pMheFepXIwmLu3ckP
v+aY71uz1VH5Y8hquOlAkfwuO0aUiSj2mERAFC3/Zd/wdNSnMT
ppf97G2zU2tLPasIsbc7g==
X-detected-kernel: by monty-python.gnu.org: Linux 2.6? (barebone, rare!)
X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
X-DIRO-MailScanner-Information: Please contact the ISP for more information
X-DIRO-MailScanner: Found to be clean
X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel,
SpamAssassin (score=-2.561, requis 5, autolearn=not spam,
BAYES_00 -2.60, MIME_QP_LONG_LINE 0.04, SPF_HELO_PASS -0.00)
X-DIRO-MailScanner-From: emacs-devel-bounces+monnier=iro.umontreal.ca@gnu.org
X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3)
Resent-Date: Sat, 17 May 2008 12:11:44 -0400
X-Mailman-Approved-At: Sat, 17 May 2008 12:59:35 -0400
X-BeenThere: bug-gnu-emacs@gnu.org
List-Id: "Bug reports for GNU Emacs,
the Swiss army knife of text editors"
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org
Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.bugs:18036
Archived-At:
The elisp document generated by texinfo in html is not valid html.
Here's the major problems:
-------------------------------------
Problems with texinfo generated html, with respect to html 4 =20
transitional:
* there's no doctype declaration.
* when there's a footnote, it is generated as
=20
which is invalid.
-------------------------------------
Problems with respect to html4 strict:
* =E2=80=9C=E2=80=9D should just be =
=E2=80=9C=E2=80=9D.
* sometimes there's =E2=80=9C
=E2=80=9D but missing =
a =20
opening =E2=80=9C=E2=80=9D.
* whenever there's a =E2=80=9CCommon Lisp note:=E2=80=9D, it =
should =20
have a =E2=80=9C
=E2=80=9D wrapped around the block, since it's inside =
=20
=E2=80=9C
=E2=80=9D and html4strict requires it.
-------------------------------------
Other minor problems:
* the css is plastered into every page. It should be one css =20
file instead.
* it should declare utf8 as the charset. (so that it doesn't =20
need to do a lot html character encoding)
* the ending is often not used.
-------------------------------------
Dead Links to external docs
In the elisp manual (one node per html page, roughly 850 html pages), =20=
there are 70 (local) links to other GNU documents. The local links =20
are nice in that they provide cross-reference, but if one hosts only =20
the elisp doc, all these local links will be dead.
Therefore, it would be nice, to have perhaps at texinfo level to =20
embed markers to links that cross-ref to external docs, or perhaps at =20=
the html conversion level to provide a option to filter local links, =20
so that local links can replaced as non-links (such as =E2=80=9CSee =
Emacs =20
manual node on Abbrev=E2=80=9D) or full http links to the right uri at =20=
gnu.org.
-------------------------------------
Use of ascii...
texinfo still use the convention of backtick ` and straight single =20
quote ' to emulate curly ones =E2=80=9C=E2=80=9D and =E2=80=98=E2=80=99, =
and other ascii =20
kludge such as =E2=80=9C=3D>=E2=80=9D instead of =E2=80=9C=E2=87=92=E2=80=9D=
. The ability to =20
displaying these chars has been widely available on commercial =20
platforms since mid 1990s, and on linuxes since about 2003 or so =20
(emacs itself support unicode to a practical degree since emacs 21, =20
released in 2001). It is perhaps time to update gnu doc convention to =20=
utf8 and use the proper characters.
-------------------------------------
Note:
The HTML generated by texinfo is actually far superior than other =20
org's, such as those of perl, pyhton, java, in the sense that when =20
sending the html to w3c's validator, texinfo's html actually contain =20
just a few errors, all are fixable. While other org's such as python =20
(which was generated from TeX), are so messy that is not fixable.
Kudos to the textinfo developer(s).
PS I had problem with the quality of FSF's documentation uri. Namely, =20=
sometimes the doc's uri disappears, so that people cannot reliably =20
link to it. Also, some links in the doc are dead links to to the =20
transformation scheme of links from texinfo to uri. For detail, see: =20
http://xahlee.org/emacs/gnu_doc.html (warning: rant)
Xah
xah@xahlee.org
=E2=88=91 http://xahlee.org/
=E2=98=84