>>>>> Lars Ingebrigtsen <larsi@gnus.org> writes:
>>>>> Nic Ferrier <nferrier@ferrier.me.uk> writes:

 >> It's certainly the case that definite ending is easier to process.

 > I don't really know what to say.  "HTML parsing is a solved problem"?

	Granted, my Libxml2 installation may be out of date, but for the
	HTML5 document MIMEd (valid per http://validator.w3.org/check),
	libxml-parse-html-region (surprisingly) produces the following:

(html
 ((lang . "en") (dir . "ltr"))
 (head nil (title nil "HTML parsing"))
 (body nil (dl nil
               (dt nil "This\n")
               (dd nil "is\n"
                   (dd nil "a\n"
                       (dd nil "perfectly\n"
                           (dd nil "valid\n"
                               (dd nil "HTML5\n"
                                   (dd nil "document.\n")))))))))

	Naturally, SHR rendition of the document would be just as
	unreasonable as is the tree above.

	On the contrary, using Lynx to render the very same document
	results in:

$ lynx --dump --stdin --force-html < example.html 
   This
          is
          a
          perfectly
          valid
          HTML5
          document.
$ 

	The relevant part of the specification [1] is as follows.

    A dt element’s end tag may be omitted if the dt element is
    immediately followed by another dt element or a dd element.

    A dd element’s end tag may be omitted if the dd element is
    immediately followed by another dd element or a dt element, or if
    there is no more content in the parent element.

[1] http://www.w3.org/TR/html5/syntax.html#optional-tags

-- 
FSF associate member #7257  http://boycottsystemd.org/  … 3013 B6A0 230E 334A