unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Texinfo XML support in Emacs Info browser
@ 2007-06-02 23:03 Juri Linkov
  2007-06-03  9:39 ` Thien-Thi Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Juri Linkov @ 2007-06-02 23:03 UTC (permalink / raw)
  To: emacs-devel; +Cc: karl

[-- Attachment #1: Type: text/plain, Size: 600 bytes --]

Here is the initial version of the Emacs package that provides support
for Texinfo XML format in the Emacs Info browser.  This format is
generated by the command `makeinfo --xml' that maps Texinfo markup
commands into XML syntax.

The current implementation uses a lot of defadvices on basic info.el
functions.  This was done for adding new format seamlessly, and leaving
old Info API for the new format.  Later these functions could be modified
to allow switching between different Info formats natively.

I'd like to hear all comments about whether this is the right direction
and so on.  Thanks.


[-- Attachment #2: info-xml.el --]
[-- Type: application/emacs-lisp, Size: 12135 bytes --]

[-- Attachment #3: Type: text/plain, Size: 45 bytes --]


-- 
Juri Linkov
http://www.jurta.org/emacs/

[-- Attachment #4: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-02 23:03 Texinfo XML support in Emacs Info browser Juri Linkov
@ 2007-06-03  9:39 ` Thien-Thi Nguyen
  2007-06-03  9:59   ` Juri Linkov
  0 siblings, 1 reply; 9+ messages in thread
From: Thien-Thi Nguyen @ 2007-06-03  9:39 UTC (permalink / raw)
  To: Juri Linkov; +Cc: karl, emacs-devel

() Juri Linkov <juri@jurta.org>
() Sun, 03 Jun 2007 02:03:03 +0300

   [info-xml.el]

interesting.  i had to modify `Info-xml-select-node' like so:

	;; Add a new unique history item to full history list
	(let ((new-history (list Info-current-file Info-current-node)))
	  (setq Info-history-list
		(cons new-history (delete new-history Info-history-list)))
	  (setq Info-history-forward nil))

i think ad-hoc regexp-based approach is likely to be troublesome in the long
run.  so, question: since we have xml.el, why not build a tree immediately?
one answer is that: well, makeinfo --xml output is not always valid.  :-(

e.g.: makeinfo --xml -o edb.info.xml edb.texi
      (xml-parse-file "edb.info.xml")
      => error 

(in edb.info.xml, element `detailedmenu' is not properly nested.  this is
seen using "makeinfo (GNU texinfo) 4.8".)

thi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03  9:39 ` Thien-Thi Nguyen
@ 2007-06-03  9:59   ` Juri Linkov
  2007-06-03 21:23     ` Karl Berry
  2007-06-03 21:27     ` Richard Stallman
  0 siblings, 2 replies; 9+ messages in thread
From: Juri Linkov @ 2007-06-03  9:59 UTC (permalink / raw)
  To: ttn; +Cc: karl, emacs-devel

> () Juri Linkov <juri@jurta.org>
> () Sun, 03 Jun 2007 02:03:03 +0300
>
>    [info-xml.el]
>
> interesting.  i had to modify `Info-xml-select-node' like so:
>
> 	;; Add a new unique history item to full history list
> 	(let ((new-history (list Info-current-file Info-current-node)))
> 	  (setq Info-history-list
> 		(cons new-history (delete new-history Info-history-list)))
> 	  (setq Info-history-forward nil))

Sorry, this part contained function calls for speed optimization of
displaying large index nodes.  I'll post a separate patch for info.el.

> i think ad-hoc regexp-based approach is likely to be troublesome in the long
> run.  so, question: since we have xml.el, why not build a tree immediately?
> one answer is that: well, makeinfo --xml output is not always valid.  :-(
>
> e.g.: makeinfo --xml -o edb.info.xml edb.texi
>       (xml-parse-file "edb.info.xml")
>       => error
>
> (in edb.info.xml, element `detailedmenu' is not properly nested.  this is
> seen using "makeinfo (GNU texinfo) 4.8".)

This is the exact reason why I leaned toward using regexps instead of
xml-parse-file.  In my first attempt to use xml-parse-file I discovered
that `makeinfo --xml' sometimes produces non-wellformed XML, so we can't
parse it with xml-parse-file.  Even if this will be fixed in the future
releases of makeinfo, old versions still produce invalid XML output.

With regexp-based approach, the Info browser would be more permissive to
invalid XML, in the same way as most HTML browsers are permissive to invalid
markup on HTML pages.  It can simply ignore invalid elements, either
removing them or leaving on display which doesn't make much harm.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03  9:59   ` Juri Linkov
@ 2007-06-03 21:23     ` Karl Berry
  2007-06-03 22:47       ` Juri Linkov
  2007-06-03 23:26       ` Thien-Thi Nguyen
  2007-06-03 21:27     ` Richard Stallman
  1 sibling, 2 replies; 9+ messages in thread
From: Karl Berry @ 2007-06-03 21:23 UTC (permalink / raw)
  To: juri; +Cc: ttn, emacs-devel

    ttn> (in edb.info.xml, element `detailedmenu' is not properly nested.  

If bugs get reported, they will be fixed.  Can you send me the Texinfo
file please?  I don't have edb.texi on my system.
    
    juri> Even if this will be fixed in the future
    releases of makeinfo, old versions still produce invalid XML output.

But meanwhile, nothing now in Emacs is using the "TexinfoML" output
(right?), so the fact that old makeinfo versions do the wrong thing
doesn't matter.

    With regexp-based approach, the Info browser would be more
    permissive to invalid XML

How about fixing xml-parse-file so that it can be "permissive"?
It seems rather strange to use regexps when something was developed
specifically to parse XML files.  Not that it's my call.

Best,
karl

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03  9:59   ` Juri Linkov
  2007-06-03 21:23     ` Karl Berry
@ 2007-06-03 21:27     ` Richard Stallman
  1 sibling, 0 replies; 9+ messages in thread
From: Richard Stallman @ 2007-06-03 21:27 UTC (permalink / raw)
  To: Juri Linkov; +Cc: karl, ttn, emacs-devel

    This is the exact reason why I leaned toward using regexps instead of
    xml-parse-file.  In my first attempt to use xml-parse-file I discovered
    that `makeinfo --xml' sometimes produces non-wellformed XML, so we can't
    parse it with xml-parse-file.  Even if this will be fixed in the future
    releases of makeinfo, old versions still produce invalid XML output.

I don't think we need to cater to the buggy makeinfo.  This is a
long-term issue, and in the long-term we can fix makeinfo.

That doesn't mean your regexp-based approach is bad.

Would you please report the makeinfo bugs to bug-texinfo?
That is very important.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03 21:23     ` Karl Berry
@ 2007-06-03 22:47       ` Juri Linkov
  2007-06-03 23:45         ` Drew Adams
  2007-06-03 23:26       ` Thien-Thi Nguyen
  1 sibling, 1 reply; 9+ messages in thread
From: Juri Linkov @ 2007-06-03 22:47 UTC (permalink / raw)
  To: Karl Berry; +Cc: ttn, emacs-devel

> How about fixing xml-parse-file so that it can be "permissive"?
> It seems rather strange to use regexps when something was developed
> specifically to parse XML files.  Not that it's my call.

I like xml-parse-file approach, but I doubt that xml-parse-file can be fixed to
parse non-well-formed XML.

After fixing bugs in makeinfo we can achieve the state when Texinfo XML
output of makeinfo is well-formed, so we could completely rely on
correctness of its output, and use xml-parse-file.

Using XML structures from xml-parse-file is very different approach from using
text-based Info files and poses many interesting problems. For example,
how to search manuals for a regexp.  One solution is to traverse the XML tree
and to match its text elements.  But what to do with text split between different
XML elements?  Another solution is to search a regexp in the rendered text
of all nodes.

-- 
Juri Linkov
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03 21:23     ` Karl Berry
  2007-06-03 22:47       ` Juri Linkov
@ 2007-06-03 23:26       ` Thien-Thi Nguyen
  1 sibling, 0 replies; 9+ messages in thread
From: Thien-Thi Nguyen @ 2007-06-03 23:26 UTC (permalink / raw)
  To: Karl Berry; +Cc: juri, emacs-devel

() karl@freefriends.org (Karl Berry)
() Sun, 3 Jun 2007 16:23:07 -0500

   If bugs get reported, they will be fixed.

i have sent a test case that reproduces the problem to bug-texinfo.

thi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Texinfo XML support in Emacs Info browser
  2007-06-03 22:47       ` Juri Linkov
@ 2007-06-03 23:45         ` Drew Adams
  2007-06-06  0:43           ` Stefan Monnier
  0 siblings, 1 reply; 9+ messages in thread
From: Drew Adams @ 2007-06-03 23:45 UTC (permalink / raw)
  To: Juri Linkov, emacs-devel

> Using XML structures from xml-parse-file is very different
> approach from using text-based Info files and poses many
> interesting problems. For example, how to search manuals
> for a regexp.  One solution is to traverse the XML tree
> and to match its text elements.  But what to do with text split
> between different XML elements?  Another solution is to
> search a regexp in the rendered text of all nodes.

Search is specific to the medium at hand. So, yes, it would be appropriate
to search the _rendered_ text, however it might be rendered. For example,
for PDF output, a PDF reader's search function could be quite different from
Info's. Likewise, for XHTML, a browser's search could be different from
Info's.

I doubt there would be much call for searching the underlying XML, but, if
there were, then XQuery or XPath expressions would be appropriate. I see no
sense in trying to regexp search across XML nodes, with the possible
exception of searching only the text() nodes (as you mentioned).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Texinfo XML support in Emacs Info browser
  2007-06-03 23:45         ` Drew Adams
@ 2007-06-06  0:43           ` Stefan Monnier
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2007-06-06  0:43 UTC (permalink / raw)
  To: Drew Adams; +Cc: Juri Linkov, emacs-devel

> Search is specific to the medium at hand. So, yes, it would be appropriate
> to search the _rendered_ text, however it might be rendered.  For example,

In theory, I'd agree.  But it might have a very significant performance
impact, so maybe searching the XML text representation, although suboptimal,
might be better off for speed reasons (although it's unclear whether it can
be made to work easily).


        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-06-06  0:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-02 23:03 Texinfo XML support in Emacs Info browser Juri Linkov
2007-06-03  9:39 ` Thien-Thi Nguyen
2007-06-03  9:59   ` Juri Linkov
2007-06-03 21:23     ` Karl Berry
2007-06-03 22:47       ` Juri Linkov
2007-06-03 23:45         ` Drew Adams
2007-06-06  0:43           ` Stefan Monnier
2007-06-03 23:26       ` Thien-Thi Nguyen
2007-06-03 21:27     ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).