unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#48334: No <title> elements in HTML manual pages
@ 2021-05-10 14:48 Maxim Nikulin
  2021-10-05 14:14 ` Maxim Nikulin
  0 siblings, 1 reply; 10+ messages in thread
From: Maxim Nikulin @ 2021-05-10 14:48 UTC (permalink / raw)
  To: 48334


HTML pages of Emacs manual, e.g.
https://www.gnu.org/software/emacs/manual/html_node/elisp/Motion.html
do not have <title> element. Open page source in browser,
inspector in browser developer tools, or just fetch the page
using e.g. curl to see that metadata in <head> element
are rather scarce.

As a result, browser tab title is not informative. In the case of
Firefox in can be "google.com/url?q=http..." due to intermediate
redirection and a bug in Firefox https://bugzilla.mozilla.org/1401091
Even if Firefox had not this bug, node names instead of URLs
it tab titles would provide better user experience.

For the particular page, my expectation for <title> element content
is something like
- "30.2 Motion (Emacs Lisp)"
- "(elisp) Motion"
- "30.2 Motion"

Texinfo manual is not affected, its pages contains reasonable
<title>, e.g.
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Generating-HTML.html
I hope, it is enough to change some settings of HTML export for Emacs
manuals to improve quality of generated pages. However I am not familiar
with texinfo enough to provide instructions which options should be tuned.

The reason why I use HTML format of Emacs manuals is that I have not
enough experience with Emacs yet. So it easier to find particular
sections using search engines that take into account relevance or even
synonyms. Docstrings for Emacs functions and variables rarely have
direct links to texinfo nodes from manuals that provides higher level
overview or guide for related functionality.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2021-05-10 14:48 bug#48334: No <title> elements in HTML manual pages Maxim Nikulin
@ 2021-10-05 14:14 ` Maxim Nikulin
  2022-07-02 16:19   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Maxim Nikulin @ 2021-10-05 14:14 UTC (permalink / raw)
  To: 48334


> HTML pages of Emacs manual, e.g.
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Motion.html
> do not have <title> element.
...
> Texinfo manual is not affected, its pages contains reasonable
> <title>, e.g.
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Generating-HTML.html 

Emacs manual is generated by texi2html, texinfo and e.g. Org mode by
     makeinfo --html ...
In the latter case pages have <title> element, in the former they do not 
(at least without some tuning).





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2021-10-05 14:14 ` Maxim Nikulin
@ 2022-07-02 16:19   ` Lars Ingebrigtsen
  2022-07-02 17:02     ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-02 16:19 UTC (permalink / raw)
  To: Maxim Nikulin; +Cc: 48334

Maxim Nikulin <m.a.nikulin@gmail.com> writes:

>> HTML pages of Emacs manual, e.g.
>> https://www.gnu.org/software/emacs/manual/html_node/elisp/Motion.html
>> do not have <title> element.
> ...
>> Texinfo manual is not affected, its pages contains reasonable
>> <title>, e.g.
>> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Generating-HTML.html
>
> Emacs manual is generated by texi2html, texinfo and e.g. Org mode by
>     makeinfo --html ...
> In the latter case pages have <title> element, in the former they do
> not (at least without some tuning).

(I'm going through old bug reports that unfortunately weren't resolved
at the time.)

These manuals still seem to be missing <title>s.  And texi2html has been
superseded by texi2any, which should be adding <title> elements
according to:

https://www.gnu.org/software/texinfo/manual/texinfo/html_node/HTML-Customization-Variables.html

Anybody know who's responsible for generating the HTML manuals?  

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-02 16:19   ` Lars Ingebrigtsen
@ 2022-07-02 17:02     ` Eli Zaretskii
  2022-07-03 12:16       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2022-07-02 17:02 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 48334, m.a.nikulin

> Cc: 48334@debbugs.gnu.org
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Sat, 02 Jul 2022 18:19:44 +0200
> 
> Maxim Nikulin <m.a.nikulin@gmail.com> writes:
> 
> >> HTML pages of Emacs manual, e.g.
> >> https://www.gnu.org/software/emacs/manual/html_node/elisp/Motion.html
> >> do not have <title> element.
> > ...
> >> Texinfo manual is not affected, its pages contains reasonable
> >> <title>, e.g.
> >> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Generating-HTML.html
> >
> > Emacs manual is generated by texi2html, texinfo and e.g. Org mode by
> >     makeinfo --html ...
> > In the latter case pages have <title> element, in the former they do
> > not (at least without some tuning).
> 
> (I'm going through old bug reports that unfortunately weren't resolved
> at the time.)
> 
> These manuals still seem to be missing <title>s.  And texi2html has been
> superseded by texi2any, which should be adding <title> elements
> according to:
> 
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/HTML-Customization-Variables.html
> 
> Anybody know who's responsible for generating the HTML manuals?  

We are.  See the instructions in admin/make-tarball.txt and the
scripts admin/make-manuals and admin/upload-manuals.

I don't remember if texi2any produces <title>, but the above scripts
modify the HTML produced by texi2any, so what we eventually have is
the result of those scripts.

We could decide dropping admin/make-manuals, or at least the parts
that modify the produced HTML, but presumably those parts were written
for a reason.  Unfortunately, I see no detailed documentation of the
reasons for those changes, so it's hard to decide whether any of them
are still valid, what with Texinfo's progress since the time those
changes were coded.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-02 17:02     ` Eli Zaretskii
@ 2022-07-03 12:16       ` Lars Ingebrigtsen
  2022-07-03 13:13         ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-03 12:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 48334, m.a.nikulin

Eli Zaretskii <eliz@gnu.org> writes:

>> > Emacs manual is generated by texi2html, texinfo and e.g. Org mode by
>> >     makeinfo --html ...
>> > In the latter case pages have <title> element, in the former they do
>> > not (at least without some tuning).

[...]

> We are.  See the instructions in admin/make-tarball.txt and the
> scripts admin/make-manuals and admin/upload-manuals.
>
> I don't remember if texi2any produces <title>, but the above scripts
> modify the HTML produced by texi2any, so what we eventually have is
> the result of those scripts.

Hm...  it looks like the manuals are produced with "makeinfo --html",
though -- I can't see any usage of texi2html or texi2any there, but I
may be missing something.

> We could decide dropping admin/make-manuals, or at least the parts
> that modify the produced HTML, but presumably those parts were written
> for a reason.  Unfortunately, I see no detailed documentation of the
> reasons for those changes, so it's hard to decide whether any of them
> are still valid, what with Texinfo's progress since the time those
> changes were coded.

Ah, it's this code:

(defun manual-html-fix-headers ()
  "Fix up HTML headers for the Emacs manual in the current buffer."
  (let ((texi5 (search-forward "<!DOCTYPE" nil t))
	opoint)

[...]

    (search-forward "<meta")
    (setq opoint (match-beginning 0))
    (unless texi5
      (search-forward "<!--")
      (goto-char (match-beginning 0))
      (delete-region opoint (point))
      (search-forward "<meta http-equiv=\"Content-Style")
      (setq opoint (match-beginning 0)))
    (search-forward "</title>\n")
    (delete-region opoint (point))

So we delete the <title> that makeinfo --html has created.  Perhaps
that's just a bug?  I see that you adjusted this code in May...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-03 12:16       ` Lars Ingebrigtsen
@ 2022-07-03 13:13         ` Eli Zaretskii
  2022-07-03 14:48           ` Max Nikulin
  2022-07-04 10:42           ` Lars Ingebrigtsen
  0 siblings, 2 replies; 10+ messages in thread
From: Eli Zaretskii @ 2022-07-03 13:13 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 48334, m.a.nikulin

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: m.a.nikulin@gmail.com,  48334@debbugs.gnu.org
> Date: Sun, 03 Jul 2022 14:16:27 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I don't remember if texi2any produces <title>, but the above scripts
> > modify the HTML produced by texi2any, so what we eventually have is
> > the result of those scripts.
> 
> Hm...  it looks like the manuals are produced with "makeinfo --html",
> though -- I can't see any usage of texi2html or texi2any there, but I
> may be missing something.

makeinfo is supposed to be a symlink to texi2any.

> Ah, it's this code:
> 
> (defun manual-html-fix-headers ()
>   "Fix up HTML headers for the Emacs manual in the current buffer."
>   (let ((texi5 (search-forward "<!DOCTYPE" nil t))
> 	opoint)
> 
> [...]
> 
>     (search-forward "<meta")
>     (setq opoint (match-beginning 0))
>     (unless texi5
>       (search-forward "<!--")
>       (goto-char (match-beginning 0))
>       (delete-region opoint (point))
>       (search-forward "<meta http-equiv=\"Content-Style")
>       (setq opoint (match-beginning 0)))
>     (search-forward "</title>\n")
>     (delete-region opoint (point))

Yes.  (But that's not the only editing we do, although the rest isn't
relevant to <title>, I think.)

> So we delete the <title> that makeinfo --html has created.  Perhaps
> that's just a bug?

It is definitely done on purpose, but I don't know what is the purpose
of deleting <title> (and many other parts of the headers as well).

> I see that you adjusted this code in May...

I made changes there because someone reported a problem with reading
the manuals on mobile devices, because we were deleting the line with
'<meta name="viewport"...', which in latest Texinfo takes care of
adjusting the viewport to the width of the device display.  My changes
were supposed to avoid deletion of this header (and a few others), but
I don't think I kept <title>.

I think the solution to this is for some HTML5 expert to look at our
edits vs what Texinfo 6.8 produces, and tell which parts of the
editing are needed (and why) and which aren't.  I'm far from being
that expert.

Failing that, I think the only alternative is to see how the original
Texinfo output looks in a browser, compare that with the edited
manuals, and then decide which of the edits are really needed.  One
problem with that is that we'll probably have to require Texinfo 6.8
or later if we go that way, because maintaining compatibility with
multiple Texinfo versions is really too much.  Ideally, we should keep
the edits to the absolute minimum.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-03 13:13         ` Eli Zaretskii
@ 2022-07-03 14:48           ` Max Nikulin
  2022-07-04 10:42           ` Lars Ingebrigtsen
  1 sibling, 0 replies; 10+ messages in thread
From: Max Nikulin @ 2022-07-03 14:48 UTC (permalink / raw)
  To: Eli Zaretskii, Lars Ingebrigtsen; +Cc: 48334

On 03/07/2022 20:13, Eli Zaretskii wrote:
>> From: Lars Ingebrigtsen
>> Date: Sun, 03 Jul 2022 14:16:27 +0200
>>        (setq opoint (match-beginning 0)))
>>      (search-forward "</title>\n")
>>      (delete-region opoint (point))
> 
> Yes.  (But that's not the only editing we do, although the rest isn't
> relevant to <title>, I think.)

Deleting of text till "<title>" should be a rather local change. Till 
May the region till "</head>" was removed.

By the way, is there a reason why DC.title meta is set to gnu.org, not 
to the title of current node or at least the manual. I am not familiar 
with Dublin Core, but I expect it is rich enough to express both and 
gnu.org as well.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-03 13:13         ` Eli Zaretskii
  2022-07-03 14:48           ` Max Nikulin
@ 2022-07-04 10:42           ` Lars Ingebrigtsen
  2022-07-04 11:36             ` Eli Zaretskii
  1 sibling, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-04 10:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 48334, m.a.nikulin

Eli Zaretskii <eliz@gnu.org> writes:

> makeinfo is supposed to be a symlink to texi2any.

Yes, indeed.

> I made changes there because someone reported a problem with reading
> the manuals on mobile devices, because we were deleting the line with
> '<meta name="viewport"...', which in latest Texinfo takes care of
> adjusting the viewport to the width of the device display.  My changes
> were supposed to avoid deletion of this header (and a few others), but
> I don't think I kept <title>.

I tried running the code now (and commented out the
manual-html-fix-headers function), and I ended up with:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Created by GNU Texinfo 6.8, https://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<!-- This file describes the Emacs auth-source library.

Copyright (C) 2008-2022 Free Software Foundation, Inc.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with the Front-Cover Texts being "A GNU Manual,"
and with the Back-Cover Texts as in (a) below.  A copy of the license
is included in the section entitled "GNU Free Documentation License".

(a) The FSF's Back-Cover Text is: "You have the freedom to copy and
modify this GNU manual." -->
<title>Emacs auth-source Library 0.3</title>

This is with texi2any (GNU texinfo) 6.8.  If I'm reading the code right,
the delete-region here is just deleting that <meta, the comment, and the
<title>.

It's probably different in every texinfo version, but altering the

    (search-forward "</title>\n")

to

    (search-forward "<title>")

should be safe in any case, so I'll go ahead and do that.

> Failing that, I think the only alternative is to see how the original
> Texinfo output looks in a browser, compare that with the edited
> manuals, and then decide which of the edits are really needed.  One
> problem with that is that we'll probably have to require Texinfo 6.8
> or later if we go that way, because maintaining compatibility with
> multiple Texinfo versions is really too much.  Ideally, we should keep
> the edits to the absolute minimum.

I think altering the HTML in this way isn't idea.  It'd be much better
to just parse the HTML, alter the DOM (to remove/insert elements), and
then write the DOM out to HTML again.  That'd be a whole lot less
brittle.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-04 10:42           ` Lars Ingebrigtsen
@ 2022-07-04 11:36             ` Eli Zaretskii
  2022-07-05 11:09               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2022-07-04 11:36 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 48334, m.a.nikulin

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: m.a.nikulin@gmail.com,  48334@debbugs.gnu.org
> Date: Mon, 04 Jul 2022 12:42:42 +0200
> 
> I tried running the code now (and commented out the
> manual-html-fix-headers function), and I ended up with:
> 
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
> <html>
> <!-- Created by GNU Texinfo 6.8, https://www.gnu.org/software/texinfo/ -->
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> <!-- This file describes the Emacs auth-source library.
> 
> Copyright (C) 2008-2022 Free Software Foundation, Inc.
> 
> Permission is granted to copy, distribute and/or modify this document
> under the terms of the GNU Free Documentation License, Version 1.3 or
> any later version published by the Free Software Foundation; with no
> Invariant Sections, with the Front-Cover Texts being "A GNU Manual,"
> and with the Back-Cover Texts as in (a) below.  A copy of the license
> is included in the section entitled "GNU Free Documentation License".
> 
> (a) The FSF's Back-Cover Text is: "You have the freedom to copy and
> modify this GNU manual." -->
> <title>Emacs auth-source Library 0.3</title>
> 
> This is with texi2any (GNU texinfo) 6.8.  If I'm reading the code right,
> the delete-region here is just deleting that <meta, the comment, and the
> <title>.

That's strange, because I remember testing the changes, and I also
used Texinfo 6.8.  Did you compare the produced HTML with what's on
the Web site?  That should show the differences clearly.  Also, I
think the title (and the file I worked mostly) is index.html -- did
you look at that, or did you look at some other file?

> > > Failing that, I think the only alternative is to see how the original
> > > Texinfo output looks in a browser, compare that with the edited
> > > manuals, and then decide which of the edits are really needed.  One
> > > problem with that is that we'll probably have to require Texinfo 6.8
> > > or later if we go that way, because maintaining compatibility with
> > > multiple Texinfo versions is really too much.  Ideally, we should keep
> > > the edits to the absolute minimum.
> > 
> > I think altering the HTML in this way isn't idea.  It'd be much better
> > to just parse the HTML, alter the DOM (to remove/insert elements), and
> > then write the DOM out to HTML again.  That'd be a whole lot less
> > brittle.

That's fine with me, but that, too, assumes someone who can understand
the resulting DOM, and which of its parts we want to change and why.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#48334: No <title> elements in HTML manual pages
  2022-07-04 11:36             ` Eli Zaretskii
@ 2022-07-05 11:09               ` Lars Ingebrigtsen
  0 siblings, 0 replies; 10+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-05 11:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 48334, m.a.nikulin

Eli Zaretskii <eliz@gnu.org> writes:

> That's strange, because I remember testing the changes, and I also
> used Texinfo 6.8.  Did you compare the produced HTML with what's on
> the Web site?  That should show the differences clearly.  Also, I
> think the title (and the file I worked mostly) is index.html -- did
> you look at that, or did you look at some other file?

I looked at the auth-source mono version of the HTML mainly.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-05 11:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-10 14:48 bug#48334: No <title> elements in HTML manual pages Maxim Nikulin
2021-10-05 14:14 ` Maxim Nikulin
2022-07-02 16:19   ` Lars Ingebrigtsen
2022-07-02 17:02     ` Eli Zaretskii
2022-07-03 12:16       ` Lars Ingebrigtsen
2022-07-03 13:13         ` Eli Zaretskii
2022-07-03 14:48           ` Max Nikulin
2022-07-04 10:42           ` Lars Ingebrigtsen
2022-07-04 11:36             ` Eli Zaretskii
2022-07-05 11:09               ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).