unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
@ 2016-05-29  9:52 Stephen Berman
  2016-05-29 14:42 ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Berman @ 2016-05-29  9:52 UTC (permalink / raw)
  To: 23647

O. emacs -Q
1. Open a man page that has a link on a hyphenated word, e.g. on my
   system: M-x man RET signal RET, put point on the word spanning lines
   129-130, which is displayed as `sig-
   nalfd(2)'.
2. Type RET (or click mouse-1 or mouse-2) on that link.
=> The error message "Can’t find the 2 sig-nalfd manpage" is displayed.

The following patch makes the link DTRT:

diff --git a/lisp/man.el b/lisp/man.el
index 5acf90b..5d4cacc 100644
--- a/lisp/man.el
+++ b/lisp/man.el
@@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
 			(quit-restore-window
 			 (get-buffer-window (current-buffer) t) 'kill)
 		      (kill-buffer (current-buffer)))
-		    (message "Can't find the %s manpage"
-			     (Man-page-from-arguments args)))
+                    ;; Entries hyphenated due to the window width
+                    ;; won't be found in the man database, so remove
+                    ;; the hyphenation and look again.
+		    (if (string-match "-" args)
+			(let ((str (replace-match "" nil nil args)))
+			  (Man-getpage-in-background str))
+                      (message "Can't find the %s manpage"
+                               (Man-page-from-arguments args))))
 
 		(if Man-fontify-manpage-flag
 		    (message "%s man page formatted"

This is a long-standing bug (presumably since commit
162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
the fix seems safe, I suppose it's too late for emacs-25.  So if there
are no objections, should I commit it to master, or is it ok for the
upcoming release?


In GNU Emacs 25.1.50.19 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15)
 of 2016-05-28 built on rosalinde
Repository revision: 4ef0fc192b8a10625053dbb9376c814e68612eb6
Windowing system distributor 'The X.Org Foundation', version 11.0.11601000
System Description:	openSUSE 13.2 (Harlequin) (x86_64)

Configured using:
 'configure --with-xwidgets 'CFLAGS=-Og -g3''

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GCONF GSETTINGS NOTIFY
GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS
GTK3 X11 XWIDGETS

Important settings:
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix





^ permalink raw reply related	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-05-29  9:52 bug#23647: 25.1.50; In man pages, links on hyphenated words don't work Stephen Berman
@ 2016-05-29 14:42 ` Eli Zaretskii
  2016-05-29 23:09   ` Stephen Berman
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2016-05-29 14:42 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 23647

> From: Stephen Berman <stephen.berman@gmx.net>
> Date: Sun, 29 May 2016 11:52:29 +0200
> 
> O. emacs -Q
> 1. Open a man page that has a link on a hyphenated word, e.g. on my
>    system: M-x man RET signal RET, put point on the word spanning lines
>    129-130, which is displayed as `sig-
>    nalfd(2)'.
> 2. Type RET (or click mouse-1 or mouse-2) on that link.
> => The error message "Can’t find the 2 sig-nalfd manpage" is displayed.
> 
> The following patch makes the link DTRT:
> 
> diff --git a/lisp/man.el b/lisp/man.el
> index 5acf90b..5d4cacc 100644
> --- a/lisp/man.el
> +++ b/lisp/man.el
> @@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
>  			(quit-restore-window
>  			 (get-buffer-window (current-buffer) t) 'kill)
>  		      (kill-buffer (current-buffer)))
> -		    (message "Can't find the %s manpage"
> -			     (Man-page-from-arguments args)))
> +                    ;; Entries hyphenated due to the window width
> +                    ;; won't be found in the man database, so remove
> +                    ;; the hyphenation and look again.
> +		    (if (string-match "-" args)

Is it only the ASCII hyphen/minus, or could there be other characters
(e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?

> This is a long-standing bug (presumably since commit
> 162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
> the fix seems safe, I suppose it's too late for emacs-25.  So if there
> are no objections, should I commit it to master, or is it ok for the
> upcoming release?

Master, please.

Thanks.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-05-29 14:42 ` Eli Zaretskii
@ 2016-05-29 23:09   ` Stephen Berman
  2016-05-30  0:22     ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Berman @ 2016-05-29 23:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23647

On Sun, 29 May 2016 17:42:13 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Date: Sun, 29 May 2016 11:52:29 +0200
>> 
>> O. emacs -Q
>> 1. Open a man page that has a link on a hyphenated word, e.g. on my
>>    system: M-x man RET signal RET, put point on the word spanning lines
>>    129-130, which is displayed as `sig-
>>    nalfd(2)'.
>> 2. Type RET (or click mouse-1 or mouse-2) on that link.
>> => The error message "Can’t find the 2 sig-nalfd manpage" is displayed.
>> 
>> The following patch makes the link DTRT:
>> 
>> diff --git a/lisp/man.el b/lisp/man.el
>> index 5acf90b..5d4cacc 100644
>> --- a/lisp/man.el
>> +++ b/lisp/man.el
>> @@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
>>  			(quit-restore-window
>>  			 (get-buffer-window (current-buffer) t) 'kill)
>>  		      (kill-buffer (current-buffer)))
>> -		    (message "Can't find the %s manpage"
>> -			     (Man-page-from-arguments args)))
>> +                    ;; Entries hyphenated due to the window width
>> +                    ;; won't be found in the man database, so remove
>> +                    ;; the hyphenation and look again.
>> +		    (if (string-match "-" args)
>
> Is it only the ASCII hyphen/minus, or could there be other characters
> (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?

That possibility didn't occur to me but according to Wikipedia, groff
also outputs soft hyphens (octal 255) and indeed I see that the function
Man-build-references-alist, which also removes hyphenation (in a more
complicated way that doesn't seem to be needed in the present case),
also takes the soft hyphen into account.  That can be done here too by
changing the above string-match regexp to "[-­]".  If someone knows of
other possibilities allowed by [gt]roff, maybe the regexp could be
further extended, or the condition reformulated as required.  What do
you think?

>> This is a long-standing bug (presumably since commit
>> 162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
>> the fix seems safe, I suppose it's too late for emacs-25.  So if there
>> are no objections, should I commit it to master, or is it ok for the
>> upcoming release?
>
> Master, please.

Ok.  I'll wait another day or two in case there's more feedback.
Thanks.

Steve Berman





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-05-29 23:09   ` Stephen Berman
@ 2016-05-30  0:22     ` Eli Zaretskii
  2016-05-30 13:55       ` Stephen Berman
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2016-05-30  0:22 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 23647

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: 23647@debbugs.gnu.org
> Date: Mon, 30 May 2016 01:09:21 +0200
> 
> > Is it only the ASCII hyphen/minus, or could there be other characters
> > (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?
> 
> That possibility didn't occur to me but according to Wikipedia, groff
> also outputs soft hyphens (octal 255) and indeed I see that the function
> Man-build-references-alist, which also removes hyphenation (in a more
> complicated way that doesn't seem to be needed in the present case),
> also takes the soft hyphen into account.  That can be done here too by
> changing the above string-match regexp to "[-­]".  If someone knows of
> other possibilities allowed by [gt]roff, maybe the regexp could be
> further extended, or the condition reformulated as required.  What do
> you think?

I'm not enough of a roff expert to tell, but how about asking on the
Groff list?





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-05-30  0:22     ` Eli Zaretskii
@ 2016-05-30 13:55       ` Stephen Berman
  2016-06-04 15:35         ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Berman @ 2016-05-30 13:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23647

On Mon, 30 May 2016 03:22:58 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: 23647@debbugs.gnu.org
>> Date: Mon, 30 May 2016 01:09:21 +0200
>> 
>> > Is it only the ASCII hyphen/minus, or could there be other characters
>> > (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?
>> 
>> That possibility didn't occur to me but according to Wikipedia, groff
>> also outputs soft hyphens (octal 255) and indeed I see that the function
>> Man-build-references-alist, which also removes hyphenation (in a more
>> complicated way that doesn't seem to be needed in the present case),
>> also takes the soft hyphen into account.  That can be done here too by
>> changing the above string-match regexp to "[-­]".  If someone knows of
>> other possibilities allowed by [gt]roff, maybe the regexp could be
>> further extended, or the condition reformulated as required.  What do
>> you think?
>
> I'm not enough of a roff expert to tell, but how about asking on the
> Groff list?

I did that and got this feedback from Steffen Nurpmeso:

> I have been convinced that soft hyphen is a control character and
> not something visual, it should be used as a «break-indicator»
> rather than as a hyphenation character, interpretation of which is
> left as an excercise for the processing software.  I have no idea
> still but would guess groff uses "hyphen minus" U+002D or hyphen
> U+2010 if Unicode is possible.

In a followup to another response he added:

> For display purposes however i think U+00AD can't be used
> directly, but will be replaced by the renderer to either nothing,
> if no wrap is to be applied at the character position, or
> something appropriate, like ASCII hyphen-minus or some extended
> Unicode "Pd" letter, of which there are some (e.g., U+058A
> ARMENIAN HYPHEN, U+1400 CANADIAN SYLLABICS HYPHEN, and more).

And he also made this suggestion:

> Eli Zaretskii is so active on the
> Unicode list, why don't you use the Pd character class for
> detecting «hyphen»?  I guess this should cover all such things
> already as of today, thanks to Werner Lemberg?!

So how should we proceed from here?  We could add U+2010 to the regexp
in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
latter, given that man.el already uses it elsewhere), but if these are
all included in the Unicode Pd character class along with other possible
hyphen characters, maybe a different approach is required.  I know
nothing about the Pd character class and how to detect it with Elisp; I
also don't know if doing that would lead to further changes in man.el,
making this a larger undertaking.  What do you suggest?

Steve Berman





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-05-30 13:55       ` Stephen Berman
@ 2016-06-04 15:35         ` Eli Zaretskii
  2016-06-05 11:17           ` Stephen Berman
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2016-06-04 15:35 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 23647

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: 23647@debbugs.gnu.org
> Date: Mon, 30 May 2016 15:55:47 +0200
> 
> > I'm not enough of a roff expert to tell, but how about asking on the
> > Groff list?
> 
> I did that and got this feedback from Steffen Nurpmeso:
> 
> > I have been convinced that soft hyphen is a control character and
> > not something visual, it should be used as a «break-indicator»
> > rather than as a hyphenation character, interpretation of which is
> > left as an excercise for the processing software.  I have no idea
> > still but would guess groff uses "hyphen minus" U+002D or hyphen
> > U+2010 if Unicode is possible.
> 
> In a followup to another response he added:
> 
> > For display purposes however i think U+00AD can't be used
> > directly, but will be replaced by the renderer to either nothing,
> > if no wrap is to be applied at the character position, or
> > something appropriate, like ASCII hyphen-minus or some extended
> > Unicode "Pd" letter, of which there are some (e.g., U+058A
> > ARMENIAN HYPHEN, U+1400 CANADIAN SYLLABICS HYPHEN, and more).
> 
> And he also made this suggestion:
> 
> > Eli Zaretskii is so active on the
> > Unicode list, why don't you use the Pd character class for
> > detecting «hyphen»?  I guess this should cover all such things
> > already as of today, thanks to Werner Lemberg?!
> 
> So how should we proceed from here?  We could add U+2010 to the regexp
> in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
> hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
> latter, given that man.el already uses it elsewhere), but if these are
> all included in the Unicode Pd character class along with other possible
> hyphen characters, maybe a different approach is required.  I know
> nothing about the Pd character class and how to detect it with Elisp; I
> also don't know if doing that would lead to further changes in man.el,
> making this a larger undertaking.  What do you suggest?

I'd go with just those 3, I think the others will not be produced by
Groff.

Thanks.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#23647: 25.1.50; In man pages, links on hyphenated words don't work
  2016-06-04 15:35         ` Eli Zaretskii
@ 2016-06-05 11:17           ` Stephen Berman
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Berman @ 2016-06-05 11:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 23647-done

On Sat, 04 Jun 2016 18:35:46 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> So how should we proceed from here?  We could add U+2010 to the regexp
>> in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
>> hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
>> latter, given that man.el already uses it elsewhere), but if these are
>> all included in the Unicode Pd character class along with other possible
>> hyphen characters, maybe a different approach is required.  I know
>> nothing about the Pd character class and how to detect it with Elisp; I
>> also don't know if doing that would lead to further changes in man.el,
>> making this a larger undertaking.  What do you suggest?
>
> I'd go with just those 3, I think the others will not be produced by
> Groff.

Done in commit 75de364 on master, and closing the bug.

Steve Berman





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-05 11:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-29  9:52 bug#23647: 25.1.50; In man pages, links on hyphenated words don't work Stephen Berman
2016-05-29 14:42 ` Eli Zaretskii
2016-05-29 23:09   ` Stephen Berman
2016-05-30  0:22     ` Eli Zaretskii
2016-05-30 13:55       ` Stephen Berman
2016-06-04 15:35         ` Eli Zaretskii
2016-06-05 11:17           ` Stephen Berman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).