emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: "G. Branden Robinson" <g.branden.robinson@gmail.com>
To: emacs-orgmode@gnu.org
Cc: groff@gnu.org
Subject: Re: [BUG] "\fC" macro in ox-man.el [9.6.15 (release_9.6.15 @ /usr/share/emacs/29.2/lisp/org/)]
Date: Wed, 18 Dec 2024 11:20:40 -0600	[thread overview]
Message-ID: <20241218172040.tyytdhbyl7annyli@illithid> (raw)
In-Reply-To: <87v866jsou.fsf@debian-hx90.lan> <87frx7moh8.fsf@localhost> <87h6hcia9y.fsf@debian-hx90.lan> <878r2mxtjw.fsf@localhost> <20240314214651.GB324558@celephais.dreamlands>

[-- Attachment #1: Type: text/plain, Size: 12791 bytes --]

[looping in groff list; please reply to it as well, as I am not
subscribed to emacs-orgmode]

Hi Xiyue, Ihor, and Jeremy,

I stumbled across this thread while researching problems have
encountered with groff, which I maintain for the GNU Project.

At 2024-02-29T23:51:29-0800, Xiyue Deng wrote:
> "mu4e"[1] (a popular Emacs mail client) uses Org to generate its
> manpages.  However, the generated output contains macros that are not
> understood by groff.  After some debugging, Jeremy traced this back to
> the macro "\fC" used in ox-man.el[2].

Strictly speaking, that's an escape sequence, not a macro.

> Git history shows that this may have been there since the beginning.
> We tried to find a documentation for the "\fC" macro but has not been
> able to find one.

Here's some historical background.  Some may want to skip it.

--- begin background ---

`C` is a non-portable font name.  Few font identifiers _are_ portable
across the history of troff.  When Brian Kernighan refactored Joe
Ossanna's original troff for device-independence circa 1980, digital
fonts were in their infancy and not generally portable across hardware
devices as they are now.  The original device target for Unix troff, the
Graphic Systems, Inc. (later acquired by Wang Laboratories, Inc.) C/A/T
machine, did not have digital fonts.  It was a phototypesetter: its
fonts were glyphs etched on glass plates mounted in a rotating
mechanical carousel and magnified with an optical lens to the desired
type size, then flashed to photosensitive paper.

Kernighan perhaps did not foresee that people targeting multiple output
device from a single master document might be able to avail themselves
of the same typefaces, or at least metrically compatible ones, across
devices, and so made no provision for font aliasing or renaming.  (You
could wangle it, perhaps, by restricting oneself to numerical "mounting
positions", but this seems to have been a seldom-taken recourse in
surviving *roff documents of the 1980s that have come to my attention.)

--- end background --

> Jeremy suggests that "C" may be an old alias for Courier, and if
> that's the case it should be changed to "\f[CR]".  Would be great if
> Org people can confirm.

That is good advice and it is what I recommend if you're writing in
"raw" roff.  The context of the discussion is not ultra-clear to me; is
ox-man.el a replacement for the old GNU Emacs man pager, "woman"?

If what you're dealing with is man pages, I (and mandoc(1) maintainer
Ingo Schwarze) discourage the use of formatter features for font
selection.  Use the man(7) package's macros instead.  I'll return to
this point below.

At 2024-03-03T13:30:59+0000, Ihor Radchenko wrote:
> This is not an unknown problem. AFAIU, the \fC macro is widely used
> for troff, although it is not supported by groff. Check out the
> ongoing discussion at
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049968#15

It _is_ supported, however there are some confounding factors.

Historically, groff has quietly remapped the font name "C" to "CR"
(Courier roman, see below).  However this was done only for the "ps"
(PostScript) output device (and PDF via file inclusion).

https://git.savannah.gnu.org/cgit/groff.git/tree/tmac/ps.tmac?h=1.23.0#n8

This means that the font name did not exist, and was not supported, on
terminal devices, which is what I suppose ox-man.el is using to stage
the rendered man page before doing whatever it does to make it look nice
in an Emacs window.  So, when you try to switch to font "C" or "CR" on a
terminal device, nothing happens.

Here comes the next complication.  Historically, groff also did not
throw warnings when a nonexistent font was selected.  In groff 1.23.0,
it started emitting these warnings.  People mistook this for withdrawal
of "support" for font names "C" and "CW" (and occasionally others).
What's actually happening is that, when rendering for a device other
than "ps" or "pdf", they are being warned of a no-op that they weren't
warned about before.  Nothing about the rendering of the document
actually changed.

https://lists.gnu.org/archive/html/groff-commit/2022-06/msg00114.html

This experience convinced me that these legacy font names are more
trouble than they are worth, so for groff 1.24 (for which, now that I
have a fencepost account, I hope to make a release candidate available
soon), we're starting a deprecation cycle for them.

https://git.savannah.gnu.org/cgit/groff.git/tree/NEWS?id=db49abe9ee9035f248f0893b4a1beb4ce3db8e38#n245

>>   The best solution known to me is to use an extension to the man(7)
>>   language.  It first appeared in Ninth Edition Unix (1986) and was
>>   adopted by a groff release in 2009.  That is the `EX`/`EE` macro
>>   pair, which sets a monospaced display.  (In other words, filling is
>>   disabled and a monospaced font selected if necessary.)

Yes.

At 2024-03-11T17:06:17-0700, Xiyue Deng wrote:
> I'm not very familiar with roff so my understanding may be off.
> According to the `Safe subset' section in man(7), they mentioned the
> following:
> 
> ,----
> | Font changes (ft and the \f escape sequence) should only have the
> | values 1, 2, 3, 4, R, I, B, P, or CW (the ft command may also have
> | no parameters).
> `----
> 
> Does it mean `\fC' should be replaced by `\f[CW]'?

No.  The Linux man-pages version of man(7) is going to be (or already
has been, in a recent release) withdrawn in favor of groff_man(7), which
along with a new page for groff 1.23.0, groff_man_style(7), has been
intensely worked on for the past seven years to make man page
composition (and, to an extent, reading) a less mysterious process.

https://lore.kernel.org/linux-man/7f7f2644-d408-969b-6916-ee9cae0962b9@kernel.org/

At 2024-03-13T11:25:23+0000, Ihor Radchenko wrote:
> man 7 groff has
> 
>       Fonts often have trademarked names, and even Free Software fonts
>       can require renaming upon modification. groff maintains a
>       convention that a de‐ vice’s serif font family is given the name
>       T (“Times”), its sans-serif family H (“Helvetica”), and its
>       monospaced family C (“Courier”). Histori‐ cal inertia has driven
>       groff’s font identifiers to short uppercase abbreviations of
>       font names, as with TR, TB, TI, TBI, and a special font S.
> 
> So, \fC refers to "Courier".

Yes, but.  The "C" here, in groff, refers to a font _family_, not a
fully resolved font name.  In groff, following the precedent of Adobe
PostScript, four styles of Courier are available: Courier roman, Courier
italic, Courier bold, and Courier bold-italic: CR, CB, CI, and CBI.

https://www.gnu.org/software/groff/manual/groff.html.node/Using-Fonts.html#Using-Fonts

Because of the aforementioned aliasing, you could "get away with" saying
`\fC`, because, when rendering for "ps" or "pdf", `C` would be remapped
to `CR` for you, but that was no help for terminal output devices.  They
never were able to select such a font, and in groff 1.23.0, the
formatter started warning about this.

> I did not find any text description of CW font, but my groff
> installation has usr/share/groff/1.23.0/font/devdvi/CW font spec:
> 
>     name CW
>     special
>     ...

Yes.  This is a tangentially related issue.  As grodvi(1) says:

   Typefaces
     grodvi supports the standard four styles: R (roman), I (italic), B
     (bold), and BI (bold‐italic).  Fonts are grouped into families T
     and H having members in each style.  “CM” abbreviates “Computer
     Modern”.
...
     The following fonts are not members of a family.

            CW     CM Typewriter Text (cmtt10)
            CWI    CM Italic Typewriter Text (cmitt10)

TeX DVI's Computer Modern Typewriter faces don't offer a full "family"
of four styles as its ordinary and sans serif faces do.

The choice of "CW" and "CWI" as abbreviations for these is unfortunate.
It apes a convention occasionally used in AT&T troff, specifically in
Unix System III (1980) where some special contrivances were made for
typesetting "constant-width" (thus "CW") faces.

Unix hackers are enamored of their terse identifiers, and never repent
of the confusing ambiguity they introduce in pursuit of them.  And if
such practice shrouds their development activity in mystique, so much
the better, it is thought.

If I were an even more aggressive reformer than I am, I'd revise all of
the font names used by grodvi to match those used by Knuth for them, but
in caps and chopping off the "10".  This would acquire the virtue of
_familiarity_ for experienced TeX users.  (I'd retain "T" and "H" family
remappings to maintain groff's attempt at typeface ecumenicism and
document portability, of course.  And people trying to select "CB" and
"CBI" would be warned as they should be, since those faces would be
unavailable.)

> which looks more suitable. But CR is not listed in "safe" subset
> (man 7 man)

The Linux man-pages man(7) document was stale or inaccurate in this
respect.  However, man pages generally should not attempt to access
specific fonts by name anyway.  They should use the macro facilities
afforded by the _man_ package to style their text.

groff_man_style(7):
   Portability
...
     The two major features that control formatting in the roff language
     are requests and escape sequences.  Since the man macros are
     implemented in terms of these, one can, in principle, supplement
     the functionality of man with these lower‐level elements where
     necessary.

     However, use of roff requests (apart from the empty request “.”)
     risks poor rendering when your page is processed by non‐roff
     formatters that attempt to interpret page sources.  (Historically,
     this was commonly attempted for HTML conversion.)  Requests may
     make assumptions that do not hold in an HTML environment.  Many of
     these programs don’t interpret the full roff language (let alone
     extensions): they may be incapable of handling numeric expressions,
     control structures, or register, string, and macro definitions.
     Such limitations can lead to portions of a document being presented
     incomprehensibly or omitted altogether.
...
     Exercise restraint with escape sequences as with requests.

> Also, neither CW nor CR work with html output:
> 
> with \fC
> 
>     .TH "" "1" 
>     .PP
>     \fBThis is test\fP 
>     \fCcode a+b\fP here a+b.
> 
> yields (groff -Thtml test.man)
> 
> <p><b>This is test</b> <tt>code a+b</tt> here a+b.</p>
> 
> Note <tt> tag.
>
> but with \f[CW]
> 
>     .TH "" "1" 
>     .PP
>     \fBThis is test\fP 
>     \f[CW]code a+b\fP here a+b.
> 
> <p><b>This is test</b> code a+b here a+b.</p>
> 
> No special markup is applied to the code.
> 
> Same for \f[CR].

The "html" output device has its own significant pile of problems and I
won't make this long mail even longer by going into them here.  I hope
to improve the situation in the coming years.

> What I did for the mu4e man-pages was to patch them to alias font C to
> B:
> 
>     .ftr C B

Can you share a link to the mu4e man pages for me?  I'd like to have a
look at them so I can make suggestions.  If you'd be open to them.

As a rule, I think doing the foregoing in a man page document should be
unnecessary.

> My initial assumption when I first looked into this is that the font
> to use would be `CR`, not `CW`.  Doing this with `CR` does seem to
> work:
> 
>     $ cat /space/azazel/tmp/test.man 
>     .ftr C CR
> 
>     .TH "" "1"
>     .PP
>     \fBThis is test\fP
>     \fCcode a+b\fP here a+b.
>     $ groff -Thtml /space/azazel/tmp/test.man | tail -5 | head -2
>     <p style="margin-top: 1em"><b>This is test</b> <tt>code
>     a+b</tt> here a+b.</p>
> 
> However, as you observe, `\f[CR]` doesn't (nor does `\f(CR`).  I note
> that groff's HTML support is stated in the grohtml(1) man-page to be
> in beta.  Haven't checked the source to determine whether that is
> what's going on here.

It's a mess. :(

https://savannah.gnu.org/bugs/index.php?61915

That's the tip of a large iceberg.

> In any case, my understanding from reading the conversation in the
> Debian bug-report is that this issue affects multiple roff generators
> in Debian.  Therefore, it probably makes sense to consult within
> Debian before asking the maintainers of those generators to make
> changes.  I need to go over that conversation again and think about
> this more.

Please consider me, and the groff mailing list, a resource.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2024-12-18 19:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  7:51 [BUG] "\fC" macro in ox-man.el [9.6.15 (release_9.6.15 @ /usr/share/emacs/29.2/lisp/org/)] Xiyue Deng
2024-03-03 13:30 ` Ihor Radchenko
2024-03-12  0:06   ` Xiyue Deng
2024-03-13 11:25     ` Ihor Radchenko
2024-03-14 21:46       ` Jeremy Sowden
2024-05-22  9:54         ` Ihor Radchenko
2024-12-18 17:20         ` G. Branden Robinson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241218172040.tyytdhbyl7annyli@illithid \
    --to=g.branden.robinson@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=groff@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).