all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Tim Landscheidt <tim@tim-landscheidt.de>
To: Eli Zaretskii <eliz@gnu.org>
Cc: help-gnu-emacs@gnu.org
Subject: Re: Coding system to encode arguments to groff?
Date: Sun, 03 Oct 2021 13:14:04 +0000	[thread overview]
Message-ID: <87bl469roj.fsf@vagabond.tim-landscheidt.de> (raw)
In-Reply-To: <83o88bio7g.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 29 Sep 2021 15:02:59 +0300")

Eli Zaretskii <eliz@gnu.org> wrote:

>> I pass text arguments from Emacs Lisp to a groff command
>> with the "-d" option.  For ASCII strings, this is trivial;
>> for strings with umlauts, I need to use:

>> | (encode-coding-string variable-to-pass 'iso-latin-1)

> What is your default locale's codeset on that system?  In general, if
> the default locale matches the encoding you need to use, the above
> should happen automagically.

If I understand your question correctly, UTF-8:

| [tim@vagabond ~]$ locale
| LANG=de_DE.UTF-8
| LC_CTYPE="de_DE.UTF-8"
| LC_NUMERIC="de_DE.UTF-8"
| LC_TIME="de_DE.UTF-8"
| LC_COLLATE="de_DE.UTF-8"
| LC_MONETARY="de_DE.UTF-8"
| LC_MESSAGES="de_DE.UTF-8"
| LC_PAPER="de_DE.UTF-8"
| LC_NAME="de_DE.UTF-8"
| LC_ADDRESS="de_DE.UTF-8"
| LC_TELEPHONE="de_DE.UTF-8"
| LC_MEASUREMENT="de_DE.UTF-8"
| LC_IDENTIFICATION="de_DE.UTF-8"
| LC_ALL=
| [tim@vagabond ~]$

>> For strings with other Unicode characters like "–" (#x2013),
>> I need to call groff's preconv like:

>> | (shell-command-to-string (concat "preconv -r <(echo " (shell-quote-argument variable-to-pass) ")"))

>> which for "ä–ö" returns something like:

>> | \[u00E4]\[u2013]\[u00F6]

> This is just the original "ä–ö" string, so I'm not quite sure what did
> the above accomplish.

The output is literal, i. e.:

| 0000000   \   [   u   0   0   E   4   ]   \   [   u   2   0   1   3   ]
| 0000020   \   [   u   0   0   F   6   ]  \n

>> Now in Emacs, this looks very much like what a coding system
>> would do.  The info documentation for elisp just laconically
>> says:

>> |    How to define a coding system is an arcane matter, and is not
>> | documented here.

>> Has someone implemented such a coding system for groff so
>> that something like:

>> | (encode-coding-string variable-to-pass 'x-groff)

> I don't think you should need a new coding-system.  But you didn't
> explain why you need to explicitly encode the command-line arguments,
> so it's hard to give an accurate advice.  What kind of Groff command
> needs this jumping through hoops from you?  E.g., why isn't it enough
> to bind coding-system-for-write to whatever you need, around the call
> to call-process or whatever?

> IOW, please describe in more detail the Groff-related context in which
> this problem happens, so that we could have an intelligent discussion
> of the issues you might have.

On Fedora 34 with GNU groff 1.22.4:

| (let
|     ((temp-ps-buffer (generate-new-buffer "*test ps*"))
|      (test-arg "a-o"))
|   (with-temp-buffer
|     (insert ".fam H\n\\*[test-arg]\n")
|     (call-process-region
|      (point-min)
|      (point-max)
|      "groff"
|      nil
|      temp-ps-buffer
|      nil
|      "-Tps"
|      "-d" (concat "test-arg=" test-arg)))
|   (switch-to-buffer temp-ps-buffer)
|   (ps-mode)
|   (doc-view-mode))

produces a PostScript buffer with the text "a-o".

With test-arg = "ä-ö" (ä minus ö), it produces gibberish mi-
nus gibberish.

With test-arg = (encode-coding-string "ä-ö" 'iso-latin-1) (ä
minus ö), it produces the text "ä-ö".

With test-arg = (encode-coding-string "ä–ö" 'iso-latin-1) (ä
endash ö), it produces the text "ä[white space]ö".

With test-arg = (shell-command-to-string (concat "preconv -r
<(echo " (shell-quote-argument "ä–ö") ")")) (ä endash ö), it
produces the intended text "ä–ö".

(Passing "-k" as an additional option to groff does not
change the output as "-k" only converts standard input, not
macro definitions set as command line arguments.)

Tim



  reply	other threads:[~2021-10-03 13:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29  8:01 Coding system to encode arguments to groff? Tim Landscheidt
2021-09-29 12:02 ` Eli Zaretskii
2021-10-03 13:14   ` Tim Landscheidt [this message]
2021-10-03 15:14     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bl469roj.fsf@vagabond.tim-landscheidt.de \
    --to=tim@tim-landscheidt.de \
    --cc=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.