From: Eduardo Ochs <eduardoochs@gmail.com>
To: help-gnu-emacs <help-gnu-emacs@gnu.org>
Subject: call-process -> insert -> iso-latin-1-dos problem on Windows
Date: Sun, 14 Jan 2024 20:09:41 -0300 [thread overview]
Message-ID: <CADs++6jc9BXPKqd8Ry5BA53zSvE4GvGeLjjJSomeng=OB-eMiw@mail.gmail.com> (raw)
Hi list,
I have a function called `find-wget' that works well in *NIX-like
systems - it calls wget, puts the output in the temporary buffer, and
on unices Emacs always chooses the right encoding... but when I run it
on Windows, and I call wget like this,
wget -q -O - http://anggtwu.net/LUA/Dang1.lua
where Dang1.lua is a file in UTF-8, then Emacs switches the encoding
of output buffer to iso-latin-1-dos...
I probably wrote my code relying in undefined behaviors... any
suggestions on how to fix it? I'm attaching the file with the test and
the comments below, and it's also here:
http://anggtwu.net/elisp/find-wget-jan-2024.el.html
http://anggtwu.net/elisp/find-wget-jan-2024.el
Thanks in advance =/,
Eduardo Ochs
http://anggtwu.net/eepitch.html
http://anggtwu.net/#eev
--snip--snip--
;; This is a simplified version of the `find-wget' from eev:
;;
;; http://anggtwu.net/eev-current/eev-plinks.el.html#find-wget
;; (find-eev "eev-plinks.el" "find-wget")
;;
;; Most functions were copied from the source code of eev without
;; changes; only the ones that are marked as "dummified" were replaced
;; by trivial versions.
(defvar ee-wget-program "wget")
(defvar ee-find-callprocess00-exit-status nil)
;; Dummified versions
(defun ee-expand (fname) fname)
(defun ee-goto-rest (list) ())
(defun ee-goto-position (&optional pos-spec &rest rest) ())
(defun find-ebuffer (buffer &rest pos-spec-list)
"Hyperlink to an Emacs buffer (existing or not)."
(interactive "bBuffer: ")
(switch-to-buffer buffer)
(apply 'ee-goto-position pos-spec-list))
(defun ee-split (str)
"If STR is a string, split it on whitespace and return the resulting list.
If STR if a list, return it unchanged."
(if (stringp str)
(split-string str "[ \t\n]+")
str))
(defun find-callprocess00-ne (program-and-args)
(let ((argv (ee-split program-and-args)))
(with-output-to-string
(with-current-buffer standard-output
(setq ee-find-callprocess00-exit-status
(apply 'call-process (car argv) nil t nil (cdr argv)))))))
(defun find-wget (url &rest pos-spec-list)
"Download URL with \"wget -q -O - URL\" and display the output.
If a buffer named \"*wget: URL*\" already exists then this
function visits it instead of running wget again.
If wget can't download URL then this function runs `error'."
(let* ((eurl (ee-expand url))
(wgetprogandargs (list ee-wget-program "-q" "-O" "-" eurl))
(wgetbufname (format "*wget: %s*" eurl)))
(if (get-buffer wgetbufname)
(apply 'find-ebuffer wgetbufname pos-spec-list)
;;
;; If the buffer wgetbufname doesn't exist, then:
(let* ((wgetoutput (find-callprocess00-ne wgetprogandargs))
(wgetstatus ee-find-callprocess00-exit-status))
;;
(if (not (equal wgetstatus 0))
;; See: (find-node "(wget)Exit Status")
(error "wget can't download: %s" eurl))
;;
(find-ebuffer wgetbufname) ; create buffer
(insert wgetoutput)
(goto-char (point-min))
(apply 'ee-goto-position pos-spec-list)))))
;; Test: (eval-buffer)
;; (find-wget "http://anggtwu.net/LUA/Dang1.lua")
;;
;; When we run the test above on Debian the double angle brackets in
;; the line 12 of Dang1.lua are displayed correctly as single
;; characters - and when we run `M-x hexlify-buffer' we see that they
;; take are encoded in two bytes each - c2ab and c2bb. From
;; /usr/share/unicode/UnicodeData.txt:
;;
;; 00AB;LEFT-POINTING DOUBLE ANGLE QUOTATION MARK;Pi;0;ON;;;;;Y;LEFT
POINTING GUILLEMET;;;;
;; 00BB;RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK;Pf;0;ON;;;;;Y;RIGHT
POINTING GUILLEMET;;;;
;;
;; When we run the `find-wget' above in Emacs 29 for Windows the
;; resulting buffer is put in the encoding "iso-latin-1-dos". `M-x
;; hexlify-buffer' shows that they are still two bytes each - c2ab and
;; c2bb - but they are displayed as two characters each, preceded by
;; "c2"s::
;;
;; 00C2;LATIN CAPITAL LETTER A WITH CIRCUMFLEX;Lu;0;L;0041
0302;;;;N;LATIN CAPITAL LETTER A CIRCUMFLEX;;;00E2;
;;
;; The wget that I am using on Windows was extracted from this zip:
;;
;; https://eternallybored.org/misc/wget/releases/wget-1.21.2-win64.zip
next reply other threads:[~2024-01-14 23:09 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-14 23:09 Eduardo Ochs [this message]
2024-01-15 0:17 ` call-process -> insert -> iso-latin-1-dos problem on Windows Michael Heerdegen via Users list for the GNU Emacs text editor
2024-01-15 1:25 ` Stefan Monnier via Users list for the GNU Emacs text editor
2024-01-15 12:34 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADs++6jc9BXPKqd8Ry5BA53zSvE4GvGeLjjJSomeng=OB-eMiw@mail.gmail.com' \
--to=eduardoochs@gmail.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).