From: <tomas@tuxteam.de>
To: help-gnu-emacs@gnu.org
Subject: url-retrieve and encoding
Date: Sat, 10 Feb 2024 20:31:02 +0100 [thread overview]
Message-ID: <ZcfO9rjxlqg7IJN0@tuxteam.de> (raw)
[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]
Hello, Emacs experts
I'm trying to fetch a Web resource via https with Emacs.
IIUC, url-retrieve (and its sinchronous friend) are the tools for
the job. They work nicely, but they leave me with a unibyte buffer
(confusingly, the line endings are just linefeeds: from the HTTP
specs I'd expected "\r\n")
Is there a canonical way to "make the buffer be UTF-8? (yes I know,
you know that once the "Content-Type" header line arrives, and at
that point you have read a bunch of bytes already, but the header
is supposed to be ASCII anyway).
What I've come up is to take the buffer-substring starting from
after the first empty line to the end, do a "string-as-multibyte"
with that and insert that into a fresh buffer. But that feels
a bit... gross:
(I've chosen a Greek wiktionary page because the results are more
visible):
(defun fetch-one ()
(let ((stuff ""))
(with-current-buffer
(url-retrieve-synchronously
"https://el.wiktionary.org/wiki/μιλώντας")
(goto-char (point-min))
(re-search-forward "^\r?$")
(forward-line)
(setq stuff (buffer-substring (point) (point-max))))
(pop-to-buffer
(get-buffer-create "*results*"))
(erase-buffer)
(insert (string-as-multibyte stuff))))
What is the "right way" to do this?
Thanks for any ideas
--
tomás
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
next reply other threads:[~2024-02-10 19:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-10 19:31 tomas [this message]
2024-02-10 19:41 ` url-retrieve and encoding Eli Zaretskii
2024-02-10 19:49 ` tomas
2024-02-11 17:49 ` tomas
2024-02-11 19:21 ` Eli Zaretskii
2024-02-12 5:30 ` tomas
2024-02-10 20:51 ` Tim Landscheidt
2024-02-11 6:30 ` tomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZcfO9rjxlqg7IJN0@tuxteam.de \
--to=tomas@tuxteam.de \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).