From: Eli Zaretskii <eliz@gnu.org>
To: "J.P." <jp@neverwas.me>
Cc: 46342@debbugs.gnu.org
Subject: bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8
Date: Fri, 12 Feb 2021 17:04:16 +0200 [thread overview]
Message-ID: <83blcpfhu7.fsf@gnu.org> (raw)
In-Reply-To: <87h7mh73zr.fsf@neverwas.me> (jp@neverwas.me)
> From: "J.P." <jp@neverwas.me>
> Cc: 46342@debbugs.gnu.org
> Date: Fri, 12 Feb 2021 06:30:32 -0800
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Then they are what we call "raw bytes", and encoding them with
> > raw-text-unix should suffice.
>
> Thanks. Unfortunately, this produces the same utf-8 encoded bytes.
>
> (encode-coding-char 192 'raw-text-unix)
> ⇒ "\303\200"
192 is not a raw-byte, it's a character whose Unicode codepoint is
192. So you get its UTF-8 sequence.
> It looks like raw-text-unix is an alias for binary [1], the coding
> system already used by the network process sending the erroneous
> request.
The problem is with how the original request is generated, not how it
is encoded.
> I suppose it's always possible to strong arm it like
>
> (encode-coding-char (or (decode-char 'eight-bit c) c) 'raw-text-unix)
> ⇒ "^@" ... "\377"
That's one way, yes. But it isn't the best one.
> But what about your original latin-1 suggestion? Is that no longer in
> contention?
No, it isn't.
> (encode-coding-char 192 'latin-1)
> ⇒ "\300"
Not every byte above 127 is a valid character that Latin-1 can
meaningfully encode. It is wrong to use Latin-1 for raw bytes. What
you need is a way of generating a unibyte string from a series of raw
bytes,
> > How does the code which calls socks.el create these raw bytes?
>
> This library has an entry-point function that's part of the url-gateway
> dispatch mechanism. I can't say for certain, but it looks like url-http
> is the only library directly using this facility. Regardless, the
> function gets called with a (possibly multibyte) host name, which in
> rare cases may be an ASCII IP address created by url-gateway.
>
> With SOCKS4, that's kind of moot, since all names are looked up through
> socks-nslookup-host, which returns an IPv4 address as a list of fixnums.
> Its caller is an internal helper that converts this list into a
> multibyte string for socks-send-command to emit onto the wire (where
> it's then rejected by the service).
>
> Currently, IP addresses aren't used at all for v5 connect-command
> requests. And raw-byte IP addresses do not yet appear anywhere [2]. This
> patch would introduce them, either as an argument to socks-send-command
> or as something ephemeral produced by it (the current idea).
So what is the problem with using unibyte-string for producing a
unibyte string from a list of bytes? It sounds like it's exactly
what is needed here, and is actually used in some places in socks.el.
next prev parent reply other threads:[~2021-02-12 15:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-06 11:46 bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8 J.P.
2021-02-06 12:26 ` Eli Zaretskii
2021-02-06 14:19 ` J.P.
2021-02-06 15:15 ` Eli Zaretskii
2021-02-07 14:22 ` J.P.
2021-02-09 15:17 ` J.P.
2021-02-09 16:14 ` Eli Zaretskii
2021-02-10 13:16 ` J.P.
2021-02-10 16:04 ` Eli Zaretskii
2021-02-11 14:58 ` J.P.
2021-02-11 15:28 ` Eli Zaretskii
2021-02-12 14:30 ` J.P.
2021-02-12 15:04 ` Eli Zaretskii [this message]
2021-02-13 15:43 ` J.P.
2021-02-17 14:59 ` J.P.
2021-02-20 9:33 ` Eli Zaretskii
2021-02-20 10:13 ` J.P.
2021-02-20 11:08 ` Eli Zaretskii
2021-02-20 15:08 ` J.P.
2021-02-20 15:19 ` Eli Zaretskii
2021-02-20 10:41 ` J.P.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83blcpfhu7.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=46342@debbugs.gnu.org \
--cc=jp@neverwas.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.