From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8 Date: Fri, 12 Feb 2021 17:04:16 +0200 Message-ID: <83blcpfhu7.fsf@gnu.org> References: <875z355sh9.fsf@neverwas.me> <83pn1do008.fsf@gnu.org> <87r1lt2s8k.fsf@neverwas.me> <83czxdns61.fsf@gnu.org> <874kils22e.fsf@neverwas.me> <831rdpkyl6.fsf@gnu.org> <87ft24njud.fsf@neverwas.me> <83o8grj4d3.fsf@gnu.org> <87eehmlkhz.fsf@neverwas.me> <83o8gqhbeh.fsf@gnu.org> <87h7mh73zr.fsf@neverwas.me> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13414"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 46342@debbugs.gnu.org To: "J.P." Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Feb 12 16:05:18 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lAa0U-0003Of-2R for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 12 Feb 2021 16:05:18 +0100 Original-Received: from localhost ([::1]:57778 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lAa0T-0006pE-5M for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 12 Feb 2021 10:05:17 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lAa0F-0006nr-1S for bug-gnu-emacs@gnu.org; Fri, 12 Feb 2021 10:05:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49471) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lAa0E-0004W3-Pp for bug-gnu-emacs@gnu.org; Fri, 12 Feb 2021 10:05:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lAa0E-0003Dw-HS for bug-gnu-emacs@gnu.org; Fri, 12 Feb 2021 10:05:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 12 Feb 2021 15:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46342 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 46342-submit@debbugs.gnu.org id=B46342.161314226212338 (code B ref 46342); Fri, 12 Feb 2021 15:05:02 +0000 Original-Received: (at 46342) by debbugs.gnu.org; 12 Feb 2021 15:04:22 +0000 Original-Received: from localhost ([127.0.0.1]:32784 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lAZzZ-0003Cw-Kz for submit@debbugs.gnu.org; Fri, 12 Feb 2021 10:04:21 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:51418) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lAZzW-0003Cc-4z for 46342@debbugs.gnu.org; Fri, 12 Feb 2021 10:04:19 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:51554) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lAZzQ-00048Q-Gk; Fri, 12 Feb 2021 10:04:12 -0500 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3162 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lAZzO-000786-Lt; Fri, 12 Feb 2021 10:04:12 -0500 In-Reply-To: <87h7mh73zr.fsf@neverwas.me> (jp@neverwas.me) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:199861 Archived-At: > From: "J.P." > Cc: 46342@debbugs.gnu.org > Date: Fri, 12 Feb 2021 06:30:32 -0800 > > Eli Zaretskii writes: > > > Then they are what we call "raw bytes", and encoding them with > > raw-text-unix should suffice. > > Thanks. Unfortunately, this produces the same utf-8 encoded bytes. > > (encode-coding-char 192 'raw-text-unix) > ⇒ "\303\200" 192 is not a raw-byte, it's a character whose Unicode codepoint is 192. So you get its UTF-8 sequence. > It looks like raw-text-unix is an alias for binary [1], the coding > system already used by the network process sending the erroneous > request. The problem is with how the original request is generated, not how it is encoded. > I suppose it's always possible to strong arm it like > > (encode-coding-char (or (decode-char 'eight-bit c) c) 'raw-text-unix) > ⇒ "^@" ... "\377" That's one way, yes. But it isn't the best one. > But what about your original latin-1 suggestion? Is that no longer in > contention? No, it isn't. > (encode-coding-char 192 'latin-1) > ⇒ "\300" Not every byte above 127 is a valid character that Latin-1 can meaningfully encode. It is wrong to use Latin-1 for raw bytes. What you need is a way of generating a unibyte string from a series of raw bytes, > > How does the code which calls socks.el create these raw bytes? > > This library has an entry-point function that's part of the url-gateway > dispatch mechanism. I can't say for certain, but it looks like url-http > is the only library directly using this facility. Regardless, the > function gets called with a (possibly multibyte) host name, which in > rare cases may be an ASCII IP address created by url-gateway. > > With SOCKS4, that's kind of moot, since all names are looked up through > socks-nslookup-host, which returns an IPv4 address as a list of fixnums. > Its caller is an internal helper that converts this list into a > multibyte string for socks-send-command to emit onto the wire (where > it's then rejected by the service). > > Currently, IP addresses aren't used at all for v5 connect-command > requests. And raw-byte IP addresses do not yet appear anywhere [2]. This > patch would introduce them, either as an argument to socks-send-command > or as something ephemeral produced by it (the current idea). So what is the problem with using unibyte-string for producing a unibyte string from a list of bytes? It sounds like it's exactly what is needed here, and is actually used in some places in socks.el.