From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.devel Subject: Fixing Gnus, and string encoding question Date: Fri, 05 Apr 2019 13:47:32 -0700 Message-ID: <87d0m0qivv.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="269095"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 05 22:51:28 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hCVo6-0017s7-MJ for ged-emacs-devel@m.gmane.org; Fri, 05 Apr 2019 22:51:26 +0200 Original-Received: from localhost ([127.0.0.1]:46745 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hCVo5-0003jc-Fi for ged-emacs-devel@m.gmane.org; Fri, 05 Apr 2019 16:51:25 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:36206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hCVkZ-0001Ya-67 for emacs-devel@gnu.org; Fri, 05 Apr 2019 16:47:48 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hCVkX-0001s7-1q for emacs-devel@gnu.org; Fri, 05 Apr 2019 16:47:46 -0400 Original-Received: from [195.159.176.226] (port=35806 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hCVkT-0001mE-B7 for emacs-devel@gnu.org; Fri, 05 Apr 2019 16:47:43 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1hCVkQ-0013fD-QR for emacs-devel@gnu.org; Fri, 05 Apr 2019 22:47:38 +0200 X-Injected-Via-Gmane: http://gmane.org/ Cancel-Lock: sha1:qLPIVmpT+P3iMEseN2a0/s34IYc= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:235002 Archived-At: So I've made a hash of this change (ha), and am trying to figure out the best solution. The problem is that non-ASCII group names are now strings, and are coming into the system in two different ways: written into .newsrc.eld with `print-escape-nonascii' set to t, and read off the filesystem using a buffer with mutibyte disabled. The two methods don't match up -- the strings are different. Katsumi Yamaoka's example is the group whose decoded name is "nnml:テス ト". This is written to .newsrc.eld as the string: "nnml:\343\203\206\343\202\271\343\203\210" Those aren't actual escapes, just backslashes and numbers. The group name is read from file with `set-buffer-multibyte' nil, using `read' to pick the group name up as a symbol, then using `symbol-name' to turn it into a string. The symbol looks like: nnml:\343\203\206\343\202\271\343\203\210 And the resulting string is: "nnml:ã\203\206ã\202¹ã\203\210" Where the escapes are real escapes, I've typed them out here. The two strings aren't `equal', obviously. I don't know how to turn either of these strings into the other -- either direction would work, but I don't know how. Another option is to give up messing with strings, and back the changes halfway out: still use hash tables, but leave the group names as symbols, with their current funky encoding. That's probably how I should have sliced these changes to begin with. Then a later step would be to go straight from symbols to fully decoded strings. Hoping for some guidance, Eric