From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: awrhygty@outlook.com Newsgroups: gmane.emacs.bugs Subject: bug#65305: 29.1; archive-mode can not handle subfile names encoded with utf-8 Date: Thu, 17 Aug 2023 22:56:54 +0900 Message-ID: References: <83sf8ka6b2.fsf@gnu.org> <83bkf89x7o.fsf@gnu.org> <83h6oz88n0.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26308"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 65305@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Aug 17 15:58:26 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qWdW6-0006dt-Dz for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 17 Aug 2023 15:58:26 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qWdVm-00081s-Ot; Thu, 17 Aug 2023 09:58:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qWdVi-00081I-Hf for bug-gnu-emacs@gnu.org; Thu, 17 Aug 2023 09:58:02 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qWdVh-0000NQ-TA for bug-gnu-emacs@gnu.org; Thu, 17 Aug 2023 09:58:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qWdVh-0004gO-Oz for bug-gnu-emacs@gnu.org; Thu, 17 Aug 2023 09:58:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: awrhygty@outlook.com Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 17 Aug 2023 13:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 65305 X-GNU-PR-Package: emacs Original-Received: via spool by 65305-submit@debbugs.gnu.org id=B65305.169228062917942 (code B ref 65305); Thu, 17 Aug 2023 13:58:01 +0000 Original-Received: (at 65305) by debbugs.gnu.org; 17 Aug 2023 13:57:09 +0000 Original-Received: from localhost ([127.0.0.1]:45237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qWdUr-0004fJ-2d for submit@debbugs.gnu.org; Thu, 17 Aug 2023 09:57:09 -0400 Original-Received: from mail-psaapc01olkn2084.outbound.protection.outlook.com ([40.92.52.84]:4096 helo=APC01-PSA-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qWdUn-0004eq-4K for 65305@debbugs.gnu.org; Thu, 17 Aug 2023 09:57:07 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VzK9nkybPuZTXmQQX6ILI0ngygOaEUH32smQ5YG8Hbsg65BwFpV7rRwn4QSLhgHXuANnq8va/KoNw3swKnI3eU+G7xqI/nBsmpeTqs9VifruwUJz9U8MKtSQb1I9aBQpoRx3aKh+E4tciX11wORWi7oc+TDGgOQbWmLv4y8HiQiHPcYoRlzEacrPCer86/OarR7/pjoOQEcpJciSPyC3dpD1AtjU/s5iZvxJvnqME51NRP/Rx+qN09qxfKfz5dh1zMaNMMlI9r4LIE9B6/Z5y5iSGKoI6iF0N+KqDaKpo0FGPutwmO6bxttnFqTDq6FBr9d8HCHEp95h6c2dF906Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7upIweL5/WccVL2KYDKDK90x/FO9O2wuO8342lp7hOU=; b=BVD7sevKG7IjxdP8zdX6ZRzvVQFYw2RS2Eoe7lySycY7XglpzmyhvyMXIuXIE6vigfcYy2iuW+HLrCXSWALC6fw/OP0FnbXg0K44SiV6ciUEiZHrXOi4laOJ5uHmJvNqdHEFDxum5smr3gw4cMRc9+8S91s0TRvtjcjoy20G1R5hqExsOTZaRC7DBdqtHsOFUjGHrGmoRDcZJvBVPxg3Dp8ar84H6XTI8VOgYay6VcsPG9aZnAnwyHNz8mN+gLwxyqKpoLspTpQ7nFltDGWp4wY8PQomYZSFk7NqlawxWrehpgdaLpvZ2ZaBsbkihlx0PmeXGROQkunbR2hc/tOysA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7upIweL5/WccVL2KYDKDK90x/FO9O2wuO8342lp7hOU=; b=eXD03QHQtqajP0WftAggDc1/A29ObD5LBSPmw+6Hpi/jUxJhyrjiq7ajShJozUX4Ur26cDGkEnDqSosGsVaX2rsOLr48gkPWveqSOlvWJTXjrGcuCWzV0STPEf48iXieLkXN/SX/PsCZeKCeWL0UkXpQk8z/7k3sRLYEc0qyoWLIaPh/P5U3FOElopTZkO/pRhH3TDsviEE6tUxzQCPcwT9AjBriWJ8i0L6UGV4/qiUm/Q5SZ8JfwWMEECX8gcCj1geBADM/TiqlLkvhS72tLhB25fejfvm/pjBrpaexmj//eXCABweLbJn8c5s8LgyQ9IVd0umSgROm9k91NaxHSw== Original-Received: from TYZPR01MB3920.apcprd01.prod.exchangelabs.com (2603:1096:400:30::11) by SI2PR01MB4570.apcprd01.prod.exchangelabs.com (2603:1096:4:1e5::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6678.29; Thu, 17 Aug 2023 13:56:57 +0000 Original-Received: from TYZPR01MB3920.apcprd01.prod.exchangelabs.com ([fe80::72cf:3224:cab3:a133]) by TYZPR01MB3920.apcprd01.prod.exchangelabs.com ([fe80::72cf:3224:cab3:a133%4]) with mapi id 15.20.6678.031; Thu, 17 Aug 2023 13:56:57 +0000 In-Reply-To: <83h6oz88n0.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 16 Aug 2023 08:38:27 -0400") X-TMN: [wGbxWX5OzTVdDL34Wx7pXgrg9lTYGRlg] X-ClientProxiedBy: OS0P286CA0024.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:9d::11) To TYZPR01MB3920.apcprd01.prod.exchangelabs.com (2603:1096:400:30::11) X-Microsoft-Original-Message-ID: <867cptvkk9.fsf@outlook.com> X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: TYZPR01MB3920:EE_|SI2PR01MB4570:EE_ X-MS-Office365-Filtering-Correlation-Id: 24667861-440e-4813-1e07-08db9f29d1e0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xMH0B94ejtofqgvofqBMvD4ZoAFAKTesNbe3UgVzkTkQXqkVdGayMQTrdH4BzVZm2qCju+lgSJh8yWskUYZ4sLsibEcWnfqLSBd0dv9+q7gLZUp9+w+g6bKE5+y+3ERi32cLQu7W+Scv7WEVLOeEcN79ObhJ5yURV6a9D7ZxmSF9dmeFDNlvENChIX/8i0AawnMmxnr5uNGYJM/KqfHTWd60jeh44oJly29S3bWWGDUr23NgQFTjKjiJhURkFSc2ofRY1wLurwqkqk/coIIJq0Q5hkLv2g/fB4sie/zntMACIfAKm5iNCGPNyeJSNBjYss7SeTEU6EPHJaU6MOmOaI/5BoZaqsR4vpuVn97sQXLhSKQWJA7LAs/00zmMy48cXIRh4ovgO3Luf84BdXMhVRZIr8f7gMoNtHUYNCq3JAhegOtI94tn9xoqb/bpG0oGr+9peTOIWgHE2066RATYugRNGmrw0gzudI8qH9okh+rD5y92cbC4GdZ2fq0kAF/oSw4LbcfrkL22MkUkaSKXa9XGfk3Lis0S+0bRvb84CZ8= X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: V7emPUEy8+c1dAsV3tiEmgYZbPxF6ppF1WVE+XB8vh8xo2MEljBcEh9h/E/gNOFRaqqFlQt+C/PUR/FUAAm/HCYKj6xsmCoLUbzhiS9maHzFqbUDcwvv/AqjFqqm7ZgCE8aafQ1yXV53e28ns6RJAaA3lcvL3FGpsLNtvSuqCJ5A3Zmb2reToTUwsbR0+32SdoIaYg8+NGJuiWTroegfOCVfC+VCOcOXYC90KDrG20CU+Byb/UQr2mUZ5SxLPGW5bnjj8OXeUPpDB4uXIUw8EotvoaAdmmTXfv8phczXPO6VdOr2mwJw/U0fpqKQOLFBH9r0jV5dE9a42qVYQ0LQU35d795H2LI9UmEUNe407cyBJBhGtsZstUF0jWaNoeBYXrY0UISK39MrlL/+dy+McKw61kIR9gheBsyKCXzgH44tsXVPJM7DxEClbnnaNP0shjKyjr0LdW6biHvSNYGd4mljk/0UbJdjRRZdb9csxWJgqFpxpsDnHuMwVbAHVoC3W1Vd35tiOHpQT96p3Rpmr1On1UC439jaONJx7zuGLZgJlntqJEV4N1/6Vs3xG85GqTrQgTS+vo4PkGt0o6QRJvz0kgpk6S4iYdoEIb0/jJkxRm5r3Tf6vblF/uXVwzucL0VOhn3OPc1Vlb6XPsmqZf2Hmf5UsoKje6AgXoKIgcd51j+ZaHjPLdm08rw1GLwoX0wt0Vjz9ypkNLyCTy+geCNe0xuHeWDsy7uuq1QZRvekFm3OWhncuf12mC gQjNcjHwXn62VfVIMC/CTMSz7DU9lNrgGlXZ5lCZhuYPrbLsURBDICTizO0V0z7dw+HS+Xt6cRsm2IA6jdeI1ZF9lfHjNuWQHy X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 24667861-440e-4813-1e07-08db9f29d1e0 X-MS-Exchange-CrossTenant-AuthSource: TYZPR01MB3920.apcprd01.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Aug 2023 13:56:57.0617 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SI2PR01MB4570 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:267665 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: awrhygty@outlook.com >> Cc: 65305@debbugs.gnu.org >> Date: Wed, 16 Aug 2023 12:47:14 +0900 >> >> Eli Zaretskii writes: >> >> >> There is a bit flag indicating that the subfile name is encoded with >> >> utf-8. Bytes 6-7 in local file header or bytes 8-9 in central directory >> >> header are general purpose bit flag. And bit 11 of the flag represents >> >> file encoding flag(1 for utf-8 encoding). >> > >> > Thanks, please try the patch below. If it gives good results, I will >> > install it. >> >> The patch works to list entries, and the contents can be extracted with >> 7z.exe. unzip.exe does not work well. > > Thanks, I installed the patch on the emacs-29 branch. I'm not > surprised that unzip.exe cannot extract such files, and I think it > works for you with 7z.exe by sheer luck: Windows transparently > converts non-ASCII characters to the system codepage when it invokes > programs via the "narrow" APIs, so it could mangle the UTF-8 encoded > file name into something unrecognizable. > >> I tried the settings below, but rewriting entries does not work. >> (archive-zip-* variables' values are default if archive-7z-program is set >> and zip.exe/unzip.exe are non-existent) >> >> (setq archive-7z-program "c:/Program Files/7-Zip/7z.exe" >> archive-zip-extract '("c:/Program Files/7-Zip/7z.exe" "x" "-so") >> archive-zip-expunge '("c:/Program Files/7-Zip/7z.exe" "d") >> archive-zip-update '("c:/Program Files/7-Zip/7z.exe" "u") >> archive-zip-update-case archive-zip-update) >> >> It is because update command needs "-si" option followed by an entry >> name. It should be one argument like (format "-si%s" name). > > Sorry, I don't understand: is this the same problem, or is this an > additional problem? For example, does rewriting entries work with > ASCII file names? > > If this is a separate problem, I prefer that you submit a separate bug > report with all the pertinent details. > > Thanks. Sorry, I have mistaken something. 7z.exe works for rewriting with the settings above. (not only ascii subfiles but also cp932 encodable subfiles) I imagine if there is a special program which interprets ascii arguments into multilingual strings. Any subfile in any archive file will be treated correctly within emacs. So I implemented an instant program using base64. --=-=-= Content-Type: application/emacs-lisp Content-Disposition: inline; filename=archive-utf.el Content-Transfer-Encoding: quoted-printable (require 'arc-mode)=0D =0D (setq archive-7z-program "c:/Program Files/7-Zip/7z.exe")=0D (defvar archive-utf-python-script=0D (format "\=0D import sys, base64, subprocess=0D =0D PROG_7Z =3D '%s'=0D MODE =3D sys.argv[1]=0D for i in range(2, len(sys.argv)):=0D idx =3D sys.argv[i].rfind('/')=0D sys.argv[i] =3D sys.argv[i][idx+1:]=0D sys.argv[i] =3D base64.b64decode(sys.argv[i]).decode()=0D ARCHIVE =3D sys.argv[2]=0D SUBFILES =3D sys.argv[3:]=0D SUBFILE =3D SUBFILES[0] if SUBFILES else ''=0D =0D if MODE =3D=3D 'x':=0D subprocess.run([PROG_7Z, 'x', '-so', ARCHIVE, SUBFILE])=0D elif MODE =3D=3D 'u':=0D subprocess.run([PROG_7Z, 'u', ARCHIVE, SUBFILE])=0D elif MODE =3D=3D 'd':=0D subprocess.run([PROG_7Z, 'd', ARCHIVE] + SUBFILES)=0D elif MODE =3D=3D 'l':=0D subprocess.run([PROG_7Z, 'l', ARCHIVE])=0D " archive-7z-program))=0D =0D (defvar archive-utf-extract `("python" "-c" ,archive-utf-python-script "x")= )=0D (defvar archive-utf-expunge `("python" "-c" ,archive-utf-python-script "d")= )=0D (defvar archive-utf-update `("python" "-c" ,archive-utf-python-script "u")= )=0D =0D (defun archive-utf-encode (str)=0D (base64-encode-string (encode-coding-string str 'utf-8)))=0D =0D (defun archive-zip-extract (archive name)=0D (let ((default-directory temporary-file-directory))=0D (archive-extract-by-stdout=0D (archive-utf-encode (expand-file-name archive))=0D (archive-utf-encode name)=0D archive-utf-extract)))=0D =0D (defun archive-zip-expunge (archive files)=0D (let ((default-directory temporary-file-directory))=0D (archive-*-expunge=0D (archive-utf-encode (expand-file-name archive))=0D (mapcar #'archive-utf-encode files)=0D archive-utf-expunge)))=0D =0D (defun archive-zip-write-file-member (archive descr)=0D (let ((default-directory temporary-file-directory)=0D (archive-file-name-coding-system 'archive-base64))=0D (archive-*-write-file-member=0D (archive-utf-encode (expand-file-name archive))=0D descr=0D archive-utf-update)))=0D =0C=0D (define-coding-system 'archive-base64=0D "base64 encoding"=0D :mnemonic ?B=0D :coding-type 'undecided=0D :post-read-conversion 'archive-base64-post-read-conversion=0D :pre-write-conversion 'archive-base64-pre-write-conversion)=0D =0D (defun archive-base64-post-read-conversion (len)=0D (let ((pos (point))=0D (buffer-modified-p (buffer-modified-p))=0D last-coding-system-used)=0D (prog1=0D (save-restriction=0D (narrow-to-region pos (+ pos len))=0D (base64-decode-region pos (point-max))=0D (decode-coding-region pos (point-max)'utf-8))=0D (set-buffer-modified-p buffer-modified-p))))=0D =0D (defun archive-base64-pre-write-conversion (from to)=0D (let ((buf (current-buffer)))=0D (set-buffer (generate-new-buffer " *temp*"))=0D (if (stringp from)=0D (insert from)=0D (insert-buffer-substring buf from to))=0D (let (last-coding-system-used)=0D (encode-coding-region 1 (point-max) 'utf-8)=0D (base64-encode-region 1 (point-max)))=0D nil))=0D --=-=-=--