From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Visuwesh Newsgroups: gmane.emacs.bugs Subject: bug#73846: [PATCH] Make djvused emit UTF-8 encoded text Date: Thu, 17 Oct 2024 14:01:56 +0530 Message-ID: <87bjzjxh8j.fsf@gmail.com> References: <87y12n1i6p.fsf@gmail.com> <86frovpaf0.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39300"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: tsdh@gnu.org, 73846@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 17 10:34:09 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1t1LxQ-000A3w-OT for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 17 Oct 2024 10:34:08 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t1Lx1-0004nY-E9; Thu, 17 Oct 2024 04:33:43 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t1Lwz-0004n4-Rm for bug-gnu-emacs@gnu.org; Thu, 17 Oct 2024 04:33:41 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t1Lwz-0005AN-Hz for bug-gnu-emacs@gnu.org; Thu, 17 Oct 2024 04:33:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:Date:In-Reply-To:From:To:Subject; bh=/BDraxecg1556Z/kLk9j4YqrPBrXpoksDr88DD1cKRc=; b=t4OIy5GuFRL+rvIwTtXjaqLxUocgF0Fhw3Wja6PCN9V6AcOTnqtkaF7cqr9hgs83HfXjkEobxgnvPxun3KvlMLh51pkB+2EeZbg22NeaoC4hySafIcb+3sRDygS4Tj8ju5Ce3nXB5ZIvk9L2qOT29/9HC0MWY3RVRmDVBAP3TeuHQKhiC0//4M865iH+OIkOx6GjhN6pfg2qIDALtF8tbGZbIGqZQnT8yi5IRvfOj+gTDSXkBJTR64yXr4t2ndiMl99p0eGm3r+eonYxJ3E/dGowy+UGq7HUEUaQVh8DC2ZiHKttpf/wusYjlb+3O0bmsGKEX+BJ3wfQqh5JAOE2Aw==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1t1LxK-00033p-5h for bug-gnu-emacs@gnu.org; Thu, 17 Oct 2024 04:34:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Visuwesh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 17 Oct 2024 08:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73846 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 73846-submit@debbugs.gnu.org id=B73846.172915403211748 (code B ref 73846); Thu, 17 Oct 2024 08:34:02 +0000 Original-Received: (at 73846) by debbugs.gnu.org; 17 Oct 2024 08:33:52 +0000 Original-Received: from localhost ([127.0.0.1]:33327 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t1Lx9-00033O-FL for submit@debbugs.gnu.org; Thu, 17 Oct 2024 04:33:51 -0400 Original-Received: from mail-pf1-f194.google.com ([209.85.210.194]:44406) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t1Lx7-00033F-So for 73846@debbugs.gnu.org; Thu, 17 Oct 2024 04:33:50 -0400 Original-Received: by mail-pf1-f194.google.com with SMTP id d2e1a72fcca58-71e57d89ffaso509289b3a.1 for <73846@debbugs.gnu.org>; Thu, 17 Oct 2024 01:33:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729153948; x=1729758748; darn=debbugs.gnu.org; h=mime-version:user-agent:references:message-id:date:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=/BDraxecg1556Z/kLk9j4YqrPBrXpoksDr88DD1cKRc=; b=kaHDQvmaDN2/HlNYB6EKwzbBJBe6oJzfbVGD9r1hWi2mNBGVd48LthZ2AJDtHEM1ND ugPaoqUuxgZPNgbZ174AZLfRmem7vGPmN7KnaSJAyZAMC8+H2XtAlqGSx7xw4IGRJrnf 3Mf0/UX7olnn/J0HRUiOUqNnRF/N43r42mxOmTJTnW+ZaabtE9jVvIqo6901RNRlF4j4 1wGQyqz1WYZPifRc3mAuq7VIECrxl6ubzqxUHMJ/oPwS4scd6EiBDRZenVj94OHIZM5A BcbxTwwee0hY505rpKD7963uWDUTDCi++DVJvon0inPTHo2L7rirTTEP8+4xmwkR2/2u 7gJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729153948; x=1729758748; h=mime-version:user-agent:references:message-id:date:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/BDraxecg1556Z/kLk9j4YqrPBrXpoksDr88DD1cKRc=; b=HYl35jUWl0CHmHVSxLPUfociUtVNy8SwhK5Sjv0W7x70/IkuHhYgaFXK+hpw+rdB27 AnwBPWy0DPaOOrW9Ny4b1RQICFNcQI54/PRcO+4uL4rGSIW8rxnsfvEEuxx6f1sDJmet Hilk0CwH3S2TgS1X0aN3cpLs60bfL4XuQKdY9PfoyYM+HQAew4uGp8ASsHRX563mHb9m RkiPy6tEeRJ2ILJ11HaYsP7wkhqCC9uCgukzsTSMwahVLhMnYzWzcDVKbnHqcaKbbt9e zei0Cos76A06M2r1M6gfWYvRJa394UMy86tauVRdPlefUNJXTpUvrdGW1ZQ/Gncz8XA4 ZReA== X-Gm-Message-State: AOJu0Ywzl72iOwzjhpJ2n6VEUDWS+25UWhS524pk5zi6ezAbEc3cWHQK YlEDeVqjg99IxaJBq3Nj37ZbDYWOzNYEu1cMpWWj3wDtkg7MrpOz X-Google-Smtp-Source: AGHT+IFOuXGF2aIDjwbq9wJeraCx1urvfZPThuFW4dWZA5mU/VosdM8I5qq59r37d69HAeWOA65SDg== X-Received: by 2002:a05:6a21:3318:b0:1d9:1789:31f3 with SMTP id adf61e73a8af0-1d917893402mr5614649637.12.1729153948277; Thu, 17 Oct 2024 01:32:28 -0700 (PDT) Original-Received: from localhost ([115.240.90.130]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71e774a2ab2sm4237594b3a.117.2024.10.17.01.32.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2024 01:32:27 -0700 (PDT) In-Reply-To: <86frovpaf0.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 17 Oct 2024 08:26:27 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293712 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [=E0=AE=B5=E0=AE=BF=E0=AE=AF=E0=AE=BE=E0=AE=B4=E0=AE=A9=E0=AF=8D =E0=AE=85= =E0=AE=95=E0=AF=8D=E0=AE=9F=E0=AF=8B=E0=AE=AA=E0=AE=B0=E0=AF=8D 17, 2024] E= li Zaretskii wrote: >> Cc: "Tassilo Horn" >> From: Visuwesh >> Date: Thu, 17 Oct 2024 09:42:30 +0530 >>=20 >> This is a small patch to make djvused emit UTF-8 encoded text. In the >> djvu test file that I sent you, outline in the appendix have non-ASCII >> characters which are written as octal escapes. Rather than unescaping >> them on Emacs side, we can request djvused to use UTF-8 directly which >> this patch does. The attached patch does just that. > > If you force djvused to emit UTF-8 encoded text, you need to bind > coding-system-for-read to 'utf-8, to make sure Emacs decodes that > correctly. I'm guessing your locale uses UTF-8 by default, which is > why it worked for you. My locale is a UTF-8 one indeed. I've now let bound coding-system-for-read around everything inside with-temp-buffer. > Please also add a comment there explaining what the -u switch does and > why we use it there. Done in attached patch, I hope it is clear. > Thanks. --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Make-djvused-emit-UTF-8-encoded-text.patch >From a39e50a504c9c24f51c7c646f3cfffcec2f34b85 Mon Sep 17 00:00:00 2001 From: Visuwesh Date: Thu, 17 Oct 2024 09:40:34 +0530 Subject: [PATCH] Make djvused emit UTF-8 encoded text * lisp/doc-view.el (doc-view--djvu-outline): Pass -u to djvused to make it emit UTF-8 encoded text rather than using octal escapes for non-ASCII string. (bug#73846) --- lisp/doc-view.el | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/lisp/doc-view.el b/lisp/doc-view.el index bbfbbdec925..4d7d36c8a16 100644 --- a/lisp/doc-view.el +++ b/lisp/doc-view.el @@ -2026,13 +2026,16 @@ doc-view--djvu-outline For the format, see `doc-view--pdf-outline'." (unless file-name (setq file-name (buffer-file-name))) (with-temp-buffer - (call-process doc-view-djvused-program nil (current-buffer) nil - "-e" "print-outline" file-name) - (goto-char (point-min)) - (when (eobp) - (setq doc-view--outline 'unavailable) - (imenu-unavailable-error "Unable to create imenu index using `djvused'")) - (nreverse (doc-view--parse-djvu-outline (read (current-buffer)))))) + (let ((coding-system-for-read 'utf-8)) + ;; Pass "-u" to make `djvused' emit UTF-8 encoded text to avoid + ;; unescaping octal escapes for non-ASCII text. + (call-process doc-view-djvused-program nil (current-buffer) nil + "-u" "-e" "print-outline" file-name) + (goto-char (point-min)) + (when (eobp) + (setq doc-view--outline 'unavailable) + (imenu-unavailable-error "Unable to create imenu index using `djvused'")) + (nreverse (doc-view--parse-djvu-outline (read (current-buffer))))))) (defun doc-view--parse-djvu-outline (bookmark &optional level) "Return a list describing the djvu outline from BOOKMARK. -- 2.45.2 --=-=-=--