all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#73846: [PATCH] Make djvused emit UTF-8 encoded text
@ 2024-10-17  4:12 Visuwesh
  2024-10-17  5:26 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Visuwesh @ 2024-10-17  4:12 UTC (permalink / raw)
  To: 73846; +Cc: Tassilo Horn

[-- Attachment #1: Type: text/plain, Size: 863 bytes --]

Tags: patch

Hi Tassilo,

This is a small patch to make djvused emit UTF-8 encoded text.  In the
djvu test file that I sent you, outline in the appendix have non-ASCII
characters which are written as octal escapes.  Rather than unescaping
them on Emacs side, we can request djvused to use UTF-8 directly which
this patch does.  The attached patch does just that.


In GNU Emacs 31.0.50 (build 13, x86_64-pc-linux-gnu, X toolkit, cairo
 version 1.18.0, Xaw scroll bars) of 2024-10-06 built on astatine
Repository revision: 500f5da5fb62cd0bbded8df754d93e3147d1d847
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101011
System Description: Debian GNU/Linux trixie/sid

Configured using:
 'configure --with-sound=alsa --with-x-toolkit=lucid --without-xaw3d
 --without-gconf --without-libsystemd --with-cairo CFLAGS=-g3'

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-djvused-emit-UTF-8-encoded-text.patch --]
[-- Type: text/patch, Size: 968 bytes --]

From 8e21167c6e01ab76b76e15fa84bd198bc8df59b4 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 17 Oct 2024 09:40:34 +0530
Subject: [PATCH] Make djvused emit UTF-8 encoded text

* lisp/doc-view.el (doc-view--djvu-outline): Pass -u to djvused
to make it emit UTF-8 encoded text rather than using octal
escapes for non-ASCII string.
---
 lisp/doc-view.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lisp/doc-view.el b/lisp/doc-view.el
index bbfbbdec925..018c4eddd34 100644
--- a/lisp/doc-view.el
+++ b/lisp/doc-view.el
@@ -2027,7 +2027,7 @@ doc-view--djvu-outline
   (unless file-name (setq file-name (buffer-file-name)))
   (with-temp-buffer
     (call-process doc-view-djvused-program nil (current-buffer) nil
-                  "-e" "print-outline" file-name)
+                  "-u" "-e" "print-outline" file-name)
     (goto-char (point-min))
     (when (eobp)
       (setq doc-view--outline 'unavailable)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#73846: [PATCH] Make djvused emit UTF-8 encoded text
  2024-10-17  4:12 bug#73846: [PATCH] Make djvused emit UTF-8 encoded text Visuwesh
@ 2024-10-17  5:26 ` Eli Zaretskii
  2024-10-17  8:31   ` Visuwesh
  0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2024-10-17  5:26 UTC (permalink / raw)
  To: Visuwesh; +Cc: tsdh, 73846

> Cc: "Tassilo Horn" <tsdh@gnu.org>
> From: Visuwesh <visuweshm@gmail.com>
> Date: Thu, 17 Oct 2024 09:42:30 +0530
> 
> This is a small patch to make djvused emit UTF-8 encoded text.  In the
> djvu test file that I sent you, outline in the appendix have non-ASCII
> characters which are written as octal escapes.  Rather than unescaping
> them on Emacs side, we can request djvused to use UTF-8 directly which
> this patch does.  The attached patch does just that.

If you force djvused to emit UTF-8 encoded text, you need to bind
coding-system-for-read to 'utf-8, to make sure Emacs decodes that
correctly.  I'm guessing your locale uses UTF-8 by default, which is
why it worked for you.

Please also add a comment there explaining what the -u switch does and
why we use it there.

Thanks.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#73846: [PATCH] Make djvused emit UTF-8 encoded text
  2024-10-17  5:26 ` Eli Zaretskii
@ 2024-10-17  8:31   ` Visuwesh
  2024-10-18  6:07     ` Tassilo Horn
  0 siblings, 1 reply; 4+ messages in thread
From: Visuwesh @ 2024-10-17  8:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tsdh, 73846

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

[வியாழன் அக்டோபர் 17, 2024] Eli Zaretskii wrote:

>> Cc: "Tassilo Horn" <tsdh@gnu.org>
>> From: Visuwesh <visuweshm@gmail.com>
>> Date: Thu, 17 Oct 2024 09:42:30 +0530
>> 
>> This is a small patch to make djvused emit UTF-8 encoded text.  In the
>> djvu test file that I sent you, outline in the appendix have non-ASCII
>> characters which are written as octal escapes.  Rather than unescaping
>> them on Emacs side, we can request djvused to use UTF-8 directly which
>> this patch does.  The attached patch does just that.
>
> If you force djvused to emit UTF-8 encoded text, you need to bind
> coding-system-for-read to 'utf-8, to make sure Emacs decodes that
> correctly.  I'm guessing your locale uses UTF-8 by default, which is
> why it worked for you.

My locale is a UTF-8 one indeed.  I've now let bound
coding-system-for-read around everything inside with-temp-buffer.

> Please also add a comment there explaining what the -u switch does and
> why we use it there.

Done in attached patch, I hope it is clear.

> Thanks.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-djvused-emit-UTF-8-encoded-text.patch --]
[-- Type: text/x-diff, Size: 1830 bytes --]

From a39e50a504c9c24f51c7c646f3cfffcec2f34b85 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Thu, 17 Oct 2024 09:40:34 +0530
Subject: [PATCH] Make djvused emit UTF-8 encoded text

* lisp/doc-view.el (doc-view--djvu-outline): Pass -u to djvused
to make it emit UTF-8 encoded text rather than using octal
escapes for non-ASCII string.  (bug#73846)
---
 lisp/doc-view.el | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/lisp/doc-view.el b/lisp/doc-view.el
index bbfbbdec925..4d7d36c8a16 100644
--- a/lisp/doc-view.el
+++ b/lisp/doc-view.el
@@ -2026,13 +2026,16 @@ doc-view--djvu-outline
 For the format, see `doc-view--pdf-outline'."
   (unless file-name (setq file-name (buffer-file-name)))
   (with-temp-buffer
-    (call-process doc-view-djvused-program nil (current-buffer) nil
-                  "-e" "print-outline" file-name)
-    (goto-char (point-min))
-    (when (eobp)
-      (setq doc-view--outline 'unavailable)
-      (imenu-unavailable-error "Unable to create imenu index using `djvused'"))
-    (nreverse (doc-view--parse-djvu-outline (read (current-buffer))))))
+    (let ((coding-system-for-read 'utf-8))
+      ;; Pass "-u" to make `djvused' emit UTF-8 encoded text to avoid
+      ;; unescaping octal escapes for non-ASCII text.
+      (call-process doc-view-djvused-program nil (current-buffer) nil
+                    "-u" "-e" "print-outline" file-name)
+      (goto-char (point-min))
+      (when (eobp)
+        (setq doc-view--outline 'unavailable)
+        (imenu-unavailable-error "Unable to create imenu index using `djvused'"))
+      (nreverse (doc-view--parse-djvu-outline (read (current-buffer)))))))
 
 (defun doc-view--parse-djvu-outline (bookmark &optional level)
   "Return a list describing the djvu outline from BOOKMARK.
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#73846: [PATCH] Make djvused emit UTF-8 encoded text
  2024-10-17  8:31   ` Visuwesh
@ 2024-10-18  6:07     ` Tassilo Horn
  0 siblings, 0 replies; 4+ messages in thread
From: Tassilo Horn @ 2024-10-18  6:07 UTC (permalink / raw)
  To: Visuwesh; +Cc: 73846-done, Eli Zaretskii

Visuwesh <visuweshm@gmail.com> writes:

>> Please also add a comment there explaining what the -u switch does
>> and why we use it there.
>
> Done in attached patch, I hope it is clear.

It is.  Applied and pushed to master.

Thanks again,
  Tassilo





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-10-18  6:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17  4:12 bug#73846: [PATCH] Make djvused emit UTF-8 encoded text Visuwesh
2024-10-17  5:26 ` Eli Zaretskii
2024-10-17  8:31   ` Visuwesh
2024-10-18  6:07     ` Tassilo Horn

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.