unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
@ 2019-06-04  3:43 Nikita
  2019-06-04 11:44 ` bug#36085: Screenshots for th bug Никита Никита
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Nikita @ 2019-06-04  3:43 UTC (permalink / raw)
  To: 36085


When i open dired, go to the needed directory, run "M-x dired-find"
"-name "*Портрет*" (or anything at all that will give some results)
results come back with octal escapes instead of Cyrillic letters.
I cannot open pictures that it finds for example.

----

In GNU Emacs 26.2 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30)
of 2019-04-13 built on lgw01-amd64-060
Windowing system distributor 'The X.Org Foundation', version 11.0.11906000
System Description: Linux Mint 19.1 Tessa

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
user-error: No further undo information [2 times]
Quit
Mark saved where search started
Making completion list...
find-dired *Find* finished.
Making completion list... [3 times]
user-error: Beginning of history; no preceding item
user-error: End of history; no default available
user-error: Beginning of history; no preceding item

Configured using:
'configure --build=x86_64-linux-gnu --prefix=/usr
'--includedir=${prefix}/include' '--mandir=${prefix}/share/man'
'--infodir=${prefix}/share/info' --sysconfdir=/etc --localstatedir=/var
--disable-silent-rules '--libdir=${prefix}/lib/x86_64-linux-gnu'
'--libexecdir=${prefix}/lib/x86_64-linux-gnu' --disable-maintainer-mode
--disable-dependency-tracking --prefix=/usr --sharedstatedir=/var/lib
--program-suffix=26 --with-modules --with-file-notification=inotify
--with-mailutils --with-x=yes --with-x-toolkit=gtk3 --with-xwidgets
--with-lcms2 'CFLAGS=-g -O2
-fdebug-prefix-map=/build/emacs26-CYbeHB/emacs26-26.2~1.gitfd1b34b=.
-fstack-protector-strong
-Wformat -Werror=format-security -no-pie' 'CPPFLAGS=-Wdate-time
-D_FORTIFY_SOURCE=2' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro
-no-pie''

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS GLIB
NOTIFY LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS XWIDGETS
LIBSYSTEMD LCMS2

Important settings:
value of $LC_MONETARY: ru_RU.UTF-8
value of $LC_NUMERIC: ru_RU.UTF-8
value of $LANG: ru_RU
locale-coding-system: utf-8-unix

Major mode: Dired by name

Minor modes in effect:
shell-dirtrack-mode: t
pdf-occur-dired-minor-mode: t
pdf-occur-global-minor-mode: t
engine-mode: t
which-key-mode: t
xah-fly-keys: t
recentf-mode: t
tooltip-mode: t
global-eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
buffer-read-only: t
column-number-mode: t
line-number-mode: t
global-visual-line-mode: t
visual-line-mode: t
transient-mark-mode: t
abbrev-mode: t

Load-path shadows:
/usr/share/emacs/site-lisp/dictionaries-common/flyspell hides
/usr/share/emacs/26.2/lisp/textmodes/flyspell
/usr/share/emacs/site-lisp/dictionaries-common/ispell hides
/usr/share/emacs/26.2/lisp/textmodes/ispell
/usr/share/emacs/site-lisp/latex-cjk-thai/thai-word hides
/usr/share/emacs/26.2/lisp/language/thai-word

Features:
(shadow sort mail-extr emacsbug message rmc puny rfc822 mml mml-sec epa
derived epg gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils shell find-dired
misearch multi-isearch dired-aux elec-pair ob-R ob-python pdf-occur
ibuf-ext ibuffer ibuffer-loaddefs tablist tablist-filter
semantic/wisent/comp semantic/wisent semantic/wisent/wisent
semantic/util-modes semantic/util semantic semantic/tag semantic/lex
semantic/fw mode-local cedet pdf-isearch let-alist pdf-misc imenu
pdf-tools compile cus-edit cus-start cus-load pdf-view bookmark pp
jka-compr pdf-cache pdf-info tq pdf-util image-mode engine-mode
which-key org-clock org-element avl-tree generator org org-macro
org-footnote org-pcomplete pcomplete org-list org-faces org-entities
org-version ob-emacs-lisp ob ob-tangle org-src ob-ref ob-lob ob-table
ob-keys ob-exp ob-comint comint ansi-color ob-core ob-eval org-compat
org-macs org-loaddefs format-spec advice find-func cal-menu calendar
cal-loaddefs pandoc-mode cl-extra pandoc-mode-utils hydra ring lv cl
markdown-toc dash s markdown-mode-table markdown-mode color thingatpt
noutline outline easy-mmode edit-indirect xah-fly-keys ido finder-inf
info package epg-config url-handlers url-parse auth-source cl-seq eieio
eieio-core cl-macs eieio-loaddefs password-cache url-vars seq byte-opt
gv bytecomp byte-compile cconv quail help-mode dired-x dired
dired-loaddefs edmacro kmacro recentf tree-widget wid-edit cl-loaddefs
cl-lib easymenu server time-date mule-util tooltip eldoc electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads dbusbind inotify lcms2 dynamic-setting
system-font-setting font-render-setting xwidget-internal move-toolbar
gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 393368 12549)
(symbols 48 41286 2)
(miscs 40 873 266)
(strings 32 122175 2569)
(string-bytes 1 3374293)
(vectors 16 44321)
(vector-slots 8 833717 14994)
(floats 8 436 68)
(intervals 56 2681 0)
(buffers 992 21))







^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: Screenshots for th bug
  2019-06-04  3:43 bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Nikita
@ 2019-06-04 11:44 ` Никита Никита
  2019-06-08 12:20 ` bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Eli Zaretskii
  2019-06-08 15:14 ` Mattias Engdegård
  2 siblings, 0 replies; 13+ messages in thread
From: Никита Никита @ 2019-06-04 11:44 UTC (permalink / raw)
  To: 36085

[-- Attachment #1: Type: text/html, Size: 611 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-04  3:43 bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Nikita
  2019-06-04 11:44 ` bug#36085: Screenshots for th bug Никита Никита
@ 2019-06-08 12:20 ` Eli Zaretskii
  2019-06-09 12:34   ` Tomas Nordin
  2019-06-08 15:14 ` Mattias Engdegård
  2 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-08 12:20 UTC (permalink / raw)
  To: Nikita; +Cc: 36085

> From: Nikita <grindeg@yandex.ru>
> Date: Tue, 4 Jun 2019 08:43:06 +0500
> 
> When i open dired, go to the needed directory, run "M-x dired-find"
> "-name "*Портрет*" (or anything at all that will give some results)
> results come back with octal escapes instead of Cyrillic letters.
> I cannot open pictures that it finds for example.

Turns out the octal escapes are produced by 'find' itself in this
case.  Try the following command in that directory from the shell
prompt:

   find . \( -iname "*Портрет*" \) -ls

and you will see the same octal escape instead of the Cyrillic
characters.  The man page for 'find' clearly documents this, under
"Unusual Filenames":

 Unusual characters are handled differently by various actions, as
 described below.
 [...]

   -ls, -fls
	 Unusual characters are always escaped.  White space,  backslash,
	 and  double  quote characters are printed using C-style escaping
	 (for example `\f', `\"').  Other unusual characters are  printed
	 using  an octal escape.  Other printable characters (for -ls and
	 -fls these are the characters between octal 041  and  0176)  are
	 printed as-is.

What this means is that any non-ASCII character will be converted to a
series of octal escapes.  IMO, this is a terrible misfeature in GNU
Findutils, as such "handling" of non-ASCII characters has no place in
today's global environment.

I suggest to report this bug to the GNU Findutils developers.

Thanks.

P.S. Emacs could perhaps go above and beyond the call of duty, and
attempt to convert the octal escapes back to readable text.  But I
don't think we should do it, as it's a clear bug in 'find'.
Nonetheless, if someone wants to submit patches to do such a
conversion, I won't block them.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-04  3:43 bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Nikita
  2019-06-04 11:44 ` bug#36085: Screenshots for th bug Никита Никита
  2019-06-08 12:20 ` bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Eli Zaretskii
@ 2019-06-08 15:14 ` Mattias Engdegård
  2019-06-08 15:34   ` Eli Zaretskii
  2 siblings, 1 reply; 13+ messages in thread
From: Mattias Engdegård @ 2019-06-08 15:14 UTC (permalink / raw)
  To: Eli Zaretskii, 36085, grindeg

Eli wrote:

> P.S. Emacs could perhaps go above and beyond the call of duty, and attempt to convert the octal escapes back to readable text. But I don't think we should do it, as it's a clear bug in 'find'. Nonetheless, if someone wants to submit patches to do such a conversion, I won't block them. 

The default (BSD) find in macOS does not seem to escape anything; files named Портрет or APL\360 are printed exactly that way. Thus, Emacs would need to know what 'find' it is running. This appears to validate your recommendation.






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-08 15:14 ` Mattias Engdegård
@ 2019-06-08 15:34   ` Eli Zaretskii
  2019-06-09  5:22     ` Eli Zaretskii
  2022-03-13  6:05     ` Visuwesh
  0 siblings, 2 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-08 15:34 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: grindeg, 36085

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sat, 8 Jun 2019 17:14:11 +0200
> 
> Eli wrote:
> 
> > P.S. Emacs could perhaps go above and beyond the call of duty, and attempt to convert the octal escapes back to readable text. But I don't think we should do it, as it's a clear bug in 'find'. Nonetheless, if someone wants to submit patches to do such a conversion, I won't block them. 
> 
> The default (BSD) find in macOS does not seem to escape anything; files named Портрет or APL\360 are printed exactly that way. Thus, Emacs would need to know what 'find' it is running. This appears to validate your recommendation.

Indeed, the hard part is to distinguish between \nnn an octal escape
and the literal string "\nnn".  That difficulty is one reason why
gdb-mi.el performs a similar decoding only as an opt-in optional
behavior.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-08 15:34   ` Eli Zaretskii
@ 2019-06-09  5:22     ` Eli Zaretskii
  2019-06-09  9:08       ` Mattias Engdegård
  2022-03-13  6:05     ` Visuwesh
  1 sibling, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-09  5:22 UTC (permalink / raw)
  To: mattiase, grindeg; +Cc: 36085

> Date: Sat, 08 Jun 2019 18:34:48 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: grindeg@yandex.ru, 36085@debbugs.gnu.org
> 
> Indeed, the hard part is to distinguish between \nnn an octal escape
> and the literal string "\nnn".  That difficulty is one reason why
> gdb-mi.el performs a similar decoding only as an opt-in optional
> behavior.

Here's an idea for making this command work with non-ASCII file names:
do NOT add "-ls" to the 'find' command line, then in the process
filter function call file-attributes on each file name we receive from
'find', and format the result according to Dired convention before
inserting it into the buffer.

Any takers?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-09  5:22     ` Eli Zaretskii
@ 2019-06-09  9:08       ` Mattias Engdegård
  2019-06-09 10:57         ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Mattias Engdegård @ 2019-06-09  9:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: grindeg, 36085

9 juni 2019 kl. 07.22 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> Here's an idea for making this command work with non-ASCII file names:
> do NOT add "-ls" to the 'find' command line, then in the process
> filter function call file-attributes on each file name we receive from
> 'find', and format the result according to Dired convention before
> inserting it into the buffer.

Maybe we can trust -print0 to work everywhere (BSD find has it).

It's probably a quaint notion, but I wish Emacs were be able to do without the help of external programs for something as basic as listing directories.






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-09  9:08       ` Mattias Engdegård
@ 2019-06-09 10:57         ` Eli Zaretskii
  2019-06-09 12:39           ` Mattias Engdegård
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-09 10:57 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: grindeg, 36085

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sun, 9 Jun 2019 11:08:51 +0200
> Cc: grindeg@yandex.ru, 36085@debbugs.gnu.org
> 
> > Here's an idea for making this command work with non-ASCII file names:
> > do NOT add "-ls" to the 'find' command line, then in the process
> > filter function call file-attributes on each file name we receive from
> > 'find', and format the result according to Dired convention before
> > inserting it into the buffer.
> 
> Maybe we can trust -print0 to work everywhere (BSD find has it).

That's orthogonal, isn't it?  It is only needed to make sure we don't
get confused by file names with embedded newlines, AFAIU.

> It's probably a quaint notion, but I wish Emacs were be able to do without the help of external programs for something as basic as listing directories.

We have such capabilities, see directory-files-and-attributes and
directory-files-recursively.  We also have find-lisp.el.  I just
assumed these alternatives will be significantly slower, but maybe
that's not the case?

One other consideration is that for large directory trees the current
implementation of find-dired updates the buffer in parallel with
'find' still running, whereas the alternatives will not return until
the whole listing has been generated, which might take a long time.
But maybe we could run the Lisp implementation in a separate thread,
and get the same effect?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-08 12:20 ` bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Eli Zaretskii
@ 2019-06-09 12:34   ` Tomas Nordin
  2019-06-09 12:51     ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Tomas Nordin @ 2019-06-09 12:34 UTC (permalink / raw)
  To: Eli Zaretskii, Nikita; +Cc: 36085

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Nikita <grindeg@yandex.ru>
>> Date: Tue, 4 Jun 2019 08:43:06 +0500
>> 
>> When i open dired, go to the needed directory, run "M-x dired-find"
>> "-name "*Портрет*" (or anything at all that will give some results)
>> results come back with octal escapes instead of Cyrillic letters.
>> I cannot open pictures that it finds for example.
>
> Turns out the octal escapes are produced by 'find' itself in this
> case.  Try the following command in that directory from the shell
> prompt:
>
>    find . \( -iname "*Портрет*" \) -ls
>
> and you will see the same octal escape instead of the Cyrillic
> characters.  The man page for 'find' clearly documents this, under
> "Unusual Filenames":
>
>  Unusual characters are handled differently by various actions, as
>  described below.
>  [...]
>
>    -ls, -fls
> 	 Unusual characters are always escaped.  White space,  backslash,
> 	 and  double  quote characters are printed using C-style escaping
> 	 (for example `\f', `\"').  Other unusual characters are  printed
> 	 using  an octal escape.  Other printable characters (for -ls and
> 	 -fls these are the characters between octal 041  and  0176)  are
> 	 printed as-is.
>
> What this means is that any non-ASCII character will be converted to a
> series of octal escapes.  IMO, this is a terrible misfeature in GNU
> Findutils, as such "handling" of non-ASCII characters has no place in
> today's global environment.

Here on 27.0.50 the customize option for `find-ls-option` says

    For example, to use human-readable file sizes with GNU ls:
       ("-exec ls -ldh {} +" . "-ldh")

Is it ignorant to suggest to try this as a workaround? It "worked" here.
Thanks for this bug anyway because i have had the same issue sometimes
and I will continue use this option and see if it makes any problems.

>
> I suggest to report this bug to the GNU Findutils developers.

Because ls doesn't seem to do this conversion -- inconsistent? :P

Best regards
--
Tomas





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-09 10:57         ` Eli Zaretskii
@ 2019-06-09 12:39           ` Mattias Engdegård
  2019-06-09 12:49             ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Mattias Engdegård @ 2019-06-09 12:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: grindeg, 36085

9 juni 2019 kl. 12.57 skrev Eli Zaretskii <eliz@gnu.org>:
> 
>> Maybe we can trust -print0 to work everywhere (BSD find has it).
> 
> That's orthogonal, isn't it?  It is only needed to make sure we don't
> get confused by file names with embedded newlines, AFAIU.

Not quite orthogonal as the -ls quoting also takes care of newlines, but I have no strong opinion on the matter.

>> It's probably a quaint notion, but I wish Emacs were be able to do without the help of external programs for something as basic as listing directories.
> 
> We have such capabilities, see directory-files-and-attributes and
> directory-files-recursively.  We also have find-lisp.el.  I just
> assumed these alternatives will be significantly slower, but maybe
> that's not the case?

You are right, they are slower, but need not be. The directory listing functions are slow because they throw away information, leading to lots of unnecessary syscalls and, on remote file systems, network roundtrips. This is true both on Unix and Windows.

Fixing this is not difficult but the elisp interface design requires care, and this goes beyond the scope of this bug. Your suggestions sound more realistic in the short term.

> One other consideration is that for large directory trees the current
> implementation of find-dired updates the buffer in parallel with
> 'find' still running, whereas the alternatives will not return until
> the whole listing has been generated, which might take a long time.

This concern is definitely valid. I don't know to what extent parallelism is possible in the current thread implementation. Again, improvements in this respect would have benefits beyond find-dired.






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-09 12:39           ` Mattias Engdegård
@ 2019-06-09 12:49             ` Eli Zaretskii
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-09 12:49 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: grindeg, 36085

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sun, 9 Jun 2019 14:39:32 +0200
> Cc: grindeg@yandex.ru, 36085@debbugs.gnu.org
> 
> > One other consideration is that for large directory trees the current
> > implementation of find-dired updates the buffer in parallel with
> > 'find' still running, whereas the alternatives will not return until
> > the whole listing has been generated, which might take a long time.
> 
> This concern is definitely valid. I don't know to what extent parallelism is possible in the current thread implementation.

Just a note: the current "parallel" implementation is not really
parallel either: 'find' indeed runs in parallel, but the process
filter functions in Emacs only run when Emacs is idle, so if the user
types very quickly after invoking find-dired, they will not see the
results until they make a break in typing.  And our threads work in
the same manner, at least in principle, so we should be good running
the Lisp implementation in a non-main thread.  Of course, until
someone actually tries that, we won't know whether there are any
obstacles: the devil, as always, is in the details.

> Again, improvements in this respect would have benefits beyond find-dired.

Sure.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-09 12:34   ` Tomas Nordin
@ 2019-06-09 12:51     ` Eli Zaretskii
  0 siblings, 0 replies; 13+ messages in thread
From: Eli Zaretskii @ 2019-06-09 12:51 UTC (permalink / raw)
  To: Tomas Nordin; +Cc: grindeg, 36085

> From: Tomas Nordin <tomasn@posteo.net>
> Cc: 36085@debbugs.gnu.org
> Date: Sun, 09 Jun 2019 14:34:45 +0200
> 
> Here on 27.0.50 the customize option for `find-ls-option` says
> 
>     For example, to use human-readable file sizes with GNU ls:
>        ("-exec ls -ldh {} +" . "-ldh")
> 
> Is it ignorant to suggest to try this as a workaround?

No, it isn't ignorant.  Thanks for mentioning it.  Although invoking
'ls' for each and every file reported by 'find' sounds gross to me,
and is definitely slower.

> > I suggest to report this bug to the GNU Findutils developers.
> 
> Because ls doesn't seem to do this conversion -- inconsistent? :P

Mainly because doing so with non-ASCII characters is highly
inappropriate nowadays.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text
  2019-06-08 15:34   ` Eli Zaretskii
  2019-06-09  5:22     ` Eli Zaretskii
@ 2022-03-13  6:05     ` Visuwesh
  1 sibling, 0 replies; 13+ messages in thread
From: Visuwesh @ 2022-03-13  6:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, grindeg, 36085

[சனி, ஜூன் 08 2019] Eli Zaretskii wrote:

Hi Eli,

>> From: Mattias Engdegård <mattiase@acm.org>
>> Date: Sat, 8 Jun 2019 17:14:11 +0200
>> 
>> Eli wrote:
>> 
>> > P.S. Emacs could perhaps go above and beyond the call of duty, and
>> > attempt to convert the octal escapes back to readable text. But I
>> > don't think we should do it, as it's a clear bug in
>> > 'find'. Nonetheless, if someone wants to submit patches to do such
>> > a conversion, I won't block them.
>> 
>> The default (BSD) find in macOS does not seem to escape anything;
>> files named Портрет or APL\360 are printed exactly that way. Thus,
>> Emacs would need to know what 'find' it is running. This appears to
>> validate your recommendation.
>
> Indeed, the hard part is to distinguish between \nnn an octal escape
> and the literal string "\nnn".  That difficulty is one reason why
> gdb-mi.el performs a similar decoding only as an opt-in optional
> behavior.

After being annoyed by the same exact behaviour, and with the helpful
hint about gdb-mi.el, I came up with the following function.  With a
preliminary testing, it does not choke on literal "\nnn" and it does not
noticeably slow down find-dired unlike the xargs option.  Maybe, we can
include something like this, WDYT?

    (defun vz/find-dired-unescape ()
      "Unescape the C-style octal escape strings."
      (while (not (eobp))
        (when-let ((beg (next-single-property-change (point) 'dired-filename))
                   (props (text-properties-at beg)))
          (goto-char beg)
          (while (and (re-search-forward (rx "\\" (group (any "0-7") (? (any "0-7") (? (any "0-7")))))
                                         (line-end-position) 'noerror)
                      (not (eq (char-before (match-beginning 0)) ?\\)))
            (let ((num (string-to-number (match-string 1) 8)))
              (replace-match (unibyte-string num) t nil nil 0)))
          (decode-coding-region beg (line-end-position) buffer-file-coding-system)
          (set-text-properties  beg (line-end-position) props))
        (forward-line)))

    (custom-set-variables
     '(find-ls-option (cons "-ls" "-dlis"))
     '(find-dired-refine-function #'vz/find-dired-unescape))





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-13  6:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04  3:43 bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Nikita
2019-06-04 11:44 ` bug#36085: Screenshots for th bug Никита Никита
2019-06-08 12:20 ` bug#36085: 26.2; find-dired octal escapes instead of Cyrillic text Eli Zaretskii
2019-06-09 12:34   ` Tomas Nordin
2019-06-09 12:51     ` Eli Zaretskii
2019-06-08 15:14 ` Mattias Engdegård
2019-06-08 15:34   ` Eli Zaretskii
2019-06-09  5:22     ` Eli Zaretskii
2019-06-09  9:08       ` Mattias Engdegård
2019-06-09 10:57         ` Eli Zaretskii
2019-06-09 12:39           ` Mattias Engdegård
2019-06-09 12:49             ` Eli Zaretskii
2022-03-13  6:05     ` Visuwesh

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).