* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems @ 2021-06-07 13:32 Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-07 14:08 ` Lars Ingebrigtsen ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-06-07 13:32 UTC (permalink / raw) To: 48902 This is my first Emacs bug report. DESCRIPTION: I use EMMS to listen to music, and recently, I have noticed that the EMMS Browser does not show covers for some of my albums. Today, I decided to investigate the problem, and I found the following. REPRODUCTION STEPS: 1. Create a directory called "Dubmood - C’etait Mieux En Rda (DATA001)". 2. Copy a picture, say "cover.png" into the newly created directory. 3. Launch "emacs -Q". 3. Open the newly created directory in Dired. 4. Press RET on the picture. EXPECTED RESULTS: Emacs shows the picture. ACTUAL RESULTS: Emacs shows an empty window. NOTES: The same problem applies to a folder called "Ultrasyd - L'Épidemie Dansante". It seems that both an apostrophe (') and a backtick (`) cause problems in both EMMS and Dired. -- Rudy In GNU Emacs 28.0.50 (build 2, x86_64-apple-darwin20.3.0, NS appkit-2022.30 Version 11.2.3 (Build 20D91)) of 2021-06-07 built on Workstation.local Repository revision: 82ccc3afcf9ed1f8b22ed5634e788879cd1af279 Repository branch: master Windowing system distributor 'Apple', version 10.3.2022 System Description: macOS 11.2.3 Configured using: 'configure --with-ns --with-native-compilation' Configured features: ACL DBUS GIF GLIB GMP GNUTLS JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP NOTIFY KQUEUE NS PDUMPER PNG RSVG THREADS TIFF TOOLKIT_SCROLL_BARS XIM ZLIB Important settings: value of $LANG: en_SK.UTF-8 locale-coding-system: utf-8-unix Major mode: Group Minor modes in effect: gnus-undo-mode: t shell-dirtrack-mode: t dumb-jump-mode: t global-hl-todo-mode: t show-paren-mode: t global-flycheck-mode: t recentf-mode: t counsel-projectile-mode: t projectile-mode: t global-subword-mode: t subword-mode: t all-the-icons-ivy-rich-mode: t ivy-prescient-mode: t prescient-persist-mode: t ivy-rich-project-root-cache-mode: t ivy-rich-mode: t ivy-mode: t guru-global-mode: t guru-mode: t which-key-mode: t save-place-mode: t global-auto-revert-mode: t delete-selection-mode: t override-global-mode: t tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t size-indication-mode: t column-number-mode: t line-number-mode: t global-visual-line-mode: t visual-line-mode: t transient-mark-mode: t Load-path shadows: /Users/salutis/.emacs.d/elpa/transient-20210530.2252/transient hides /Users/salutis/Repositories/emacs/nextstep/Emacs.app/Contents/Resources/lisp/transient Features: (shadow emacsbug sendmail smiley gnus-cite mail-extr gnus-async gnus-bcklg sort gnus-ml mm-archive gnutls network-stream url-http url-gw nsm url-cache url-auth qp nnrss mm-url nndraft nnmh nnmaildir nnfolder nnnil gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu mml2015 mm-view mml-smime smime dig nntp gnus-cache gnus-sum shr kinsoku svg dom gnus-group gnus-undo gnus-start gnus-dbus dbus xml gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo gnus-spec gnus-int gnus-range message rmc puny rfc822 mml mml-sec epa derived epg epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader gnus-win em-unix em-term em-script em-prompt em-ls em-hist em-pred em-glob em-dirs esh-var em-cmpl em-basic em-banner em-alias esh-mode bookmark pp writegood-mode two-column password-store auth-source-pass with-editor vterm face-remap vterm-module term/xterm xterm term disp-table ehelp eshell esh-cmd esh-ext esh-opt esh-proc esh-io esh-arg esh-module esh-groups esh-util server amx vc-git diff-mode vc-dispatcher ffap tramp tramp-loaddefs trampver tramp-integration files-x tramp-compat shell parse-time iso8601 ls-lisp pcase flyspell ispell dumb-jump popup hl-todo gnus nnheader gnus-util rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums mail-utils mm-util mail-prsvr cus-load paren flycheck recentf tree-widget wid-edit counsel-projectile ag vc-svn find-dired s dash ripgrep projectile grep ibuf-ext ibuffer ibuffer-loaddefs thingatpt cap-words superword subword all-the-icons-ivy-rich all-the-icons all-the-icons-faces data-material data-weathericons data-octicons data-fileicons data-faicons data-alltheicons ivy-prescient prescient counsel xdg xref project dired dired-loaddefs compile text-property-search swiper ivy-rich ivy flx ivy-faces ivy-overlay colir color guru-mode which-key modus-vivendi-theme modus-operandi-theme modus-themes comp comp-cstr warnings rx saveplace autorevert filenotify delsel exec-path-from-shell diminish use-package-ensure-system-package system-packages use-package use-package-ensure use-package-delight use-package-diminish use-package-bind-key bind-key use-package-core finder-inf cl-extra help-mode org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src ob-comint org-pcomplete pcomplete comint ansi-color ring org-list org-faces org-entities time-date noutline outline easy-mmode org-version ob-emacs-lisp ob-core ob-eval org-table ol org-keys org-compat advice org-macs org-loaddefs format-spec find-func cal-menu calendar cal-loaddefs tex-site edmacro kmacro info package browse-url url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json subr-x map url-vars seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib iso-transl tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads dbusbind kqueue cocoa ns lcms2 multi-tty make-network-process native-compile emacs) Memory information: ((conses 16 862734 56170) (symbols 48 42507 4) (strings 32 248048 23357) (string-bytes 1 10036170) (vectors 16 87234) (vector-slots 8 2207490 61351) (floats 8 1197 526) (intervals 56 735 540) (buffers 992 25)) ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 13:32 bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-06-07 14:08 ` Lars Ingebrigtsen 2021-06-07 14:15 ` Eli Zaretskii 2021-06-07 14:24 ` Lars Ingebrigtsen 2021-06-07 14:13 ` Eli Zaretskii 2021-06-08 10:39 ` naofumi 2 siblings, 2 replies; 29+ messages in thread From: Lars Ingebrigtsen @ 2021-06-07 14:08 UTC (permalink / raw) To: 48902; +Cc: Rudolf Adamkovič Rudolf Adamkovič via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> writes: > 1. Create a directory called "Dubmood - C’etait Mieux En Rda (DATA001)". > 2. Copy a picture, say "cover.png" into the newly created directory. > 3. Launch "emacs -Q". > 3. Open the newly created directory in Dired. > 4. Press RET on the picture. > > EXPECTED RESULTS: > > Emacs shows the picture. > > ACTUAL RESULTS: > > Emacs shows an empty window. I'm unable to reproduce this problem on Debian/bullseye... > In GNU Emacs 28.0.50 (build 2, x86_64-apple-darwin20.3.0, NS appkit-2022.30 Version 11.2.3 (Build 20D91)) > of 2021-06-07 built on Workstation.local ... so perhaps it's only a problem for the Macos build? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 14:08 ` Lars Ingebrigtsen @ 2021-06-07 14:15 ` Eli Zaretskii 2021-06-07 14:24 ` Lars Ingebrigtsen 1 sibling, 0 replies; 29+ messages in thread From: Eli Zaretskii @ 2021-06-07 14:15 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 48902, salutis > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Mon, 07 Jun 2021 16:08:31 +0200 > Cc: Rudolf Adamkovič <salutis@me.com> > > I'm unable to reproduce this problem on Debian/bullseye... > > > In GNU Emacs 28.0.50 (build 2, x86_64-apple-darwin20.3.0, NS appkit-2022.30 Version 11.2.3 (Build 20D91)) > > of 2021-06-07 built on Workstation.local > > ... so perhaps it's only a problem for the Macos build? I suspect it has to do with the special way macOS encodes non-ASCII file names. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 14:08 ` Lars Ingebrigtsen 2021-06-07 14:15 ` Eli Zaretskii @ 2021-06-07 14:24 ` Lars Ingebrigtsen 2021-06-07 14:36 ` Eli Zaretskii 1 sibling, 1 reply; 29+ messages in thread From: Lars Ingebrigtsen @ 2021-06-07 14:24 UTC (permalink / raw) To: 48902; +Cc: Alan Third, Rudolf Adamkovič Lars Ingebrigtsen <larsi@gnus.org> writes: > ... so perhaps it's only a problem for the Macos build? Yup; the current trunk isn't able to view images under Macos from dired if the path contains a non-ASCII char. Test case: larsi@emkay trunk % mkdir /tmp/fóo larsi@emkay trunk % cp etc/images/gnus/gnus.png /tmp/fóo/ larsi@emkay trunk % ./src/emacs /tmp/fóo/ And then RET the file to test. I then get an error message about failed output type 'public.tiff' on the terminal. I've added Alan to the CCs. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 14:24 ` Lars Ingebrigtsen @ 2021-06-07 14:36 ` Eli Zaretskii 0 siblings, 0 replies; 29+ messages in thread From: Eli Zaretskii @ 2021-06-07 14:36 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 48902, alan, salutis > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Mon, 07 Jun 2021 16:24:44 +0200 > Cc: Alan Third <alan@idiocy.org>, > Rudolf Adamkovič <salutis@me.com> > > larsi@emkay trunk % mkdir /tmp/fóo > larsi@emkay trunk % cp etc/images/gnus/gnus.png /tmp/fóo/ > larsi@emkay trunk % ./src/emacs /tmp/fóo/ > > And then RET the file to test. I then get an error message about > failed output type 'public.tiff' on the terminal. What happens if you type larsi@emkay trunk % ./src/emacs /tmp/fóo/gnus.png does it show the PNG image then? And what are the values of file-name-coding-system and default-file-name-coding-system? ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 13:32 bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-07 14:08 ` Lars Ingebrigtsen @ 2021-06-07 14:13 ` Eli Zaretskii 2021-06-08 22:21 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-08 10:39 ` naofumi 2 siblings, 1 reply; 29+ messages in thread From: Eli Zaretskii @ 2021-06-07 14:13 UTC (permalink / raw) To: Rudolf Adamkovič; +Cc: 48902 > Date: Mon, 07 Jun 2021 15:32:10 +0200 > From: Rudolf Adamkovič via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> > > 1. Create a directory called "Dubmood - C’etait Mieux En Rda (DATA001)". > 2. Copy a picture, say "cover.png" into the newly created directory. > 3. Launch "emacs -Q". > 3. Open the newly created directory in Dired. > 4. Press RET on the picture. > > EXPECTED RESULTS: > > Emacs shows the picture. > > ACTUAL RESULTS: > > Emacs shows an empty window. > > NOTES: > > The same problem applies to a folder called "Ultrasyd - L'Épidemie > Dansante". It seems that both an apostrophe (') and a backtick (`) > cause problems in both EMMS and Dired. When you visit that directory in Dired, what is the value of buffer-file-coding-system in the Dired buffer? Also, you mention backtick, but there's no such character in the example you show. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 14:13 ` Eli Zaretskii @ 2021-06-08 22:21 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 29+ messages in thread From: Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-06-08 22:21 UTC (permalink / raw) To: 48902 Eli Zaretskii <eliz@gnu.org> writes: > When you visit that directory in Dired, what is the value of > buffer-file-coding-system in the Dired buffer? The value is “utf-8-unix”. > Also, you mention backtick, but there's no such character in the > example you show. There is no backtick. My apologies! -- Rudy ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-07 13:32 bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-07 14:08 ` Lars Ingebrigtsen 2021-06-07 14:13 ` Eli Zaretskii @ 2021-06-08 10:39 ` naofumi 2021-06-08 11:57 ` Lars Ingebrigtsen 2 siblings, 1 reply; 29+ messages in thread From: naofumi @ 2021-06-08 10:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 48902, Rudolf Adamkovič [-- Attachment #1: Type: text/plain, Size: 2110 bytes --] On Monday, June 07, 2021 16:15 CEST, Eli Zaretskii <eliz@gnu.org> wrote: > > From: Lars Ingebrigtsen <larsi@gnus.org> > > ... so perhaps it's only a problem for the Macos build? > > I suspect it has to do with the special way macOS encodes non-ASCII > file names. > I found the same issue with images which has Japanese filename on macOS. Message buffer says: ------------------------------------------------------------------------ Unable to load image (image :type png :file /Users/naofumi/_git/git.sv.gnu.org/emacs_png/いーまっくす.png :scale 1 :max-width 480 :max-height 781 :format nil :transform-smoothing t) [5 times] ------------------------------------------------------------------------ At least, attached small patch could fix this. attachments: 0001-Fix-to-show-images-with-non-ascii-filename-on-macOS.patch ns_load_image-error-with-non-ascii-filename.png revert-filename-NSString-in-nsimage.png emacs_png.tar.gz This [EmacsImage allocInitFromFile:] change was introduced by the following commit: ------------------------------------------------------------------------ commit 747a923b9a35533f98573ad5b01fccf096195079 Author: Alan Third <alan@idiocy.org> Date: Tue Dec 22 23:28:25 2020 +0000 Use new NSString lisp methods * src/nsfont.m (ns_otf_to_script): (ns_registry_to_script): (ns_get_req_script): Use NSString conversion methods. * src/nsimage.m ([EmacsImage allocInitFromFile:]): Use NSString conversion methods. * src/nsmenu.m (ns_menu_show): Use NSString conversion methods. * src/nsselect.m (symbol_to_nsstring): (ns_string_to_pasteboard_internal): Use NSString conversion methods. * src/nsterm.m (ns_term_init): ([EmacsView initFrameFromEmacs:]): Use NSString conversion methods. * src/nsxwidget.m (nsxwidget_webkit_uri): (nsxwidget_webkit_title): (js_to_lisp): Use NSString conversion methods. (build_string_with_nsstr): Functionality replaced by NSString extensions. ------------------------------------------------------------------------ Regards, --Naofumi [-- Attachment #2: emacs_png.tar.gz --] [-- Type: application/x-gzip, Size: 14482 bytes --] [-- Attachment #3: ns_load_image-error-with-non-ascii-filename.png --] [-- Type: image/png, Size: 366374 bytes --] [-- Attachment #4: 0001-Fix-to-show-images-with-non-ascii-filename-on-macOS.patch --] [-- Type: application/octet-stream, Size: 921 bytes --] From bb9250b67bf887224059c337d117a720fcf79dd7 Mon Sep 17 00:00:00 2001 From: Naofumi Yasufuku <naofumi@yasufuku.dev> Date: Tue, 8 Jun 2021 19:04:24 +0900 Subject: [PATCH] Fix to show images with non-ascii filename on macOS * src/nsimage.m ([EmacsImage allocInitFromFile:]): Revert filename NSString to use stringWithUTF8String instead of stringWithLispString. (Bug#48902) --- src/nsimage.m | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/nsimage.m b/src/nsimage.m index fa81a41a51..8c7a3d9a09 100644 --- a/src/nsimage.m +++ b/src/nsimage.m @@ -262,7 +262,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file found = ENCODE_FILE (found); image = [[EmacsImage alloc] initByReferencingFile: - [NSString stringWithLispString: found]]; + [NSString stringWithUTF8String: SSDATA (found)]]; image->bmRep = nil; #ifdef NS_IMPL_COCOA -- 2.31.1 [-- Attachment #5: revert-filename-NSString-in-nsimage.png --] [-- Type: image/png, Size: 232034 bytes --] ^ permalink raw reply related [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 10:39 ` naofumi @ 2021-06-08 11:57 ` Lars Ingebrigtsen 2021-06-08 12:12 ` Alan Third 0 siblings, 1 reply; 29+ messages in thread From: Lars Ingebrigtsen @ 2021-06-08 11:57 UTC (permalink / raw) To: naofumi; +Cc: 48902, Rudolf Adamkovič, Alan Third naofumi@yasufuku.dev <naofumi@yasufuku.dev> writes: > diff --git a/src/nsimage.m b/src/nsimage.m > index fa81a41a51..8c7a3d9a09 100644 > --- a/src/nsimage.m > +++ b/src/nsimage.m > @@ -262,7 +262,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file > found = ENCODE_FILE (found); > > image = [[EmacsImage alloc] initByReferencingFile: > - [NSString stringWithLispString: found]]; > + [NSString stringWithUTF8String: SSDATA (found)]]; Hm... I'm not very familiar at all with the Objective C code here... but shouldn't "found" here be a Lisp string so that stringWithLispString would do the right thing? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 11:57 ` Lars Ingebrigtsen @ 2021-06-08 12:12 ` Alan Third 2021-06-08 12:14 ` Lars Ingebrigtsen 2021-06-08 12:37 ` Eli Zaretskii 0 siblings, 2 replies; 29+ messages in thread From: Alan Third @ 2021-06-08 12:12 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 48902, Rudolf Adamkovič, naofumi On Tue, Jun 08, 2021 at 01:57:01PM +0200, Lars Ingebrigtsen wrote: > naofumi@yasufuku.dev <naofumi@yasufuku.dev> writes: > > > diff --git a/src/nsimage.m b/src/nsimage.m > > index fa81a41a51..8c7a3d9a09 100644 > > --- a/src/nsimage.m > > +++ b/src/nsimage.m > > @@ -262,7 +262,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file > > found = ENCODE_FILE (found); > > > > image = [[EmacsImage alloc] initByReferencingFile: > > - [NSString stringWithLispString: found]]; > > + [NSString stringWithUTF8String: SSDATA (found)]]; > > Hm... I'm not very familiar at all with the Objective C code here... > but shouldn't "found" here be a Lisp string so that stringWithLispString > would do the right thing? It's always possible that stringWithLispString isn't doing the right thing. It's implemented at nsfns.m:3026. I know almost nothing about UTF8/UTF16 so while it looks like it's doing the right thing to me, I could be entirely wrong. -- Alan Third ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 12:12 ` Alan Third @ 2021-06-08 12:14 ` Lars Ingebrigtsen 2021-06-08 17:45 ` Mattias Engdegård 2021-06-08 18:17 ` Mattias Engdegård 2021-06-08 12:37 ` Eli Zaretskii 1 sibling, 2 replies; 29+ messages in thread From: Lars Ingebrigtsen @ 2021-06-08 12:14 UTC (permalink / raw) To: Alan Third; +Cc: 48902, Mattias Engdegård, Rudolf Adamkovič, naofumi Alan Third <alan@idiocy.org> writes: >> > image = [[EmacsImage alloc] initByReferencingFile: >> > - [NSString stringWithLispString: found]]; >> > + [NSString stringWithUTF8String: SSDATA (found)]]; >> >> Hm... I'm not very familiar at all with the Objective C code here... >> but shouldn't "found" here be a Lisp string so that stringWithLispString >> would do the right thing? > > It's always possible that stringWithLispString isn't doing the right > thing. It's implemented at nsfns.m:3026. I know almost nothing about > UTF8/UTF16 so while it looks like it's doing the right thing to me, I > could be entirely wrong. Right -- and that was written by Mattias, so I've added him to the CCs. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 12:14 ` Lars Ingebrigtsen @ 2021-06-08 17:45 ` Mattias Engdegård 2021-06-08 18:18 ` Eli Zaretskii 2021-06-08 19:10 ` Alan Third 2021-06-08 18:17 ` Mattias Engdegård 1 sibling, 2 replies; 29+ messages in thread From: Mattias Engdegård @ 2021-06-08 17:45 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 48902, Alan Third, Rudolf Adamkovič, naofumi 8 juni 2021 kl. 14.14 skrev Lars Ingebrigtsen <larsi@gnus.org>: >> It's always possible that stringWithLispString isn't doing the right >> thing. It's implemented at nsfns.m:3026. I know almost nothing about >> UTF8/UTF16 so while it looks like it's doing the right thing to me, I >> could be entirely wrong. > > Right -- and that was written by Mattias, so I've added him to the CCs. Thank you, but stringWithLispString: is actually fine, unless you count its inability to produce useful results from wrong input! The image code seems to be quite confused with respect to whether the file names being passed around are in encoded form. Until recently it seems to have worked by pure chance since as long as the file name encoding is UTF-8 and the low-level code accesses the raw string data we do get the same result, but at least since 747a923b9a35 that's no longer the case. Concretely, we have: 1. image_find_image_file probably expects its `file` argument to be non-encoded, but the string it returns is always encoded. 2. native_image_load calls image_find_image_file and passes its return value to ns_load_image. 3. ns_load_image calls [EmacsImage allocInitFromFile:] with its file argument. 4. [EmacsImage allocInitFromFile: file] can apparently be called with both a non-encoded or an encoded `file` argument (clearly not ideal), and it does: found = image_find_image_file (file); // This is dubious when `file` is already encoded. found = ENCODE_FILE (found); // This is completely useless since `found` is already encoded! Apparently ENCODE_FILE is idempotent, at least on macOS... [NSString stringWithLispString: found] // This produces nonsense as `found` is a string of raw bytes, so any Unicode will be converted to stretches of U+FFFD REPLACEMENT CHAR. [NSString stringWithLispString: file] // Same problem again, with a different variable. The quick fix of reverting to stringWithUTF8String: will work, but the real problem is that we have no control of the encodedness of lisp strings being passed around. Comments would help, and I'd even go as far to suggest typedef struct { Lisp_Object string; } encoded_file_name_t; with the appropriate constructors and accessors, to get C's static type checking to work for us. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 17:45 ` Mattias Engdegård @ 2021-06-08 18:18 ` Eli Zaretskii 2021-06-08 19:13 ` naofumi 2021-06-08 19:10 ` Alan Third 1 sibling, 1 reply; 29+ messages in thread From: Eli Zaretskii @ 2021-06-08 18:18 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 48902, larsi, salutis, alan, naofumi > From: Mattias Engdegård <mattiase@acm.org> > Date: Tue, 8 Jun 2021 19:45:22 +0200 > Cc: 48902@debbugs.gnu.org, Alan Third <alan@idiocy.org>, > Rudolf Adamkovič <salutis@me.com>, naofumi@yasufuku.dev > > The quick fix of reverting to stringWithUTF8String: will work, but the real problem is that we have no control of the encodedness of lisp strings being passed around. The usual technique in these cases is to keep Lisp strings unencoded, encode them when passing to the low-level C functions, and pass to those functions only the pointer to the string's data. In those rare cases when you really need to pass a Lisp string that is an encoded file name, call the argument "encoded_file" or somesuch. But these cases should be rare exceptions. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 18:18 ` Eli Zaretskii @ 2021-06-08 19:13 ` naofumi 2021-06-08 20:08 ` Mattias Engdegård 0 siblings, 1 reply; 29+ messages in thread From: naofumi @ 2021-06-08 19:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 48902, Mattias Engdegård, larsi, salutis, alan [-- Attachment #1: Type: text/plain, Size: 1494 bytes --] On Tuesday, June 08, 2021 20:18 CEST, Eli Zaretskii <eliz@gnu.org> wrote: > > From: Mattias Engdegård <mattiase@acm.org> > > Date: Tue, 8 Jun 2021 19:45:22 +0200 > > Cc: 48902@debbugs.gnu.org, Alan Third <alan@idiocy.org>, > > Rudolf Adamkovič <salutis@me.com>, naofumi@yasufuku.dev > > > > The quick fix of reverting to stringWithUTF8String: will work, but the real problem is that we have no control of the encodedness of lisp strings being passed around. > > The usual technique in these cases is to keep Lisp strings unencoded, > encode them when passing to the low-level C functions, and pass to > those functions only the pointer to the string's data. > > In those rare cases when you really need to pass a Lisp string that is > an encoded file name, call the argument "encoded_file" or somesuch. > But these cases should be rare exceptions. > > I agree that [NSString stringWithLispString:] is working as expected, and it is not the real problem. For example, another patch using STRING_SET_MULTIBYTE() is here. This looks still just a quick fix and bit weired, though.. attachments: 0001-Fix-to-show-images-with-non-ascii-filename-STRING_SET_MULTIBYTE.patch image-non-ascii-filename-STRING_SET_MULTIBYTE-macos.png image-non-ascii-filename-STRING_SET_MULTIBYTE-linux.png On the other hand, I cannot find out that non-UTF-8 filename coding is really needed on macOS. It might be over-engineered thing, and just an overhead. Regards, --Naofumi [-- Attachment #2: image-non-ascii-filename-STRING_SET_MULTIBYTE-macos.png --] [-- Type: image/png, Size: 234036 bytes --] [-- Attachment #3: 0001-Fix-to-show-images-with-non-ascii-filename-STRING_SET_MULTIBYTE.patch --] [-- Type: application/octet-stream, Size: 1676 bytes --] From 0b18100ada3a16e667684383150c4f4d7848e5ce Mon Sep 17 00:00:00 2001 From: Naofumi Yasufuku <naofumi@yasufuku.dev> Date: Wed, 9 Jun 2021 02:10:43 +0900 Subject: [PATCH] Fix to show images with non-ascii filename on macOS * src/image.c (image_find_image_fd): Indicate 'file_found' Lisp_String as multibyte if 'file' or 'search_path' is multibyte Lisp_String. Without this special care, string_to_multibyte() call in [NSString stringWithLispString:] attempts to convert the multibyte filename (UTF-8 by default) to multybyte string unintentionally. (Bug#48902) * src/nsimage.m ([EmacsImage allocInitFromFile:]): Remove redundant ENCODE_FILE() which is done by image_find_image_fd(). --- src/image.c | 2 ++ src/nsimage.m | 1 - 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/image.c b/src/image.c index b34dc3e916..f7204af873 100644 --- a/src/image.c +++ b/src/image.c @@ -3156,6 +3156,8 @@ image_find_image_fd (Lisp_Object file, int *pfd) if (fd >= 0 || fd == -2) { file_found = ENCODE_FILE (file_found); + if (STRING_MULTIBYTE (search_path) || STRING_MULTIBYTE (file)) + STRING_SET_MULTIBYTE (file_found); if (fd == -2) { /* The file exists locally, but has a file name handler. diff --git a/src/nsimage.m b/src/nsimage.m index fa81a41a51..5e5960de90 100644 --- a/src/nsimage.m +++ b/src/nsimage.m @@ -259,7 +259,6 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file found = image_find_image_file (file); if (!STRINGP (found)) return nil; - found = ENCODE_FILE (found); image = [[EmacsImage alloc] initByReferencingFile: [NSString stringWithLispString: found]]; -- 2.31.1 [-- Attachment #4: image-non-ascii-filename-STRING_SET_MULTIBYTE-linux.png --] [-- Type: image/png, Size: 82170 bytes --] ^ permalink raw reply related [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 19:13 ` naofumi @ 2021-06-08 20:08 ` Mattias Engdegård 0 siblings, 0 replies; 29+ messages in thread From: Mattias Engdegård @ 2021-06-08 20:08 UTC (permalink / raw) To: naofumi@yasufuku.dev; +Cc: 48902, salutis, larsi, alan 8 juni 2021 kl. 21.13 skrev naofumi@yasufuku.dev: > For example, another patch using STRING_SET_MULTIBYTE() is here. > This looks still just a quick fix and bit weired, though.. Thank you, but it's probably better to always return an unencoded string from image_find_image_fd in that case. The current code looks like a premature optimisation. > On the other hand, I cannot find out that non-UTF-8 filename coding is > really needed on macOS. It might be over-engineered thing, and just an overhead. Maybe and in practice you are probably right, but the NS port is not exclusive to macOS. There is the quasi-NFD normalisation step but I'm not sure how important that is today. There's no need to convert to NFD for accessing files; it only matters when comparing names (much like case folding on many file systems). ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 17:45 ` Mattias Engdegård 2021-06-08 18:18 ` Eli Zaretskii @ 2021-06-08 19:10 ` Alan Third 2021-06-08 19:52 ` Mattias Engdegård 1 sibling, 1 reply; 29+ messages in thread From: Alan Third @ 2021-06-08 19:10 UTC (permalink / raw) To: Mattias Engdegård Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi [-- Attachment #1: Type: text/plain, Size: 1495 bytes --] On Tue, Jun 08, 2021 at 07:45:22PM +0200, Mattias Engdegård wrote: > 8 juni 2021 kl. 14.14 skrev Lars Ingebrigtsen <larsi@gnus.org>: > > >> It's always possible that stringWithLispString isn't doing the right > >> thing. It's implemented at nsfns.m:3026. I know almost nothing about > >> UTF8/UTF16 so while it looks like it's doing the right thing to me, I > >> could be entirely wrong. > > > > Right -- and that was written by Mattias, so I've added him to the CCs. > > Thank you, but stringWithLispString: is actually fine, unless you > count its inability to produce useful results from wrong input! In my defence it wasn't entirely clear to me that a lisp string returned from ENCODE_FILE was incompatible with stringWithLispString. ;) > The image code seems to be quite confused with respect to whether > the file names being passed around are in encoded form. Until > recently it seems to have worked by pure chance since as long as the > file name encoding is UTF-8 and the low-level code accesses the raw > string data we do get the same result, but at least since > 747a923b9a35 that's no longer the case. Hmm, and as you point out we use "file" further down and it may or may not be encoded, but will probably have the same contents as found, which we know is encoded. Plus it's setting the "name" field in the image, which we probably want to keep as uniform as possible for caching purposes but is otherwise irrelevant. I think the attached should solve this. -- Alan Third [-- Attachment #2: 0001-Fix-image-filename-encoding-issues-in-NS-port-bug-48.patch --] [-- Type: text/x-diff, Size: 2175 bytes --] From c2902846c57c55576b1612bf11afaf240994ca70 Mon Sep 17 00:00:00 2001 From: Alan Third <alan@idiocy.org> Date: Tue, 8 Jun 2021 20:08:34 +0100 Subject: [PATCH] Fix image filename encoding issues in NS port (bug#48902) * src/nsfns.m ([NSString stringWithLispString:]): Clarify usage in comment. * src/nsimage.m ([EmacsImage allocInitFromFile:]): Clarify that the filename is UTF-8 encoded and handle it correctly. --- src/nsfns.m | 3 ++- src/nsimage.m | 11 +++++------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/nsfns.m b/src/nsfns.m index d14f7b51ea..98801d8526 100644 --- a/src/nsfns.m +++ b/src/nsfns.m @@ -3024,7 +3024,8 @@ - (NSString *)panel: (id)sender userEnteredFilename: (NSString *)filename } @implementation NSString (EmacsString) -/* Make an NSString from a Lisp string. */ +/* Make an NSString from a Lisp string. STRING must not be in an + encoded form (e.g. UTF-8). */ + (NSString *)stringWithLispString:(Lisp_Object)string { /* Shortcut for the common case. */ diff --git a/src/nsimage.m b/src/nsimage.m index fa81a41a51..b0bd52111b 100644 --- a/src/nsimage.m +++ b/src/nsimage.m @@ -252,17 +252,16 @@ @implementation EmacsImage + (instancetype)allocInitFromFile: (Lisp_Object)file { NSImageRep *imgRep; - Lisp_Object found; + Lisp_Object utf8_filename; EmacsImage *image; /* Search bitmap-file-path for the file, if appropriate. */ - found = image_find_image_file (file); - if (!STRINGP (found)) + utf8_filename = image_find_image_file (file); + if (!STRINGP (utf8_filename)) return nil; - found = ENCODE_FILE (found); image = [[EmacsImage alloc] initByReferencingFile: - [NSString stringWithLispString: found]]; + [NSString stringWithUTF8String: SSDATA (utf8_filename)]]; image->bmRep = nil; #ifdef NS_IMPL_COCOA @@ -278,7 +277,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file [image setSize: NSMakeSize([imgRep pixelsWide], [imgRep pixelsHigh])]; - [image setName: [NSString stringWithLispString: file]]; + [image setName: [NSString stringWithUTF8String: SSDATA (utf8_filename)]]; return image; } -- 2.30.2 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 19:10 ` Alan Third @ 2021-06-08 19:52 ` Mattias Engdegård 2021-06-08 20:33 ` Alan Third 0 siblings, 1 reply; 29+ messages in thread From: Mattias Engdegård @ 2021-06-08 19:52 UTC (permalink / raw) To: Alan Third; +Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi 8 juni 2021 kl. 21.10 skrev Alan Third <alan@idiocy.org>: > In my defence it wasn't entirely clear to me that a lisp string > returned from ENCODE_FILE was incompatible with stringWithLispString. ;) Oh it's compatible all right, it just takes it job description very literally! That's typical of them computers -- no imagination at all. > Hmm, and as you point out we use "file" further down and it may or may > not be encoded, but will probably have the same contents as found, > which we know is encoded. Plus it's setting the "name" field in the > image, which we probably want to keep as uniform as possible for > caching purposes but is otherwise irrelevant. > > I think the attached should solve this. Thank you, that would work and I don't mind you pushing that right away. We probably should clear up the encodedness of `file` in allocInitFromFile: -- as Eli said, the convention is keeping strings unencoded until needed by low-level operations and it really makes the most sense. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 19:52 ` Mattias Engdegård @ 2021-06-08 20:33 ` Alan Third 2021-06-09 11:40 ` Mattias Engdegård 2021-06-09 11:56 ` Eli Zaretskii 0 siblings, 2 replies; 29+ messages in thread From: Alan Third @ 2021-06-08 20:33 UTC (permalink / raw) To: Mattias Engdegård Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi [-- Attachment #1: Type: text/plain, Size: 1519 bytes --] On Tue, Jun 08, 2021 at 09:52:51PM +0200, Mattias Engdegård wrote: > 8 juni 2021 kl. 21.10 skrev Alan Third <alan@idiocy.org>: > > > I think the attached should solve this. > > Thank you, that would work and I don't mind you pushing that right > away. We probably should clear up the encodedness of `file` in > allocInitFromFile: -- as Eli said, the convention is keeping strings > unencoded until needed by low-level operations and it really makes > the most sense. It's not just allocInitFromFile, I'm looking at the other callers of image_find_image_file and they all call ENCODE_FILE after it too. The only direct caller of image_find_image_fd that actually uses the contents of the returned string (svg_load) also encodes the file name. So I think we could restrict the use of the encoded filename within image_find_image_fd to *only* when it actually opens the file. Patch attached. I've tested it here but I only have a couple of images to try it with. I've been looking at the other changes I made in 747a923b9a35533f98573ad5b01fccf096195079 and I'm not sure they're correct. They clearly work now, but most of the time it's probably simple ASCII which should pass easily. Before they *all* seem to have assumed the data was UTF8 encoded, which is surely wrong since most of the time it's coming from Emacs. It's things like menu item titles. These are the use cases stringWithLispString was designed for, right? The only odd one is image filenames because they're explicitly encoded? -- Alan Third [-- Attachment #2: v2-0001-Fix-image-filename-encoding-issues-bug-48902.patch --] [-- Type: text/x-diff, Size: 2186 bytes --] From 48763cfe123955173ad82085125b2f08295daa7d Mon Sep 17 00:00:00 2001 From: Alan Third <alan@idiocy.org> Date: Tue, 8 Jun 2021 20:08:34 +0100 Subject: [PATCH v2] Fix image filename encoding issues (bug#48902) * src/image.c (image_find_image_fd): Don't return an encoded filename string. * src/nsfns.m: ([NSString stringWithLispString:]): Clarify usage comment. --- src/image.c | 19 ++++++++----------- src/nsfns.m | 3 ++- 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/src/image.c b/src/image.c index b34dc3e916..d1aaaf8f53 100644 --- a/src/image.c +++ b/src/image.c @@ -3153,19 +3153,16 @@ image_find_image_fd (Lisp_Object file, int *pfd) /* Try to find FILE in data-directory/images, then x-bitmap-file-path. */ fd = openp (search_path, file, Qnil, &file_found, pfd ? Qt : make_fixnum (R_OK), false, false); - if (fd >= 0 || fd == -2) + if (fd == -2) { - file_found = ENCODE_FILE (file_found); - if (fd == -2) - { - /* The file exists locally, but has a file name handler. - (This happens, e.g., under Auto Image File Mode.) - 'openp' didn't open the file, so we should, because the - caller expects that. */ - fd = emacs_open (SSDATA (file_found), O_RDONLY, 0); - } + /* The file exists locally, but has a file name handler. + (This happens, e.g., under Auto Image File Mode.) + 'openp' didn't open the file, so we should, because the + caller expects that. */ + Lisp_Object encoded_name = ENCODE_FILE (file_found); + fd = emacs_open (SSDATA (encoded_name), O_RDONLY, 0); } - else /* fd < 0, but not -2 */ + else if (fd < 0) return Qnil; if (pfd) *pfd = fd; diff --git a/src/nsfns.m b/src/nsfns.m index d14f7b51ea..98801d8526 100644 --- a/src/nsfns.m +++ b/src/nsfns.m @@ -3024,7 +3024,8 @@ - (NSString *)panel: (id)sender userEnteredFilename: (NSString *)filename } @implementation NSString (EmacsString) -/* Make an NSString from a Lisp string. */ +/* Make an NSString from a Lisp string. STRING must not be in an + encoded form (e.g. UTF-8). */ + (NSString *)stringWithLispString:(Lisp_Object)string { /* Shortcut for the common case. */ -- 2.30.2 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 20:33 ` Alan Third @ 2021-06-09 11:40 ` Mattias Engdegård 2021-06-09 15:19 ` Alan Third 2021-06-09 11:56 ` Eli Zaretskii 1 sibling, 1 reply; 29+ messages in thread From: Mattias Engdegård @ 2021-06-09 11:40 UTC (permalink / raw) To: Alan Third; +Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi 8 juni 2021 kl. 22.33 skrev Alan Third <alan@idiocy.org>: > It's not just allocInitFromFile, I'm looking at the other callers of > image_find_image_file and they all call ENCODE_FILE after it too. > > The only direct caller of image_find_image_fd that actually uses the > contents of the returned string (svg_load) also encodes the file name. > > So I think we could restrict the use of the encoded filename within > image_find_image_fd to *only* when it actually opens the file. Thank you, and I arrived at the same conclusion. > Patch attached. I've tested it here but I only have a couple of images > to try it with. Looks fine, but the image_find_image_file comment needs to be amended since it says that it returns an encoded string. > I've been looking at the other changes I made in > 747a923b9a35533f98573ad5b01fccf096195079 and I'm not sure they're > correct. They clearly work now, but most of the time it's probably > simple ASCII which should pass easily. > > Before they *all* seem to have assumed the data was UTF8 encoded, > which is surely wrong since most of the time it's coming from Emacs. > It's things like menu item titles. > > These are the use cases stringWithLispString was designed for, right? > The only odd one is image filenames because they're explicitly encoded? I should think so -- stringWithLispString: was designed as a general-purpose method to convert from a lisp string to NSString without changing the contents. Non-Unicode values (which includes raw bytes) become U+FFFD except surrogates as they can be represented (in a manner of speaking) in UTF-16, and it turns out to be more useful that way. Furthermore, if we use stringWithLispString: for file names, no special file name encoding step should be needed on our side, since the NS libs will perform any needed normalisation (at least if I've understood it right). ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-09 11:40 ` Mattias Engdegård @ 2021-06-09 15:19 ` Alan Third 2021-06-11 22:09 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 29+ messages in thread From: Alan Third @ 2021-06-09 15:19 UTC (permalink / raw) To: Mattias Engdegård Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi On Wed, Jun 09, 2021 at 01:40:18PM +0200, Mattias Engdegård wrote: > 8 juni 2021 kl. 22.33 skrev Alan Third <alan@idiocy.org>: > > > Patch attached. I've tested it here but I only have a couple of images > > to try it with. > > Looks fine, but the image_find_image_file comment needs to be > amended since it says that it returns an encoded string. I've made a change to the comment. > Furthermore, if we use stringWithLispString: for file names, no > special file name encoding step should be needed on our side, since > the NS libs will perform any needed normalisation (at least if I've > understood it right). Yes, I believe that's right, so I've made that change too and pushed to master. As far as I can see this fixes the problem so I'll close the bug report. If it's still broken in some way, please reply to this email and we'll reopen the bug. Thanks everyone! -- Alan Third ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-09 15:19 ` Alan Third @ 2021-06-11 22:09 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 29+ messages in thread From: Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-06-11 22:09 UTC (permalink / raw) To: Alan Third; +Cc: 48902, Mattias Engdegård, Lars Ingebrigtsen, naofumi I have pulled, recompiled, updated my EMMS music library, and now, every album has artwork showing, including Japanese and other non-English albums. Thank you Alan, Eli, Lars, Mattias, and Naofumi. This was my first GNU bug report, and, as a fresh GNU/Emacs/DRM-free convert, I found the experience rather impressive! R+ Alan Third <alan@idiocy.org> writes: > On Wed, Jun 09, 2021 at 01:40:18PM +0200, Mattias Engdegård > wrote: >> 8 juni 2021 kl. 22.33 skrev Alan Third <alan@idiocy.org>: >> >> > Patch attached. I've tested it here but I only have a couple >> > of images >> > to try it with. >> >> Looks fine, but the image_find_image_file comment needs to be >> amended since it says that it returns an encoded string. > > I've made a change to the comment. > >> Furthermore, if we use stringWithLispString: for file names, no >> special file name encoding step should be needed on our side, >> since >> the NS libs will perform any needed normalisation (at least if >> I've >> understood it right). > > Yes, I believe that's right, so I've made that change too and > pushed > to master. > > As far as I can see this fixes the problem so I'll close the bug > report. If it's still broken in some way, please reply to this > email > and we'll reopen the bug. > > Thanks everyone! ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 20:33 ` Alan Third 2021-06-09 11:40 ` Mattias Engdegård @ 2021-06-09 11:56 ` Eli Zaretskii 1 sibling, 0 replies; 29+ messages in thread From: Eli Zaretskii @ 2021-06-09 11:56 UTC (permalink / raw) To: Alan Third; +Cc: 48902, mattiase, larsi, salutis, naofumi > Date: Tue, 8 Jun 2021 21:33:01 +0100 > From: Alan Third <alan@idiocy.org> > Cc: 48902@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>, > Rudolf Adamkovič <salutis@me.com>, naofumi@yasufuku.dev > > It's not just allocInitFromFile, I'm looking at the other callers of > image_find_image_file and they all call ENCODE_FILE after it too. Encoding an already encoded file name is a no-op. But I agree it's unclean and we had better fixed that. > Patch attached. I've tested it here but I only have a couple of images > to try it with. Thanks, it LGTM. But please also adjust the comment of image_find_image_fd. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 12:14 ` Lars Ingebrigtsen 2021-06-08 17:45 ` Mattias Engdegård @ 2021-06-08 18:17 ` Mattias Engdegård 1 sibling, 0 replies; 29+ messages in thread From: Mattias Engdegård @ 2021-06-08 18:17 UTC (permalink / raw) To: Alan Third; +Cc: 48902, Lars Ingebrigtsen, Rudolf Adamkovič, naofumi [Replying to a couple of previous messages] > I guess we just need to make a note that stringWithLispString cannot > handle UTF-8 encoded filenames, unless someone has a smarter solution. This is not restricted to file names but yes, we should definitely clarify that it expects Unicode (or ASCII) strings as input, since raw bytes are interpreted as, well, raw bytes. > NSString can read in almost anything, and Mattias extended it to read > in multibyte (and ascii) lisp strings, so we don't need a UTF-16 input > specifically. It would probably be nice if NSString was also able to > recognise that a lisp string is UTF-8 and handle that itself, but I > don't think that's really possible, unless we make the assumption that > any unibyte string it's passed will already be ascii or UTF-8. > > I don't know if that's a reasonable assumption. No, I don't think it's reasonable either -- we should not put dwimmery into our string conversion logic just because we are too sloppy to document whether an argument or return value is encoded or not. stringWithLispString: appears to work as designed. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 12:12 ` Alan Third 2021-06-08 12:14 ` Lars Ingebrigtsen @ 2021-06-08 12:37 ` Eli Zaretskii 2021-06-08 13:00 ` Alan Third 1 sibling, 1 reply; 29+ messages in thread From: Eli Zaretskii @ 2021-06-08 12:37 UTC (permalink / raw) To: Alan Third; +Cc: 48902, larsi, salutis, naofumi > Date: Tue, 8 Jun 2021 13:12:56 +0100 > From: Alan Third <alan@idiocy.org> > Cc: 48902@debbugs.gnu.org, Rudolf Adamkovič <salutis@me.com>, > naofumi@yasufuku.dev > > On Tue, Jun 08, 2021 at 01:57:01PM +0200, Lars Ingebrigtsen wrote: > > naofumi@yasufuku.dev <naofumi@yasufuku.dev> writes: > > > > > diff --git a/src/nsimage.m b/src/nsimage.m > > > index fa81a41a51..8c7a3d9a09 100644 > > > --- a/src/nsimage.m > > > +++ b/src/nsimage.m > > > @@ -262,7 +262,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file > > > found = ENCODE_FILE (found); > > > > > > image = [[EmacsImage alloc] initByReferencingFile: > > > - [NSString stringWithLispString: found]]; > > > + [NSString stringWithUTF8String: SSDATA (found)]]; > > > > Hm... I'm not very familiar at all with the Objective C code here... > > but shouldn't "found" here be a Lisp string so that stringWithLispString > > would do the right thing? > > It's always possible that stringWithLispString isn't doing the right > thing. It's implemented at nsfns.m:3026. I know almost nothing about > UTF8/UTF16 so while it looks like it's doing the right thing to me, I > could be entirely wrong. It looks like stringWithLispString encodes into UTF-16? But file names on macOS should be encoded in UTF-8, and in fact allocInitFromFile already does TRT when it calls ENCODE_FILE, just before stringWithLispString is called. So I think the patch is correct. (UTF-16 encoding on macOS is for ENCODE_SYSTEM, right?) ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 12:37 ` Eli Zaretskii @ 2021-06-08 13:00 ` Alan Third 2021-06-08 14:02 ` Eli Zaretskii 0 siblings, 1 reply; 29+ messages in thread From: Alan Third @ 2021-06-08 13:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 48902, larsi, salutis, naofumi On Tue, Jun 08, 2021 at 03:37:51PM +0300, Eli Zaretskii wrote: > > Date: Tue, 8 Jun 2021 13:12:56 +0100 > > From: Alan Third <alan@idiocy.org> > > Cc: 48902@debbugs.gnu.org, Rudolf Adamkovič <salutis@me.com>, > > naofumi@yasufuku.dev > > > > On Tue, Jun 08, 2021 at 01:57:01PM +0200, Lars Ingebrigtsen wrote: > > > naofumi@yasufuku.dev <naofumi@yasufuku.dev> writes: > > > > > > > diff --git a/src/nsimage.m b/src/nsimage.m > > > > index fa81a41a51..8c7a3d9a09 100644 > > > > --- a/src/nsimage.m > > > > +++ b/src/nsimage.m > > > > @@ -262,7 +262,7 @@ + (instancetype)allocInitFromFile: (Lisp_Object)file > > > > found = ENCODE_FILE (found); > > > > > > > > image = [[EmacsImage alloc] initByReferencingFile: > > > > - [NSString stringWithLispString: found]]; > > > > + [NSString stringWithUTF8String: SSDATA (found)]]; > > > > > > Hm... I'm not very familiar at all with the Objective C code here... > > > but shouldn't "found" here be a Lisp string so that stringWithLispString > > > would do the right thing? > > > > It's always possible that stringWithLispString isn't doing the right > > thing. It's implemented at nsfns.m:3026. I know almost nothing about > > UTF8/UTF16 so while it looks like it's doing the right thing to me, I > > could be entirely wrong. > > It looks like stringWithLispString encodes into UTF-16? But file > names on macOS should be encoded in UTF-8, and in fact > allocInitFromFile already does TRT when it calls ENCODE_FILE, just > before stringWithLispString is called. So I think the patch is > correct. > > (UTF-16 encoding on macOS is for ENCODE_SYSTEM, right?) I think you're right. But confusingly initByReferencingFile takes an NSString which is a UTF-16 format string, so if I remove all the calls to ENCODE_FILE, stringWithLispString works fine. I guess we just need to make a note that stringWithLispString cannot handle UTF-8 encoded filenames, unless someone has a smarter solution. -- Alan Third ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 13:00 ` Alan Third @ 2021-06-08 14:02 ` Eli Zaretskii 2021-06-08 16:19 ` Alan Third 0 siblings, 1 reply; 29+ messages in thread From: Eli Zaretskii @ 2021-06-08 14:02 UTC (permalink / raw) To: Alan Third; +Cc: 48902, larsi, salutis, naofumi > Date: Tue, 8 Jun 2021 14:00:17 +0100 > From: Alan Third <alan@idiocy.org> > Cc: larsi@gnus.org, naofumi@yasufuku.dev, 48902@debbugs.gnu.org, > salutis@me.com > > > It looks like stringWithLispString encodes into UTF-16? But file > > names on macOS should be encoded in UTF-8, and in fact > > allocInitFromFile already does TRT when it calls ENCODE_FILE, just > > before stringWithLispString is called. So I think the patch is > > correct. > > > > (UTF-16 encoding on macOS is for ENCODE_SYSTEM, right?) > > I think you're right. But confusingly initByReferencingFile takes an > NSString which is a UTF-16 format string, so if I remove all the calls > to ENCODE_FILE, stringWithLispString works fine. > > I guess we just need to make a note that stringWithLispString cannot > handle UTF-8 encoded filenames, unless someone has a smarter solution. If you do need a UTF-16 encoded string, then instead of ENCODE_FILE you can call code_convert_string_norecord with Qutf_16. There's no need to invent or use a private UTF-16 encoder there, and you also get rid of an unnecessary extra UTF-8 encoding as a bonus. ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 14:02 ` Eli Zaretskii @ 2021-06-08 16:19 ` Alan Third 2021-06-08 18:09 ` Eli Zaretskii 0 siblings, 1 reply; 29+ messages in thread From: Alan Third @ 2021-06-08 16:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 48902, larsi, salutis, naofumi On Tue, Jun 08, 2021 at 05:02:25PM +0300, Eli Zaretskii wrote: > > Date: Tue, 8 Jun 2021 14:00:17 +0100 > > From: Alan Third <alan@idiocy.org> > > Cc: larsi@gnus.org, naofumi@yasufuku.dev, 48902@debbugs.gnu.org, > > salutis@me.com > > > > > It looks like stringWithLispString encodes into UTF-16? But file > > > names on macOS should be encoded in UTF-8, and in fact > > > allocInitFromFile already does TRT when it calls ENCODE_FILE, just > > > before stringWithLispString is called. So I think the patch is > > > correct. > > > > > > (UTF-16 encoding on macOS is for ENCODE_SYSTEM, right?) > > > > I think you're right. But confusingly initByReferencingFile takes an > > NSString which is a UTF-16 format string, so if I remove all the calls > > to ENCODE_FILE, stringWithLispString works fine. > > > > I guess we just need to make a note that stringWithLispString cannot > > handle UTF-8 encoded filenames, unless someone has a smarter solution. > > If you do need a UTF-16 encoded string, then instead of ENCODE_FILE > you can call code_convert_string_norecord with Qutf_16. There's no > need to invent or use a private UTF-16 encoder there, and you also get > rid of an unnecessary extra UTF-8 encoding as a bonus. In this case the call to ENCODE_FILE in allocInitFromFile is actually redundant because image_find_image_fd already calls ENCODE_FILE on the filename before passing it back. So we get a UTF-8 string no matter what. NSString can read in almost anything, and Mattias extended it to read in multibyte (and ascii) lisp strings, so we don't need a UTF-16 input specifically. It would probably be nice if NSString was also able to recognise that a lisp string is UTF-8 and handle that itself, but I don't think that's really possible, unless we make the assumption that any unibyte string it's passed will already be ascii or UTF-8. I don't know if that's a reasonable assumption. -- Alan Third ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 16:19 ` Alan Third @ 2021-06-08 18:09 ` Eli Zaretskii 2021-06-08 19:24 ` Alan Third 0 siblings, 1 reply; 29+ messages in thread From: Eli Zaretskii @ 2021-06-08 18:09 UTC (permalink / raw) To: Alan Third; +Cc: 48902, larsi, salutis, naofumi > Date: Tue, 8 Jun 2021 17:19:44 +0100 > From: Alan Third <alan@idiocy.org> > Cc: larsi@gnus.org, naofumi@yasufuku.dev, 48902@debbugs.gnu.org, > salutis@me.com > > In this case the call to ENCODE_FILE in allocInitFromFile is actually > redundant because image_find_image_fd already calls ENCODE_FILE on the > filename before passing it back. So we get a UTF-8 string no matter > what. Then why was the code re-encoding t in UTF-16? A bug? > NSString can read in almost anything, and Mattias extended it to read > in multibyte (and ascii) lisp strings, so we don't need a UTF-16 input > specifically. It would probably be nice if NSString was also able to > recognise that a lisp string is UTF-8 and handle that itself, but I > don't think that's really possible, unless we make the assumption that > any unibyte string it's passed will already be ascii or UTF-8. > > I don't know if that's a reasonable assumption. Any file name passed through ENCODE_FILE should be in UTF-8 on macOS, as I understand that's how the macOS filesystems work. Am I mistaken? Can the value of default-file-name-coding-system on macOS be anything other than UTF-8? ^ permalink raw reply [flat|nested] 29+ messages in thread
* bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems 2021-06-08 18:09 ` Eli Zaretskii @ 2021-06-08 19:24 ` Alan Third 0 siblings, 0 replies; 29+ messages in thread From: Alan Third @ 2021-06-08 19:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 48902, larsi, salutis, naofumi On Tue, Jun 08, 2021 at 09:09:51PM +0300, Eli Zaretskii wrote: > > Date: Tue, 8 Jun 2021 17:19:44 +0100 > > From: Alan Third <alan@idiocy.org> > > Cc: larsi@gnus.org, naofumi@yasufuku.dev, 48902@debbugs.gnu.org, > > salutis@me.com > > > > In this case the call to ENCODE_FILE in allocInitFromFile is actually > > redundant because image_find_image_fd already calls ENCODE_FILE on the > > filename before passing it back. So we get a UTF-8 string no matter > > what. > > Then why was the code re-encoding t in UTF-16? A bug? No, sorry, I'm not being clear. The internal format of NSString is UTF-16. We can load practically anything into it, as long as we know ahead of time what the encoding is. > > NSString can read in almost anything, and Mattias extended it to read > > in multibyte (and ascii) lisp strings, so we don't need a UTF-16 input > > specifically. It would probably be nice if NSString was also able to > > recognise that a lisp string is UTF-8 and handle that itself, but I > > don't think that's really possible, unless we make the assumption that > > any unibyte string it's passed will already be ascii or UTF-8. > > > > I don't know if that's a reasonable assumption. > > Any file name passed through ENCODE_FILE should be in UTF-8 on macOS, > as I understand that's how the macOS filesystems work. Am I mistaken? > Can the value of default-file-name-coding-system on macOS be anything > other than UTF-8? Not as far as I'm aware. But NSString is used all over the place in the NS port code base (and all through the toolkit). Any time we pass a string to or from the toolkit it has to be converted to or from NSString. I think most of the actual file access code in Emacs is low level C code which won't go near NSString, though, so it's not worth changing C code to make ObjC code cleaner. I guess I'm just being lazy and would like our extensions to NSString to just DTRT, but it seems that's impractical. :) -- Alan Third ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2021-06-11 22:09 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-06-07 13:32 bug#48902: 28.0.50; Directory names containing apostrophes and backticks cause problems Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-07 14:08 ` Lars Ingebrigtsen 2021-06-07 14:15 ` Eli Zaretskii 2021-06-07 14:24 ` Lars Ingebrigtsen 2021-06-07 14:36 ` Eli Zaretskii 2021-06-07 14:13 ` Eli Zaretskii 2021-06-08 22:21 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-08 10:39 ` naofumi 2021-06-08 11:57 ` Lars Ingebrigtsen 2021-06-08 12:12 ` Alan Third 2021-06-08 12:14 ` Lars Ingebrigtsen 2021-06-08 17:45 ` Mattias Engdegård 2021-06-08 18:18 ` Eli Zaretskii 2021-06-08 19:13 ` naofumi 2021-06-08 20:08 ` Mattias Engdegård 2021-06-08 19:10 ` Alan Third 2021-06-08 19:52 ` Mattias Engdegård 2021-06-08 20:33 ` Alan Third 2021-06-09 11:40 ` Mattias Engdegård 2021-06-09 15:19 ` Alan Third 2021-06-11 22:09 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors 2021-06-09 11:56 ` Eli Zaretskii 2021-06-08 18:17 ` Mattias Engdegård 2021-06-08 12:37 ` Eli Zaretskii 2021-06-08 13:00 ` Alan Third 2021-06-08 14:02 ` Eli Zaretskii 2021-06-08 16:19 ` Alan Third 2021-06-08 18:09 ` Eli Zaretskii 2021-06-08 19:24 ` Alan Third
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).