unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
@ 2013-09-20 16:47 Andreas Politz
  2013-09-20 17:46 ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Politz @ 2013-09-20 16:47 UTC (permalink / raw)
  To: 15426


There seems to be something going wrong with the
encoding/decoding of multibyte filenames from a unibyte buffer in
recursive calls to directory-files.

$ emacs -Q 

(setq d "/tmp/Ä")
"/tmp/Ä"

(make-directory d t)
nil

(toggle-enable-multibyte-characters)
t

(car (directory-files d t))
"/tmp/Ä/."

(car (directory-files (car (directory-files d t)) t))
"/tmp/\301\203\300\204/."





In GNU Emacs 24.3.50.3 (x86_64-unknown-linux-gnu, GTK+ Version 2.20.1)
 of 2013-09-20 on luca
Bzr revision: 114409 xfq.free@gmail.com-20130920102220-y3z14fcjcduk605j
Windowing system distributor `The X.Org Foundation', version 11.0.10707000
System Description:	Debian GNU/Linux 6.0.7 (squeeze)

Important settings:
  value of $LC_COLLATE: C
  value of $LC_MESSAGES: C
  value of $LANG: de_DE.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: DocView

Minor modes in effect:
  diff-auto-refine-mode: t
  doc-view-fixed-scroll-mode: t
  desktop-save-mode: t
  workgroups-mode: t
  ispell-track-input-method: t
  pdf-annot-minor-mode: t
  pdf-history-minor-mode: t
  pdf-outline-minor-mode: t
  pdf-links-minor-mode: t
  pdf-isearch-minor-mode: t
  pdf-misc-tool-bar-minor-mode: t
  pdf-misc-menu-bar-minor-mode: t
  pdf-misc-size-indication-minor-mode: t
  pdf-misc-minor-mode: t
  pdf-info-auto-revert-minor-mode: t
  recentf-mode: t
  show-paren-mode: t
  yas-global-mode: t
  yas-minor-mode: t
  window-numbering-mode: t
  shell-dirtrack-mode: t
  scroll-other-window-mode: t
  savehist-mode: t
  TeX-PDF-mode: t
  ekey-mode: t
  winner-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
C-; s g C-x C-c M-s g g <return> M-g n C-x b <return> 
C-u C-M-x C-x o C-x b C-s q u <return> <return> C-s 
w e SPC <return> f SPC SPC f SPC SPC SPC SPC f SPC 
SPC f SPC SPC SPC f SPC SPC f g C-x o C-M-x C-p C-p 
C-p C-p C-p C-p C-a C-SPC C-n C-n C-n C-w TAB C-M-f 
C-n DEL C-M-\ C-n C-n C-n C-n C-n C-n C-n C-n M-m C-o 
C-y C-p C-p C-p C-p C-k C-k TAB C-n C-n C-n C-2 C-k 
TAB C-e C-M-\ C-p M-b C-M-b C-M-b C-M-b f i l e n a 
m e SPC M-q C-n C-e C-M-\ C-c l C-x o C-x k <return> 
<return> C-x k <return> <return> C-x o C-p M-f C-M-f 
C-M-f SPC <backspace> <backspace> SPC f r o m SPC u 
n i b y t e SPC b u f f e r . M-q C-x C-s C-x o C-x 
o C-p C-p M-m C-a M-f M-f M-b M-b C-M-SPC C-M-SPC C-M-SPC 
C-M-SPC M-w C-x v v C-y DEL SPC w h e n SPC d e l e 
t i n g SPC C-x o C-x o C-a C-y M-y C-x o C-n C-n C-n 
C-n C-n C-n C-n C-n C-n C-n C-SPC C-n C-n C-n M-w C-x 
o C-p C-o C-y C-SPC C-n C-n C-n C-w C-p C-p C-p C-p 
C-n C-SPC M-f M-b C-b C-f M-w C-a M-% C-y <return> 
<return> ! C-p C-p M-q M-f M-f M-f M-f SPC w h e n 
SPC d e l e t i n g SPC c a c h e SPC f i l e s M-q 
C-c C-c C-x o M-x r e p o r t - e m <tab> b u <tab> 
<return>

Recent messages:
Press C-c C-c when you are done editing.
Enter a change comment.  Type C-c C-c when done
Finding changes in /home/politza/.emacs.d/working/pdf-tools/lisp/pdf-util.el...
View mode: type C-h for help, h for commands, q to quit.
Mark set [5 times]
End of buffer
Mark set [2 times]
Replaced 3 occurrences
Checking in /home/politza/.emacs.d/working/pdf-tools/lisp/pdf-util.el...done
Auto-saving...done

Load-path shadows:
/home/politza/.emacs.d/elpa/yasnippet-0.8.0/dropdown-list hides /home/politza/.emacs.d/plugins/yasnippet-0.6.1c/dropdown-list
/home/politza/.emacs.d/elpa/yasnippet-0.8.0/yasnippet hides /home/politza/.emacs.d/plugins/yasnippet-0.6.1c/yasnippet
/home/politza/.emacs.d/plugins/tblc hides /home/politza/.emacs.d/plugins/tblc/tblc
/home/politza/.emacs.d/plugins/haskell-mode/haskell-cabal hides /home/politza/.emacs.d/plugins/haskell/haskell-cabal
/home/politza/.emacs.d/plugins/haskell-mode/haskell-decl-scan hides /home/politza/.emacs.d/plugins/haskell/haskell-decl-scan
/home/politza/.emacs.d/plugins/haskell-mode/haskell-doc hides /home/politza/.emacs.d/plugins/haskell/haskell-doc
/home/politza/.emacs.d/plugins/haskell-mode/ghc-core hides /home/politza/.emacs.d/plugins/haskell/ghc-core
/home/politza/.emacs.d/plugins/haskell-mode/haskell-mode hides /home/politza/.emacs.d/plugins/haskell/haskell-mode
/home/politza/.emacs.d/plugins/haskell-mode/haskell-c hides /home/politza/.emacs.d/plugins/haskell/haskell-c
/home/politza/.emacs.d/plugins/haskell-mode/haskell-indentation hides /home/politza/.emacs.d/plugins/haskell/haskell-indentation
/home/politza/.emacs.d/plugins/haskell-mode/haskell-site-file hides /home/politza/.emacs.d/plugins/haskell/haskell-site-file
/home/politza/.emacs.d/plugins/haskell-mode/haskell-ghci hides /home/politza/.emacs.d/plugins/haskell/haskell-ghci
/home/politza/.emacs.d/plugins/haskell-mode/inf-haskell hides /home/politza/.emacs.d/plugins/haskell/inf-haskell
/home/politza/.emacs.d/plugins/haskell-mode/haskell-indent hides /home/politza/.emacs.d/plugins/haskell/haskell-indent
/home/politza/.emacs.d/plugins/haskell-mode/haskell-hugs hides /home/politza/.emacs.d/plugins/haskell/haskell-hugs
/home/politza/.emacs.d/plugins/haskell-mode/haskell-font-lock hides /home/politza/.emacs.d/plugins/haskell/haskell-font-lock
/home/politza/.emacs.d/plugins/haskell-mode/haskell-simple-indent hides /home/politza/.emacs.d/plugins/haskell/haskell-simple-indent
/home/politza/.emacs.d/plugins/jedi/scratch hides /home/politza/.emacs.d/plugins/ewm/scratch
/home/politza/.emacs.d/elpa/company-0.6.10/.dir-locals hides /home/politza/.emacs.d/plugins/el-get/.dir-locals
/home/politza/.emacs.d/plugins/saveplace hides /home/politza/src/emacs/trunk/lisp/saveplace
/home/politza/.emacs.d/plugins/imenu hides /home/politza/src/emacs/trunk/lisp/imenu
/home/politza/.emacs.d/plugins/term hides /home/politza/src/emacs/trunk/lisp/term
/home/politza/.emacs.d/plugins/python/python/python hides /home/politza/src/emacs/trunk/lisp/progmodes/python
/home/politza/.emacs.d/elpa/company-0.6.10/.dir-locals hides /home/politza/src/emacs/trunk/lisp/gnus/.dir-locals
/home/politza/.emacs.d/plugins/python/python/sym-comp hides /home/politza/src/emacs/trunk/lisp/obsolete/sym-comp
/home/politza/.emacs.d/plugins/matlab/matlab hides /usr/share/emacs-snapshot/site-lisp/emacs-goodies-el/matlab
/home/politza/.emacs.d/plugins/boxquote hides /usr/share/emacs-snapshot/site-lisp/emacs-goodies-el/boxquote
/home/politza/.emacs.d/plugins/bm hides /usr/share/emacs-snapshot/site-lisp/emacs-goodies-el/bm
/home/politza/.emacs.d/plugins/haskell-mode/haskell-decl-scan hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-decl-scan
/home/politza/.emacs.d/plugins/haskell-mode/haskell-c hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-c
/home/politza/.emacs.d/plugins/haskell-mode/haskell-ghci hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-ghci
/home/politza/.emacs.d/plugins/haskell-mode/haskell-doc hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-doc
/home/politza/.emacs.d/plugins/haskell-mode/haskell-indent hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-indent
/home/politza/.emacs.d/plugins/haskell-mode/haskell-mode hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-mode
/home/politza/.emacs.d/plugins/haskell-mode/haskell-hugs hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-hugs
/home/politza/.emacs.d/plugins/haskell-mode/haskell-site-file hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-site-file
/home/politza/.emacs.d/plugins/haskell-mode/haskell-cabal hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-cabal
/home/politza/.emacs.d/plugins/haskell-mode/inf-haskell hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/inf-haskell
/home/politza/.emacs.d/plugins/haskell-mode/haskell-font-lock hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-font-lock
/home/politza/.emacs.d/plugins/haskell-mode/haskell-simple-indent hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-simple-indent
/home/politza/.emacs.d/plugins/haskell-mode/haskell-indentation hides /usr/share/emacs-snapshot/site-lisp/haskell-mode/haskell-indentation

Features:
(shadow sort gnus-cite bbdb-message mail-extr nnir emacsbug sendmail
diff-mode log-edit pcvs-util add-log nndraft nnmh utf-7 network-stream
starttls nnfolder bbdb-gnus nnnil gnus-agent gnus-srvr gnus-score
score-mode nnvirtual gnus-msg gnus-art mm-uu mml2015 epg-config mm-view
mml-smime smime dig mailcap nntp gnus-cache debug ibuf-ext misearch
multi-isearch dired-aux vc-dir ewoc vc semantic/format ezimage
semantic/tag-ls semantic/ctxt semantic/dep semantic/find
semantic/wisent/python-wy python-21 python sym-comp org-wl org-w3m
org-vm org-rmail org-mhe org-mew org-irc org-jsinfo org-infojs org-html
org-exp ob-exp org-exp-blocks org-agenda org-info org-gnus org-docview
org-bibtex bibtex org-bbdb conf-mode tex-buf reftex-dcr reftex-auc
font-latex dired-eshell vc-git doc-view-fixed-scroll pdftk-outline
vc-bzr vc-dispatcher vc-svn cc-langs cc-mode cc-fonts cc-guess cc-menus
cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
emacs-customizations nogroup-customizations wp-customizations
view-customizations tex-customizations reftex-customizations
reftex-miscellaneous-configurations-customizations
reftex-label-support-customizations
reftex-referencing-labels-customizations
reftex-defining-label-environments-customizations AUCTeX-customizations
preview-customizations preview-latex-customizations
preview-appearance-customizations TeX-parse-customizations
TeX-file-customizations TeX-command-customizations
TeX-view-customizations LaTeX-customizations LaTeX-macro-customizations
LaTeX-math-customizations LaTeX-indentation-customizations
table-customizations table-hooks-customizations outlines-customizations
programming-customizations tools-customizations vc-customizations
log-edit-customizations semantic-customizations makefile-customizations
etags-customizations ediff-customizations diff-customizations
diff-mode-customizations languages-customizations elpy-customizations
matlab-customizations sh-customizations python-customizations rx
haskell-customizations c-customizations asm-customizations
multimedia-customizations image-customizations pcase help-customizations
ekey-customizations info-lookup-customizations info-customizations
customize-customizations custom-buffer-customizations
apropos-customizations files-customizations uniquify-customizations
uniquify sunrise-customizations recentf-customizations
find-file-customizations backup-customizations faces-customizations
highlight-symbol-customizations font-lock-customizations
hi-lock-customizations facemenu-customizations external-customizations
server-customizations processes-customizations shell-customizations
proced-customizations gud-customizations tooltip-customizations
grep-customizations compilation-customizations next-error-customizations
comint-customizations SQL-customizations man-customizations
environment-customizations xterm-customizations windows-customizations
winner-customizations minibuffer-customizations savehist-customizations
completion-spelling lib-string menu-customizations
keyboard-customizations chistory-customizations
initialization-customizations frames-customizations
ediff-window-customizations desktop-customizations desktop frameset
dired-customizations dired-x-customizations dired-x
dired-details-customizations editing-customizations
yasnippet-customizations paragraphs-customizations
matching-customizations paren-matching-customizations
paren-showing-customizations isearch-customizations
bookmark-customizations killing-customizations indent-customizations
fill-customizations emulations-customizations
editing-basics-customizations development-customizations
lisp-customizations re-builder-customizations
inferior-lisp-customizations ielm-customizations ert-customizations
edebug-customizations bytecomp-customizations advice-customizations
internal-customizations alloc-customizations extensions-customizations
eldoc-customizations cust-print-customizations data-customizations
save-place-customizations convenience-customizations
diminish-customizations diminish iedit-customizations
imenu-tree-customizations tags-tree-customizations
company-customizations workgroups-customizations workgroups bookmark pp
window-numbering-customizations pabbrev-customizations
kmacro-customizations imenu-customizations ibuffer-customizations
ibuf-macs hl-line-customizations hippie-expand-customizations
file-cache-customizations ffap-customizations completion-customizations
iswitchb-customizations browse-kill-ring-customizations
auto-revert-customizations auto-insert-customizations
Buffer-menu-customizations comm-customizations tramp-customizations
browse-url-customizations applications-customizations
mediawiki-customizations w3m-customizations package-customizations
mail-customizations bbdb-customizations bbdb-sendmail-customizations
bbdb-mua-customizations bbdb-mua bbdb-com crm bbdb
smtpmail-customizations sendmail-customizations gnus-customizations
nnmail-customizations nnmail-split-customizations
gnus-summary-customizations gnus-thread-customizations
gnus-summary-various-customizations gnus-summary-sort-customizations
gnus-summary-marks-customizations
gnus-summary-maneuvering-customizations
gnus-summary-format-customizations parse-time-rfc2822
gnus-summary-exit-customizations gnus-sum gnus-group gnus-undo
gnus-start gnus-spec gnus-win gnus-start-customizations
gnus-server-customizations gnus-message-customizations
message-customizations message-various-customizations
message-sending-customizations message-buffers-customizations
gnus-group-customizations gnus-group-visual-customizations
gnus-nnimap-format nnimap nnmail gnus-int mail-source message rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums mailabbrev gmm-utils mailheader parse-time tls utf7 netrc
nnoo gnus gnus-ems nnheader mail-utils gnus-group-various-customizations
gnus-group-select-customizations gnus-files-customizations
gnus-newsrc-customizations gnus-exit-customizations
gnus-article-customizations gnus-article-hiding-customizations
ispell-customizations eshell-customizations eshell-module-customizations
eshell-smart-customizations eshell-hist-customizations
eshell-mode-customizations edebug doc-view-customizations
pdf-tools-customizations pdf-annot-customizations
pdf-links-customizations pdf-isearch-customizations pdf-annot tablist
tablist-filter semantic/wisent/comp semantic/wisent
semantic/wisent/wisent semantic/util-modes semantic/util semantic
semantic/tag semantic/lex semantic/fw mode-local cedet pdf-occur
pdf-history pdf-outline pdf-links pdf-isearch pdf-misc imenu pdf-info tq
pdf-render pdf-tools pdf-util gnus-range warnings doc-view jka-compr
image-mode calendar-customizations org-customizations
org-structure-customizations org-plain-lists-customizations
org-edit-structure-customizations org-startup-customizations
org-link-customizations org-latex-customizations
org-appearance-customizations holidays-customizations
calculator-customizations calc-customizations server recentf tree-widget
.autoload paren yasnippet dropdown-list help-mode window-numbering w3m
browse-url timezone w3m-hist w3m-e23 w3m-ccl ccl w3m-fsf w3m-favicon
w3m-image w3m-proc w3m-util view tramp tramp-compat tramp-loaddefs
trampver shell track-last-window scroll-other-window saveplace savehist
reftex reftex-vars pabbrev org ob-tangle ob-ref ob-lob ob-table
org-footnote org-src ob-comint ob-keys org-pcomplete org-list org-faces
org-entities noutline outline org-version ob-emacs-lisp ob org-compat
org-macs ob-eval org-loaddefs format-spec find-func cal-menu calendar
cal-loaddefs lib-edit lib-window lib-isearch lib-buffer reveal iswitchb
lib-basic lib-lispext latex easy-mmode tex-style tex dbus xml tex-site
auto-loads info-look info ibuffer hippie-exp grep compile filecache
edit-minibuffer eldoc-eval pcomplete esh-var esh-io esh-cmd esh-opt
esh-ext esh-proc esh-arg esh-groups eshell esh-module esh-mode esh-util
ekey assoc dired-details+ dired dired-details cool-prefix-bindings
winner lib-kbd comint-history comint ansi-color ring browse-kill-ring
advice anticus edmacro kmacro derived cl-macs gv ffap thingatpt
url-parse auth-source eieio byte-opt bytecomp byte-compile cconv
eieio-core gnus-util mm-util mail-prsvr password-cache url-vars eldoc
help-fns cus-edit easymenu cus-start cus-load wid-edit cl cl-loaddefs
cl-lib bbdb-loaddefs cl-format-autoloads eldoc-eval-autoloads
yasnippet-autoloads package time-date tooltip ediff-hook vc-hooks
lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt
fringe tabulated-list newcomment lisp-mode prog-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core frame cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dbusbind
gfilenotify dynamic-setting system-font-setting font-render-setting
move-toolbar gtk x-toolkit x multi-tty emacs)





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 16:47 bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer Andreas Politz
@ 2013-09-20 17:46 ` Eli Zaretskii
  2013-09-20 18:51   ` Andreas Politz
  2013-09-20 19:15   ` Stefan Monnier
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-20 17:46 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 15426

> From: Andreas Politz <politza@hochschule-trier.de>
> Date: Fri, 20 Sep 2013 18:47:54 +0200
> 
> There seems to be something going wrong with the
> encoding/decoding of multibyte filenames from a unibyte buffer in
> recursive calls to directory-files.
> 
> $ emacs -Q 
> 
> (setq d "/tmp/Ä")
> "/tmp/Ä"
> 
> (make-directory d t)
> nil
> 
> (toggle-enable-multibyte-characters)
> t
> 
> (car (directory-files d t))
> "/tmp/Ä/."
> 
> (car (directory-files (car (directory-files d t)) t))
> "/tmp/\301\203\300\204/."

Don't do that: inserting multibyte strings into a unibyte buffer
changes the representation of the characters in the string, so you get
a unibyte string.  Unibyte buffers should only ever hold encoded text
or binary data.

Why did you need to do something like that, and in what real-life use
case?





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 17:46 ` Eli Zaretskii
@ 2013-09-20 18:51   ` Andreas Politz
  2013-09-20 19:08     ` Eli Zaretskii
  2013-09-20 19:15   ` Stefan Monnier
  1 sibling, 1 reply; 16+ messages in thread
From: Andreas Politz @ 2013-09-20 18:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426

Eli Zaretskii <eliz@gnu.org> writes:

> Why did you need to do something like that, and in what real-life use
> case?

By default buffer for PDF files are in unibyte mode.  I use
doc-view-mode and store some custom data in some directory below it's
cache directory, which I want to remove, when the buffer gets killed.
The removing of this directory (e.g. .doc-view/FILENAME-83432/data/) is
done with delete-directory, but fails, if FILENAME contains multibyte
character (see below).  So I am actually not inserting anything into a
buffer.

Debugger entered--Lisp error: (file-error "Removing old name" "no such file or directory" "/home/politza/.emacs.d/.doc-view-cache/\301\203\300\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file/b5366a2d2ac98dae978423083f8b09e5cddc705d.png")
  delete-file("/home/politza/.emacs.d/.doc-view-cache/\301\203\300\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file/b5366a2d2ac98dae978423083f8b09e5cddc705d.png" nil)
  #[257 "\301.!@\302=\203.\0\303.\300\304#\207\305.\304\"\207" [t file-attributes t delete-directory nil delete-file] 5 "\n\n(fn FILE)"]("/home/politza/.emacs.d/.doc-view-cache/\301\203\300\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file/b5366a2d2ac98dae978423083f8b09e5cddc705d.png")
  mapc(#[257 "\301.!@\302=\203.\0\303.\300\304#\207\305.\304\"\207" [t file-attributes t delete-directory nil delete-file] 5 "\n\n(fn FILE)"] ("/home/politza/.emacs.d/.doc-view-cache/\301\203\300\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file/b5366a2d2ac98dae978423083f8b09e5cddc705d.png"))
  delete-directory("/home/politza/.emacs.d/.doc-view-cache/\303\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file" t nil)
  #[257 "\301.!@\302=\203.\0\303.\300\304#\207\305.\304\"\207" [t file-attributes t delete-directory nil delete-file] 5 "\n\n(fn FILE)"]("/home/politza/.emacs.d/.doc-view-cache/\303\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file")
  mapc(#[257 "\301.!@\302=\203.\0\303.\300\304#\207\305.\304\"\207" [t file-attributes t delete-directory nil delete-file] 5 "\n\n(fn FILE)"] ("/home/politza/.emacs.d/.doc-view-cache/\303\204.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/pdf-render-temp-file"))
  delete-directory("/home/politza/.emacs.d/.doc-view-cache/Ä.pdf-52cb45a94cb1cf895aea4dba33da58be/.pdf-util-cache/" t)
  (progn (delete-directory dir t))
  (if (and dir (file-exists-p dir)) (progn (delete-directory dir t)))
  (let ((dir (pdf-util-cache--get-root-dir))) (if (and dir (file-exists-p dir)) (progn (delete-directory dir t))))
  pdf-util-cache-clear-all()
  kill-buffer("Ä.pdf")
  call-interactively(kill-buffer nil nil)
  command-execute(kill-buffer)

-ap





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 18:51   ` Andreas Politz
@ 2013-09-20 19:08     ` Eli Zaretskii
  0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-20 19:08 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 15426

> From: Andreas Politz <politza@hochschule-trier.de>
> Cc: 15426@debbugs.gnu.org
> Date: Fr, 20 Sep 2013 20:51:09 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Why did you need to do something like that, and in what real-life use
> > case?
> 
> By default buffer for PDF files are in unibyte mode.  I use
> doc-view-mode and store some custom data in some directory below it's
> cache directory, which I want to remove, when the buffer gets killed.
> The removing of this directory (e.g. .doc-view/FILENAME-83432/data/) is
> done with delete-directory, but fails, if FILENAME contains multibyte
> character (see below).  So I am actually not inserting anything into a
> buffer.

Well, I think somehow your code converts a multibyte string into a
unibyte one.  If you cannot figure out how that happens (e.g., by
stepping with Edebug through the code), perhaps show more of your code
here.  What you sent is backtrace full of byte-compiled code, which is
very hard to interpret.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 17:46 ` Eli Zaretskii
  2013-09-20 18:51   ` Andreas Politz
@ 2013-09-20 19:15   ` Stefan Monnier
  2013-09-20 19:17     ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2013-09-20 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426, Andreas Politz

> Don't do that: inserting multibyte strings into a unibyte buffer
> changes the representation of the characters in the string, so you get
> a unibyte string.  Unibyte buffers should only ever hold encoded text
> or binary data.

AFAICT his recipe does not involve inserting any string anywhere.


        Stefan





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 19:15   ` Stefan Monnier
@ 2013-09-20 19:17     ` Eli Zaretskii
  2013-09-20 20:56       ` Andreas Politz
  0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-20 19:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 15426, politza

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Andreas Politz <politza@hochschule-trier.de>,  15426@debbugs.gnu.org
> Date: Fri, 20 Sep 2013 15:15:44 -0400
> 
> > Don't do that: inserting multibyte strings into a unibyte buffer
> > changes the representation of the characters in the string, so you get
> > a unibyte string.  Unibyte buffers should only ever hold encoded text
> > or binary data.
> 
> AFAICT his recipe does not involve inserting any string anywhere.

Perhaps the recipe should be described in more detail, then.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 19:17     ` Eli Zaretskii
@ 2013-09-20 20:56       ` Andreas Politz
  2013-09-21  6:48         ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Politz @ 2013-09-20 20:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Andreas Politz <politza@hochschule-trier.de>,  15426@debbugs.gnu.org
>> Date: Fri, 20 Sep 2013 15:15:44 -0400
>> 
>> > Don't do that: inserting multibyte strings into a unibyte buffer
>> > changes the representation of the characters in the string, so you get
>> > a unibyte string.  Unibyte buffers should only ever hold encoded text
>> > or binary data.
>> 
>> AFAICT his recipe does not involve inserting any string anywhere.
>
> Perhaps the recipe should be described in more detail, then.


Here is another recipe, maybe more to the point:

-------------------->8-------------------------------------
;; -*- coding: binary -*-

(let ((d "/tmp/\303\204")) ;; utf-8 for german umlaut "A 
  (when (file-exists-p d)
    (delete-directory d t))
  (make-directory d)
  (append
   (list (car (directory-files d t)) 
         (file-exists-p (car (directory-files d t))))
   ;; switch to a multibyte buffer
   (with-temp-buffer
     (list (car (directory-files d t))
	   (file-exists-p (car (directory-files d t)))))))
--------------------8<-------------------------------------

If I save this somewhere (/tmp/foo.el), do

$ LC_ALL=C emacs -Q /tmp/foo.el

and evaluate it with C-x C-e, the minibuffer displays

=> ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)
 
.

I hope that clarifies it.

-ap







^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-20 20:56       ` Andreas Politz
@ 2013-09-21  6:48         ` Eli Zaretskii
  2013-09-21  9:35           ` Andreas Politz
  2013-09-21 16:06           ` Stefan Monnier
  0 siblings, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-21  6:48 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 15426

> From: Andreas Politz <politza@hochschule-trier.de>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  15426@debbugs.gnu.org
> Date: Fri, 20 Sep 2013 22:56:22 +0200
> 
> (let ((d "/tmp/\303\204")) ;; utf-8 for german umlaut "A 

This makes d a unibyte string:

  (setq d "/tmp/\303\204")
  "/tmp/\303\204"

  (multibyte-string-p d)
    => nil

Why would one do such a thing in the first place?  Are any of the file
names involved in your real-life use case unibyte strings that include
bytes above 127?  If there are, I suggest to find out how did they
come into existence -- that might be the source of your trouble.

Handling of unibyte strings in Emacs is optimized for certain use
cases, certainly not those that manipulate file names on the Lisp
level.  I suggest to stay away of unibyte strings as non-ASCII file
names, unless you really must (which normally is only necessary if you
need to encode and decode file names by hand, like when you get them
from some program, and the encoding of process output is different
from the encoding of file names on your system).  Otherwise, Lisp code
should only ever manipulate file names with non-ASCII characters that
are multibyte strings.

>   (when (file-exists-p d)
>     (delete-directory d t))
>   (make-directory d)
>   (append
>    (list (car (directory-files d t)) 
>          (file-exists-p (car (directory-files d t))))
>    ;; switch to a multibyte buffer
>    (with-temp-buffer
>      (list (car (directory-files d t))
> 	   (file-exists-p (car (directory-files d t)))))))
> --------------------8<-------------------------------------
> 
> If I save this somewhere (/tmp/foo.el), do
> 
> $ LC_ALL=C emacs -Q /tmp/foo.el
> 
> and evaluate it with C-x C-e, the minibuffer displays
> 
> => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)

"The minibuffer displays" is the key point here: to display anything
in the minibuffer or echo area, Emacs first _inserts_ the textual
representation of that thing into a buffer, and then triggers
redisplay.  Insertion of unibyte strings into a multibyte buffer, or
insertion of multibyte strings into the minibuffer when the current
buffer is unibyte, causes all kinds of transformations on the inserted
string, whose purpose is to intuit what the user expects to see.  What
you see is the result of those transformations.  And yes, that result
could be baffling at times; that's why I suggest to stay away of
unibyte strings as much as you can, certainly as long as those strings
are file names with non-ASCII characters.

Again, I suggest to figure out if and how did you get unibyte strings
as file names in your original use case.

> I hope that clarifies it.

Sorry, it does not.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21  6:48         ` Eli Zaretskii
@ 2013-09-21  9:35           ` Andreas Politz
  2013-09-21  9:38             ` Andreas Politz
  2013-09-21 11:59             ` Eli Zaretskii
  2013-09-21 16:06           ` Stefan Monnier
  1 sibling, 2 replies; 16+ messages in thread
From: Andreas Politz @ 2013-09-21  9:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426

[-- Attachment #1: Type: text/plain, Size: 379 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andreas Politz <politza@hochschule-trier.de>
>> 
>> (let ((d "/tmp/\303\204")) ;; utf-8 for german umlaut "A 
>
> This makes d a unibyte string:
>
>   (setq d "/tmp/\303\204")
>   "/tmp/\303\204"
>
>   (multibyte-string-p d)
>     => nil
>
> Why would one do such a thing in the first place?

OK.  The thing is that Emacs does it.


[-- Attachment #2: multibyte-directory-list.el --]
[-- Type: application/emacs-lisp, Size: 747 bytes --]

[-- Attachment #3: Type: text/plain, Size: 562 bytes --]


If I save this in mb-dir/foo.el, where mb-dir is a directory containing
multi-bytes, the results (d1 and d2) of the same calls to
`directory-list' are different in the uni-byte and multi-byte buffer. It
seems that the 2 byte sequences of the UTF-8 characters are replaced by
some 4 bytes. Anyway, the resulting filename d2 names a non-existent
file.

> "The minibuffer displays" is the key point here:[...]

No, the key is that the file's existence depends on the buffer's
multi-byte status, in which the code is evaluated.

>
>> I hope that clarifies it.

-ap

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21  9:35           ` Andreas Politz
@ 2013-09-21  9:38             ` Andreas Politz
  2013-09-21 11:59             ` Eli Zaretskii
  1 sibling, 0 replies; 16+ messages in thread
From: Andreas Politz @ 2013-09-21  9:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426

Andreas Politz <politza@hochschule-trier.de> writes:

> Anyway, the resulting filename d2 names a non-existent file.

Sorry, I meant d1.

-ap





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21  9:35           ` Andreas Politz
  2013-09-21  9:38             ` Andreas Politz
@ 2013-09-21 11:59             ` Eli Zaretskii
  2013-09-21 17:12               ` Andreas Politz
  1 sibling, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-21 11:59 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 15426

> From: Andreas Politz <politza@hochschule-trier.de>
> Cc: monnier@iro.umontreal.ca,  15426@debbugs.gnu.org
> Date: Sat, 21 Sep 2013 11:35:52 +0200
> 
> If I save this in mb-dir/foo.el, where mb-dir is a directory containing
> multi-bytes, the results (d1 and d2) of the same calls to
> `directory-list' are different in the uni-byte and multi-byte buffer.

For the record, a much simpler test case is this:

  M-: (multibyte-string-p (car (directory-files default-directory t))) RET

invoked from the unibyte buffer that visits your mb-dir/foo.el.  Note
that default-directory is a multibyte string, as shown by calling
multibyte-string-p on it.  So the problem happens inside the
directory-files call.

(You should never trust what the echo area shows when potentially
unibyte strings are involved, always use multibyte-string-p to tell if
a string is multibyte or unibyte.)

The bug that caused this should be fixed now in revision 114421 on the
trunk.

> It seems that the 2 byte sequences of the UTF-8 characters are
> replaced by some 4 bytes.

That's how Emacs represents raw bytes internally in a multibyte
buffer, so that "expansion" is a clear sign of a unibyte string.

> Anyway, the resulting filename d2 names a non-existent file.

Because encoding a unibyte string with raw bytes in their internal
representation will never get you the right file name.

> > "The minibuffer displays" is the key point here:[...]
> 
> No, the key is that the file's existence depends on the buffer's
> multi-byte status, in which the code is evaluated.

The truth is neither (although what the minibuffer displays in these
cases can easily fool you, so don't trust it).  The truth was that the
code in a subroutine of directory-files, when it is called with its
second argument non-nil, incorrectly marked the full file name it
produced as a unibyte string.  The rest was the consequence of that.

Thanks.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21  6:48         ` Eli Zaretskii
  2013-09-21  9:35           ` Andreas Politz
@ 2013-09-21 16:06           ` Stefan Monnier
  2013-09-21 16:26             ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2013-09-21 16:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426, Andreas Politz

>> => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)
> "The minibuffer displays" is the key point here:

No, the key point is that the two strings should be identical because
they are the return value of the exact same code which doesn't touch the
current buffer, yes once it's run in a unibyte buffer and the second
time it's run in multibyte buffer.  So the current-buffer's multibyte
setting somehow affects the directory-files function.  That's the bug.


        Stefan





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21 16:06           ` Stefan Monnier
@ 2013-09-21 16:26             ` Eli Zaretskii
  2013-09-22  1:29               ` Stefan Monnier
  0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-21 16:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 15426, politza

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Andreas Politz <politza@hochschule-trier.de>,  15426@debbugs.gnu.org
> Date: Sat, 21 Sep 2013 12:06:10 -0400
> 
> >> => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)
> > "The minibuffer displays" is the key point here:
> 
> No, the key point is that the two strings should be identical

But you don't see the strings, except after they are inserted into
some buffer.

> So the current-buffer's multibyte setting somehow affects the
> directory-files function.  That's the bug.

Yes.  But the way to show the bug is not to display the strings, but
to pass them to multibyte-string-p, or some other function whose
output's display cannot be possibly affected by multibyte-ness.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21 11:59             ` Eli Zaretskii
@ 2013-09-21 17:12               ` Andreas Politz
  2013-09-21 18:53                 ` Eli Zaretskii
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Politz @ 2013-09-21 17:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426

Eli Zaretskii <eliz@gnu.org> writes:

> For the record, a much simpler test case is this:
>
>   M-: (multibyte-string-p (car (directory-files default-directory t))) RET
>

Ok, I didn't know about this invariant.

> [...]  So the problem happens inside the directory-files call.

That's why I put it's name in the title and description.

> (You should never trust what the echo area shows when potentially
> unibyte strings are involved, always use multibyte-string-p to tell if
> a string is multibyte or unibyte.)

I think it's reasonable to assume, that two strings have different
contents, if they display differently in the same buffer.

>> > "The minibuffer displays" is the key point here:[...]
>> 
>> No, the key is that the file's existence depends on the buffer's
>> multi-byte status, in which the code is evaluated.
>
> The truth is neither [...]

I guess we can agree on the key, that you seemed to have solved this
problem.

-ap





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21 17:12               ` Andreas Politz
@ 2013-09-21 18:53                 ` Eli Zaretskii
  0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2013-09-21 18:53 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 15426-done

> From: Andreas Politz <politza@hochschule-trier.de>
> Cc: monnier@iro.umontreal.ca,  15426@debbugs.gnu.org
> Date: Sat, 21 Sep 2013 19:12:48 +0200
> 
> I guess we can agree on the key, that you seemed to have solved this
> problem.

Right, so I'm closing the bug.

Thanks.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
  2013-09-21 16:26             ` Eli Zaretskii
@ 2013-09-22  1:29               ` Stefan Monnier
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Monnier @ 2013-09-22  1:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15426, politza

>> >> => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)
>> > "The minibuffer displays" is the key point here:
>> No, the key point is that the two strings should be identical
> But you don't see the strings, except after they are inserted into
> some buffer.

That's OK.  Since they print differently, we know they're different,
which is all that mattered.
Anyway, thanks for finding and fixing the bug,


        Stefan





^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-09-22  1:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-20 16:47 bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer Andreas Politz
2013-09-20 17:46 ` Eli Zaretskii
2013-09-20 18:51   ` Andreas Politz
2013-09-20 19:08     ` Eli Zaretskii
2013-09-20 19:15   ` Stefan Monnier
2013-09-20 19:17     ` Eli Zaretskii
2013-09-20 20:56       ` Andreas Politz
2013-09-21  6:48         ` Eli Zaretskii
2013-09-21  9:35           ` Andreas Politz
2013-09-21  9:38             ` Andreas Politz
2013-09-21 11:59             ` Eli Zaretskii
2013-09-21 17:12               ` Andreas Politz
2013-09-21 18:53                 ` Eli Zaretskii
2013-09-21 16:06           ` Stefan Monnier
2013-09-21 16:26             ` Eli Zaretskii
2013-09-22  1:29               ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).