* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files @ 2014-12-16 15:21 Tassilo Horn 2014-12-16 16:05 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-16 15:21 UTC (permalink / raw) To: 19393 I've dowloaded the following file ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz which contains all movies known to the international movie database (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or unzip it first) and then do M-x describe-coding-system I can see that it is "t -- raw-text-unix". As a result of this, the last movie in that file is displayed as "\374\347 (2012) 2012". However, according to the `file' command, the file is plain ISO-8859. And I can easily convert it to UTF-8 using % iconv -f ISO-8859-15 -t UTF-8 < movies.list > movies.list.utf8 without any encoding errors being reported. Emacs can guess the encoding of the resulting UTF-8 encoded file movies.list.utf8, i.e., the coding system when opening the file is "U -- utf-8-unix". Emacs shows the last movie as "üç (2012) 2012" which is correct. I also tried % iconv -f ISO-8859-15 -t ISO-8859-15 < movies.list > movies.list.iso-8859 but for the result file movies.list.iso-8859 the same issue as for the original file applies, i.e., Emacs uses the encoding "t -- raw-text-unix" and displays garbage for all non-ASCII characters. I also can't force Emacs to use ISO-8859 for that or the original file. `C-x RET f iso-8859-15 RET' results in a query that certain characters cannot be encoded using latin-9, e.g., \374 and \347, and I'm expected to choose another encoding. So `file' and `iconv' say the file is valid latin-9 but Emacs seems to disagree. Who is correct? I tend towards file/iconv but I might be wrong. And shouldn't it be possible to force Emacs to a certain coding system? I mean, even if a file's content has a broken encoding, e.g., coding X in part A, coding Y in part B, I might want to switch to X in order to be able to read part A at all. (Ok, in that case I should get a big fat warning that saving the buffer will corrupt the file even more. Or maybe the buffer should become read-only...) The issue can be reproduced also with the other IMDb files containing non-ASCII chars, e.g., actors.list.gz, actresses.list.gz, etc. They are all available in the FTP directory above. In GNU Emacs 25.0.50.10 (x86_64-unknown-linux-gnu, GTK+ Version 3.14.5) of 2014-12-16 on thinkpad-t440p Repository revision: 15426191a1353ac208d8ebe4a5920228e0df41a4 Windowing system distributor `The X.Org Foundation', version 11.0.11602901 System Description: Arch Linux Configured features: XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS NOTIFY ACL GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB Important settings: value of $LC_MONETARY: de_DE.utf8 value of $LC_NUMERIC: de_DE.utf8 value of $LC_TIME: de_DE.utf8 value of $LANG: en_US.utf8 locale-coding-system: utf-8-unix Major mode: Group Minor modes in effect: TeX-PDF-mode: t TeX-source-correlate-mode: t diff-auto-refine-mode: t gnus-topic-mode: t hl-line-mode: t global-company-mode: t global-aggressive-indent-mode: t gnus-undo-mode: t global-edit-server-edit-mode: t recentf-mode: t shell-dirtrack-mode: t helm-match-plugin-mode: t helm-occur-match-plugin-mode: t global-subword-mode: t subword-mode: t savehist-mode: t show-paren-mode: t icomplete-mode: t minibuffer-depth-indicate-mode: t electric-pair-mode: t tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t column-number-mode: t line-number-mode: t Recent messages: Buffer dictionary was nil Ispell process killed Local Ispell dictionary set to en Buffer dictionary is now en Starting new Ispell process /usr/bin/aspell with en dictionary... Checking region... Spell Checking...100% [diss] Spell Checking completed. Quit Auto-saving... Load-path shadows: ~/Repos/el/auctex/lpath hides ~/Repos/el/gnus/lisp/lpath ~/Repos/el/gnus/lisp/md4 hides /home/horn/Repos/el/emacs/lisp/md4 ~/Repos/el/gnus/lisp/color hides /home/horn/Repos/el/emacs/lisp/color ~/Repos/el/gnus/lisp/format-spec hides /home/horn/Repos/el/emacs/lisp/format-spec ~/Repos/el/gnus/lisp/password-cache hides /home/horn/Repos/el/emacs/lisp/password-cache ~/Repos/el/gnus/lisp/hex-util hides /home/horn/Repos/el/emacs/lisp/hex-util ~/Repos/el/gnus/lisp/dns-mode hides /home/horn/Repos/el/emacs/lisp/textmodes/dns-mode /home/horn/.emacs.d/elpa/org-20141215/ob-plantuml hides /home/horn/Repos/el/emacs/lisp/org/ob-plantuml /home/horn/.emacs.d/elpa/org-20141215/org-archive hides /home/horn/Repos/el/emacs/lisp/org/org-archive /home/horn/.emacs.d/elpa/org-20141215/org-w3m hides /home/horn/Repos/el/emacs/lisp/org/org-w3m /home/horn/.emacs.d/elpa/org-20141215/ox-org hides /home/horn/Repos/el/emacs/lisp/org/ox-org /home/horn/.emacs.d/elpa/org-20141215/ob hides /home/horn/Repos/el/emacs/lisp/org/ob /home/horn/.emacs.d/elpa/org-20141215/org-faces hides /home/horn/Repos/el/emacs/lisp/org/org-faces /home/horn/.emacs.d/elpa/org-20141215/ob-awk hides /home/horn/Repos/el/emacs/lisp/org/ob-awk /home/horn/.emacs.d/elpa/org-20141215/org-habit hides /home/horn/Repos/el/emacs/lisp/org/org-habit /home/horn/.emacs.d/elpa/org-20141215/ob-sass hides /home/horn/Repos/el/emacs/lisp/org/ob-sass /home/horn/.emacs.d/elpa/org-20141215/org-ctags hides /home/horn/Repos/el/emacs/lisp/org/org-ctags /home/horn/.emacs.d/elpa/org-20141215/ob-screen hides /home/horn/Repos/el/emacs/lisp/org/ob-screen /home/horn/.emacs.d/elpa/org-20141215/ox-md hides /home/horn/Repos/el/emacs/lisp/org/ox-md /home/horn/.emacs.d/elpa/org-20141215/ox-beamer hides /home/horn/Repos/el/emacs/lisp/org/ox-beamer /home/horn/.emacs.d/elpa/org-20141215/org-loaddefs hides /home/horn/Repos/el/emacs/lisp/org/org-loaddefs /home/horn/.emacs.d/elpa/org-20141215/ob-perl hides /home/horn/Repos/el/emacs/lisp/org/ob-perl /home/horn/.emacs.d/elpa/org-20141215/org-rmail hides /home/horn/Repos/el/emacs/lisp/org/org-rmail /home/horn/.emacs.d/elpa/org-20141215/org-id hides /home/horn/Repos/el/emacs/lisp/org/org-id /home/horn/.emacs.d/elpa/org-20141215/ox-publish hides /home/horn/Repos/el/emacs/lisp/org/ox-publish /home/horn/.emacs.d/elpa/org-20141215/ob-maxima hides /home/horn/Repos/el/emacs/lisp/org/ob-maxima /home/horn/.emacs.d/elpa/org-20141215/org-install hides /home/horn/Repos/el/emacs/lisp/org/org-install /home/horn/.emacs.d/elpa/org-20141215/org-feed hides /home/horn/Repos/el/emacs/lisp/org/org-feed /home/horn/.emacs.d/elpa/org-20141215/ob-R hides /home/horn/Repos/el/emacs/lisp/org/ob-R /home/horn/.emacs.d/elpa/org-20141215/ox-latex hides /home/horn/Repos/el/emacs/lisp/org/ox-latex /home/horn/.emacs.d/elpa/org-20141215/org-timer hides /home/horn/Repos/el/emacs/lisp/org/org-timer /home/horn/.emacs.d/elpa/org-20141215/ob-core hides /home/horn/Repos/el/emacs/lisp/org/ob-core /home/horn/.emacs.d/elpa/org-20141215/org-datetree hides /home/horn/Repos/el/emacs/lisp/org/org-datetree /home/horn/.emacs.d/elpa/org-20141215/ob-sql hides /home/horn/Repos/el/emacs/lisp/org/ob-sql /home/horn/.emacs.d/elpa/org-20141215/ob-js hides /home/horn/Repos/el/emacs/lisp/org/ob-js /home/horn/.emacs.d/elpa/org-20141215/ob-tangle hides /home/horn/Repos/el/emacs/lisp/org/ob-tangle /home/horn/.emacs.d/elpa/org-20141215/org-capture hides /home/horn/Repos/el/emacs/lisp/org/org-capture /home/horn/.emacs.d/elpa/org-20141215/ob-haskell hides /home/horn/Repos/el/emacs/lisp/org/ob-haskell /home/horn/.emacs.d/elpa/org-20141215/ob-dot hides /home/horn/Repos/el/emacs/lisp/org/ob-dot /home/horn/.emacs.d/elpa/org-20141215/ob-exp hides /home/horn/Repos/el/emacs/lisp/org/ob-exp /home/horn/.emacs.d/elpa/org-20141215/org-info hides /home/horn/Repos/el/emacs/lisp/org/org-info /home/horn/.emacs.d/elpa/org-20141215/ob-octave hides /home/horn/Repos/el/emacs/lisp/org/ob-octave /home/horn/.emacs.d/elpa/org-20141215/org-mobile hides /home/horn/Repos/el/emacs/lisp/org/org-mobile /home/horn/.emacs.d/elpa/org-20141215/org-indent hides /home/horn/Repos/el/emacs/lisp/org/org-indent /home/horn/.emacs.d/elpa/org-20141215/org-attach hides /home/horn/Repos/el/emacs/lisp/org/org-attach /home/horn/.emacs.d/elpa/org-20141215/ob-java hides /home/horn/Repos/el/emacs/lisp/org/ob-java /home/horn/.emacs.d/elpa/org-20141215/org-mhe hides /home/horn/Repos/el/emacs/lisp/org/org-mhe /home/horn/.emacs.d/elpa/org-20141215/ob-scheme hides /home/horn/Repos/el/emacs/lisp/org/ob-scheme /home/horn/.emacs.d/elpa/org-20141215/ob-lob hides /home/horn/Repos/el/emacs/lisp/org/ob-lob /home/horn/.emacs.d/elpa/org-20141215/ob-calc hides /home/horn/Repos/el/emacs/lisp/org/ob-calc /home/horn/.emacs.d/elpa/org-20141215/org-agenda hides /home/horn/Repos/el/emacs/lisp/org/org-agenda /home/horn/.emacs.d/elpa/org-20141215/org-version hides /home/horn/Repos/el/emacs/lisp/org/org-version /home/horn/.emacs.d/elpa/org-20141215/org-clock hides /home/horn/Repos/el/emacs/lisp/org/org-clock /home/horn/.emacs.d/elpa/org-20141215/org-macro hides /home/horn/Repos/el/emacs/lisp/org/org-macro /home/horn/.emacs.d/elpa/org-20141215/ob-fortran hides /home/horn/Repos/el/emacs/lisp/org/ob-fortran /home/horn/.emacs.d/elpa/org-20141215/ob-picolisp hides /home/horn/Repos/el/emacs/lisp/org/ob-picolisp /home/horn/.emacs.d/elpa/org-20141215/ob-mscgen hides /home/horn/Repos/el/emacs/lisp/org/ob-mscgen /home/horn/.emacs.d/elpa/org-20141215/ox-texinfo hides /home/horn/Repos/el/emacs/lisp/org/ox-texinfo /home/horn/.emacs.d/elpa/org-20141215/org-table hides /home/horn/Repos/el/emacs/lisp/org/org-table /home/horn/.emacs.d/elpa/org-20141215/ob-matlab hides /home/horn/Repos/el/emacs/lisp/org/ob-matlab /home/horn/.emacs.d/elpa/org-20141215/ox-html hides /home/horn/Repos/el/emacs/lisp/org/ox-html /home/horn/.emacs.d/elpa/org-20141215/ox-icalendar hides /home/horn/Repos/el/emacs/lisp/org/ox-icalendar /home/horn/.emacs.d/elpa/org-20141215/org-bbdb hides /home/horn/Repos/el/emacs/lisp/org/org-bbdb /home/horn/.emacs.d/elpa/org-20141215/ob-asymptote hides /home/horn/Repos/el/emacs/lisp/org/ob-asymptote /home/horn/.emacs.d/elpa/org-20141215/org-eshell hides /home/horn/Repos/el/emacs/lisp/org/org-eshell /home/horn/.emacs.d/elpa/org-20141215/ob-comint hides /home/horn/Repos/el/emacs/lisp/org/ob-comint /home/horn/.emacs.d/elpa/org-20141215/org hides /home/horn/Repos/el/emacs/lisp/org/org /home/horn/.emacs.d/elpa/org-20141215/org-irc hides /home/horn/Repos/el/emacs/lisp/org/org-irc /home/horn/.emacs.d/elpa/org-20141215/ob-table hides /home/horn/Repos/el/emacs/lisp/org/ob-table /home/horn/.emacs.d/elpa/org-20141215/ob-scala hides /home/horn/Repos/el/emacs/lisp/org/ob-scala /home/horn/.emacs.d/elpa/org-20141215/ob-io hides /home/horn/Repos/el/emacs/lisp/org/ob-io /home/horn/.emacs.d/elpa/org-20141215/ox-ascii hides /home/horn/Repos/el/emacs/lisp/org/ox-ascii /home/horn/.emacs.d/elpa/org-20141215/ob-lisp hides /home/horn/Repos/el/emacs/lisp/org/ob-lisp /home/horn/.emacs.d/elpa/org-20141215/org-macs hides /home/horn/Repos/el/emacs/lisp/org/org-macs /home/horn/.emacs.d/elpa/org-20141215/ob-sqlite hides /home/horn/Repos/el/emacs/lisp/org/ob-sqlite /home/horn/.emacs.d/elpa/org-20141215/ob-latex hides /home/horn/Repos/el/emacs/lisp/org/ob-latex /home/horn/.emacs.d/elpa/org-20141215/ob-css hides /home/horn/Repos/el/emacs/lisp/org/ob-css /home/horn/.emacs.d/elpa/org-20141215/org-protocol hides /home/horn/Repos/el/emacs/lisp/org/org-protocol /home/horn/.emacs.d/elpa/org-20141215/ob-keys hides /home/horn/Repos/el/emacs/lisp/org/ob-keys /home/horn/.emacs.d/elpa/org-20141215/org-mouse hides /home/horn/Repos/el/emacs/lisp/org/org-mouse /home/horn/.emacs.d/elpa/org-20141215/ob-ruby hides /home/horn/Repos/el/emacs/lisp/org/ob-ruby /home/horn/.emacs.d/elpa/org-20141215/org-element hides /home/horn/Repos/el/emacs/lisp/org/org-element /home/horn/.emacs.d/elpa/org-20141215/org-bibtex hides /home/horn/Repos/el/emacs/lisp/org/org-bibtex /home/horn/.emacs.d/elpa/org-20141215/ob-C hides /home/horn/Repos/el/emacs/lisp/org/ob-C /home/horn/.emacs.d/elpa/org-20141215/org-src hides /home/horn/Repos/el/emacs/lisp/org/org-src /home/horn/.emacs.d/elpa/org-20141215/ob-makefile hides /home/horn/Repos/el/emacs/lisp/org/ob-makefile /home/horn/.emacs.d/elpa/org-20141215/org-colview hides /home/horn/Repos/el/emacs/lisp/org/org-colview /home/horn/.emacs.d/elpa/org-20141215/ob-ledger hides /home/horn/Repos/el/emacs/lisp/org/ob-ledger /home/horn/.emacs.d/elpa/org-20141215/org-crypt hides /home/horn/Repos/el/emacs/lisp/org/org-crypt /home/horn/.emacs.d/elpa/org-20141215/ob-shen hides /home/horn/Repos/el/emacs/lisp/org/ob-shen /home/horn/.emacs.d/elpa/org-20141215/ob-gnuplot hides /home/horn/Repos/el/emacs/lisp/org/ob-gnuplot /home/horn/.emacs.d/elpa/org-20141215/org-inlinetask hides /home/horn/Repos/el/emacs/lisp/org/org-inlinetask /home/horn/.emacs.d/elpa/org-20141215/org-gnus hides /home/horn/Repos/el/emacs/lisp/org/org-gnus /home/horn/.emacs.d/elpa/org-20141215/ob-sh hides /home/horn/Repos/el/emacs/lisp/org/ob-sh /home/horn/.emacs.d/elpa/org-20141215/org-pcomplete hides /home/horn/Repos/el/emacs/lisp/org/org-pcomplete /home/horn/.emacs.d/elpa/org-20141215/org-docview hides /home/horn/Repos/el/emacs/lisp/org/org-docview /home/horn/.emacs.d/elpa/org-20141215/ox-man hides /home/horn/Repos/el/emacs/lisp/org/ox-man /home/horn/.emacs.d/elpa/org-20141215/org-plot hides /home/horn/Repos/el/emacs/lisp/org/org-plot /home/horn/.emacs.d/elpa/org-20141215/ox hides /home/horn/Repos/el/emacs/lisp/org/ox /home/horn/.emacs.d/elpa/org-20141215/ob-python hides /home/horn/Repos/el/emacs/lisp/org/ob-python /home/horn/.emacs.d/elpa/org-20141215/ob-eval hides /home/horn/Repos/el/emacs/lisp/org/ob-eval /home/horn/.emacs.d/elpa/org-20141215/ob-clojure hides /home/horn/Repos/el/emacs/lisp/org/ob-clojure /home/horn/.emacs.d/elpa/org-20141215/ob-ocaml hides /home/horn/Repos/el/emacs/lisp/org/ob-ocaml /home/horn/.emacs.d/elpa/org-20141215/ox-odt hides /home/horn/Repos/el/emacs/lisp/org/ox-odt /home/horn/.emacs.d/elpa/org-20141215/org-compat hides /home/horn/Repos/el/emacs/lisp/org/org-compat /home/horn/.emacs.d/elpa/org-20141215/org-list hides /home/horn/Repos/el/emacs/lisp/org/org-list /home/horn/.emacs.d/elpa/org-20141215/ob-emacs-lisp hides /home/horn/Repos/el/emacs/lisp/org/ob-emacs-lisp /home/horn/.emacs.d/elpa/org-20141215/org-entities hides /home/horn/Repos/el/emacs/lisp/org/org-entities /home/horn/.emacs.d/elpa/org-20141215/ob-ref hides /home/horn/Repos/el/emacs/lisp/org/ob-ref /home/horn/.emacs.d/elpa/org-20141215/ob-ditaa hides /home/horn/Repos/el/emacs/lisp/org/ob-ditaa /home/horn/.emacs.d/elpa/org-20141215/ob-lilypond hides /home/horn/Repos/el/emacs/lisp/org/ob-lilypond /home/horn/.emacs.d/elpa/org-20141215/ob-org hides /home/horn/Repos/el/emacs/lisp/org/ob-org /home/horn/.emacs.d/elpa/org-20141215/org-footnote hides /home/horn/Repos/el/emacs/lisp/org/org-footnote ~/Repos/el/gnus/lisp/dig hides /home/horn/Repos/el/emacs/lisp/net/dig ~/Repos/el/gnus/lisp/hmac-md5 hides /home/horn/Repos/el/emacs/lisp/net/hmac-md5 ~/Repos/el/gnus/lisp/ntlm hides /home/horn/Repos/el/emacs/lisp/net/ntlm ~/Repos/el/gnus/lisp/hmac-def hides /home/horn/Repos/el/emacs/lisp/net/hmac-def ~/Repos/el/gnus/lisp/sasl-ntlm hides /home/horn/Repos/el/emacs/lisp/net/sasl-ntlm ~/Repos/el/gnus/lisp/sasl-cram hides /home/horn/Repos/el/emacs/lisp/net/sasl-cram ~/Repos/el/gnus/lisp/dns hides /home/horn/Repos/el/emacs/lisp/net/dns ~/Repos/el/gnus/lisp/sasl hides /home/horn/Repos/el/emacs/lisp/net/sasl ~/Repos/el/gnus/lisp/tls hides /home/horn/Repos/el/emacs/lisp/net/tls ~/Repos/el/gnus/lisp/netrc hides /home/horn/Repos/el/emacs/lisp/net/netrc ~/Repos/el/gnus/lisp/sasl-digest hides /home/horn/Repos/el/emacs/lisp/net/sasl-digest ~/Repos/el/gnus/lisp/uudecode hides /home/horn/Repos/el/emacs/lisp/mail/uudecode ~/Repos/el/gnus/lisp/binhex hides /home/horn/Repos/el/emacs/lisp/mail/binhex ~/Repos/el/gnus/lisp/hashcash hides /home/horn/Repos/el/emacs/lisp/mail/hashcash ~/Repos/el/gnus/lisp/canlock hides /home/horn/Repos/el/emacs/lisp/gnus/canlock ~/Repos/el/gnus/lisp/nneething hides /home/horn/Repos/el/emacs/lisp/gnus/nneething ~/Repos/el/gnus/lisp/mm-encode hides /home/horn/Repos/el/emacs/lisp/gnus/mm-encode ~/Repos/el/gnus/lisp/mm-util hides /home/horn/Repos/el/emacs/lisp/gnus/mm-util ~/Repos/el/gnus/lisp/rfc2047 hides /home/horn/Repos/el/emacs/lisp/gnus/rfc2047 ~/Repos/el/gnus/lisp/nnml hides /home/horn/Repos/el/emacs/lisp/gnus/nnml ~/Repos/el/gnus/lisp/gnus-cus hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-cus ~/Repos/el/gnus/lisp/gnus-range hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-range ~/Repos/el/gnus/lisp/gnus-int hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-int ~/Repos/el/gnus/lisp/gnus-cloud hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-cloud ~/Repos/el/gnus/lisp/spam-stat hides /home/horn/Repos/el/emacs/lisp/gnus/spam-stat ~/Repos/el/gnus/lisp/nnmh hides /home/horn/Repos/el/emacs/lisp/gnus/nnmh ~/Repos/el/gnus/lisp/gnus-mlspl hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-mlspl ~/Repos/el/gnus/lisp/deuglify hides /home/horn/Repos/el/emacs/lisp/gnus/deuglify ~/Repos/el/gnus/lisp/gnus-gravatar hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-gravatar ~/Repos/el/gnus/lisp/nngateway hides /home/horn/Repos/el/emacs/lisp/gnus/nngateway ~/Repos/el/gnus/lisp/ietf-drums hides /home/horn/Repos/el/emacs/lisp/gnus/ietf-drums ~/Repos/el/gnus/lisp/mail-parse hides /home/horn/Repos/el/emacs/lisp/gnus/mail-parse ~/Repos/el/gnus/lisp/gnus-salt hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-salt ~/Repos/el/gnus/lisp/nnimap hides /home/horn/Repos/el/emacs/lisp/gnus/nnimap ~/Repos/el/gnus/lisp/gnus-draft hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-draft ~/Repos/el/gnus/lisp/mail-source hides /home/horn/Repos/el/emacs/lisp/gnus/mail-source ~/Repos/el/gnus/lisp/messcompat hides /home/horn/Repos/el/emacs/lisp/gnus/messcompat ~/Repos/el/gnus/lisp/pop3 hides /home/horn/Repos/el/emacs/lisp/gnus/pop3 ~/Repos/el/gnus/lisp/nnmaildir hides /home/horn/Repos/el/emacs/lisp/gnus/nnmaildir ~/Repos/el/gnus/lisp/nnheader hides /home/horn/Repos/el/emacs/lisp/gnus/nnheader ~/Repos/el/gnus/lisp/gnus-cite hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-cite ~/Repos/el/gnus/lisp/rfc2104 hides /home/horn/Repos/el/emacs/lisp/gnus/rfc2104 ~/Repos/el/gnus/lisp/nndiary hides /home/horn/Repos/el/emacs/lisp/gnus/nndiary ~/Repos/el/gnus/lisp/gnus-diary hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-diary ~/Repos/el/gnus/lisp/nnfolder hides /home/horn/Repos/el/emacs/lisp/gnus/nnfolder ~/Repos/el/gnus/lisp/gnus-art hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-art ~/Repos/el/gnus/lisp/gnus-demon hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-demon ~/Repos/el/gnus/lisp/mml-sec hides /home/horn/Repos/el/emacs/lisp/gnus/mml-sec ~/Repos/el/gnus/lisp/nnir hides /home/horn/Repos/el/emacs/lisp/gnus/nnir ~/Repos/el/gnus/lisp/mm-partial hides /home/horn/Repos/el/emacs/lisp/gnus/mm-partial ~/Repos/el/gnus/lisp/gnus-registry hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-registry ~/Repos/el/gnus/lisp/gnus-icalendar hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-icalendar ~/Repos/el/gnus/lisp/compface hides /home/horn/Repos/el/emacs/lisp/gnus/compface ~/Repos/el/gnus/lisp/gnus-fun hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-fun ~/Repos/el/gnus/lisp/gnus-start hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-start ~/Repos/el/gnus/lisp/smiley hides /home/horn/Repos/el/emacs/lisp/gnus/smiley ~/Repos/el/gnus/lisp/gnus-picon hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-picon ~/Repos/el/gnus/lisp/spam-report hides /home/horn/Repos/el/emacs/lisp/gnus/spam-report ~/Repos/el/gnus/lisp/nntp hides /home/horn/Repos/el/emacs/lisp/gnus/nntp ~/Repos/el/gnus/lisp/nnnil hides /home/horn/Repos/el/emacs/lisp/gnus/nnnil ~/Repos/el/gnus/lisp/nndir hides /home/horn/Repos/el/emacs/lisp/gnus/nndir ~/Repos/el/gnus/lisp/gnus-srvr hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-srvr ~/Repos/el/gnus/lisp/smime hides /home/horn/Repos/el/emacs/lisp/gnus/smime ~/Repos/el/gnus/lisp/nnvirtual hides /home/horn/Repos/el/emacs/lisp/gnus/nnvirtual ~/Repos/el/gnus/lisp/gnus-notifications hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-notifications ~/Repos/el/gnus/lisp/nnspool hides /home/horn/Repos/el/emacs/lisp/gnus/nnspool ~/Repos/el/gnus/lisp/gnus-group hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-group ~/Repos/el/gnus/lisp/gnus-bcklg hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-bcklg ~/Repos/el/gnus/lisp/gnus-util hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-util ~/Repos/el/gnus/lisp/gnus-sieve hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-sieve ~/Repos/el/gnus/lisp/nndraft hides /home/horn/Repos/el/emacs/lisp/gnus/nndraft ~/Repos/el/gnus/lisp/nnagent hides /home/horn/Repos/el/emacs/lisp/gnus/nnagent ~/Repos/el/gnus/lisp/gnus-spec hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-spec ~/Repos/el/gnus/lisp/gnus-bookmark hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-bookmark ~/Repos/el/gnus/lisp/mml1991 hides /home/horn/Repos/el/emacs/lisp/gnus/mml1991 ~/Repos/el/gnus/lisp/rfc2231 hides /home/horn/Repos/el/emacs/lisp/gnus/rfc2231 ~/Repos/el/gnus/lisp/yenc hides /home/horn/Repos/el/emacs/lisp/gnus/yenc ~/Repos/el/gnus/lisp/gnus-undo hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-undo ~/Repos/el/gnus/lisp/ecomplete hides /home/horn/Repos/el/emacs/lisp/gnus/ecomplete ~/Repos/el/gnus/lisp/legacy-gnus-agent hides /home/horn/Repos/el/emacs/lisp/gnus/legacy-gnus-agent ~/Repos/el/gnus/lisp/utf7 hides /home/horn/Repos/el/emacs/lisp/gnus/utf7 ~/Repos/el/gnus/lisp/rtree hides /home/horn/Repos/el/emacs/lisp/gnus/rtree ~/Repos/el/gnus/lisp/gnus-uu hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-uu ~/Repos/el/gnus/lisp/gnus-ml hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-ml ~/Repos/el/gnus/lisp/sieve hides /home/horn/Repos/el/emacs/lisp/gnus/sieve ~/Repos/el/gnus/lisp/gnus hides /home/horn/Repos/el/emacs/lisp/gnus/gnus ~/Repos/el/gnus/lisp/mml hides /home/horn/Repos/el/emacs/lisp/gnus/mml ~/Repos/el/gnus/lisp/message hides /home/horn/Repos/el/emacs/lisp/gnus/message ~/Repos/el/gnus/lisp/mml-smime hides /home/horn/Repos/el/emacs/lisp/gnus/mml-smime ~/Repos/el/gnus/lisp/gnus-eform hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-eform ~/Repos/el/gnus/lisp/gnus-agent hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-agent ~/Repos/el/gnus/lisp/gnus-logic hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-logic ~/Repos/el/gnus/lisp/mm-extern hides /home/horn/Repos/el/emacs/lisp/gnus/mm-extern ~/Repos/el/gnus/lisp/nndoc hides /home/horn/Repos/el/emacs/lisp/gnus/nndoc ~/Repos/el/gnus/lisp/sieve-manage hides /home/horn/Repos/el/emacs/lisp/gnus/sieve-manage ~/Repos/el/gnus/lisp/mm-decode hides /home/horn/Repos/el/emacs/lisp/gnus/mm-decode ~/Repos/el/gnus/lisp/starttls hides /home/horn/Repos/el/emacs/lisp/gnus/starttls ~/Repos/el/gnus/lisp/gnus-dired hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-dired ~/Repos/el/gnus/lisp/nnbabyl hides /home/horn/Repos/el/emacs/lisp/gnus/nnbabyl ~/Repos/el/gnus/lisp/nnmbox hides /home/horn/Repos/el/emacs/lisp/gnus/nnmbox ~/Repos/el/gnus/lisp/gnus-win hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-win ~/Repos/el/gnus/lisp/gnus-async hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-async ~/Repos/el/gnus/lisp/mm-url hides /home/horn/Repos/el/emacs/lisp/gnus/mm-url ~/Repos/el/gnus/lisp/gnus-html hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-html ~/Repos/el/gnus/lisp/gssapi hides /home/horn/Repos/el/emacs/lisp/gnus/gssapi ~/Repos/el/gnus/lisp/mml2015 hides /home/horn/Repos/el/emacs/lisp/gnus/mml2015 ~/Repos/el/gnus/lisp/nnrss hides /home/horn/Repos/el/emacs/lisp/gnus/nnrss ~/Repos/el/gnus/lisp/gnus-mh hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-mh ~/Repos/el/gnus/lisp/gnus-sum hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-sum ~/Repos/el/gnus/lisp/nnweb hides /home/horn/Repos/el/emacs/lisp/gnus/nnweb ~/Repos/el/gnus/lisp/mail-prsvr hides /home/horn/Repos/el/emacs/lisp/gnus/mail-prsvr ~/Repos/el/gnus/lisp/nnmairix hides /home/horn/Repos/el/emacs/lisp/gnus/nnmairix ~/Repos/el/gnus/lisp/plstore hides /home/horn/Repos/el/emacs/lisp/gnus/plstore ~/Repos/el/gnus/lisp/rfc2045 hides /home/horn/Repos/el/emacs/lisp/gnus/rfc2045 ~/Repos/el/gnus/lisp/gnus-msg hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-msg ~/Repos/el/gnus/lisp/spam-wash hides /home/horn/Repos/el/emacs/lisp/gnus/spam-wash ~/Repos/el/gnus/lisp/gnus-score hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-score ~/Repos/el/gnus/lisp/mm-uu hides /home/horn/Repos/el/emacs/lisp/gnus/mm-uu ~/Repos/el/gnus/lisp/spam hides /home/horn/Repos/el/emacs/lisp/gnus/spam ~/Repos/el/gnus/lisp/mm-view hides /home/horn/Repos/el/emacs/lisp/gnus/mm-view ~/Repos/el/gnus/lisp/sieve-mode hides /home/horn/Repos/el/emacs/lisp/gnus/sieve-mode ~/Repos/el/gnus/lisp/html2text hides /home/horn/Repos/el/emacs/lisp/gnus/html2text ~/Repos/el/gnus/lisp/gnus-ems hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-ems ~/Repos/el/gnus/lisp/registry hides /home/horn/Repos/el/emacs/lisp/gnus/registry ~/Repos/el/gnus/lisp/auth-source hides /home/horn/Repos/el/emacs/lisp/gnus/auth-source ~/Repos/el/gnus/lisp/gravatar hides /home/horn/Repos/el/emacs/lisp/gnus/gravatar ~/Repos/el/gnus/lisp/flow-fill hides /home/horn/Repos/el/emacs/lisp/gnus/flow-fill ~/Repos/el/gnus/lisp/gmm-utils hides /home/horn/Repos/el/emacs/lisp/gnus/gmm-utils ~/Repos/el/gnus/lisp/mailcap hides /home/horn/Repos/el/emacs/lisp/gnus/mailcap ~/Repos/el/gnus/lisp/gnus-delay hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-delay ~/Repos/el/gnus/lisp/mm-bodies hides /home/horn/Repos/el/emacs/lisp/gnus/mm-bodies ~/Repos/el/gnus/lisp/mm-archive hides /home/horn/Repos/el/emacs/lisp/gnus/mm-archive ~/Repos/el/gnus/lisp/rfc1843 hides /home/horn/Repos/el/emacs/lisp/gnus/rfc1843 ~/Repos/el/gnus/lisp/gnus-kill hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-kill ~/Repos/el/gnus/lisp/qp hides /home/horn/Repos/el/emacs/lisp/gnus/qp ~/Repos/el/gnus/lisp/score-mode hides /home/horn/Repos/el/emacs/lisp/gnus/score-mode ~/Repos/el/gnus/lisp/gnus-topic hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-topic ~/Repos/el/gnus/lisp/gnus-cache hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-cache ~/Repos/el/gnus/lisp/nnmail hides /home/horn/Repos/el/emacs/lisp/gnus/nnmail ~/Repos/el/gnus/lisp/gnus-vm hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-vm ~/Repos/el/gnus/lisp/gnus-sync hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-sync ~/Repos/el/gnus/lisp/nnoo hides /home/horn/Repos/el/emacs/lisp/gnus/nnoo ~/Repos/el/gnus/lisp/nnregistry hides /home/horn/Repos/el/emacs/lisp/gnus/nnregistry ~/Repos/el/gnus/lisp/gnus-dup hides /home/horn/Repos/el/emacs/lisp/gnus/gnus-dup ~/Repos/el/gnus/lisp/parse-time hides /home/horn/Repos/el/emacs/lisp/calendar/parse-time ~/Repos/el/gnus/lisp/time-date hides /home/horn/Repos/el/emacs/lisp/calendar/time-date Features: (shadow emacsbug tramp-cache gnus-dired autorevert filenotify cider-macroexpansion reftex-sel reftex-ref reftex-parse reftex-toc texmathp preview prv-emacs auto-dictionary flyspell ispell tex-buf reftex-dcr reftex-auc reftex reftex-vars font-latex latex tex-style tex dbus crm tex-mode latexenc filecache shr-color color shr dom subr-x pcase hippie-exp bs mailalias smtpmail sendmail nxml-uchnm rng-xsd xsd-regexp rng-cmpct rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt rng-util rng-pttrn nxml-ns nxml-mode nxml-outln nxml-rap nxml-util nxml-glyph nxml-enc xmltok misearch multi-isearch xterm url-http url-gw url-auth sort smiley gnus-cite qp mm-archive gnus-async gnus-bcklg gnus-ml mule-diag vc-git diff-mode jka-compr hl-line nndraft nnmh rot13 utf-7 gnutls network-stream nsm starttls nnml nnnil gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-cache gnus-demon nntp spam spam-stat gnus-uu yenc gnus-msg gnus-gravatar mail-extr gravatar gnus-topic nnir gnus-registry registry eieio-base th-private company-files company-oddmuse company-keywords company-etags company-gtags company-dabbrev-code company-dabbrev company-capf company-cmake company-ropemacs company-xcode company-clang company-semantic company-eclim company-template company-css company-nxml company-bbdb highlight-parentheses company stratego-mode greql-mode tg-mode generic preview-latex tex-site auto-loads cider tramp-sh cider-mode cider-repl cider-eldoc cider-interaction apropos arc-mode archive-mode cider-doc org-table cider-test cider-stacktrace cider-client nrepl-client queue cider-util ewoc etags clojure-mode imenu paredit aggressive-indent names edebug epa-file epa epg rdictcc ox-reveal ox-latex ox-icalendar ox-html ox-ascii ox-publish ox org-element google-contacts-message google-contacts derived url-cache google-oauth google-contacts-gnus gnus-art mm-uu mml2015 mm-view mml-smime smime dig gnus-sum gnus-group gnus-undo gnus-start gnus-cloud nnimap nnmail mail-source tls utf7 netrc nnoo parse-time gnus-spec gnus-int gnus-range gnus-win gnus gnus-ems gnus-compat nnheader em-term term ehelp esh-opt esh-ext esh-util highlight-symbol boxquote rect ecomplete message rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mailabbrev mail-utils gmm-utils mailheader edit-server server yasnippet help-mode disp-table browse-kill-ring recentf tree-widget wid-edit helm-projectile helm-files image-dired tramp tramp-compat tramp-loaddefs trampver shell dired-x dired-aux ffap helm-tags helm-bookmark helm-adaptive helm-info helm-net browse-url xml url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util url-parse auth-source gnus-util mm-util mail-prsvr password-cache url-vars mailcap bookmark pp helm-help helm-org org org-macro org-footnote org-pcomplete pcomplete org-list org-faces org-entities noutline outline org-version ob-emacs-lisp ob ob-tangle ob-ref ob-lob ob-table ob-exp org-src ob-keys ob-comint ob-core ob-eval org-compat org-macs org-loaddefs format-spec cal-menu calendar cal-loaddefs helm-external helm-buffers helm-match-plugin helm-grep helm-regexp helm-plugin helm-elscreen helm-utils dired helm-locate helm helm-source eieio byte-opt bytecomp byte-compile cl-extra cconv eieio-core helm-config async-bytecomp async helm-aliases projectile ibuf-ext ibuffer pkg-info find-func lisp-mnt epl grep compile comint ansi-color ring f s ucs-normalize thingatpt easy-mmode cl-macs iedit help-macro iedit-lib cl gv cap-words superword subword saveplace savehist paren icomplete mb-depth smart-mode-line-respectful-theme smart-mode-line-light-theme rich-minority smart-mode-line mule-util dash rx edmacro kmacro cl-loaddefs cl-lib elec-pair gnus-load tsdh-light-theme memory-usage-autoloads advice help-fns info easymenu package epg-config time-date tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind gfilenotify dynamic-setting system-font-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs) Memory information: ((conses 16 905837 165784) (symbols 48 63620 24) (miscs 40 1776 13622) (strings 32 211846 31876) (string-bytes 1 6973133) (vectors 16 88496) (vector-slots 8 2137319 192540) (floats 8 791 758) (intervals 56 7040 9061) (buffers 976 59) (heap 1024 133392 9009)) ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 15:21 bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files Tassilo Horn @ 2014-12-16 16:05 ` Eli Zaretskii 2014-12-16 16:20 ` Eli Zaretskii 2014-12-16 19:10 ` Tassilo Horn 2014-12-16 16:39 ` martin rudalics ` (2 subsequent siblings) 3 siblings, 2 replies; 33+ messages in thread From: Eli Zaretskii @ 2014-12-16 16:05 UTC (permalink / raw) To: Tassilo Horn; +Cc: 19393 > From: Tassilo Horn <tsdh@gnu.org> > Date: Tue, 16 Dec 2014 16:21:10 +0100 > > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz > > which contains all movies known to the international movie database > (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or > unzip it first) and then do M-x describe-coding-system I can see that it > is "t -- raw-text-unix". As a result of this, the last movie in that > file is displayed as "\374\347 (2012) 2012". > > However, according to the `file' command, the file is plain ISO-8859. Looks like some kind of bug, although with such a large file, it's not easy to be sure. > I also can't force Emacs to use ISO-8859 for that or the original file. > `C-x RET f iso-8859-15 RET' results in a query that certain characters > cannot be encoded using latin-9, e.g., \374 and \347, and I'm expected > to choose another encoding. That's not how you force Emacs to use a specific encoding when visiting a file. You should do this instead: C-x RET c iso-8859-15 RET C-x C-f movies.list RET IOW, revisit the file, forcing Emacs to decode it as ISO-8859-15. (The same works with the original compressed file.) ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 16:05 ` Eli Zaretskii @ 2014-12-16 16:20 ` Eli Zaretskii 2014-12-16 19:22 ` Tassilo Horn 2014-12-16 19:10 ` Tassilo Horn 1 sibling, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-16 16:20 UTC (permalink / raw) To: tsdh; +Cc: 19393 > Date: Tue, 16 Dec 2014 18:05:38 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 19393@debbugs.gnu.org > > > From: Tassilo Horn <tsdh@gnu.org> > > Date: Tue, 16 Dec 2014 16:21:10 +0100 > > > > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz > > > > which contains all movies known to the international movie database > > (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or > > unzip it first) and then do M-x describe-coding-system I can see that it > > is "t -- raw-text-unix". As a result of this, the last movie in that > > file is displayed as "\374\347 (2012) 2012". > > > > However, according to the `file' command, the file is plain ISO-8859. > > Looks like some kind of bug, although with such a large file, it's not > easy to be sure. Actually, I don't think this is a bug. There are ISO-8859-15 characters in that file that are not part of ISO-8859-1, so Emacs will not detect that encoding unless either (a) your locale dictates that encoding, or (b) you change the preferences to prefer ISO-8859-15. This is so with any 8-bit encoding -- EMacs cannot easily distinguish between them, and needs some guidance. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 16:20 ` Eli Zaretskii @ 2014-12-16 19:22 ` Tassilo Horn 0 siblings, 0 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-16 19:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 Eli Zaretskii <eliz@gnu.org> writes: >> > However, according to the `file' command, the file is plain ISO-8859. >> >> Looks like some kind of bug, although with such a large file, it's not >> easy to be sure. > > Actually, I don't think this is a bug. There are ISO-8859-15 > characters in that file that are not part of ISO-8859-1, so Emacs will > not detect that encoding unless either (a) your locale dictates that > encoding, It doesn't. > or (b) you change the preferences to prefer ISO-8859-15. Is there a way to prefer ISO-8859-15 over ISO-8859-1? The manual I can only find the command `prefer-coding-system' which doesn't seem to do what I want. I wan't to reorder the "priority list for automatic detection" so that ISO-8859-15 is before ISO-8859-1 but still UTF-8 is the very first entry (as it's dictated by my locale). > This is so with any 8-bit encoding -- EMacs cannot easily distinguish > between them, and needs some guidance. Ok, I see. And as Wolfgang said, some chars in the file are encoded wrongly using Windows-1250. That probably adds to the problem. Thanks for the explanation! Bye, Tassilo ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 16:05 ` Eli Zaretskii 2014-12-16 16:20 ` Eli Zaretskii @ 2014-12-16 19:10 ` Tassilo Horn 1 sibling, 0 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-16 19:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 Eli Zaretskii <eliz@gnu.org> writes: >> I also can't force Emacs to use ISO-8859 for that or the original file. >> `C-x RET f iso-8859-15 RET' results in a query that certain characters >> cannot be encoded using latin-9, e.g., \374 and \347, and I'm expected >> to choose another encoding. > > That's not how you force Emacs to use a specific encoding when > visiting a file. You should do this instead: > > C-x RET c iso-8859-15 RET C-x C-f movies.list RET > > IOW, revisit the file, forcing Emacs to decode it as ISO-8859-15. > (The same works with the original compressed file.) Ah, indeed, that works. Bye, Tassilo ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 15:21 bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files Tassilo Horn 2014-12-16 16:05 ` Eli Zaretskii @ 2014-12-16 16:39 ` martin rudalics 2014-12-16 19:26 ` Tassilo Horn 2014-12-16 16:56 ` Andreas Schwab 2014-12-16 18:49 ` Wolfgang Jenkner 3 siblings, 1 reply; 33+ messages in thread From: martin rudalics @ 2014-12-16 16:39 UTC (permalink / raw) To: Tassilo Horn, 19393 > I've dowloaded the following file > > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz > > which contains all movies known to the international movie database > (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or > unzip it first) and then do M-x describe-coding-system I can see that it > is "t -- raw-text-unix". As a result of this, the last movie in that > file is displayed as "\374\347 (2012) 2012". I usually delegate such problems to unicad.el. martin ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 16:39 ` martin rudalics @ 2014-12-16 19:26 ` Tassilo Horn 0 siblings, 0 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-16 19:26 UTC (permalink / raw) To: martin rudalics; +Cc: 19393 martin rudalics <rudalics@gmx.at> writes: >> I've dowloaded the following file >> >> ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz >> >> which contains all movies known to the international movie database >> (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or >> unzip it first) and then do M-x describe-coding-system I can see that it >> is "t -- raw-text-unix". As a result of this, the last movie in that >> file is displayed as "\374\347 (2012) 2012". > > I usually delegate such problems to unicad.el. Indeed, when using and enabling that, the file is read as latin-9. Thanks, Tassilo ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 15:21 bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files Tassilo Horn 2014-12-16 16:05 ` Eli Zaretskii 2014-12-16 16:39 ` martin rudalics @ 2014-12-16 16:56 ` Andreas Schwab 2014-12-16 18:49 ` Wolfgang Jenkner 3 siblings, 0 replies; 33+ messages in thread From: Andreas Schwab @ 2014-12-16 16:56 UTC (permalink / raw) To: Tassilo Horn; +Cc: 19393 Tassilo Horn <tsdh@gnu.org> writes: > However, according to the `file' command, the file is plain ISO-8859. You can't take that seriously, since file doesn't check every character in the file. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 15:21 bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files Tassilo Horn ` (2 preceding siblings ...) 2014-12-16 16:56 ` Andreas Schwab @ 2014-12-16 18:49 ` Wolfgang Jenkner 2014-12-16 19:36 ` Tassilo Horn 3 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-16 18:49 UTC (permalink / raw) To: Tassilo Horn; +Cc: 19393 On Tue, Dec 16 2014, Tassilo Horn wrote: > I've dowloaded the following file > > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz > [...] > I also can't force Emacs to use ISO-8859 for that or the original file. > `C-x RET f iso-8859-15 RET' results in a query that certain characters > cannot be encoded using latin-9, e.g., \374 and \347, and I'm expected > to choose another encoding. > > So `file' and `iconv' say the file is valid latin-9 but Emacs seems to > disagree. Who is correct? I tend towards file/iconv but I might be > wrong. > > And shouldn't it be possible to force Emacs to a certain coding system? Perhaps revert-buffer-with-coding-system will do what you want (i.e., C-x <return> r l a t i n - 1 <return> y e s <return> should show letters with diacritical marks properly, but it took about 20 minutes on my old dual-core k8 system). In any case, some bisecting shows that the first problem is the line Jedna žena – jedan vek (2011) 2011 It seems to be encoded in Windows-1250 [1] instead. The IMDb website [2] has also problems with this title (at least in Firefox, the problematic letters seem to be missing somehow). [1] https://en.wikipedia.org/wiki/Windows-1250 [2] http://www.imdb.com/title/tt2087826/keywords Wolfgang ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 18:49 ` Wolfgang Jenkner @ 2014-12-16 19:36 ` Tassilo Horn 2014-12-17 14:22 ` Wolfgang Jenkner 2014-12-17 15:12 ` Wolfgang Jenkner 0 siblings, 2 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-16 19:36 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 Wolfgang Jenkner <wjenkner@inode.at> writes: >> And shouldn't it be possible to force Emacs to a certain coding system? > > Perhaps revert-buffer-with-coding-system will do what you want (i.e., > > C-x <return> r l a t i n - 1 <return> y e s <return> Yes, that's the right command and not `C-x RET f' as I've thought. > should show letters with diacritical marks properly, It does. > but it took about 20 minutes on my old dual-core k8 system). Here it took about 2 seconds and it's not that I own the first practical quantum computer. > In any case, some bisecting shows that the first problem is the line > > Jedna žena – jedan vek (2011) 2011 > > It seems to be encoded in Windows-1250 [1] instead. Indeed. How did you search for it? I guess you didn't just scroll the file with open eye. Bye, Tassilo ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 19:36 ` Tassilo Horn @ 2014-12-17 14:22 ` Wolfgang Jenkner 2014-12-17 15:50 ` Eli Zaretskii 2014-12-17 15:12 ` Wolfgang Jenkner 1 sibling, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-17 14:22 UTC (permalink / raw) To: Tassilo Horn; +Cc: 19393 On Tue, Dec 16 2014, Tassilo Horn wrote: >> but it took about 20 minutes on my old dual-core k8 system). > > Here it took about 2 seconds and it's not that I own the first practical > quantum computer. Thanks, that's strange... Wolfgang ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-17 14:22 ` Wolfgang Jenkner @ 2014-12-17 15:50 ` Eli Zaretskii 2014-12-17 16:02 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-17 15:50 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393, tsdh > From: Wolfgang Jenkner <wjenkner@inode.at> > Date: Wed, 17 Dec 2014 15:22:19 +0100 > Cc: 19393@debbugs.gnu.org > > On Tue, Dec 16 2014, Tassilo Horn wrote: > > >> but it took about 20 minutes on my old dual-core k8 system). > > > > Here it took about 2 seconds and it's not that I own the first practical > > quantum computer. > > Thanks, that's strange... What is the system where you observed the 20-minute delay? And what version of Emacs was that? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-17 15:50 ` Eli Zaretskii @ 2014-12-17 16:02 ` Wolfgang Jenkner 2014-12-17 17:03 ` Eli Zaretskii 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-17 16:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393, tsdh On Wed, Dec 17 2014, Eli Zaretskii wrote: > What is the system where you observed the 20-minute delay? And what > version of Emacs was that? FreeBSD 10 on amd64, but the emacs versions I have are more than a month old, so I'll bootstrap from a current git checkout and try again. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-17 16:02 ` Wolfgang Jenkner @ 2014-12-17 17:03 ` Eli Zaretskii 2014-12-18 1:47 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-17 17:03 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393, tsdh > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org, tsdh@gnu.org > Date: Wed, 17 Dec 2014 17:02:07 +0100 > > On Wed, Dec 17 2014, Eli Zaretskii wrote: > > > What is the system where you observed the 20-minute delay? And what > > version of Emacs was that? > > FreeBSD 10 on amd64 That's what I thought. AFAIK, FreeBSD systems use mmap(2) explicitly for buffer memory allocation, and that could be slow when we need to repeatedly reallocate buffer text and memmove the text between old and new. > but the emacs versions I have are more than a month old, so I'll > bootstrap from a current git checkout and try again. If I'm right, this won't change the result. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-17 17:03 ` Eli Zaretskii @ 2014-12-18 1:47 ` Wolfgang Jenkner 2014-12-18 16:22 ` Eli Zaretskii 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-18 1:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Wed, Dec 17 2014, Eli Zaretskii wrote: >> On Wed, Dec 17 2014, Eli Zaretskii wrote: >> >> > What is the system where you observed the 20-minute delay? And what >> > version of Emacs was that? >> >> FreeBSD 10 on amd64 > > That's what I thought. AFAIK, FreeBSD systems use mmap(2) explicitly > for buffer memory allocation, and that could be slow when we need to > repeatedly reallocate buffer text and memmove the text between old and > new. > >> but the emacs versions I have are more than a month old, so I'll >> bootstrap from a current git checkout and try again. > > If I'm right, this won't change the result. You are right, of course (it took around 15 minutes system+user time). So, I tried --8<---------------cut here---------------start------------->8--- diff --git a/configure.ac b/configure.ac index 010abc8..de1c5e8 100644 --- a/configure.ac +++ b/configure.ac @@ -2127,7 +2127,7 @@ fi use_mmap_for_buffers=no case "$opsys" in - cygwin|mingw32|freebsd|irix6-5) use_mmap_for_buffers=yes ;; + cygwin|mingw32|irix6-5) use_mmap_for_buffers=yes ;; esac AC_FUNC_MMAP --8<---------------cut here---------------end--------------->8--- However, this still took around 10 minutes (I tested with emacs -Q in both cases, of course). I give samples of the recurring sequence of syscalls (as reported by truss) in both cases below. Here's the current default for FreeBSD. Should Emacs use the GNU version of malloc? yes Should Emacs use a relocating allocator for buffers? no Should Emacs use mmap(2) for buffer allocation? yes --8<---------------cut here---------------start------------->8--- sigprocmask(SIG_BLOCK,SIGINT|SIGALRM,0x0) = 0 (0x0) clock_gettime(0,{1418846146.702726599 }) = 0 (0x0) ktimer_settime(0x3,0x1,0x7ffffffece50,0x0,0x0,0x0) = 0 (0x0) sigprocmask(SIG_SETMASK,0x0,SIGINT|SIGALRM) = 0 (0x0) nanosleep({0.000001000 }) = 0 (0x0) mmap(0x0,28815360,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x8108ec000,28798976) = 0 (0x0) read(9,"\t????\n"Esperan\M-ga" (2002) {("...,65536) = 65536 (0x10000) mmap(0x0,28831744,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34608795648 (0x80ed85000) munmap(0x80d20a000,28815360) = 0 (0x0) mmap(0x0,28848128,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34637627392 (0x810904000) munmap(0x80ed85000,28831744) = 0 (0x0) mmap(0x0,28864512,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810904000,28848128) = 0 (0x0) mmap(0x0,28880896,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34608844800 (0x80ed91000) munmap(0x80d20a000,28864512) = 0 (0x0) read(9," SportsCentury" (1999) {Seabiscu"...,65536) = 65536 (0x10000) mmap(0x0,28897280,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34637725696 (0x81091c000) munmap(0x80ed91000,28880896) = 0 (0x0) mmap(0x0,28913664,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x81091c000,28897280) = 0 (0x0) mmap(0x0,28930048,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34608893952 (0x80ed9d000) munmap(0x80d20a000,28913664) = 0 (0x0) mmap(0x0,28946432,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34637824000 (0x810934000) munmap(0x80ed9d000,28930048) = 0 (0x0) read(9,"a es mi historia" (2001) {La vid"...,65536) = 65536 (0x10000) mmap(0x0,28962816,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810934000,28946432) = 0 (0x0) mmap(0x0,28979200,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34608943104 (0x80eda9000) munmap(0x80d20a000,28962816) = 0 (0x0) mmap(0x0,28999680,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34637922304 (0x81094c000) munmap(0x80eda9000,28979200) = 0 (0x0) mmap(0x0,29016064,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x81094c000,28999680) = 0 (0x0) read(9,"\t1999\n"Esti showder" (1999) {("...,65536) = 65536 (0x10000) mmap(0x0,29032448,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34608996352 (0x80edb6000) munmap(0x80d20a000,29016064) = 0 (0x0) mmap(0x0,29048832,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638028800 (0x810966000) munmap(0x80edb6000,29032448) = 0 (0x0) mmap(0x0,29065216,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810966000,29048832) = 0 (0x0) mmap(0x0,29081600,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609045504 (0x80edc2000) munmap(0x80d20a000,29065216) = 0 (0x0) read(9,"en Cuba}\t\t1978\n"Estudio 1" (1"...,65536) = 65536 (0x10000) mmap(0x0,29097984,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638127104 (0x81097e000) munmap(0x80edc2000,29081600) = 0 (0x0) mmap(0x0,29114368,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x81097e000,29097984) = 0 (0x0) mmap(0x0,29130752,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609094656 (0x80edce000) munmap(0x80d20a000,29114368) = 0 (0x0) mmap(0x0,29147136,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638225408 (0x810996000) munmap(0x80edce000,29130752) = 0 (0x0) read(9,"07\n"Eterna Magia" (2007) {(2007"...,65536) = 65536 (0x10000) mmap(0x0,29163520,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810996000,29147136) = 0 (0x0) mmap(0x0,29179904,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609143808 (0x80edda000) munmap(0x80d20a000,29163520) = 0 (0x0) mmap(0x0,29196288,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638323712 (0x8109ae000) munmap(0x80edda000,29179904) = 0 (0x0) mmap(0x0,29212672,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x8109ae000,29196288) = 0 (0x0) read(9,")}\t1991\n"Eva y Ad\M-an, agenci"...,65536) = 65536 (0x10000) mmap(0x0,29229056,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609192960 (0x80ede6000) munmap(0x80d20a000,29212672) = 0 (0x0) mmap(0x0,29245440,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638422016 (0x8109c6000) munmap(0x80ede6000,29229056) = 0 (0x0) mmap(0x0,29261824,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x8109c6000,29245440) = 0 (0x0) mmap(0x0,29278208,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609242112 (0x80edf2000) munmap(0x80d20a000,29261824) = 0 (0x0) read(9,"A. (#9.4)}\t2004\n"Everybody Lov"...,65536) = 65536 (0x10000) mmap(0x0,29294592,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638520320 (0x8109de000) munmap(0x80edf2000,29278208) = 0 (0x0) mmap(0x0,29310976,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x8109de000,29294592) = 0 (0x0) mmap(0x0,29327360,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609291264 (0x80edfe000) munmap(0x80d20a000,29310976) = 0 (0x0) mmap(0x0,29343744,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638618624 (0x8109f6000) munmap(0x80edfe000,29327360) = 0 (0x0) read(9,"\t\t1988\n"Everyman" (1977) {Who"...,65536) = 65536 (0x10000) mmap(0x0,29360128,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x8109f6000,29343744) = 0 (0x0) mmap(0x0,29376512,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609340416 (0x80ee0a000) munmap(0x80d20a000,29360128) = 0 (0x0) mmap(0x0,29392896,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638716928 (0x810a0e000) munmap(0x80ee0a000,29376512) = 0 (0x0) mmap(0x0,29409280,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810a0e000,29392896) = 0 (0x0) read(9,"xclusive" (1997) {(#1.1)}\t\t\t"...,65536) = 65536 (0x10000) mmap(0x0,29425664,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609389568 (0x80ee16000) munmap(0x80d20a000,29409280) = 0 (0x0) mmap(0x0,29442048,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638815232 (0x810a26000) munmap(0x80ee16000,29425664) = 0 (0x0) mmap(0x0,29458432,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810a26000,29442048) = 0 (0x0) mmap(0x0,29474816,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609438720 (0x80ee22000) munmap(0x80d20a000,29458432) = 0 (0x0) read(9,"\n"Explorers: Adventures of the "...,65536) = 65536 (0x10000) mmap(0x0,29491200,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34638913536 (0x810a3e000) munmap(0x80ee22000,29474816) = 0 (0x0) mmap(0x0,29507584,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810a3e000,29491200) = 0 (0x0) mmap(0x0,29523968,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609487872 (0x80ee2e000) munmap(0x80d20a000,29507584) = 0 (0x0) mmap(0x0,29540352,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34639011840 (0x810a56000) munmap(0x80ee2e000,29523968) = 0 (0x0) read(9,"xtra" (1994) {(2011-05-03)}\t\t"...,65536) = 65536 (0x10000) mmap(0x0,29556736,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34579980288 (0x80d20a000) munmap(0x810a56000,29540352) = 0 (0x0) mmap(0x0,29573120,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34609537024 (0x80ee3a000) SIGNAL 14 (SIGALRM) sigprocmask(SIG_SETMASK,SIGINT|SIGQUIT|SIGALRM|SIGCHLD|SIGIO|SIGPROF|SIGWINCH,0x0) = 0 (0x0) sigreturn(0x7ffffffec630,0x7ffffffec630,0x301,0x0,0xfffffffffffffbc0,0x0) = 34609537064 (0x80ee3a028) munmap(0x80d20a000,29556736) = 0 (0x0) recvmsg(0x6,0x7ffffffecb80,0x0,0x1000,0x1c30000,0x0) ERR#35 'Resource temporarily unavailable' --8<---------------cut here---------------end--------------->8--- And here's the version with the patch above applied. Should Emacs use the GNU version of malloc? yes Should Emacs use a relocating allocator for buffers? yes Should Emacs use mmap(2) for buffer allocation? no --8<---------------cut here---------------start------------->8--- sigprocmask(SIG_BLOCK,SIGINT|SIGALRM,0x0) = 0 (0x0) clock_gettime(0,{1418846834.087766996 }) = 0 (0x0) ktimer_settime(0x3,0x1,0x7ffffffece90,0x0,0x0,0xd10fe8) = 0 (0x0) sigprocmask(SIG_SETMASK,0x0,SIGINT|SIGALRM) = 0 (0x0) nanosleep({0.000001000 }) = 0 (0x0) read(5,"n the Family" (1971) {Archie See"...,65536) = 65536 (0x10000) break(0xeb9a000) = 0 (0x0) read(5,"en" (1970) {(#1.5899)}\t\t\t1992"...,65536) = 65536 (0x10000) break(0xebaa000) = 0 (0x0) read(5,"\t\t2008\n"All My Children" (197"...,65536) = 65536 (0x10000) break(0xebba000) = 0 (0x0) read(5,"(1998) {False Convictions (#8.15"...,65536) = 65536 (0x10000) break(0xebca000) = 0 (0x0) read(5,"la lei\M-p" (2008) {Fimmti \M-~"...,65536) = 65536 (0x10000) break(0xebda000) = 0 (0x0) read(5,"014\n"Allt f\M-vr Sverige" (2011"...,65536) = 65536 (0x10000) SIGNAL 14 (SIGALRM) sigreturn(0x7ffffffeca70,0x10003,0x7ffffffeca70,0x7ffffffed478,0x41e8,0xd10fe8) = 25710504 (0x1884fa8) recvmsg(0x4,0x7ffffffecbc0,0x0,0x1000,0x41e8,0xd10fe8) ERR#35 'Resource temporarily unavailable' --8<---------------cut here---------------end--------------->8--- ^ permalink raw reply related [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-18 1:47 ` Wolfgang Jenkner @ 2014-12-18 16:22 ` Eli Zaretskii 2014-12-18 16:36 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-18 16:22 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org > Date: Thu, 18 Dec 2014 02:47:41 +0100 > > > That's what I thought. AFAIK, FreeBSD systems use mmap(2) explicitly > > for buffer memory allocation, and that could be slow when we need to > > repeatedly reallocate buffer text and memmove the text between old and > > new. > > > >> but the emacs versions I have are more than a month old, so I'll > >> bootstrap from a current git checkout and try again. > > > > If I'm right, this won't change the result. > > You are right, of course (it took around 15 minutes system+user time). > > So, I tried > > --8<---------------cut here---------------start------------->8--- > diff --git a/configure.ac b/configure.ac > index 010abc8..de1c5e8 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -2127,7 +2127,7 @@ fi > > use_mmap_for_buffers=no > case "$opsys" in > - cygwin|mingw32|freebsd|irix6-5) use_mmap_for_buffers=yes ;; > + cygwin|mingw32|irix6-5) use_mmap_for_buffers=yes ;; > esac > > AC_FUNC_MMAP > --8<---------------cut here---------------end--------------->8--- > > However, this still took around 10 minutes (I tested with emacs -Q in > both cases, of course). That's expected: when you disable mmap, Emacs uses ralloc.c, which still has this problem. Btw, is this with the compressed file or after decompressing it? My guess is the former. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-18 16:22 ` Eli Zaretskii @ 2014-12-18 16:36 ` Wolfgang Jenkner 2014-12-18 17:34 ` Eli Zaretskii 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-18 16:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Thu, Dec 18 2014, Eli Zaretskii wrote: >> - cygwin|mingw32|freebsd|irix6-5) use_mmap_for_buffers=yes ;; >> + cygwin|mingw32|irix6-5) use_mmap_for_buffers=yes ;; [...] >> However, this still took around 10 minutes (I tested with emacs -Q in >> both cases, of course). > > That's expected: when you disable mmap, Emacs uses ralloc.c, which > still has this problem. Shouldn't other systems for which the native malloc is not used have a similar problem then? > Btw, is this with the compressed file or after decompressing it? My > guess is the former. No, with the uncompressed file. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-18 16:36 ` Wolfgang Jenkner @ 2014-12-18 17:34 ` Eli Zaretskii 2014-12-20 3:21 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-18 17:34 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org > Date: Thu, 18 Dec 2014 17:36:19 +0100 > > On Thu, Dec 18 2014, Eli Zaretskii wrote: > > >> - cygwin|mingw32|freebsd|irix6-5) use_mmap_for_buffers=yes ;; > >> + cygwin|mingw32|irix6-5) use_mmap_for_buffers=yes ;; > [...] > >> However, this still took around 10 minutes (I tested with emacs -Q in > >> both cases, of course). > > > > That's expected: when you disable mmap, Emacs uses ralloc.c, which > > still has this problem. > > Shouldn't other systems for which the native malloc is not used have > a similar problem then? There are almost none of them. But yes, those which do should have a similar problem. > > Btw, is this with the compressed file or after decompressing it? My > > guess is the former. > > No, with the uncompressed file. Then it's probably some inefficiency in insert-file-contents, when it is called to revert a buffer. If you have time, please take a look what happens there, I suspect we reallocate the buffer in very small chunks, instead of doing it with larger increments. (With compressed files, it's hard to do, because the size of the uncompressed file is not known in advance.) Thanks. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-18 17:34 ` Eli Zaretskii @ 2014-12-20 3:21 ` Wolfgang Jenkner 2014-12-20 7:27 ` Eli Zaretskii 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-20 3:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Thu, Dec 18 2014, Eli Zaretskii wrote: > Then it's probably some inefficiency in insert-file-contents, when it > is called to revert a buffer. If you have time, please take a look > what happens there, I suspect we reallocate the buffer in very small > chunks, instead of doing it with larger increments. (With compressed > files, it's hard to do, because the size of the uncompressed file is > not known in advance.) I have been looking into this with dtrace and what is sure is that a large amount of data (increasing up to the order of magnitude of the buffer size) is memcpy'd again and again as a result of mmap_realloc being called by enlarge_buffer_text. Apparently, the latter is called for buffer gap handling which is triggered by decode_coding_c_string (or rather decode_coding_object) in insert-file-contents. So it seems that the effect on memory of this innocent-looking loop there is enormously magnified. But I have to look at the source more closely (not that I expect to get any idea how to fix this, though). ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-20 3:21 ` Wolfgang Jenkner @ 2014-12-20 7:27 ` Eli Zaretskii 2015-01-13 14:06 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2014-12-20 7:27 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org > Date: Sat, 20 Dec 2014 04:21:54 +0100 > > I have been looking into this with dtrace and what is sure is that > a large amount of data (increasing up to the order of magnitude of the > buffer size) is memcpy'd again and again as a result of mmap_realloc > being called by enlarge_buffer_text. Apparently, the latter is called > for buffer gap handling which is triggered by decode_coding_c_string (or > rather decode_coding_object) in insert-file-contents. So it seems that > the effect on memory of this innocent-looking loop there is enormously > magnified. Yes, that'd be my guess for the reason. > But I have to look at the source more closely (not that I expect to get > any idea how to fix this, though). Since we know the size of the file, we could perhaps compute the new buffer size up front (taking some conservative approximations, if needed), and mmap_realloc it only once. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-20 7:27 ` Eli Zaretskii @ 2015-01-13 14:06 ` Wolfgang Jenkner 2015-01-13 16:25 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 33+ messages in thread From: Wolfgang Jenkner @ 2015-01-13 14:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 [-- Attachment #1: Type: text/plain, Size: 1485 bytes --] Here's a simple change in src/buffer.c that reduces the time to six seconds or so, but only for newer versions of FreeBSD. It takes advantage of the MAP_EXCL flag for mmap(2), which has been recently added[1] and is also available in 10-STABLE and 10.1-RELEASE. In percentage of user CPU time, the hotuser script[2] from the dtrace toolkit shows a change from [...] emacs-25.0.50.1`decode_coding 537 0.1% emacs-25.0.50.1`produce_chars 2109 0.4% emacs-25.0.50.1`decode_coding_charset 2544 0.5% libc.so.7`memcpy 516884 98.9% to [...] libc.so.7`memcpy 220 4.1% bootstrap-emacs`decode_coding 488 9.0% bootstrap-emacs`produce_chars 2100 38.8% bootstrap-emacs`decode_coding_charset 2501 46.2% (the second column counts sample points, of which there are 1001 per second for each CPU core) The numbers are for the system compiler (clang 3.4.1) with default optimizations, though they are even a bit better for gcc 4.9. However, if the file in question is compressed revert-buffer-with-coding-system still takes 4 minutes (the user time being dominated to 98% by memmove). [1] https://svnweb.freebsd.org/base?view=revision&revision=267630 [2] https://svnweb.freebsd.org/base/stable/10/cddl/contrib/dtracetoolkit/hotuser?revision=256281&view=co [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Use MAP_EXCL mmap flag. --] [-- Type: text/x-diff, Size: 2394 bytes --] From b0233ff2274e554339da3c4606ff7fb5fc961e82 Mon Sep 17 00:00:00 2001 From: Wolfgang Jenkner <wjenkner@inode.at> Date: Tue, 23 Dec 2014 01:50:10 +0100 Subject: [PATCH] Actually use mmap_enlarge for FreeBSD 10.1 or newer. * src/buffer.c (MAP_EXCL): Make sure it is always defined. (MMAP_ALLOCATED_P, mmap_enlarge): Use it. This alleviates a performance problem due to excessive use of memcpy(3). (Bug#19393) --- src/ChangeLog | 8 ++++++++ src/buffer.c | 15 ++++++++++++--- 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/src/ChangeLog b/src/ChangeLog index 252dfd3..b526e28 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,11 @@ +2014-12-24 Wolfgang Jenkner <wjenkner@inode.at> + + Actually use mmap_enlarge for FreeBSD 10.1 or newer. + * buffer.c (MAP_EXCL): Make sure it is always defined. + (MMAP_ALLOCATED_P, mmap_enlarge): Use it. + This alleviates a performance problem due to excessive use of + memcpy(3). (Bug#19393) + 2015-01-12 Paul Eggert <eggert@cs.ucla.edu> Port to 32-bit MingGW --with-wide-int diff --git a/src/buffer.c b/src/buffer.c index d0ffe67d9..8a97f3d 100644 --- a/src/buffer.c +++ b/src/buffer.c @@ -4683,10 +4683,19 @@ static bool mmap_initialized_p; Default is to conservatively assume the address range is occupied by something else. This can be overridden by system configuration - files if system-specific means to determine this exists. */ + files if system-specific means to determine this exists. + + However, if MAP_EXCL is defined assume that it is an mmap flag + which, combined with MAP_FIXED, has FreeBSD semantics, viz., the + mapping request will fail if a mapping already exists within the + range (the flag was first present in release 10.1). */ + +#ifndef MAP_EXCL +#define MAP_EXCL 0 +#endif #ifndef MMAP_ALLOCATED_P -#define MMAP_ALLOCATED_P(start, end) 1 +#define MMAP_ALLOCATED_P(start, end) (!MAP_EXCL) #endif /* Perform necessary initializations for the use of mmap. */ @@ -4770,7 +4779,7 @@ mmap_enlarge (struct mmap_region *r, int npages) void *p; p = mmap (region_end, nbytes, PROT_READ | PROT_WRITE, - MAP_ANON | MAP_PRIVATE | MAP_FIXED, mmap_fd, 0); + MAP_ANON | MAP_EXCL | MAP_PRIVATE | MAP_FIXED, mmap_fd, 0); if (p == MAP_FAILED) ; /* fprintf (stderr, "mmap: %s\n", emacs_strerror (errno)); */ else if (p != region_end) -- 2.2.1 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-13 14:06 ` Wolfgang Jenkner @ 2015-01-13 16:25 ` Eli Zaretskii 2015-01-13 17:12 ` Wolfgang Jenkner 2015-01-14 19:41 ` Wolfgang Jenkner 2020-09-07 21:30 ` Lars Ingebrigtsen 2 siblings, 1 reply; 33+ messages in thread From: Eli Zaretskii @ 2015-01-13 16:25 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org > Date: Tue, 13 Jan 2015 15:06:01 +0100 > > However, if the file in question is compressed > revert-buffer-with-coding-system still takes 4 minutes (the user time > being dominated to 98% by memmove). Is the problem with compressed files due to the fact that the size is unknown in advance? If so, perhaps enlarging by more than was requested (e.g., twice as large) will alleviate the problem? Thanks. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-13 16:25 ` Eli Zaretskii @ 2015-01-13 17:12 ` Wolfgang Jenkner 2015-01-13 17:31 ` Eli Zaretskii 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2015-01-13 17:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Tue, Jan 13 2015, Eli Zaretskii wrote: >> From: Wolfgang Jenkner <wjenkner@inode.at> >> Cc: 19393@debbugs.gnu.org >> Date: Tue, 13 Jan 2015 15:06:01 +0100 >> >> However, if the file in question is compressed >> revert-buffer-with-coding-system still takes 4 minutes (the user time >> being dominated to 98% by memmove). > > Is the problem with compressed files due to the fact that the size is > unknown in advance? I only know that loading the compressed file from disk with the same coding system conversion as above takes just a few seconds, i.e., doing something like C-x RET c l a t i n - 1 <return> C-x C-f m o v i e s . l i s t . g z is fast (enough). > If so, perhaps enlarging by more than was > requested (e.g., twice as large) will alleviate the problem? IIUC, this is your previous suggestion about improving insert-file-contents itself? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-13 17:12 ` Wolfgang Jenkner @ 2015-01-13 17:31 ` Eli Zaretskii 0 siblings, 0 replies; 33+ messages in thread From: Eli Zaretskii @ 2015-01-13 17:31 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > From: Wolfgang Jenkner <wjenkner@inode.at> > Cc: 19393@debbugs.gnu.org > Date: Tue, 13 Jan 2015 18:12:54 +0100 > > > Is the problem with compressed files due to the fact that the size is > > unknown in advance? > > I only know that loading the compressed file from disk with the same > coding system conversion as above takes just a few seconds, i.e., doing > something like > > C-x RET c l a t i n - 1 <return> C-x C-f m o v i e s . l i s t . g z > > is fast (enough). Then it's probably not what I had in mind. > > If so, perhaps enlarging by more than was > > requested (e.g., twice as large) will alleviate the problem? > > IIUC, this is your previous suggestion about improving > insert-file-contents itself? According to what you see, it sounds like determining the encoding is what takes the time here, for some reason triggering massive memmove's. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-13 14:06 ` Wolfgang Jenkner 2015-01-13 16:25 ` Eli Zaretskii @ 2015-01-14 19:41 ` Wolfgang Jenkner 2015-01-15 13:38 ` Wolfgang Jenkner 2020-09-07 21:30 ` Lars Ingebrigtsen 2 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2015-01-14 19:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Tue, Jan 13 2015, Wolfgang Jenkner wrote: > Here's a simple change in src/buffer.c that reduces the time to six > seconds or so, but only for newer versions of FreeBSD. > > It takes advantage of the MAP_EXCL flag for mmap(2), which has been > recently added[1] and is also available in 10-STABLE and 10.1-RELEASE. There remains the problem, though, that emacs on FreeBSD also uses gmalloc and hence, IIUC, sbrk() for memory allocation, and at this point I'm too ignorant about almost everything involved here to be confident that mmap()ed pages can't overlap with the process (BSS) data segment when MAP_EXCL | MAP_FIXED is among the flags. Without the MAP_EXCL mmap flag they definitely can overlap, as the following test program shows when it is _statically_ linked. Here's the output when I run it: r0 = 0x800663000 Cannot allocate memory r2 = 0x800662000 -- >8 -- #include <sys/types.h> #include <unistd.h> #include <stdio.h> #include <sys/mman.h> #include <errno.h> int main () { int n; void *r0, *r1, *r2; n = getpagesize(); r0 = mmap(NULL, n, PROT_READ | PROT_WRITE, MAP_ANON, -1, 0); if (r0 == MAP_FAILED || brk(r0) != 0 || sbrk(0) != r0) return (1); fprintf(stderr, "r0 = %p\n", r0); errno = 0; r1 = mmap(r0 - n, n, PROT_READ | PROT_WRITE, MAP_ANON | MAP_EXCL | MAP_FIXED, -1, 0); if (r1 == MAP_FAILED) perror(NULL); else fprintf(stderr, "r1 = %p\n", r1); errno = 0; r2 = mmap(r0 - n, n, PROT_READ | PROT_WRITE, MAP_ANON | MAP_FIXED, -1, 0); if (r2 == MAP_FAILED) perror(NULL); else fprintf(stderr, "r2 = %p\n", r2); return (0); } ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-14 19:41 ` Wolfgang Jenkner @ 2015-01-15 13:38 ` Wolfgang Jenkner 2015-01-15 16:08 ` Stefan Monnier 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2015-01-15 13:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19393 On Wed, Jan 14 2015, Wolfgang Jenkner wrote: > There remains the problem, though, that emacs on FreeBSD also uses > gmalloc and hence, IIUC, sbrk() for memory allocation, and at this point > I'm too ignorant about almost everything involved here to be confident > that mmap()ed pages can't overlap with the process (BSS) data segment > when MAP_EXCL | MAP_FIXED is among the flags. > > Without the MAP_EXCL mmap flag they definitely can overlap, as the > following test program shows when it is _statically_ linked. > > Here's the output when I run it: > > r0 = 0x800663000 > Cannot allocate memory > r2 = 0x800662000 However, I somehow forgot that, quite contrary to my test program, src/buffer.c would use MAP_FIXED only when trying to add some other pages on top of an existing region, the beginning of which was mmap'd without MAP_FIXED. Hence the new region could only reach into the data segment if the old one was already there. That is, the patch doesn't change the current situation in this regard. So I think that the patch would be OK, after all. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-15 13:38 ` Wolfgang Jenkner @ 2015-01-15 16:08 ` Stefan Monnier 2015-01-15 17:00 ` Wolfgang Jenkner 0 siblings, 1 reply; 33+ messages in thread From: Stefan Monnier @ 2015-01-15 16:08 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 > However, I somehow forgot that, quite contrary to my test program, > src/buffer.c would use MAP_FIXED only when trying to add some other > pages on top of an existing region, the beginning of which was mmap'd > without MAP_FIXED. Hence the new region could only reach into the data > segment if the old one was already there. That is, the patch doesn't > change the current situation in this regard. > So I think that the patch would be OK, after all. Thanks Wolfgang for looking into this. I'm really unfamiliar with that code, so I can't help much, but hopefully someone else will be able to take care of your patch, Stefan ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-15 16:08 ` Stefan Monnier @ 2015-01-15 17:00 ` Wolfgang Jenkner 0 siblings, 0 replies; 33+ messages in thread From: Wolfgang Jenkner @ 2015-01-15 17:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: 19393 On Thu, Jan 15 2015, Stefan Monnier wrote: > but hopefully someone else will be able to > take care of your patch, You gave me a commit bit (but I haven't been very active since then)... ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2015-01-13 14:06 ` Wolfgang Jenkner 2015-01-13 16:25 ` Eli Zaretskii 2015-01-14 19:41 ` Wolfgang Jenkner @ 2020-09-07 21:30 ` Lars Ingebrigtsen 2020-09-10 0:43 ` Wolfgang Jenkner 2 siblings, 1 reply; 33+ messages in thread From: Lars Ingebrigtsen @ 2020-09-07 21:30 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 Wolfgang Jenkner <wjenkner@inode.at> writes: > * src/buffer.c (MAP_EXCL): Make sure it is always defined. > (MMAP_ALLOCATED_P, mmap_enlarge): Use it. > This alleviates a performance problem due to excessive use of > memcpy(3). (Bug#19393) [...] > - MAP_ANON | MAP_PRIVATE | MAP_FIXED, mmap_fd, 0); > + MAP_ANON | MAP_EXCL | MAP_PRIVATE | MAP_FIXED, mmap_fd, 0); This patch apparently made loading huge files on FreeBSD a lot faster, but as far as I can tell, it was never applied. This was five years ago, though -- Wolfgang, is this still a problem on FreeBSD? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2020-09-07 21:30 ` Lars Ingebrigtsen @ 2020-09-10 0:43 ` Wolfgang Jenkner 2020-09-10 13:17 ` Lars Ingebrigtsen 0 siblings, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2020-09-10 0:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 19393 Lars Ingebrigtsen <larsi@gnus.org> wrote: > This was five years ago, though -- Wolfgang, is this still a problem on > FreeBSD? No, AFAICT. For the last four years or so, FreeBSD (like other non-glibc based systems) has been able to use its native libc malloc instead of the bundled gmalloc (first via HYBRID_MALLOC and now thanks to pdumper). The test case described above in this bug report now takes only a few seconds (both with or without compression). My patch above should be consigned to oblivion. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2020-09-10 0:43 ` Wolfgang Jenkner @ 2020-09-10 13:17 ` Lars Ingebrigtsen 0 siblings, 0 replies; 33+ messages in thread From: Lars Ingebrigtsen @ 2020-09-10 13:17 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 Wolfgang Jenkner <wjenkner@inode.at> writes: > Lars Ingebrigtsen <larsi@gnus.org> wrote: > >> This was five years ago, though -- Wolfgang, is this still a problem on >> FreeBSD? > > No, AFAICT. > > For the last four years or so, FreeBSD (like other non-glibc based > systems) has been able to use its native libc malloc instead of the > bundled gmalloc (first via HYBRID_MALLOC and now thanks to pdumper). > > The test case described above in this bug report now takes only a few > seconds (both with or without compression). > > My patch above should be consigned to oblivion. OK. :-) Closing this bug report. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-16 19:36 ` Tassilo Horn 2014-12-17 14:22 ` Wolfgang Jenkner @ 2014-12-17 15:12 ` Wolfgang Jenkner 2014-12-17 15:46 ` Tassilo Horn 1 sibling, 1 reply; 33+ messages in thread From: Wolfgang Jenkner @ 2014-12-17 15:12 UTC (permalink / raw) To: Tassilo Horn; +Cc: 19393 On Tue, Dec 16 2014, Tassilo Horn wrote: >> In any case, some bisecting shows that the first problem is the line >> >> Jedna žena – jedan vek (2011) 2011 >> >> It seems to be encoded in Windows-1250 [1] instead. > > Indeed. How did you search for it? I guess you didn't just scroll the > file with open eye. Bisecting (to base 10 ;-) $ cp movies.list /tmp/bad && cd /tmp Then repeat the following 5 or 6 times. $ split -n10 bad $ emacs -Q x* $ cp x... bad $ rm x* Just look for the indication of the buffer coding system in the mode line to find the first bad file at each step. Wolfgang ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files 2014-12-17 15:12 ` Wolfgang Jenkner @ 2014-12-17 15:46 ` Tassilo Horn 0 siblings, 0 replies; 33+ messages in thread From: Tassilo Horn @ 2014-12-17 15:46 UTC (permalink / raw) To: Wolfgang Jenkner; +Cc: 19393 Wolfgang Jenkner <wjenkner@inode.at> writes: >> Indeed. How did you search for it? I guess you didn't just scroll the >> file with open eye. > > Bisecting (to base 10 ;-) > > $ cp movies.list /tmp/bad && cd /tmp > > Then repeat the following 5 or 6 times. > > $ split -n10 bad > $ emacs -Q x* > $ cp x... bad > $ rm x* > > Just look for the indication of the buffer coding system in the mode > line to find the first bad file at each step. Ah, I see. I hoped for some emacs command that lets me search for characters displayed "in red", e.g., characters displayed as ^J or \374. Bye, Tassilo ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2020-09-10 13:17 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-16 15:21 bug#19393: 25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files Tassilo Horn 2014-12-16 16:05 ` Eli Zaretskii 2014-12-16 16:20 ` Eli Zaretskii 2014-12-16 19:22 ` Tassilo Horn 2014-12-16 19:10 ` Tassilo Horn 2014-12-16 16:39 ` martin rudalics 2014-12-16 19:26 ` Tassilo Horn 2014-12-16 16:56 ` Andreas Schwab 2014-12-16 18:49 ` Wolfgang Jenkner 2014-12-16 19:36 ` Tassilo Horn 2014-12-17 14:22 ` Wolfgang Jenkner 2014-12-17 15:50 ` Eli Zaretskii 2014-12-17 16:02 ` Wolfgang Jenkner 2014-12-17 17:03 ` Eli Zaretskii 2014-12-18 1:47 ` Wolfgang Jenkner 2014-12-18 16:22 ` Eli Zaretskii 2014-12-18 16:36 ` Wolfgang Jenkner 2014-12-18 17:34 ` Eli Zaretskii 2014-12-20 3:21 ` Wolfgang Jenkner 2014-12-20 7:27 ` Eli Zaretskii 2015-01-13 14:06 ` Wolfgang Jenkner 2015-01-13 16:25 ` Eli Zaretskii 2015-01-13 17:12 ` Wolfgang Jenkner 2015-01-13 17:31 ` Eli Zaretskii 2015-01-14 19:41 ` Wolfgang Jenkner 2015-01-15 13:38 ` Wolfgang Jenkner 2015-01-15 16:08 ` Stefan Monnier 2015-01-15 17:00 ` Wolfgang Jenkner 2020-09-07 21:30 ` Lars Ingebrigtsen 2020-09-10 0:43 ` Wolfgang Jenkner 2020-09-10 13:17 ` Lars Ingebrigtsen 2014-12-17 15:12 ` Wolfgang Jenkner 2014-12-17 15:46 ` Tassilo Horn
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).