From: Jeremy Barbay <jbarbay@dcc.uchile.cl>
To: 17343@debbugs.gnu.org
Subject: bug#17343: 24.2; Exponential growth of files using raw-mode
Date: Thu, 24 Apr 2014 15:58:41 -0300 [thread overview]
Message-ID: <21337.24289.430068.104422@gargle.gargle.HOWL> (raw)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 5455 bytes --]
Hi.
Following the short recipe below shows how a user saving files in "raw
mode" could end up with files doubling their size each time saved, if
following emacs' suggestion to save it in raw mode:
* Recipe:
1. Save the following line in a file "testAccentsMinimal.txt"
Nà ¥à ¤à ¥
2. Repeatedly,
0) measure the size of the file (wc -c testAccentsMinimal.txt);
1) open emacs loading the file (emacs -q testAccentsMinimal.txt);
2) insert and delete a character in it (manually);
3) save it selecting the suggested raw encoding (manually);
4) quit emacs (or force the reload of the file).
* Result:
This should give something akin to the following, where one can see
the size of the file growing exponentially with the number of savings.
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
11 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
19 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
35 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
67 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
131 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
259 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
515 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -q testAccentsMinimal.txt
1027 testAccentsMinimal.txt
>wc -c testAccentsMinimal.txt ; emacs -Q testAccentsMinimal.txt
2051 testAccentsMinimal.txt
* (Tentative) Explanation:
- Even though the file is saved in "raw" mode, it is read in another
mode which prefix the "special" characters with a unicode code.
- Due to symbols from incompatible encodings, emacs is confused about
which encoding to use for saving and asks the user about it.
* Why it matters:
- The faulty sequence above occured naturally from copy pasting from
various webpages (containing accented characters) into the same
document, and was identified when some files grew too large. -
Files (e.g. of notes) end up doubling in size at each edition, until
they fill the memory and/or hard-drive, slow down the system and
make Emacs complain about the size of the file.
* (Potential) Solutions:
- when saving a file with conflicting encodings, instead of merely
suggesting the raw encoding, add an option to "clean" the file
instead of merely save it in raw mode, for instance by projecting
the file to an encoding by deleting all symbols which are
incompatible with it.
I think that I signaled this bug 1 year ago in Emacs 23 and was answered
at the time that this would be solved by the next version (24), but it
occured to me recently that this undesirable behavior was still there :(
I hope it helps.
--
Jeremy (http://www.dcc.uchile.cl/~jbarbay)
In GNU Emacs 24.2.1 (x86_64-unknown-linux-gnu, X toolkit, Xaw scroll bars)
of 2013-02-27 on raven
Windowing system distributor `The X.Org Foundation', version 11.0.11300000
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: en_US.UTF-8
value of $LC_NUMERIC: en_US.UTF-8
value of $LC_TIME: en_US.UTF-8
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default enable-multibyte-characters: t
Major mode: Text
Minor modes in effect:
tooltip-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<menu-bar> <help-menu> <send-emacs-bug-report>
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Loading vc-git...done
Scanning for dabbrevs...done
dabbrev-expand: No dynamic expansion for `Expo' found
Load-path shadows:
None found.
Features:
(shadow sort gnus-util mail-extr dabbrev emacsbug message format-spec
rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse
rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mail-utils vc-git ind-util regexp-opt
time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd
tool-bar dnd fontset image fringe lisp-mode register page menu-bar
rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax
facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak
czech european ethiopic indian cyrillic chinese case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces
cus-face files text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget hashtable-print-readable backquote
make-network-process dbusbind dynamic-setting system-font-setting
font-render-setting x-toolkit x multi-tty emacs)
next reply other threads:[~2014-04-24 18:58 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-24 18:58 Jeremy Barbay [this message]
2014-04-25 7:13 ` bug#17343: 24.2; Exponential growth of files using raw-mode Eli Zaretskii
2014-04-25 18:15 ` Stefan Monnier
2014-04-29 5:48 ` Jarek Czekalski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=21337.24289.430068.104422@gargle.gargle.HOWL \
--to=jbarbay@dcc.uchile.cl \
--cc=17343@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.