unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
To: rms@gnu.org
Cc: jasonr@gnu.org, dann@ics.uci.edu, evilborisnet@netscape.net,
	emanuele.giaquinta@gmail.com, emacs-devel@gnu.org
Subject: Re: size of emacs executable after unicode merge
Date: Fri, 31 Oct 2008 14:29:28 +0900	[thread overview]
Message-ID: <E1KvmZc-0001tX-Pg@etlken.m17n.org> (raw)
In-Reply-To: <E1Kvl71-0002pv-Pt@fencepost.gnu.org> (rms@gnu.org)

In article <E1Kvl71-0002pv-Pt@fencepost.gnu.org>, "Richard M. Stallman" <rms@gnu.org> writes:

>     If I comment the load_charset_map_from_file call in unify_charset the
>     data segment size is back to normal.

> Although these are loaded "on demand", perhaps something "demands" them
> at build time.

It's not that simple.  This is the strategy of the charset
map loading mechanism.  I took that approach expecting that
char-tables that are garbage-collected before dumping are
not in the dumped file.

(0) At first, Emacs assigns a unique linear character code
    space in upper Unicode area (#x110000-) to each big
    character set (e.g. GB, JIS, KSC) (*see the note at the
    tail).  The decoding of a character of a specific
    charset into this area is quite fast (done just by a few
    steps of arithmetic calculation).  Encoding is the same
    too.

(1) While building Emacs, when unify-charset is called, we
    update two char-tables Vchar_unify_table, and
    Vchar_unified_charset_table.  The former maps a
    character in the above upper area to Unicode area, and
    the latter maps the character to charset symbol.
    Unify-charset also builds deunifier char-table for each
    charater set that maps a character in Unicode area to
    the upper area that is unique to each charset.

    So at this time, the full maps is build.

(2) Just before dumping, clear-charset-maps is called.  This
    function sets all char-tables built in (1) (except for
    Vchar_unified_charset_table) to nil.  Then set
    Vchar_unify_table to Vchar_unified_charset_table, and
    set Vchar_unified_charset_table to nil.

    Then, garbage-collect is called.  After that, the living
    char-table is Vchar_unify_table only, and the contents
    is not that big because it maps upper area characters to
    charset, and each charset has linear upper area, thus
    most succeeding charaters have the same value.

(3) When the dumped Emacs runs, at the time of
    decoding/encoding charsets that are unified as above, by
    checking if the value of Vchar_unify_table for a
    character is symbol or not, Emacs knows whether it has
    to load the mapping table again or not.

    So, that way, Emacs loads maps on demand.


*Note:

The reason Emacs assigns those linear area is because such
big charsets tend to have their own private use area, and we
must keep a unique characte code for them.  Those private
characters are decoded and encoded without being mapped to
Unicode are.

---
Kenichi Handa
handa@ni.aist.go.jp




  reply	other threads:[~2008-10-31  5:29 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-14  3:13 size of emacs executable after unicode merge Evil Boris
2008-05-14  3:51 ` Dan Nicolaescu
2008-05-14 16:39   ` Richard M Stallman
2008-05-14 16:52     ` Dan Nicolaescu
2008-05-15 14:18       ` Richard M Stallman
2008-05-15 15:29         ` Dan Nicolaescu
2008-05-16 11:31           ` Richard M Stallman
2008-05-16 12:06             ` Dan Nicolaescu
2008-05-16 12:32             ` Kenichi Handa
2008-05-16 12:55               ` Jason Rumney
2008-05-16 15:59                 ` Thomas Lord
2008-05-16 22:07                   ` Stephen J. Turnbull
2008-05-16 23:01                     ` Thomas Lord
2008-05-17  0:56                 ` Kenichi Handa
2008-05-17  1:52                   ` YAMAMOTO Mitsuharu
2008-05-19  1:45                     ` Kenichi Handa
2008-10-30 10:18                   ` Emanuele Giaquinta
2008-10-30 21:22                     ` Eli Zaretskii
2008-10-30 21:42                       ` Stefan Monnier
2008-10-31  3:55                     ` Richard M. Stallman
2008-10-31  5:29                       ` Kenichi Handa [this message]
2008-10-31  6:32                         ` Chong Yidong
2008-10-31  7:32                           ` Kenichi Handa
2008-10-31 10:09                             ` Eli Zaretskii
2008-10-31 12:33                               ` gdb error [Re: size of emacs executable after unicode merge] Kenichi Handa
2008-10-31 14:28                                 ` Eli Zaretskii
2008-10-31 12:35                               ` size of emacs executable after unicode merge Stephen Berman
2008-11-21 12:32                                 ` Kenichi Handa
2008-11-21 14:18                                   ` Ulrich Mueller
2008-10-31 10:41                           ` YAMAMOTO Mitsuharu
2008-10-31 15:07                         ` Dan Nicolaescu
2008-10-31 16:44                           ` Stefan Monnier
2008-11-04 23:09                             ` Chong Yidong
2008-11-05  4:17                               ` Kenichi Handa
2008-11-05 15:50                                 ` Stefan Monnier
2008-11-06  7:56                                   ` Kenichi Handa
2008-11-08  2:42                                     ` Stefan Monnier
2008-11-08  4:10                                       ` Chong Yidong
2008-11-08  9:19                                         ` Eli Zaretskii
2008-11-09  0:27                                         ` Richard M. Stallman
2008-11-09  6:29                                           ` Dan Nicolaescu
2008-11-09 17:11                                             ` Richard M. Stallman
2008-11-10  1:24                                               ` Stefan Monnier
2008-11-10  1:55                                                 ` Thomas Lord
2008-11-11  4:37                                                   ` Chong Yidong
2008-11-08 10:30                                       ` Dan Nicolaescu
2008-11-09 20:14                                     ` Chong Yidong
2008-11-10  1:59                                       ` Kenichi Handa
2008-11-10 15:18                                         ` Chong Yidong
2008-11-10 23:18                                         ` Chong Yidong
2008-11-11 18:17                                         ` Chong Yidong
2008-11-12  6:26                                           ` Kenichi Handa
2008-11-13 16:33                                             ` Chong Yidong
2008-11-14  0:48                                               ` Kenichi Handa
2008-11-27 11:20                                               ` Kenichi Handa
2008-11-27 16:07                                                 ` Chong Yidong
2008-11-27 16:12                                                 ` Dan Nicolaescu
2008-11-28  1:02                                                   ` Kenichi Handa
2008-11-27 16:31                                                 ` Stefan Monnier
2008-11-27 20:17                                                 ` Richard M Stallman
2008-11-27 20:42                                                   ` Eli Zaretskii
2008-11-28  1:47                                                   ` Kenichi Handa
2008-11-28 15:38                                                     ` Richard M Stallman
2008-11-29  1:52                                                       ` Kenichi Handa
2008-11-29 10:47                                                         ` Eli Zaretskii
2008-11-29 19:43                                                         ` Richard M Stallman
2008-11-30  4:50                                                           ` Chetan Pandya
2008-11-28 16:11                                                 ` Juanma Barranquero
2008-11-29  1:47                                                   ` Kenichi Handa
2008-11-29 11:13                                                     ` Juanma Barranquero
2008-11-29 12:17                                                       ` Juanma Barranquero
2008-11-29 13:50                                                         ` Kenichi Handa
2008-11-29 15:05                                                           ` Juanma Barranquero
2008-11-05 22:30                                 ` Richard M. Stallman
2008-11-06 11:58                                   ` Kenichi Handa
2008-11-07 12:39                                     ` Richard M. Stallman
2008-11-07 13:29                                       ` Stephen J. Turnbull
2008-11-07 21:15                                         ` Richard M. Stallman
2008-11-08  4:00                                           ` Stephen J. Turnbull
2008-11-08  4:19                                           ` Stefan Monnier
2008-10-31 19:30                         ` Richard M. Stallman
2008-11-09 22:43                         ` Chong Yidong
2008-11-09 22:57                           ` Chong Yidong
2008-11-10  1:28                             ` Kenichi Handa
2008-11-10 19:29                               ` Richard M. Stallman
2008-11-10  1:26                           ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1KvmZc-0001tX-Pg@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=dann@ics.uci.edu \
    --cc=emacs-devel@gnu.org \
    --cc=emanuele.giaquinta@gmail.com \
    --cc=evilborisnet@netscape.net \
    --cc=jasonr@gnu.org \
    --cc=rms@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).