From: Lee Sau Dan <danlee@informatik.uni-freiburg.de>
Subject: Re: Reading portions of large files
Date: 20 Jan 2003 08:50:31 +0100 [thread overview]
Message-ID: <m3hec6hn8r.fsf@mika.informatik.uni-freiburg.de> (raw)
In-Reply-To: 5lbs2mdrxs.fsf@rum.cs.yale.edu
>>>>> "Stefan" == "Stefan Monnier <foo@acm.com>" <monnier+gnu.emacs.help/news/@flint.cs.yale.edu> writes:
Stefan> Since at least 1 bit of tag is needed, that means that to
Stefan> get 31bit integers we'd need to move the mark bit
Stefan> somewhere else. XEmacs decided to use 3-word cons cells
Stefan> (and I know they're still regularly wondering whether it
Stefan> was a good idea). Another approach is to use a separate
Stefan> mark-bit array.
I think the separate mark-bit array would be cleaner. You don't need
to access the mark bits unless you're doing gc. Why let that bit
stick there in the _main_ working set all the time? Wouldn't a
separate mark-bit array also improve locality (important for caching)?
Then, in theory, the tag bits can also be kept separately, giving the
full 32 bits to integers (represented as machine-native words). I
think we only need 1 tag bit in the separate tag-bit array. Its
function is to indicate whether the corresponding memory word is an
integer or not. If not, then the remaining tag bits are found in the
word itself. And integer arithmetic can certainly be faster!
Would this implementation be more efficient or worse?
Stefan> Lots of trade offs, a fair bit of coding, even more
Stefan> testing, ... Anybody interested is welcome to tried it
Stefan> out. My opinion is that maybe it would be nice, but since
Stefan> the only application I'm aware of is "editing files
Stefan> between 128MB and 1GB on 32bit systems", I don't think
Stefan> it's worth the trouble.
Yeah. I share this last point with you. >128MB text files are simply
weird. And for binary file, a real hex editor (or 'xxd', which I just
discovered) is a more appropriate tool, or just 'dd'.
--
Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ)
E-mail: danlee@informatik.uni-freiburg.de
Home page: http://www.informatik.uni-freiburg.de/~danlee
next prev parent reply other threads:[~2003-01-20 7:50 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mailman.100.1042135372.21513.help-gnu-emacs@gnu.org>
2003-01-09 18:20 ` Reading portions of large files David Kastrup
2003-01-10 19:21 ` Eli Zaretskii
[not found] ` <mailman.153.1042230313.21513.help-gnu-emacs@gnu.org>
2003-01-10 20:51 ` David Kastrup
2003-01-11 8:51 ` Eli Zaretskii
[not found] ` <mailman.169.1042278925.21513.help-gnu-emacs@gnu.org>
2003-01-11 10:42 ` David Kastrup
2003-01-12 20:38 ` Stefan Monnier <foo@acm.com>
2003-01-13 7:40 ` Miles Bader
2003-01-13 7:42 ` Miles Bader
2003-01-13 7:55 ` David Kastrup
2003-01-13 8:05 ` Miles Bader
2003-01-20 7:50 ` Lee Sau Dan [this message]
2003-01-24 7:55 ` Mac
2003-01-27 14:44 ` Stefan Monnier <foo@acm.com>
2003-01-10 16:27 ` Eric Pement
2003-01-10 17:16 ` Brendan Halpin
2003-01-10 20:35 ` Benjamin Riefenstahl
2003-01-11 10:25 ` Klaus Berndl
2003-01-20 7:50 ` Lee Sau Dan
2003-01-20 12:46 ` Benjamin Riefenstahl
2003-01-20 7:50 ` Lee Sau Dan
2003-01-09 15:45 Gerald.Jean
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3hec6hn8r.fsf@mika.informatik.uni-freiburg.de \
--to=danlee@informatik.uni-freiburg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.