unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Lee Sau Dan <danlee@informatik.uni-freiburg.de>
Subject: Re: Reading portions of large files
Date: 20 Jan 2003 08:50:31 +0100	[thread overview]
Message-ID: <m3hec6hn8r.fsf@mika.informatik.uni-freiburg.de> (raw)
In-Reply-To: 5lbs2mdrxs.fsf@rum.cs.yale.edu

>>>>> "Stefan" == "Stefan Monnier <foo@acm.com>" <monnier+gnu.emacs.help/news/@flint.cs.yale.edu> writes:

    Stefan> Since at least 1 bit of tag is needed, that means that to
    Stefan> get 31bit integers we'd need to move the mark bit
    Stefan> somewhere else.  XEmacs decided to use 3-word cons cells
    Stefan> (and I know they're still regularly wondering whether it
    Stefan> was a good idea).  Another approach is to use a separate
    Stefan> mark-bit array.

I think the separate mark-bit  array would be cleaner.  You don't need
to access  the mark  bits unless  you're doing gc.   Why let  that bit
stick  there in  the  _main_ working  set  all the  time?  Wouldn't  a
separate mark-bit array also improve locality (important for caching)?

Then, in theory, the tag bits  can also be kept separately, giving the
full  32 bits to  integers (represented  as machine-native  words).  I
think  we only  need 1  tag bit  in the  separate tag-bit  array.  Its
function is  to indicate whether  the corresponding memory word  is an
integer or not.  If not, then  the remaining tag bits are found in the
word itself.  And integer arithmetic can certainly be faster!

Would this implementation be more efficient or worse?


    Stefan> Lots of trade offs, a fair bit of coding, even more
    Stefan> testing, ...  Anybody interested is welcome to tried it
    Stefan> out.  My opinion is that maybe it would be nice, but since
    Stefan> the only application I'm aware of is "editing files
    Stefan> between 128MB and 1GB on 32bit systems", I don't think
    Stefan> it's worth the trouble.

Yeah.  I share this last point with you.  >128MB text files are simply
weird.  And for binary file, a real hex editor (or 'xxd', which I just
discovered) is a more appropriate tool, or just 'dd'.


-- 
Lee Sau Dan                     李守敦(Big5)                    ~{@nJX6X~}(HZ) 

E-mail: danlee@informatik.uni-freiburg.de
Home page: http://www.informatik.uni-freiburg.de/~danlee

  parent reply	other threads:[~2003-01-20  7:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.100.1042135372.21513.help-gnu-emacs@gnu.org>
2003-01-09 18:20 ` Reading portions of large files David Kastrup
2003-01-10 19:21   ` Eli Zaretskii
     [not found]   ` <mailman.153.1042230313.21513.help-gnu-emacs@gnu.org>
2003-01-10 20:51     ` David Kastrup
2003-01-11  8:51       ` Eli Zaretskii
     [not found]       ` <mailman.169.1042278925.21513.help-gnu-emacs@gnu.org>
2003-01-11 10:42         ` David Kastrup
2003-01-12 20:38       ` Stefan Monnier <foo@acm.com>
2003-01-13  7:40         ` Miles Bader
2003-01-13  7:42           ` Miles Bader
2003-01-13  7:55             ` David Kastrup
2003-01-13  8:05               ` Miles Bader
2003-01-20  7:50         ` Lee Sau Dan [this message]
2003-01-24  7:55           ` Mac
2003-01-27 14:44           ` Stefan Monnier <foo@acm.com>
2003-01-10 16:27 ` Eric Pement
2003-01-10 17:16 ` Brendan Halpin
2003-01-10 20:35   ` Benjamin Riefenstahl
2003-01-11 10:25     ` Klaus Berndl
2003-01-20  7:50     ` Lee Sau Dan
2003-01-20 12:46       ` Benjamin Riefenstahl
2003-01-20  7:50   ` Lee Sau Dan
2003-01-09 15:45 Gerald.Jean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3hec6hn8r.fsf@mika.informatik.uni-freiburg.de \
    --to=danlee@informatik.uni-freiburg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).