unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: view/edit large files
  2009-02-06 18:42   ` view/edit large files (was: map-file-lines) Ted Zlatanov
@ 2009-02-06 21:06     ` Ted Zlatanov
  2009-02-06 21:49       ` Miles Bader
  2009-02-07  9:14     ` view/edit large files (was: map-file-lines) Richard M Stallman
  1 sibling, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2009-02-06 21:06 UTC (permalink / raw)
  To: emacs-devel

On Fri, 6 Feb 2009 14:20:45 +0100 Mathias Dahl <mathias.dahl@gmail.com> wrote: 

MD> This is how far I got:
MD> http://www.emacswiki.org/emacs/vlf.el

Thank you, I looked at it and it's almost exactly what I was thinking
originally (but actually implemented :).  I would like it, however, to
be a minor mode rather than a major one so it's more generally useful.
Also writing modifications back is an interesting challenge.

MD> What I do know is that it hits the roof when the file is larger than
MD> that integer limit in Emacs, whatever it is. 

Modifying insert-file-contents to take float or list arguments to
specify the file position should not be too hard--I assume that's the
place where it fails.  Using floats bothers me a bit.  I'd really like
the offet to be a pair of integers, similar to the time storage in
Emacs.

I also got these comments from Chetan Pandya that I wanted to answer
here:
CP> Is this for editing binary files or file with single byte encoding?
CP> If not, it gets more complicated.

It must be single-byte or binary.  insert-file-contents doesn't handle
multibyte encodings and Emacs doesn't have a way to ensure a random seek
is to a valid sequence.  I believe this is all fixable, but I don't know
enough about multibyte encodings to be helpful.

CP> Is this to be the major mode for the file? In that case it may be
CP> OK. Otherwise it wrecks the font lock information and functions that
CP> work with sexp and such syntactic information.

I think as a major mode it's not very useful.  You can use `more' or
`less' from the shell to view a large file in a pager.  hexl-mode would
be a good major mode for large files, for example.

I don't think the font-lock information is very useful for large files
over multiple lines.  The most common case (viewing logs) just needs to
examine a single line.  Can you think of large files that have sexps and
other multiline (over 1000 lines) font-lockable data, which Emacs should
handle?  I can't think of any common ones.  In any case, at worst the
user will fall back to fundamental-mode, and that's better than nothing.

Ted





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-06 21:06     ` view/edit large files Ted Zlatanov
@ 2009-02-06 21:49       ` Miles Bader
       [not found]         ` <864oz3nyj8.fsf@lifelogs.com>
  0 siblings, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-02-06 21:49 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:
> MD> What I do know is that it hits the roof when the file is larger than
> MD> that integer limit in Emacs, whatever it is. 
>
> Using floats bothers me a bit.  I'd really like the offet to be a pair
> of integers, similar to the time storage in Emacs.

Why?  Floats are certainly a bit more convenient for the user...

-Miles

-- 
Generous, adj. Originally this word meant noble by birth and was rightly
applied to a great multitude of persons. It now means noble by nature and is
taking a bit of a rest.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* view/edit large files (was: map-file-lines)
@ 2009-02-07  3:44 MON KEY
  2009-02-08  4:34 ` Bob Rogers
  0 siblings, 1 reply; 25+ messages in thread
From: MON KEY @ 2009-02-07  3:44 UTC (permalink / raw)
  To: tzz; +Cc: emacs-devel

> [1] I still can't think of a better term than "window."
> large-file-window is too verbose.  boffset?  byte-offset?
> virtual-buffer?

off-slice
slice-off
virtual-slice

s_P




^ permalink raw reply	[flat|nested] 25+ messages in thread

* view/edit large files (was: map-file-lines)
  2009-02-07  3:44 view/edit large files (was: map-file-lines) MON KEY
@ 2009-02-08  4:34 ` Bob Rogers
  2009-02-09 19:44   ` view/edit large files Thien-Thi Nguyen
  0 siblings, 1 reply; 25+ messages in thread
From: Bob Rogers @ 2009-02-08  4:34 UTC (permalink / raw)
  To: MON KEY, tzz, emacs-devel

   From: MON KEY <monkey@sandpframing.com>
   Date: Fri, 6 Feb 2009 22:44:30 -0500

   > [1] I still can't think of a better term than "window."
   > large-file-window is too verbose.  boffset?  byte-offset?
   > virtual-buffer?

   off-slice
   slice-off
   virtual-slice

   s_P

Porthole?  Aperture?  Stratum?  Zone?

					-- Bob Rogers
					   http://www.rgrjr.com/




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-08  4:34 ` Bob Rogers
@ 2009-02-09 19:44   ` Thien-Thi Nguyen
  0 siblings, 0 replies; 25+ messages in thread
From: Thien-Thi Nguyen @ 2009-02-09 19:44 UTC (permalink / raw)
  To: emacs-devel

() Bob Rogers <rogers-emacs@rgrjr.dyndns.org>
() Sat, 7 Feb 2009 23:34:09 -0500

   Porthole?  Aperture?  Stratum?  Zone?

lima -- limited awareness/area.

thi




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-07  9:14     ` view/edit large files (was: map-file-lines) Richard M Stallman
@ 2009-02-09 20:26       ` Ted Zlatanov
  2009-02-10 20:02         ` Richard M Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2009-02-09 20:26 UTC (permalink / raw)
  To: emacs-devel

On Sat, 07 Feb 2009 04:14:27 -0500 Richard M Stallman <rms@gnu.org> wrote: 

RMS> Describe it in text is what I thought I did.
RMS> I don't have time to implement it, though.

I quote your suggestion here:
> Here's an idea for a UI for editing big files.  First you run M-x grep on
> the file, and display the matches for whatever regexp.  In the *grep*
> buffer you specify a region, which is a way of choosing two matches,
> the ones whose entries contain point and mark.  Then you give a command to edit
> the file from one of the matches to the other.  It marks these matches
> (and the lines containing them) as read-only so that you can't
> spoil the correspondance with the file.  Thus, you can always save this
> partial-file buffer.

> The beginning and end of the *grep* buffer can be used to specify
> that the portion to edit starts or ends at bof or eof.

I didn't understand your suggestion fully initially, sorry.  I think
you're suggesting that the user should pick a "window" between two
places in the file.  Then the user can only edit the file between those
two places.  That `grep' is producing the list of places is not as
important as the idea of having those places.

RMS> Your API seems to be aimed at a lower level,
RMS> so the two could work together.

Yes, I believe so.  I'll try to implement the items I listed and perhaps
someone can use them productively, to implement the UI you suggested or
something else.

Thanks for your help and time.
Ted





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
       [not found]         ` <864oz3nyj8.fsf@lifelogs.com>
@ 2009-02-10  1:58           ` Stefan Monnier
  2009-02-10  8:46             ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Monnier @ 2009-02-10  1:58 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

MB> Why?  Floats are certainly a bit more convenient for the user...
> By the same logic, time storage could have been done with floats.

Most likely time conses date back to a time were Emacs could be
configured without floats.

> The reason why it bothers me a bit is that it would be inconsistent
> with time storage--now there's two ways of storing large integers.

There are already many inconsistencies in this regard.  FWIW, I believe
that file-attributes can return floats for things like file-size.


        Stefan





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10  1:58           ` Stefan Monnier
@ 2009-02-10  8:46             ` Eli Zaretskii
  2009-02-10  9:23               ` Miles Bader
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-10  8:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tzz, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 09 Feb 2009 20:58:05 -0500
> Cc: emacs-devel@gnu.org
> 
> MB> Why?  Floats are certainly a bit more convenient for the user...
> > By the same logic, time storage could have been done with floats.
> 
> Most likely time conses date back to a time were Emacs could be
> configured without floats.

Yes, probably.

> > The reason why it bothers me a bit is that it would be inconsistent
> > with time storage--now there's two ways of storing large integers.
> 
> There are already many inconsistencies in this regard.  FWIW, I believe
> that file-attributes can return floats for things like file-size.

Yes, we do return a float for size.  But for some attributes, like
inode, floats are not a good idea, because inodes are habitually
compared for exact equality.  I'm not sure time values need that
measure of accuracy, though.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10  8:46             ` Eli Zaretskii
@ 2009-02-10  9:23               ` Miles Bader
  2009-02-10  9:54                 ` Eli Zaretskii
  2009-02-10 12:28                 ` Eli Zaretskii
  0 siblings, 2 replies; 25+ messages in thread
From: Miles Bader @ 2009-02-10  9:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tzz, Stefan Monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
> Yes, we do return a float for size.  But for some attributes, like
> inode, floats are not a good idea, because inodes are habitually
> compared for exact equality.  I'm not sure time values need that
> measure of accuracy, though.

"floats" can exactly represent integers if the integer quantity fits
within the mantissa.  For an IEEE double, that's 52 bits, which is
enough for many uses (for an inode number, I'm not sure -- obviously
it's enough for 32-bit inode numbers, but possibly not some 64-bit
numbers ... OTOH, neither is a cons of integers).

Requiring emacs platforms to support double-precision floats is probably
pretty safe these days, but I suppose it's the sort of thing people
could argue about...

-Miles

-- 
Bacchus, n. A convenient deity invented by the ancients as an excuse for
getting drunk.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10  9:23               ` Miles Bader
@ 2009-02-10  9:54                 ` Eli Zaretskii
  2009-02-10 10:02                   ` Miles Bader
  2009-02-10 12:28                 ` Eli Zaretskii
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-10  9:54 UTC (permalink / raw)
  To: Miles Bader; +Cc: tzz, monnier, emacs-devel

> From: Miles Bader <miles@gnu.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, tzz@lifelogs.com,
>         emacs-devel@gnu.org
> Date: Tue, 10 Feb 2009 18:23:58 +0900
> 
> "floats" can exactly represent integers if the integer quantity fits
> within the mantissa.  For an IEEE double, that's 52 bits, which is
> enough for many uses

Right, but is it enough in this case?  I don't know, it all depends on
what kind of time resolution is needed.  Also, time values are
frequently used in arithmetic operations that could lose a few low
bits.

> (for an inode number, I'm not sure -- obviously
> it's enough for 32-bit inode numbers, but possibly not some 64-bit
> numbers

Windows NTFS uses 64-bit numbers for the ``file index'' we use as the
replacement for inode.

> ... OTOH, neither is a cons of integers).

That's why we use a cons of 3 numbers.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10  9:54                 ` Eli Zaretskii
@ 2009-02-10 10:02                   ` Miles Bader
  2009-02-10 11:50                     ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-02-10 10:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tzz, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
>> "floats" can exactly represent integers if the integer quantity fits
>> within the mantissa.  For an IEEE double, that's 52 bits, which is
>> enough for many uses
>
> Right, but is it enough in this case?  I don't know, it all depends on
> what kind of time resolution is needed.  Also, time values are
> frequently used in arithmetic operations that could lose a few low
> bits.

If it's an integer, and it fits, it's exact -- there is no loss of precision.

>> (for an inode number, I'm not sure -- obviously
>> it's enough for 32-bit inode numbers, but possibly not some 64-bit
>> numbers
>
> Windows NTFS uses 64-bit numbers for the ``file index'' we use as the
> replacement for inode.

For traditional style inode numbers, which are allocate sequentially
from zero, it doesn't matter; however, for abstract 64-bit quantities
for which no guarantees, it wouldn't work.

-Miles

-- 
Discriminate, v.i. To note the particulars in which one person or thing is,
if possible, more objectionable than another.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10 10:02                   ` Miles Bader
@ 2009-02-10 11:50                     ` Eli Zaretskii
  2009-02-10 15:08                       ` Ted Zlatanov
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-10 11:50 UTC (permalink / raw)
  To: Miles Bader; +Cc: tzz, monnier, emacs-devel

> From: Miles Bader <miles@gnu.org>
> Cc: tzz@lifelogs.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> Date: Tue, 10 Feb 2009 19:02:55 +0900
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> >> "floats" can exactly represent integers if the integer quantity fits
> >> within the mantissa.  For an IEEE double, that's 52 bits, which is
> >> enough for many uses
> >
> > Right, but is it enough in this case?  I don't know, it all depends on
> > what kind of time resolution is needed.  Also, time values are
> > frequently used in arithmetic operations that could lose a few low
> > bits.
> 
> If it's an integer, and it fits, it's exact -- there is no loss of precision.

I was talking about arithmetic operations such as multiplication by
small factors, such as 2, in case it wasn't clear.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10  9:23               ` Miles Bader
  2009-02-10  9:54                 ` Eli Zaretskii
@ 2009-02-10 12:28                 ` Eli Zaretskii
  2009-02-10 12:46                   ` Miles Bader
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-10 12:28 UTC (permalink / raw)
  To: Miles Bader; +Cc: tzz, monnier, emacs-devel

> From: Miles Bader <miles@gnu.org>
> Date: Tue, 10 Feb 2009 18:23:58 +0900
> Cc: tzz@lifelogs.com, Stefan Monnier <monnier@iro.umontreal.ca>,
> 	emacs-devel@gnu.org
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > Yes, we do return a float for size.  But for some attributes, like
> > inode, floats are not a good idea, because inodes are habitually
> > compared for exact equality.  I'm not sure time values need that
> > measure of accuracy, though.
> 
> "floats" can exactly represent integers if the integer quantity fits
> within the mantissa.

On second thought, I don't think I agree.  For example an integer
number as small and "simple" as 5 does not have an exact
representation as an IEEE floating-point number, right?




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10 12:28                 ` Eli Zaretskii
@ 2009-02-10 12:46                   ` Miles Bader
  0 siblings, 0 replies; 25+ messages in thread
From: Miles Bader @ 2009-02-10 12:46 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
>> "floats" can exactly represent integers if the integer quantity fits
>> within the mantissa.
>
> On second thought, I don't think I agree.  For example an integer
> number as small and "simple" as 5 does not have an exact
> representation as an IEEE floating-point number, right?

Yes, it does.  All integers which fit into the mantissa (plus some
others) are exactly representable.

-Miles

-- 
`There are more things in heaven and earth, Horatio,
 Than are dreamt of in your philosophy.'





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10 11:50                     ` Eli Zaretskii
@ 2009-02-10 15:08                       ` Ted Zlatanov
  2009-02-17 19:23                         ` Stefan Monnier
  0 siblings, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2009-02-10 15:08 UTC (permalink / raw)
  To: emacs-devel

On Tue, 10 Feb 2009 13:50:46 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Miles Bader <miles@gnu.org>
>> If it's an integer, and it fits, it's exact -- there is no loss of precision.

EZ> I was talking about arithmetic operations such as multiplication by
EZ> small factors, such as 2, in case it wasn't clear.

While time values and file offsets can certainly be represented as
floats under some constraints, I think it's an inelegant solution.

This is the chance to have a clean design for support of large integers,
since I or someone else will be modifying insert-file-contents anyhow.
Why not add a int64 type?  It doesn't have to be supported everywhere,
and it can fail `integerp' as long as simple arithmetic works (in fact,
only + - < > need to support it for the file offsets work).  We can have
int64p and int-any-size-p as well.  The time functions can be modified
to support either the old-style conses or an int64.  The support for
int64 can be gradually grown; when people need it they can implement
it.  Scratch the itch.

I'm definitely not an expert on the Emacs internals, so this may be
completely untenable and it's probably been debated to death, but I hope
we can at least get started with a int64 implementation I can use for
large file support.

Thanks
Ted





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-09 20:26       ` view/edit large files Ted Zlatanov
@ 2009-02-10 20:02         ` Richard M Stallman
  0 siblings, 0 replies; 25+ messages in thread
From: Richard M Stallman @ 2009-02-10 20:02 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

    I didn't understand your suggestion fully initially, sorry.  I think
    you're suggesting that the user should pick a "window" between two
    places in the file.  Then the user can only edit the file between those
    two places.  That `grep' is producing the list of places is not as
    important as the idea of having those places.

Yes, that's it.  A further part of the idea is that the matching lines
that mark the beginning and the end of the segment would be read-only,
to prevent confusion in what it means to save the file back.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-10 15:08                       ` Ted Zlatanov
@ 2009-02-17 19:23                         ` Stefan Monnier
  2009-02-17 19:47                           ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Monnier @ 2009-02-17 19:23 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> While time values and file offsets can certainly be represented as
> floats under some constraints, I think it's an inelegant solution.

> This is the chance to have a clean design for support of large integers,
> since I or someone else will be modifying insert-file-contents anyhow.

Using floats has the major advantage that it only requires changes in
insert-file-contents (e.g. try the patch below).  Large integers can be
added as well, but it's a mostly orthogonal issue.


        Stefan


=== modified file 'src/fileio.c'
--- src/fileio.c	2009-02-11 20:00:50 +0000
+++ src/fileio.c	2009-02-17 19:21:59 +0000
@@ -3161,6 +3161,7 @@
   Lisp_Object old_Vdeactivate_mark = Vdeactivate_mark;
   int we_locked_file = 0;
   int deferred_remove_unwind_protect = 0;
+  off_t beg_offset, end_offset;
 
   if (current_buffer->base_buffer && ! NILP (visit))
     error ("Cannot do file visiting in an indirect buffer");
@@ -3268,12 +3269,12 @@
     }
 
   if (!NILP (beg))
-    CHECK_NUMBER (beg);
+    CHECK_NUMBER_OR_FLOAT (beg);
   else
     XSETFASTINT (beg, 0);
 
   if (!NILP (end))
-    CHECK_NUMBER (end);
+    CHECK_NUMBER_OR_FLOAT (end);
   else
     {
       if (! not_regular)
@@ -3408,6 +3409,8 @@
       set_coding_system = 1;
     }
 
+  beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg);
+  end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end);
   /* If requested, replace the accessible part of the buffer
      with the file contents.  Avoid replacing text at the
      beginning or end of the buffer that matches the file contents;
@@ -3438,9 +3441,9 @@
 	 give up on handling REPLACE in the optimized way.  */
       int giveup_match_end = 0;
 
-      if (XINT (beg) != 0)
+      if (beg_offset != 0)
 	{
-	  if (lseek (fd, XINT (beg), 0) < 0)
+	  if (lseek (fd, beg_offset, 0) < 0)
 	    report_file_error ("Setting file position",
 			       Fcons (orig_filename, Qnil));
 	}
@@ -3487,7 +3490,7 @@
       immediate_quit = 0;
       /* If the file matches the buffer completely,
 	 there's no need to replace anything.  */
-      if (same_at_start - BEGV_BYTE == XINT (end))
+      if (same_at_start - BEGV_BYTE == end_offset - beg_offset)
 	{
 	  emacs_close (fd);
 	  specpdl_ptr--;
@@ -3505,7 +3508,7 @@
 	  EMACS_INT total_read, nread, bufpos, curpos, trial;
 
 	  /* At what file position are we now scanning?  */
-	  curpos = XINT (end) - (ZV_BYTE - same_at_end);
+	  curpos = end_offset - (ZV_BYTE - same_at_end);
 	  /* If the entire file matches the buffer tail, stop the scan.  */
 	  if (curpos == 0)
 	    break;
@@ -3583,8 +3586,8 @@
 	    same_at_end += overlap;
 
 	  /* Arrange to read only the nonmatching middle part of the file.  */
-	  XSETFASTINT (beg, XINT (beg) + (same_at_start - BEGV_BYTE));
-	  XSETFASTINT (end, XINT (end) - (ZV_BYTE - same_at_end));
+	  beg_offset += same_at_start - BEGV_BYTE;
+	  end_offset -= ZV_BYTE - same_at_end;
 
 	  del_range_byte (same_at_start, same_at_end, 0);
 	  /* Insert from the file at the proper position.  */
@@ -3628,7 +3631,7 @@
       /* First read the whole file, performing code conversion into
 	 CONVERSION_BUFFER.  */
 
-      if (lseek (fd, XINT (beg), 0) < 0)
+      if (lseek (fd, beg_offset, 0) < 0)
 	report_file_error ("Setting file position",
 			   Fcons (orig_filename, Qnil));
 
@@ -3803,7 +3806,7 @@
     {
       register Lisp_Object temp;
 
-      total = XINT (end) - XINT (beg);
+      total = end_offset - beg_offset;
 
       /* Make sure point-max won't overflow after this insertion.  */
       XSETINT (temp, total);
@@ -3830,9 +3833,9 @@
   if (GAP_SIZE < total)
     make_gap (total - GAP_SIZE);
 
-  if (XINT (beg) != 0 || !NILP (replace))
+  if (beg_offset != 0 || !NILP (replace))
     {
-      if (lseek (fd, XINT (beg), 0) < 0)
+      if (lseek (fd, beg_offset, 0) < 0)
 	report_file_error ("Setting file position",
 			   Fcons (orig_filename, Qnil));
     }





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 19:23                         ` Stefan Monnier
@ 2009-02-17 19:47                           ` Eli Zaretskii
  2009-02-17 20:18                             ` Miles Bader
  2009-02-18  1:56                             ` Stefan Monnier
  0 siblings, 2 replies; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-17 19:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tzz, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Date: Tue, 17 Feb 2009 14:23:32 -0500
> Cc: emacs-devel@gnu.org
> 
> +  off_t beg_offset, end_offset;

Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
of the floats, no?

> +  beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg);
> +  end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end);

Shouldn't we round rather than truncate, when converting to off_t?

> -      if (XINT (beg) != 0)
> +      if (beg_offset != 0)

Exact equalities might be dangerous with floats.

> -      if (same_at_start - BEGV_BYTE == XINT (end))
> +      if (same_at_start - BEGV_BYTE == end_offset - beg_offset)

Likewise.

> -  if (XINT (beg) != 0 || !NILP (replace))
> +  if (beg_offset != 0 || !NILP (replace))

Likewise.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 19:47                           ` Eli Zaretskii
@ 2009-02-17 20:18                             ` Miles Bader
  2009-02-17 20:51                               ` Eli Zaretskii
  2009-02-18  1:56                             ` Stefan Monnier
  1 sibling, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-02-17 20:18 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
>> +  off_t beg_offset, end_offset;
>
> Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
> of the floats, no?

If the system isn't capable of handling large files at all, then there's
no point in worrying about it, right?

>> +  beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg);
>> +  end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end);
>
> Shouldn't we round rather than truncate, when converting to off_t?

No.  The values being represented are integers.  The user almost
certainly will not be passing in a non-integral float; if he is doing
something weird so that he may end up with non-integral offsets, then
it's his job to worry about how such values are interpreted as integer
offsets.

Maybe it should guard against overflow in the conversion though (and
signal an error?).

>> -      if (XINT (beg) != 0)
>> +      if (beg_offset != 0)
>
> Exact equalities might be dangerous with floats.
>
>> -      if (same_at_start - BEGV_BYTE == XINT (end))
>> +      if (same_at_start - BEGV_BYTE == end_offset - beg_offset)
> Likewise.
>> -  if (XINT (beg) != 0 || !NILP (replace))
>> +  if (beg_offset != 0 || !NILP (replace))
> Likewise.

Comparing against zero here is fine -- a float can represent it exactly,
and there's no non-integer calculation to lose accuracy.

If there was overflow in the conversion to off_t, it probabably should
have been caught during the conversion.

-Miles

-- 
Discriminate, v.i. To note the particulars in which one person or thing is,
if possible, more objectionable than another.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 20:18                             ` Miles Bader
@ 2009-02-17 20:51                               ` Eli Zaretskii
  2009-02-17 21:19                                 ` Miles Bader
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-17 20:51 UTC (permalink / raw)
  To: Miles Bader; +Cc: emacs-devel

> From: Miles Bader <miles@gnu.org>
> Date: Wed, 18 Feb 2009 05:18:03 +0900
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> >> +  off_t beg_offset, end_offset;
> >
> > Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
> > of the floats, no?
> 
> If the system isn't capable of handling large files at all, then there's
> no point in worrying about it, right?

Some systems can handle large files, but only if you use something
like off64_t.

> >> +  beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg);
> >> +  end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end);
> >
> > Shouldn't we round rather than truncate, when converting to off_t?
> 
> No.  The values being represented are integers.  The user almost
> certainly will not be passing in a non-integral float

I was thinking about 1234.99999 or some such, due to inaccuracies in
converting textual representation into a float.

> Maybe it should guard against overflow in the conversion though (and
> signal an error?).

Yes, probably.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 20:51                               ` Eli Zaretskii
@ 2009-02-17 21:19                                 ` Miles Bader
  2009-02-17 21:21                                   ` Miles Bader
  0 siblings, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-02-17 21:19 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
>> > Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
>> > of the floats, no?
>> 
>> If the system isn't capable of handling large files at all, then there's
>> no point in worrying about it, right?
>
> Some systems can handle large files, but only if you use something
> like off64_t.

Sure, but using variant interfaces for large-file support is a much
bigger and more intrusive change.

Oh, BTW, of course there's a range of offsets which are still within
32-bits, and are representable by floats but not by emacs integers.

A separate question is whether emacs should try to use something like
_FILE_OFFSET_BITS=64 by default or not (on linux/solaris/... that causes
64-bit variants of off_t, syscalls, etc, to be used even on 32-bit
systems).

>> No.  The values being represented are integers.  The user almost
>> certainly will not be passing in a non-integral float
>
> I was thinking about 1234.99999 or some such, due to inaccuracies in
> converting textual representation into a float.

This should not happen with integer values (if it does, something is
very wrong).

-Miles

-- 
Congratulation, n. The civility of envy.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 21:19                                 ` Miles Bader
@ 2009-02-17 21:21                                   ` Miles Bader
  2009-02-18  4:09                                     ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-02-17 21:21 UTC (permalink / raw)
  To: emacs-devel

Miles Bader <miles@gnu.org> writes:
> Oh, BTW, of course there's a range of offsets which are still within
> 32-bits, and are representable by floats but not by emacs integers.

When I say "float", btw, I of course mean "double"... :-/

-Miles

-- 
97% of everything is grunge





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 19:47                           ` Eli Zaretskii
  2009-02-17 20:18                             ` Miles Bader
@ 2009-02-18  1:56                             ` Stefan Monnier
  2009-02-20 19:23                               ` Ted Zlatanov
  1 sibling, 1 reply; 25+ messages in thread
From: Stefan Monnier @ 2009-02-18  1:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tzz, emacs-devel

> Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
> of the floats, no?

I wouldn't worry about it for now.  This is just a quick patch, barely
tested (I wrote it a while ago, but haven't actually used it).  "off_t"
is what is used by "lseek", so if it's not enough, we need
further changes.

In any case, this was a mistake: it was only intended to be sent
to Ted.  We're in pretest, so we shouldn't waste time on such things.


        Stefan




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-17 21:21                                   ` Miles Bader
@ 2009-02-18  4:09                                     ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2009-02-18  4:09 UTC (permalink / raw)
  To: Miles Bader; +Cc: emacs-devel

> From: Miles Bader <miles@gnu.org>
> Date: Wed, 18 Feb 2009 06:21:26 +0900
> 
> Miles Bader <miles@gnu.org> writes:
> > Oh, BTW, of course there's a range of offsets which are still within
> > 32-bits, and are representable by floats but not by emacs integers.
> 
> When I say "float", btw, I of course mean "double"... :-/

So did I.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: view/edit large files
  2009-02-18  1:56                             ` Stefan Monnier
@ 2009-02-20 19:23                               ` Ted Zlatanov
  0 siblings, 0 replies; 25+ messages in thread
From: Ted Zlatanov @ 2009-02-20 19:23 UTC (permalink / raw)
  To: emacs-devel

On Tue, 17 Feb 2009 20:56:59 -0500 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> Is off_t guaranteed to be 64-bit wide?  If not, we lose the advantage
>> of the floats, no?

SM> I wouldn't worry about it for now.  This is just a quick patch, barely
SM> tested (I wrote it a while ago, but haven't actually used it).  "off_t"
SM> is what is used by "lseek", so if it's not enough, we need
SM> further changes.

SM> In any case, this was a mistake: it was only intended to be sent
SM> to Ted.  We're in pretest, so we shouldn't waste time on such things.

Thanks for the information.  I will wait until the release to get this
discussion going again (I also have the hashtable read support patch
waiting for that).

Ted





^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-02-20 19:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-07  3:44 view/edit large files (was: map-file-lines) MON KEY
2009-02-08  4:34 ` Bob Rogers
2009-02-09 19:44   ` view/edit large files Thien-Thi Nguyen
  -- strict thread matches above, loose matches on Subject: below --
2009-02-04 15:38 map-file-lines Ted Zlatanov
2009-02-05  5:40 ` map-file-lines Richard M Stallman
2009-02-06 18:42   ` view/edit large files (was: map-file-lines) Ted Zlatanov
2009-02-06 21:06     ` view/edit large files Ted Zlatanov
2009-02-06 21:49       ` Miles Bader
     [not found]         ` <864oz3nyj8.fsf@lifelogs.com>
2009-02-10  1:58           ` Stefan Monnier
2009-02-10  8:46             ` Eli Zaretskii
2009-02-10  9:23               ` Miles Bader
2009-02-10  9:54                 ` Eli Zaretskii
2009-02-10 10:02                   ` Miles Bader
2009-02-10 11:50                     ` Eli Zaretskii
2009-02-10 15:08                       ` Ted Zlatanov
2009-02-17 19:23                         ` Stefan Monnier
2009-02-17 19:47                           ` Eli Zaretskii
2009-02-17 20:18                             ` Miles Bader
2009-02-17 20:51                               ` Eli Zaretskii
2009-02-17 21:19                                 ` Miles Bader
2009-02-17 21:21                                   ` Miles Bader
2009-02-18  4:09                                     ` Eli Zaretskii
2009-02-18  1:56                             ` Stefan Monnier
2009-02-20 19:23                               ` Ted Zlatanov
2009-02-10 12:28                 ` Eli Zaretskii
2009-02-10 12:46                   ` Miles Bader
2009-02-07  9:14     ` view/edit large files (was: map-file-lines) Richard M Stallman
2009-02-09 20:26       ` view/edit large files Ted Zlatanov
2009-02-10 20:02         ` Richard M Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).