* view/edit large files (was: map-file-lines)
@ 2009-02-07 3:44 MON KEY
2009-02-08 4:34 ` Bob Rogers
0 siblings, 1 reply; 25+ messages in thread
From: MON KEY @ 2009-02-07 3:44 UTC (permalink / raw)
To: tzz; +Cc: emacs-devel
> [1] I still can't think of a better term than "window."
> large-file-window is too verbose. boffset? byte-offset?
> virtual-buffer?
off-slice
slice-off
virtual-slice
s_P
^ permalink raw reply [flat|nested] 25+ messages in thread
* view/edit large files (was: map-file-lines) 2009-02-07 3:44 view/edit large files (was: map-file-lines) MON KEY @ 2009-02-08 4:34 ` Bob Rogers 2009-02-09 19:44 ` view/edit large files Thien-Thi Nguyen 0 siblings, 1 reply; 25+ messages in thread From: Bob Rogers @ 2009-02-08 4:34 UTC (permalink / raw) To: MON KEY, tzz, emacs-devel From: MON KEY <monkey@sandpframing.com> Date: Fri, 6 Feb 2009 22:44:30 -0500 > [1] I still can't think of a better term than "window." > large-file-window is too verbose. boffset? byte-offset? > virtual-buffer? off-slice slice-off virtual-slice s_P Porthole? Aperture? Stratum? Zone? -- Bob Rogers http://www.rgrjr.com/ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-08 4:34 ` Bob Rogers @ 2009-02-09 19:44 ` Thien-Thi Nguyen 0 siblings, 0 replies; 25+ messages in thread From: Thien-Thi Nguyen @ 2009-02-09 19:44 UTC (permalink / raw) To: emacs-devel () Bob Rogers <rogers-emacs@rgrjr.dyndns.org> () Sat, 7 Feb 2009 23:34:09 -0500 Porthole? Aperture? Stratum? Zone? lima -- limited awareness/area. thi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: map-file-lines @ 2009-02-04 15:38 Ted Zlatanov 2009-02-05 5:40 ` map-file-lines Richard M Stallman 0 siblings, 1 reply; 25+ messages in thread From: Ted Zlatanov @ 2009-02-04 15:38 UTC (permalink / raw) To: emacs-devel On Wed, 04 Feb 2009 02:04:32 -0500 Richard M Stallman <rms@gnu.org> wrote: RMS> Here's an idea for a UI for editing big files. First you run M-x grep on RMS> the file, and display the matches for whatever regexp. In the *grep* RMS> buffer you specify a region, which is a way of choosing two matches, RMS> the ones whose entries contain point and mark. Then you give a command to edit RMS> the file from one of the matches to the other. It marks these matches RMS> (and the lines containing them) as read-only so that you can't RMS> spoil the correspondance with the file. Thus, you can always save this RMS> partial-file buffer. RMS> The beginning and end of the *grep* buffer can be used to specify RMS> that the portion to edit starts or ends at bof or eof. RMS> It would be easy to adapt this to variants such as RMS> (1) using hexl-mode to visit the file, RMS> (2) using methods other than grep to subdivide the file, RMS> (3) providing more friendly front ends to grep. This is essentially mapping byte offsets to line positions, with extra calculations. As Stefan suggested, it's better to just use byte offsets. Your approach requires a lot of tracking of the grep lines, whereas just using byte offsets requires remembering the two current offsets and nothing else. Otherwise I think your suggestions are similar to mine: set up a special mode where the buffer is a window[1] into the file instead of the whole file, and create special commands to move the window back and forth. Saving would only save the buffer contents; the window won't be moveable until changes are saved (another approach is to remember modifications outside the window, but that gets hairy with undo). Ted [1] I know "window" has meaning in Emacs already, but I can't think of a better term. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: map-file-lines 2009-02-04 15:38 map-file-lines Ted Zlatanov @ 2009-02-05 5:40 ` Richard M Stallman 2009-02-06 18:42 ` view/edit large files (was: map-file-lines) Ted Zlatanov 0 siblings, 1 reply; 25+ messages in thread From: Richard M Stallman @ 2009-02-05 5:40 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel This is essentially mapping byte offsets to line positions, with extra calculations. As Stefan suggested, it's better to just use byte offsets. The main point of my message is the UI proposal. You're right that it's good to use byte offsets. I had not thought about that, but it could be done by specifying the -b option for grep. Otherwise I think your suggestions are similar to mine: set up a special mode where the buffer is a window[1] into the file instead of the whole file, and create special commands to move the window back and forth. I proposed a specific UI for specifying which part of the file to edit, one I think will be convenient. ^ permalink raw reply [flat|nested] 25+ messages in thread
* view/edit large files (was: map-file-lines) 2009-02-05 5:40 ` map-file-lines Richard M Stallman @ 2009-02-06 18:42 ` Ted Zlatanov 2009-02-06 21:06 ` view/edit large files Ted Zlatanov 2009-02-07 9:14 ` view/edit large files (was: map-file-lines) Richard M Stallman 0 siblings, 2 replies; 25+ messages in thread From: Ted Zlatanov @ 2009-02-06 18:42 UTC (permalink / raw) To: emacs-devel On Thu, 05 Feb 2009 00:40:40 -0500 Richard M Stallman <rms@gnu.org> wrote: RMS> The main point of my message is the UI proposal. You're right that RMS> it's good to use byte offsets. I had not thought about that, but it RMS> could be done by specifying the -b option for grep. RMS> I proposed a specific UI for specifying which part of the file to RMS> edit, one I think will be convenient. Could you please explain, with code or text, what using your UI would look like? I looked over your suggestions and I still think we have the same idea, just expressed differently. You do seem to want `grep' instead of dynamic offsets, but see my comments later. Here's the "window"[1] API I'm suggesting, as a detailed list of TODO items: 1) a buffer-local set of offset variables that indicate the beginning and the end of the current window into the file. 2) override all write-file functions to write the buffer at the starting offset. I don't think there's a write-file-contents analogous to insert-file-contents 3) override all insert-file* functions to respect the offsets as well 3) disable insertion, always in overwrite mode for better performance (maybe allow insert at end of file...). Force save when the "window" is moved. 4) "window" management functions: set/get-window-offset, set/get-window-length, etc. These operate on the (1) buffer-local variables. As you can see, it requires no grep calls to pre-scan the file, and should be consistent with the existing Emacs code. Pre-scanning a large file with grep can be very expensive, and it's inaccurate if the large file is growing (e.g. a log file). Thanks to anyone with suggestions... Ted [1] I still can't think of a better term than "window." large-file-window is too verbose. boffset? byte-offset? virtual-buffer? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-06 18:42 ` view/edit large files (was: map-file-lines) Ted Zlatanov @ 2009-02-06 21:06 ` Ted Zlatanov 2009-02-06 21:49 ` Miles Bader 2009-02-07 9:14 ` view/edit large files (was: map-file-lines) Richard M Stallman 1 sibling, 1 reply; 25+ messages in thread From: Ted Zlatanov @ 2009-02-06 21:06 UTC (permalink / raw) To: emacs-devel On Fri, 6 Feb 2009 14:20:45 +0100 Mathias Dahl <mathias.dahl@gmail.com> wrote: MD> This is how far I got: MD> http://www.emacswiki.org/emacs/vlf.el Thank you, I looked at it and it's almost exactly what I was thinking originally (but actually implemented :). I would like it, however, to be a minor mode rather than a major one so it's more generally useful. Also writing modifications back is an interesting challenge. MD> What I do know is that it hits the roof when the file is larger than MD> that integer limit in Emacs, whatever it is. Modifying insert-file-contents to take float or list arguments to specify the file position should not be too hard--I assume that's the place where it fails. Using floats bothers me a bit. I'd really like the offet to be a pair of integers, similar to the time storage in Emacs. I also got these comments from Chetan Pandya that I wanted to answer here: CP> Is this for editing binary files or file with single byte encoding? CP> If not, it gets more complicated. It must be single-byte or binary. insert-file-contents doesn't handle multibyte encodings and Emacs doesn't have a way to ensure a random seek is to a valid sequence. I believe this is all fixable, but I don't know enough about multibyte encodings to be helpful. CP> Is this to be the major mode for the file? In that case it may be CP> OK. Otherwise it wrecks the font lock information and functions that CP> work with sexp and such syntactic information. I think as a major mode it's not very useful. You can use `more' or `less' from the shell to view a large file in a pager. hexl-mode would be a good major mode for large files, for example. I don't think the font-lock information is very useful for large files over multiple lines. The most common case (viewing logs) just needs to examine a single line. Can you think of large files that have sexps and other multiline (over 1000 lines) font-lockable data, which Emacs should handle? I can't think of any common ones. In any case, at worst the user will fall back to fundamental-mode, and that's better than nothing. Ted ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-06 21:06 ` view/edit large files Ted Zlatanov @ 2009-02-06 21:49 ` Miles Bader [not found] ` <864oz3nyj8.fsf@lifelogs.com> 0 siblings, 1 reply; 25+ messages in thread From: Miles Bader @ 2009-02-06 21:49 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel Ted Zlatanov <tzz@lifelogs.com> writes: > MD> What I do know is that it hits the roof when the file is larger than > MD> that integer limit in Emacs, whatever it is. > > Using floats bothers me a bit. I'd really like the offet to be a pair > of integers, similar to the time storage in Emacs. Why? Floats are certainly a bit more convenient for the user... -Miles -- Generous, adj. Originally this word meant noble by birth and was rightly applied to a great multitude of persons. It now means noble by nature and is taking a bit of a rest. ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <864oz3nyj8.fsf@lifelogs.com>]
* Re: view/edit large files [not found] ` <864oz3nyj8.fsf@lifelogs.com> @ 2009-02-10 1:58 ` Stefan Monnier 2009-02-10 8:46 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: Stefan Monnier @ 2009-02-10 1:58 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel MB> Why? Floats are certainly a bit more convenient for the user... > By the same logic, time storage could have been done with floats. Most likely time conses date back to a time were Emacs could be configured without floats. > The reason why it bothers me a bit is that it would be inconsistent > with time storage--now there's two ways of storing large integers. There are already many inconsistencies in this regard. FWIW, I believe that file-attributes can return floats for things like file-size. Stefan ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 1:58 ` Stefan Monnier @ 2009-02-10 8:46 ` Eli Zaretskii 2009-02-10 9:23 ` Miles Bader 0 siblings, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2009-02-10 8:46 UTC (permalink / raw) To: Stefan Monnier; +Cc: tzz, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 09 Feb 2009 20:58:05 -0500 > Cc: emacs-devel@gnu.org > > MB> Why? Floats are certainly a bit more convenient for the user... > > By the same logic, time storage could have been done with floats. > > Most likely time conses date back to a time were Emacs could be > configured without floats. Yes, probably. > > The reason why it bothers me a bit is that it would be inconsistent > > with time storage--now there's two ways of storing large integers. > > There are already many inconsistencies in this regard. FWIW, I believe > that file-attributes can return floats for things like file-size. Yes, we do return a float for size. But for some attributes, like inode, floats are not a good idea, because inodes are habitually compared for exact equality. I'm not sure time values need that measure of accuracy, though. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 8:46 ` Eli Zaretskii @ 2009-02-10 9:23 ` Miles Bader 2009-02-10 9:54 ` Eli Zaretskii 2009-02-10 12:28 ` Eli Zaretskii 0 siblings, 2 replies; 25+ messages in thread From: Miles Bader @ 2009-02-10 9:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: tzz, Stefan Monnier, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Yes, we do return a float for size. But for some attributes, like > inode, floats are not a good idea, because inodes are habitually > compared for exact equality. I'm not sure time values need that > measure of accuracy, though. "floats" can exactly represent integers if the integer quantity fits within the mantissa. For an IEEE double, that's 52 bits, which is enough for many uses (for an inode number, I'm not sure -- obviously it's enough for 32-bit inode numbers, but possibly not some 64-bit numbers ... OTOH, neither is a cons of integers). Requiring emacs platforms to support double-precision floats is probably pretty safe these days, but I suppose it's the sort of thing people could argue about... -Miles -- Bacchus, n. A convenient deity invented by the ancients as an excuse for getting drunk. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 9:23 ` Miles Bader @ 2009-02-10 9:54 ` Eli Zaretskii 2009-02-10 10:02 ` Miles Bader 2009-02-10 12:28 ` Eli Zaretskii 1 sibling, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2009-02-10 9:54 UTC (permalink / raw) To: Miles Bader; +Cc: tzz, monnier, emacs-devel > From: Miles Bader <miles@gnu.org> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, tzz@lifelogs.com, > emacs-devel@gnu.org > Date: Tue, 10 Feb 2009 18:23:58 +0900 > > "floats" can exactly represent integers if the integer quantity fits > within the mantissa. For an IEEE double, that's 52 bits, which is > enough for many uses Right, but is it enough in this case? I don't know, it all depends on what kind of time resolution is needed. Also, time values are frequently used in arithmetic operations that could lose a few low bits. > (for an inode number, I'm not sure -- obviously > it's enough for 32-bit inode numbers, but possibly not some 64-bit > numbers Windows NTFS uses 64-bit numbers for the ``file index'' we use as the replacement for inode. > ... OTOH, neither is a cons of integers). That's why we use a cons of 3 numbers. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 9:54 ` Eli Zaretskii @ 2009-02-10 10:02 ` Miles Bader 2009-02-10 11:50 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: Miles Bader @ 2009-02-10 10:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: tzz, monnier, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> "floats" can exactly represent integers if the integer quantity fits >> within the mantissa. For an IEEE double, that's 52 bits, which is >> enough for many uses > > Right, but is it enough in this case? I don't know, it all depends on > what kind of time resolution is needed. Also, time values are > frequently used in arithmetic operations that could lose a few low > bits. If it's an integer, and it fits, it's exact -- there is no loss of precision. >> (for an inode number, I'm not sure -- obviously >> it's enough for 32-bit inode numbers, but possibly not some 64-bit >> numbers > > Windows NTFS uses 64-bit numbers for the ``file index'' we use as the > replacement for inode. For traditional style inode numbers, which are allocate sequentially from zero, it doesn't matter; however, for abstract 64-bit quantities for which no guarantees, it wouldn't work. -Miles -- Discriminate, v.i. To note the particulars in which one person or thing is, if possible, more objectionable than another. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 10:02 ` Miles Bader @ 2009-02-10 11:50 ` Eli Zaretskii 2009-02-10 15:08 ` Ted Zlatanov 0 siblings, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2009-02-10 11:50 UTC (permalink / raw) To: Miles Bader; +Cc: tzz, monnier, emacs-devel > From: Miles Bader <miles@gnu.org> > Cc: tzz@lifelogs.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Tue, 10 Feb 2009 19:02:55 +0900 > > Eli Zaretskii <eliz@gnu.org> writes: > >> "floats" can exactly represent integers if the integer quantity fits > >> within the mantissa. For an IEEE double, that's 52 bits, which is > >> enough for many uses > > > > Right, but is it enough in this case? I don't know, it all depends on > > what kind of time resolution is needed. Also, time values are > > frequently used in arithmetic operations that could lose a few low > > bits. > > If it's an integer, and it fits, it's exact -- there is no loss of precision. I was talking about arithmetic operations such as multiplication by small factors, such as 2, in case it wasn't clear. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 11:50 ` Eli Zaretskii @ 2009-02-10 15:08 ` Ted Zlatanov 2009-02-17 19:23 ` Stefan Monnier 0 siblings, 1 reply; 25+ messages in thread From: Ted Zlatanov @ 2009-02-10 15:08 UTC (permalink / raw) To: emacs-devel On Tue, 10 Feb 2009 13:50:46 +0200 Eli Zaretskii <eliz@gnu.org> wrote: >> From: Miles Bader <miles@gnu.org> >> If it's an integer, and it fits, it's exact -- there is no loss of precision. EZ> I was talking about arithmetic operations such as multiplication by EZ> small factors, such as 2, in case it wasn't clear. While time values and file offsets can certainly be represented as floats under some constraints, I think it's an inelegant solution. This is the chance to have a clean design for support of large integers, since I or someone else will be modifying insert-file-contents anyhow. Why not add a int64 type? It doesn't have to be supported everywhere, and it can fail `integerp' as long as simple arithmetic works (in fact, only + - < > need to support it for the file offsets work). We can have int64p and int-any-size-p as well. The time functions can be modified to support either the old-style conses or an int64. The support for int64 can be gradually grown; when people need it they can implement it. Scratch the itch. I'm definitely not an expert on the Emacs internals, so this may be completely untenable and it's probably been debated to death, but I hope we can at least get started with a int64 implementation I can use for large file support. Thanks Ted ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 15:08 ` Ted Zlatanov @ 2009-02-17 19:23 ` Stefan Monnier 2009-02-17 19:47 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: Stefan Monnier @ 2009-02-17 19:23 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > While time values and file offsets can certainly be represented as > floats under some constraints, I think it's an inelegant solution. > This is the chance to have a clean design for support of large integers, > since I or someone else will be modifying insert-file-contents anyhow. Using floats has the major advantage that it only requires changes in insert-file-contents (e.g. try the patch below). Large integers can be added as well, but it's a mostly orthogonal issue. Stefan === modified file 'src/fileio.c' --- src/fileio.c 2009-02-11 20:00:50 +0000 +++ src/fileio.c 2009-02-17 19:21:59 +0000 @@ -3161,6 +3161,7 @@ Lisp_Object old_Vdeactivate_mark = Vdeactivate_mark; int we_locked_file = 0; int deferred_remove_unwind_protect = 0; + off_t beg_offset, end_offset; if (current_buffer->base_buffer && ! NILP (visit)) error ("Cannot do file visiting in an indirect buffer"); @@ -3268,12 +3269,12 @@ } if (!NILP (beg)) - CHECK_NUMBER (beg); + CHECK_NUMBER_OR_FLOAT (beg); else XSETFASTINT (beg, 0); if (!NILP (end)) - CHECK_NUMBER (end); + CHECK_NUMBER_OR_FLOAT (end); else { if (! not_regular) @@ -3408,6 +3409,8 @@ set_coding_system = 1; } + beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg); + end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end); /* If requested, replace the accessible part of the buffer with the file contents. Avoid replacing text at the beginning or end of the buffer that matches the file contents; @@ -3438,9 +3441,9 @@ give up on handling REPLACE in the optimized way. */ int giveup_match_end = 0; - if (XINT (beg) != 0) + if (beg_offset != 0) { - if (lseek (fd, XINT (beg), 0) < 0) + if (lseek (fd, beg_offset, 0) < 0) report_file_error ("Setting file position", Fcons (orig_filename, Qnil)); } @@ -3487,7 +3490,7 @@ immediate_quit = 0; /* If the file matches the buffer completely, there's no need to replace anything. */ - if (same_at_start - BEGV_BYTE == XINT (end)) + if (same_at_start - BEGV_BYTE == end_offset - beg_offset) { emacs_close (fd); specpdl_ptr--; @@ -3505,7 +3508,7 @@ EMACS_INT total_read, nread, bufpos, curpos, trial; /* At what file position are we now scanning? */ - curpos = XINT (end) - (ZV_BYTE - same_at_end); + curpos = end_offset - (ZV_BYTE - same_at_end); /* If the entire file matches the buffer tail, stop the scan. */ if (curpos == 0) break; @@ -3583,8 +3586,8 @@ same_at_end += overlap; /* Arrange to read only the nonmatching middle part of the file. */ - XSETFASTINT (beg, XINT (beg) + (same_at_start - BEGV_BYTE)); - XSETFASTINT (end, XINT (end) - (ZV_BYTE - same_at_end)); + beg_offset += same_at_start - BEGV_BYTE; + end_offset -= ZV_BYTE - same_at_end; del_range_byte (same_at_start, same_at_end, 0); /* Insert from the file at the proper position. */ @@ -3628,7 +3631,7 @@ /* First read the whole file, performing code conversion into CONVERSION_BUFFER. */ - if (lseek (fd, XINT (beg), 0) < 0) + if (lseek (fd, beg_offset, 0) < 0) report_file_error ("Setting file position", Fcons (orig_filename, Qnil)); @@ -3803,7 +3806,7 @@ { register Lisp_Object temp; - total = XINT (end) - XINT (beg); + total = end_offset - beg_offset; /* Make sure point-max won't overflow after this insertion. */ XSETINT (temp, total); @@ -3830,9 +3833,9 @@ if (GAP_SIZE < total) make_gap (total - GAP_SIZE); - if (XINT (beg) != 0 || !NILP (replace)) + if (beg_offset != 0 || !NILP (replace)) { - if (lseek (fd, XINT (beg), 0) < 0) + if (lseek (fd, beg_offset, 0) < 0) report_file_error ("Setting file position", Fcons (orig_filename, Qnil)); } ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 19:23 ` Stefan Monnier @ 2009-02-17 19:47 ` Eli Zaretskii 2009-02-17 20:18 ` Miles Bader 2009-02-18 1:56 ` Stefan Monnier 0 siblings, 2 replies; 25+ messages in thread From: Eli Zaretskii @ 2009-02-17 19:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: tzz, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Date: Tue, 17 Feb 2009 14:23:32 -0500 > Cc: emacs-devel@gnu.org > > + off_t beg_offset, end_offset; Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage of the floats, no? > + beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg); > + end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end); Shouldn't we round rather than truncate, when converting to off_t? > - if (XINT (beg) != 0) > + if (beg_offset != 0) Exact equalities might be dangerous with floats. > - if (same_at_start - BEGV_BYTE == XINT (end)) > + if (same_at_start - BEGV_BYTE == end_offset - beg_offset) Likewise. > - if (XINT (beg) != 0 || !NILP (replace)) > + if (beg_offset != 0 || !NILP (replace)) Likewise. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 19:47 ` Eli Zaretskii @ 2009-02-17 20:18 ` Miles Bader 2009-02-17 20:51 ` Eli Zaretskii 2009-02-18 1:56 ` Stefan Monnier 1 sibling, 1 reply; 25+ messages in thread From: Miles Bader @ 2009-02-17 20:18 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> + off_t beg_offset, end_offset; > > Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage > of the floats, no? If the system isn't capable of handling large files at all, then there's no point in worrying about it, right? >> + beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg); >> + end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end); > > Shouldn't we round rather than truncate, when converting to off_t? No. The values being represented are integers. The user almost certainly will not be passing in a non-integral float; if he is doing something weird so that he may end up with non-integral offsets, then it's his job to worry about how such values are interpreted as integer offsets. Maybe it should guard against overflow in the conversion though (and signal an error?). >> - if (XINT (beg) != 0) >> + if (beg_offset != 0) > > Exact equalities might be dangerous with floats. > >> - if (same_at_start - BEGV_BYTE == XINT (end)) >> + if (same_at_start - BEGV_BYTE == end_offset - beg_offset) > Likewise. >> - if (XINT (beg) != 0 || !NILP (replace)) >> + if (beg_offset != 0 || !NILP (replace)) > Likewise. Comparing against zero here is fine -- a float can represent it exactly, and there's no non-integer calculation to lose accuracy. If there was overflow in the conversion to off_t, it probabably should have been caught during the conversion. -Miles -- Discriminate, v.i. To note the particulars in which one person or thing is, if possible, more objectionable than another. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 20:18 ` Miles Bader @ 2009-02-17 20:51 ` Eli Zaretskii 2009-02-17 21:19 ` Miles Bader 0 siblings, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2009-02-17 20:51 UTC (permalink / raw) To: Miles Bader; +Cc: emacs-devel > From: Miles Bader <miles@gnu.org> > Date: Wed, 18 Feb 2009 05:18:03 +0900 > > Eli Zaretskii <eliz@gnu.org> writes: > >> + off_t beg_offset, end_offset; > > > > Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage > > of the floats, no? > > If the system isn't capable of handling large files at all, then there's > no point in worrying about it, right? Some systems can handle large files, but only if you use something like off64_t. > >> + beg_offset = FLOATP (beg) ? (off_t) XFLOAT_DATA (beg) : XINT (beg); > >> + end_offset = FLOATP (end) ? (off_t) XFLOAT_DATA (end) : XINT (end); > > > > Shouldn't we round rather than truncate, when converting to off_t? > > No. The values being represented are integers. The user almost > certainly will not be passing in a non-integral float I was thinking about 1234.99999 or some such, due to inaccuracies in converting textual representation into a float. > Maybe it should guard against overflow in the conversion though (and > signal an error?). Yes, probably. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 20:51 ` Eli Zaretskii @ 2009-02-17 21:19 ` Miles Bader 2009-02-17 21:21 ` Miles Bader 0 siblings, 1 reply; 25+ messages in thread From: Miles Bader @ 2009-02-17 21:19 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> > Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage >> > of the floats, no? >> >> If the system isn't capable of handling large files at all, then there's >> no point in worrying about it, right? > > Some systems can handle large files, but only if you use something > like off64_t. Sure, but using variant interfaces for large-file support is a much bigger and more intrusive change. Oh, BTW, of course there's a range of offsets which are still within 32-bits, and are representable by floats but not by emacs integers. A separate question is whether emacs should try to use something like _FILE_OFFSET_BITS=64 by default or not (on linux/solaris/... that causes 64-bit variants of off_t, syscalls, etc, to be used even on 32-bit systems). >> No. The values being represented are integers. The user almost >> certainly will not be passing in a non-integral float > > I was thinking about 1234.99999 or some such, due to inaccuracies in > converting textual representation into a float. This should not happen with integer values (if it does, something is very wrong). -Miles -- Congratulation, n. The civility of envy. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 21:19 ` Miles Bader @ 2009-02-17 21:21 ` Miles Bader 2009-02-18 4:09 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: Miles Bader @ 2009-02-17 21:21 UTC (permalink / raw) To: emacs-devel Miles Bader <miles@gnu.org> writes: > Oh, BTW, of course there's a range of offsets which are still within > 32-bits, and are representable by floats but not by emacs integers. When I say "float", btw, I of course mean "double"... :-/ -Miles -- 97% of everything is grunge ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 21:21 ` Miles Bader @ 2009-02-18 4:09 ` Eli Zaretskii 0 siblings, 0 replies; 25+ messages in thread From: Eli Zaretskii @ 2009-02-18 4:09 UTC (permalink / raw) To: Miles Bader; +Cc: emacs-devel > From: Miles Bader <miles@gnu.org> > Date: Wed, 18 Feb 2009 06:21:26 +0900 > > Miles Bader <miles@gnu.org> writes: > > Oh, BTW, of course there's a range of offsets which are still within > > 32-bits, and are representable by floats but not by emacs integers. > > When I say "float", btw, I of course mean "double"... :-/ So did I. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-17 19:47 ` Eli Zaretskii 2009-02-17 20:18 ` Miles Bader @ 2009-02-18 1:56 ` Stefan Monnier 2009-02-20 19:23 ` Ted Zlatanov 1 sibling, 1 reply; 25+ messages in thread From: Stefan Monnier @ 2009-02-18 1:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: tzz, emacs-devel > Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage > of the floats, no? I wouldn't worry about it for now. This is just a quick patch, barely tested (I wrote it a while ago, but haven't actually used it). "off_t" is what is used by "lseek", so if it's not enough, we need further changes. In any case, this was a mistake: it was only intended to be sent to Ted. We're in pretest, so we shouldn't waste time on such things. Stefan ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-18 1:56 ` Stefan Monnier @ 2009-02-20 19:23 ` Ted Zlatanov 0 siblings, 0 replies; 25+ messages in thread From: Ted Zlatanov @ 2009-02-20 19:23 UTC (permalink / raw) To: emacs-devel On Tue, 17 Feb 2009 20:56:59 -0500 Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> Is off_t guaranteed to be 64-bit wide? If not, we lose the advantage >> of the floats, no? SM> I wouldn't worry about it for now. This is just a quick patch, barely SM> tested (I wrote it a while ago, but haven't actually used it). "off_t" SM> is what is used by "lseek", so if it's not enough, we need SM> further changes. SM> In any case, this was a mistake: it was only intended to be sent SM> to Ted. We're in pretest, so we shouldn't waste time on such things. Thanks for the information. I will wait until the release to get this discussion going again (I also have the hashtable read support patch waiting for that). Ted ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 9:23 ` Miles Bader 2009-02-10 9:54 ` Eli Zaretskii @ 2009-02-10 12:28 ` Eli Zaretskii 2009-02-10 12:46 ` Miles Bader 1 sibling, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2009-02-10 12:28 UTC (permalink / raw) To: Miles Bader; +Cc: tzz, monnier, emacs-devel > From: Miles Bader <miles@gnu.org> > Date: Tue, 10 Feb 2009 18:23:58 +0900 > Cc: tzz@lifelogs.com, Stefan Monnier <monnier@iro.umontreal.ca>, > emacs-devel@gnu.org > > Eli Zaretskii <eliz@gnu.org> writes: > > Yes, we do return a float for size. But for some attributes, like > > inode, floats are not a good idea, because inodes are habitually > > compared for exact equality. I'm not sure time values need that > > measure of accuracy, though. > > "floats" can exactly represent integers if the integer quantity fits > within the mantissa. On second thought, I don't think I agree. For example an integer number as small and "simple" as 5 does not have an exact representation as an IEEE floating-point number, right? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-10 12:28 ` Eli Zaretskii @ 2009-02-10 12:46 ` Miles Bader 0 siblings, 0 replies; 25+ messages in thread From: Miles Bader @ 2009-02-10 12:46 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> "floats" can exactly represent integers if the integer quantity fits >> within the mantissa. > > On second thought, I don't think I agree. For example an integer > number as small and "simple" as 5 does not have an exact > representation as an IEEE floating-point number, right? Yes, it does. All integers which fit into the mantissa (plus some others) are exactly representable. -Miles -- `There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy.' ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files (was: map-file-lines) 2009-02-06 18:42 ` view/edit large files (was: map-file-lines) Ted Zlatanov 2009-02-06 21:06 ` view/edit large files Ted Zlatanov @ 2009-02-07 9:14 ` Richard M Stallman 2009-02-09 20:26 ` view/edit large files Ted Zlatanov 1 sibling, 1 reply; 25+ messages in thread From: Richard M Stallman @ 2009-02-07 9:14 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel Could you please explain, with code or text, what using your UI would look like? Describe it in text is what I thought I did. I don't have time to implement it, though. Your API seems to be aimed at a lower level, so the two could work together. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-07 9:14 ` view/edit large files (was: map-file-lines) Richard M Stallman @ 2009-02-09 20:26 ` Ted Zlatanov 2009-02-10 20:02 ` Richard M Stallman 0 siblings, 1 reply; 25+ messages in thread From: Ted Zlatanov @ 2009-02-09 20:26 UTC (permalink / raw) To: emacs-devel On Sat, 07 Feb 2009 04:14:27 -0500 Richard M Stallman <rms@gnu.org> wrote: RMS> Describe it in text is what I thought I did. RMS> I don't have time to implement it, though. I quote your suggestion here: > Here's an idea for a UI for editing big files. First you run M-x grep on > the file, and display the matches for whatever regexp. In the *grep* > buffer you specify a region, which is a way of choosing two matches, > the ones whose entries contain point and mark. Then you give a command to edit > the file from one of the matches to the other. It marks these matches > (and the lines containing them) as read-only so that you can't > spoil the correspondance with the file. Thus, you can always save this > partial-file buffer. > The beginning and end of the *grep* buffer can be used to specify > that the portion to edit starts or ends at bof or eof. I didn't understand your suggestion fully initially, sorry. I think you're suggesting that the user should pick a "window" between two places in the file. Then the user can only edit the file between those two places. That `grep' is producing the list of places is not as important as the idea of having those places. RMS> Your API seems to be aimed at a lower level, RMS> so the two could work together. Yes, I believe so. I'll try to implement the items I listed and perhaps someone can use them productively, to implement the UI you suggested or something else. Thanks for your help and time. Ted ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: view/edit large files 2009-02-09 20:26 ` view/edit large files Ted Zlatanov @ 2009-02-10 20:02 ` Richard M Stallman 0 siblings, 0 replies; 25+ messages in thread From: Richard M Stallman @ 2009-02-10 20:02 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel I didn't understand your suggestion fully initially, sorry. I think you're suggesting that the user should pick a "window" between two places in the file. Then the user can only edit the file between those two places. That `grep' is producing the list of places is not as important as the idea of having those places. Yes, that's it. A further part of the idea is that the matching lines that mark the beginning and the end of the segment would be read-only, to prevent confusion in what it means to save the file back. ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2009-02-20 19:23 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-07 3:44 view/edit large files (was: map-file-lines) MON KEY 2009-02-08 4:34 ` Bob Rogers 2009-02-09 19:44 ` view/edit large files Thien-Thi Nguyen -- strict thread matches above, loose matches on Subject: below -- 2009-02-04 15:38 map-file-lines Ted Zlatanov 2009-02-05 5:40 ` map-file-lines Richard M Stallman 2009-02-06 18:42 ` view/edit large files (was: map-file-lines) Ted Zlatanov 2009-02-06 21:06 ` view/edit large files Ted Zlatanov 2009-02-06 21:49 ` Miles Bader [not found] ` <864oz3nyj8.fsf@lifelogs.com> 2009-02-10 1:58 ` Stefan Monnier 2009-02-10 8:46 ` Eli Zaretskii 2009-02-10 9:23 ` Miles Bader 2009-02-10 9:54 ` Eli Zaretskii 2009-02-10 10:02 ` Miles Bader 2009-02-10 11:50 ` Eli Zaretskii 2009-02-10 15:08 ` Ted Zlatanov 2009-02-17 19:23 ` Stefan Monnier 2009-02-17 19:47 ` Eli Zaretskii 2009-02-17 20:18 ` Miles Bader 2009-02-17 20:51 ` Eli Zaretskii 2009-02-17 21:19 ` Miles Bader 2009-02-17 21:21 ` Miles Bader 2009-02-18 4:09 ` Eli Zaretskii 2009-02-18 1:56 ` Stefan Monnier 2009-02-20 19:23 ` Ted Zlatanov 2009-02-10 12:28 ` Eli Zaretskii 2009-02-10 12:46 ` Miles Bader 2009-02-07 9:14 ` view/edit large files (was: map-file-lines) Richard M Stallman 2009-02-09 20:26 ` view/edit large files Ted Zlatanov 2009-02-10 20:02 ` Richard M Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.