unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* where to put eol kludge
@ 2003-05-15 23:05 Thien-Thi Nguyen
  2003-05-15 23:49 ` Kenichi Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Thien-Thi Nguyen @ 2003-05-15 23:05 UTC (permalink / raw)


(thinking out loud...)

the vms run-time `write' suffixes a gratuitous newline that for emacs 19
was worked around by adjusting point to be at a newline, and silently
suffering lack of workaround for gratuitous eof newline (see comment in
`Fwrite_region').  in trying to adapt this kludge to emacs 21, i gather
that the emacs 21 write path is now:

	Fwrite_region
	a_write
	e_write
	emacs_write
	sys_write
	write		(vms)

since emacs_write calls `encode_coding' (which eventually does eol
encoding handling selective-display as well), i wonder if that might be
a better place to move the kludge, than from the high level where it is
currently.  basically what is needed is to adjust the end of the buffer
that `write' sees so that it ends w/ newline.

of course, along the way we should write an autoconf test to detect the
necessity for such a kludge (i.e, just how broken `write' is) and then
omit it when not needed, but that's a side issue...

thi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-15 23:05 where to put eol kludge Thien-Thi Nguyen
@ 2003-05-15 23:49 ` Kenichi Handa
  2003-05-16 10:20   ` Thien-Thi Nguyen
  0 siblings, 1 reply; 7+ messages in thread
From: Kenichi Handa @ 2003-05-15 23:49 UTC (permalink / raw)
  Cc: emacs-devel

In article <E19GRnS-0002Mg-00@colo.agora-net.com>, Thien-Thi Nguyen <ttn@glug.org> writes:
> (thinking out loud...)
> the vms run-time `write' suffixes a gratuitous newline that for emacs 19
> was worked around by adjusting point to be at a newline, and silently
> suffering lack of workaround for gratuitous eof newline (see comment in
> `Fwrite_region').  in trying to adapt this kludge to emacs 21, i gather
> that the emacs 21 write path is now:

> 	Fwrite_region
> 	a_write
> 	e_write
> 	emacs_write
> 	sys_write
> 	write		(vms)

> since emacs_write calls `encode_coding' (which eventually does eol
> encoding handling selective-display as well), i wonder if that might be
> a better place to move the kludge, than from the high level where it is
> currently.  basically what is needed is to adjust the end of the buffer
> that `write' sees so that it ends w/ newline.

It's e_write, not emacs_write, that calls encode-coding.
Anyway, I agree that emacs_write is the better place for
such a job.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-15 23:49 ` Kenichi Handa
@ 2003-05-16 10:20   ` Thien-Thi Nguyen
  2003-05-16 11:07     ` Kenichi Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Thien-Thi Nguyen @ 2003-05-16 10:20 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

   It's e_write, not emacs_write, that calls encode-coding.
   Anyway, I agree that emacs_write is the better place for
   such a job.

oops, my bad.

in any case, it seems to me the criteria for the best place for the
kludge would be where chunking of the write occurs, since the kludge has
the re-chunking nature.  which indicates `e_write' after all.  in other
words, something like:

  result = encode_coding (...);
  /* kludge to adjust end of write buffer by way of
     adjusting coding->produced goes here */
  if (coding->produced)
    ...

this way, emacs_write can remain dumb.  otherwise, to do the kludge at
emacs_write level requires extra information to be passed in, saying
whether or not "this is the last write" and information to be passed
out, saying "here are the leftover unwritten bytes which you can use in
the next call since this call was not the last write".

if this kludge had more general use, i might even suggest moving it into
encode_coding for efficiency (i.e., rather than throw away the tail of
the encoding after eol, add a flag to stop encoding at the "last" eol).
but on second thought probably determining last eol reliably would not
preclude processing post-last-eol anyway...

thi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-16 10:20   ` Thien-Thi Nguyen
@ 2003-05-16 11:07     ` Kenichi Handa
  2003-05-16 12:14       ` Thien-Thi Nguyen
  0 siblings, 1 reply; 7+ messages in thread
From: Kenichi Handa @ 2003-05-16 11:07 UTC (permalink / raw)
  Cc: emacs-devel

In article <jk1xyzgplx.fsf@glug.org>, Thien-Thi Nguyen <ttn@glug.org> writes:
> in any case, it seems to me the criteria for the best place for the
> kludge would be where chunking of the write occurs, since the kludge has
> the re-chunking nature.  which indicates `e_write' after all.  in other
> words, something like:

>   result = encode_coding (...);
>   /* kludge to adjust end of write buffer by way of
>      adjusting coding->produced goes here */
>   if (coding->produced)
>     ...

I'd like to avoid introducing "#ifdef VMS" newly in a
function that doesn't have it currently.

By the way, I've just found this code in sysdep.c.

/*
 *	VAX/VMS VAX C RTL really loses. It insists that records
 *      end with a newline (carriage return) character, and if they
 *	don't it adds one (nice of it isn't it!)
 *
 *	Thus we do this stupidity below.
 */

#undef write
int
sys_write (fildes, buf, nbytes)

Isn't it working?

> if this kludge had more general use, i might even suggest moving it into
> encode_coding for efficiency (i.e., rather than throw away the tail of
> the encoding after eol, add a flag to stop encoding at the "last" eol).
> but on second thought probably determining last eol reliably would not
> preclude processing post-last-eol anyway...

Encoding is not stateless, so, as I wrote before, it's not
easy to rewind the internal encoding state to the point of
last eol.

Anyway, we don't have to throw away the tail bytes of the
encoding after eol even if we do that in e_write.  We can
simply keep those bytes as carryover, move them to the head
of `buf', then, call encode_coding as this:
   encode_coding (coding, addr, buf + CARRY_OVER_BYTES, 
		  nbytes, WRITE_BUF_SIZE - CARRY_OVER_BYTES);
                 
---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-16 11:07     ` Kenichi Handa
@ 2003-05-16 12:14       ` Thien-Thi Nguyen
  2003-05-16 13:06         ` Kenichi Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Thien-Thi Nguyen @ 2003-05-16 12:14 UTC (permalink / raw)
  Cc: emacs-devel

   From: Kenichi Handa <handa@m17n.org>
   Date: Fri, 16 May 2003 20:07:47 +0900 (JST)

   I'd like to avoid introducing "#ifdef VMS" newly in a
   function that doesn't have it currently.

fully agreed.  generally this might be re-conceptualized as "#if
CODING_STOPS_AT_LAST_EOL", where vms just happens to be one beneficiary
(maybe there will be others).  the word "stops" is not 100% accurate,
unfortunately.

   Isn't [sydep.c `sys_write'] working?

it used to work becuase there was no chunking involved between
Fwrite_region and sys_write; basically, the end of the buffer given to
sys_write was taken to be EOF (or the gap), and so sys_write could do
the eol adjustments directly.  (the code in Fwrite_region is only part
of the kludge!)  since coding introduces chunking, end of buffer given
to sys_write can no longer be assumed to be EOF (or gap).

it seems i mis-stated the direction of the proposed kludge motion;
rather than moving it higher to lower, we need to keep its pre-amble
where it is (in Fwrite_region), and move its body from lower to higher,
integrating it w/ the chunking that goes on as part of encode_coding.

   Encoding is not stateless, so, as I wrote before, it's not
   easy to rewind the internal encoding state to the point of
   last eol.

ok.

   Anyway, we don't have to throw away the tail bytes of the
   encoding after eol even if we do that in e_write.  We can
   simply keep those bytes as carryover, move them to the head
   of `buf', then, call encode_coding as this:
      encode_coding (coding, addr, buf + CARRY_OVER_BYTES, 
		     nbytes, WRITE_BUF_SIZE - CARRY_OVER_BYTES);

eventually this is what i realized as well -- i'm glad you confirmed my
inklings!  how about something like:

  struct coding_system
  { ...
  #ifdef CODING_CARRIES_AFTER_LAST_EOL
    /* Last EOL.  */
    char *last_eol;
    /* Number of bytes to carry over after the last EOL encountered.  */
    int carry_over_bytes;
  #endif
  };

then macros EMIT_ONE_BYTE and EMIT_TWO_BYTES would set last_eol,
and encode_coding would need:

  #ifdef CODING_CARRIES_AFTER_LAST_EOL
    coding->carry_over_bytes = coding->produced - coding->last_eol;
  #endif

(rough sketch.)

thi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-16 12:14       ` Thien-Thi Nguyen
@ 2003-05-16 13:06         ` Kenichi Handa
  2003-05-19 14:50           ` Thien-Thi Nguyen
  0 siblings, 1 reply; 7+ messages in thread
From: Kenichi Handa @ 2003-05-16 13:06 UTC (permalink / raw)
  Cc: emacs-devel

In article <E19Ge6O-0007qf-00@colo.agora-net.com>, Thien-Thi Nguyen <ttn@glug.org> writes:
>    I'd like to avoid introducing "#ifdef VMS" newly in a
>    function that doesn't have it currently.

> fully agreed.  generally this might be re-conceptualized as "#if
> CODING_STOPS_AT_LAST_EOL", where vms just happens to be one beneficiary
> (maybe there will be others).  the word "stops" is not 100% accurate,
> unfortunately.

Please note that this problem is not related to the task of
encoding_coding.  It's wrong to bring in the unrelated task
in encoding_coding.

And, encoding-coding is used also for encode-coding-region
and encode-coding-string that don't have to pay attention to
this problem.

In addition, the chunking is introduced not only by
encode_coding but by annotate functions which may produce a
very short chunk that doesn't contain a newline.

So, I think we should handle it in the place of "file
writing".

>   struct coding_system
>   { ...
>   #ifdef CODING_CARRIES_AFTER_LAST_EOL
>     /* Last EOL.  */
>     char *last_eol;
>     /* Number of bytes to carry over after the last EOL encountered.  */
>     int carry_over_bytes;
>   #endif
>   };

> then macros EMIT_ONE_BYTE and EMIT_TWO_BYTES would set last_eol,
> and encode_coding would need:

>   #ifdef CODING_CARRIES_AFTER_LAST_EOL
>   coding-> carry_over_bytes = coding->produced - coding->last_eol;
>   #endif

encode_coding is designed so that it can be called with
different input buffer.  So, if we handle this problem in
encode_coding, we have to remember carry over bytes
themselves (not only the length) in the struct coding_system
by allocating a memory for them, copy them, and re-copy them
to the output buffer on the next encoding.  It's so
inefficient.  Even if we do that, it can't solve the problem
of thr short chunk produced by an annotate function.

So, I strongly suggest to forget about changing
encode_coding in such a way.

Here's a suggestion to solve this problem.

(1) Detect this problem in configure, and define, say,
    WRITE_ADD_NEWLINE on a problematic system.

(2) Write a different version of emacs_write in sysdep.c for
    such a system.  It keeps carry over bytes in a static
    buffer, and flushes out the bytes when called with
    NBYTES == 0.

(3) Modify e_write to to call emacs_write with NBYTES == 0
    at the end.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: where to put eol kludge
  2003-05-16 13:06         ` Kenichi Handa
@ 2003-05-19 14:50           ` Thien-Thi Nguyen
  0 siblings, 0 replies; 7+ messages in thread
From: Thien-Thi Nguyen @ 2003-05-19 14:50 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

   [encoding is not the right place to move the kludge.]

thanks for the lucid explanation.  i understand this more clearly now.

   So, I think we should handle it in the place of "file
   writing".  [...] strongly suggest to forget about changing
   encode_coding in such a way.

   [create/use a buffering version of emacs_write]

thanks, i've followed this suggestion (w/ a minor change) and it seems
to be working well.  on to the kludge... :-)

thi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-05-19 14:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-15 23:05 where to put eol kludge Thien-Thi Nguyen
2003-05-15 23:49 ` Kenichi Handa
2003-05-16 10:20   ` Thien-Thi Nguyen
2003-05-16 11:07     ` Kenichi Handa
2003-05-16 12:14       ` Thien-Thi Nguyen
2003-05-16 13:06         ` Kenichi Handa
2003-05-19 14:50           ` Thien-Thi Nguyen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).