unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* strip extraneous CR characters
@ 2009-09-28 15:01 Ted Zlatanov
  2009-09-28 15:54 ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Ted Zlatanov @ 2009-09-28 15:01 UTC (permalink / raw)
  To: emacs-devel

This code in Gnus' nnheader.el was also useful for the imap-hash.el I
put into Emacs recently:

;; from nnheader.el
(defsubst imap-hash-remove-cr-followed-by-lf ()
  (goto-char (point-max))
  (while (search-backward "\r\n" nil t)
    (delete-char 1)))

;; from nnheader.el
(defun imap-hash-ms-strip-cr (&optional string)
  "Strip ^M from the end of all lines in current buffer or STRING."
  (if string
    (with-temp-buffer
      (insert string)
      (imap-hash-remove-cr-followed-by-lf)
      (buffer-string))
    (save-excursion
      (imap-hash-remove-cr-followed-by-lf))))

I wonder if it makes sense to define these functions globally?  They are
not trivial, though the implementation is short.

Thanks
Ted





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: strip extraneous CR characters
  2009-09-28 15:01 strip extraneous CR characters Ted Zlatanov
@ 2009-09-28 15:54 ` Eli Zaretskii
  2009-09-28 17:49   ` Ted Zlatanov
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2009-09-28 15:54 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 28 Sep 2009 10:01:24 -0500
> 
> ;; from nnheader.el
> (defsubst imap-hash-remove-cr-followed-by-lf ()
>   (goto-char (point-max))
>   (while (search-backward "\r\n" nil t)
>     (delete-char 1)))
> 
> ;; from nnheader.el
> (defun imap-hash-ms-strip-cr (&optional string)
>   "Strip ^M from the end of all lines in current buffer or STRING."
>   (if string
>     (with-temp-buffer
>       (insert string)
>       (imap-hash-remove-cr-followed-by-lf)
>       (buffer-string))
>     (save-excursion
>       (imap-hash-remove-cr-followed-by-lf))))
> 
> I wonder if it makes sense to define these functions globally?  They are
> not trivial, though the implementation is short.

Why are these needed, when we have the EOL decoding as part of
inserting text into the buffer since a long time ago?  And if the
initial decode somehow didn't DTRT, either fix that or decode it
again.

When will this paradigm not work?





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: strip extraneous CR characters
  2009-09-28 15:54 ` Eli Zaretskii
@ 2009-09-28 17:49   ` Ted Zlatanov
  2009-09-28 18:10     ` Eli Zaretskii
  2009-09-29  0:44     ` Stefan Monnier
  0 siblings, 2 replies; 5+ messages in thread
From: Ted Zlatanov @ 2009-09-28 17:49 UTC (permalink / raw)
  To: emacs-devel

On Mon, 28 Sep 2009 17:54:47 +0200 Eli Zaretskii <eliz@gnu.org> wrote: 

>> From: Ted Zlatanov <tzz@lifelogs.com>
>> Date: Mon, 28 Sep 2009 10:01:24 -0500
>> 
>> ;; from nnheader.el
>> (defsubst imap-hash-remove-cr-followed-by-lf ()
>> (goto-char (point-max))
>> (while (search-backward "\r\n" nil t)
>> (delete-char 1)))
>> 
>> ;; from nnheader.el
>> (defun imap-hash-ms-strip-cr (&optional string)
>> "Strip ^M from the end of all lines in current buffer or STRING."
>> (if string
>> (with-temp-buffer
>> (insert string)
>> (imap-hash-remove-cr-followed-by-lf)
>> (buffer-string))
>> (save-excursion
>> (imap-hash-remove-cr-followed-by-lf))))
>> 
>> I wonder if it makes sense to define these functions globally?  They are
>> not trivial, though the implementation is short.

EZ> Why are these needed, when we have the EOL decoding as part of
EZ> inserting text into the buffer since a long time ago?  And if the
EZ> initial decode somehow didn't DTRT, either fix that or decode it
EZ> again.

EZ> When will this paradigm not work?

IMAP has CR characters explicitly in the standard.  imap.el passes those
down in every message body and in the headers.  I don't know why imap.el
doesn't use automatic EOL decoding (perhaps to preserve every aspect of
the original data).  I don't need the original CR characters for my
purposes so probably it's better to do the decoding in imap-hash.el
instead of imap.el, which is used by many other packages.

Where can I find an example of this EOL decoding from DOS, to ensure I
am doing it correctly?

Thanks
Ted





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: strip extraneous CR characters
  2009-09-28 17:49   ` Ted Zlatanov
@ 2009-09-28 18:10     ` Eli Zaretskii
  2009-09-29  0:44     ` Stefan Monnier
  1 sibling, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2009-09-28 18:10 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> From: Ted Zlatanov <tzz@lifelogs.com>
> Date: Mon, 28 Sep 2009 12:49:21 -0500
> 
> Where can I find an example of this EOL decoding from DOS, to ensure I
> am doing it correctly?

It's very simple:

  (let ((coding-system-for-read 'undecided-dos))
    (insert-file-contents ....))

or:

  (let ((coding-system-for-read 'undecided-dos))
    (decode-coding-inserted-region ....))

or:

  (decode-coding-region start end 'undecided-dos ...)

or (if you must use a string, which I personally advise against):

  (decode-coding-string STRING 'undecided-dos ...)

If none of the above fits the bill, please tell more about what you
need to accomplish, where is the text with CRs held originally and
where do you want to have the text with CRs stripped.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: strip extraneous CR characters
  2009-09-28 17:49   ` Ted Zlatanov
  2009-09-28 18:10     ` Eli Zaretskii
@ 2009-09-29  0:44     ` Stefan Monnier
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Monnier @ 2009-09-29  0:44 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: emacs-devel

> IMAP has CR characters explicitly in the standard.  imap.el passes
> those down in every message body and in the headers.

To my naive eye, it looks like a misfeature of imap.el.

> I don't know why imap.el doesn't use automatic EOL decoding (perhaps
> to preserve every aspect of the original data).

My guess is that this design is inherited from pre-Mule times where
CRLF->LF decoding was better done once and for all in the
backend-agnostic code of Gnus.

In any case, I think you'll find that your functions are pretty
Gnus-specific, for that reason.


        Stefan




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-29  0:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-28 15:01 strip extraneous CR characters Ted Zlatanov
2009-09-28 15:54 ` Eli Zaretskii
2009-09-28 17:49   ` Ted Zlatanov
2009-09-28 18:10     ` Eli Zaretskii
2009-09-29  0:44     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).