One example of code I can't understand

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* One example of code I can't understand
@ 2009-07-19 23:21 Richard Stallman
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2009-07-19 23:21 UTC (permalink / raw)
  To: emacs-devel

Here's more code from mm-util.el that I don't understand.
I don't know what this function is supposed to do
or how it works.


(defun mm-iso-8859-x-to-15-region (&optional b e)
  (if (fboundp 'char-charset)
      (let (charset item c inconvertible)
	(save-restriction
	  (if e (narrow-to-region b e))
	  (goto-char (point-min))
	  (skip-chars-forward "\0-\177")
	  (while (not (eobp))
	    (cond
	     ((not (setq item (assq (char-charset (setq c (char-after)))
				    mm-iso-8859-x-to-15-table)))
	      (forward-char))
	     ((memq c (cdr (cdr item)))
	      (setq inconvertible t)
	      (forward-char))
	     (t
	      (insert-before-markers (prog1 (+ c (car (cdr item)))
				       (delete-char 1)))))
	    (skip-chars-forward "\0-\177")))
	(not inconvertible))))




^ permalink raw reply	[flat|nested] 19+ messages in thread

* One example of code I can't understand
@ 2009-07-19 23:21 Richard Stallman
  2009-07-20  3:13 ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Stallman @ 2009-07-19 23:21 UTC (permalink / raw)
  To: emacs-devel

Here's more code from mm-util.el that I don't understand.

    ;; We are in a unibyte buffer or XEmacs non-mule, so we futz around a bit.
    (save-excursion
      (save-restriction
	(narrow-to-region b e)
	(goto-char (point-min))
	(skip-chars-forward "\0-\177")
	(if (eobp)
	    '(ascii)
	  (let (charset)
	    (setq charset
		  (and (boundp 'current-language-environment)
		       (car (last (assq 'charset
					(assoc current-language-environment
					       language-info-alist))))))
	    (if (eq charset 'ascii) (setq charset nil))
	    (or charset
		(setq charset
		      (car (last (assq mail-parse-charset
				       mm-mime-mule-charset-alist)))))
	    (list 'ascii (or charset 'latin-iso8859-1)))))))))




^ permalink raw reply	[flat|nested] 19+ messages in thread

* One example of code I can't understand
@ 2009-07-19 23:21 Richard Stallman
  2009-07-20 18:13 ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Stallman @ 2009-07-19 23:21 UTC (permalink / raw)
  To: emacs-devel

Here's some code from mm-util.el that I don't understand.
Well, I can understand the first 6 lines, but after that
I am stumped.  The doc string gives no details of what the
value should look like or what it means.


(defvar mm-iso-8859-x-to-15-table
  (and (fboundp 'coding-system-p)
       (mm-coding-system-p 'iso-8859-15)
       (mapcar
	(lambda (cs)
	  (if (mm-coding-system-p (car cs))
	      (let ((c (string-to-char
			(decode-coding-string "\341" (car cs)))))
		(cons (char-charset c)
		      (cons
		       (- (string-to-char
			   (decode-coding-string "\341" 'iso-8859-15)) c)
		       (string-to-list (decode-coding-string (car (cdr cs))
							     (car cs))))))
	    '(gnus-charset 0)))
	mm-iso-8859-15-compatible))
  "A table of the difference character between ISO-8859-X and ISO-8859-15.")




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-19 23:21 One example of code I can't understand Richard Stallman
@ 2009-07-20  3:13 ` Eli Zaretskii
  2009-07-20 19:01   ` Richard Stallman
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2009-07-20  3:13 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Date: Sun, 19 Jul 2009 19:21:29 -0400
> 
> Here's more code from mm-util.el that I don't understand.
> 
>     ;; We are in a unibyte buffer or XEmacs non-mule, so we futz around a bit.

Why do we need to cater to unibyte operation in Emacs 23?




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-19 23:21 Richard Stallman
@ 2009-07-20 18:13 ` Stefan Monnier
  2009-07-20 20:50   ` Reiner Steib
  2009-07-21 14:42   ` Richard Stallman
  0 siblings, 2 replies; 19+ messages in thread
From: Stefan Monnier @ 2009-07-20 18:13 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

> Here's some code from mm-util.el that I don't understand.
> Well, I can understand the first 6 lines, but after that
> I am stumped.  The doc string gives no details of what the
> value should look like or what it means.

> (defvar mm-iso-8859-x-to-15-table
>   (and (fboundp 'coding-system-p)
>        (mm-coding-system-p 'iso-8859-15)
>        (mapcar
> 	(lambda (cs)
> 	  (if (mm-coding-system-p (car cs))
> 	      (let ((c (string-to-char
> 			(decode-coding-string "\341" (car cs)))))
> 		(cons (char-charset c)
> 		      (cons
> 		       (- (string-to-char
> 			   (decode-coding-string "\341" 'iso-8859-15)) c)
> 		       (string-to-list (decode-coding-string (car (cdr cs))
> 							     (car cs))))))
> 	    '(gnus-charset 0)))
> 	mm-iso-8859-15-compatible))
>   "A table of the difference character between ISO-8859-X and ISO-8859-15.")

Entries in this list have the form (CHARSET OFFSET CHARS...)
and it means that characters in CHARSET (except for those in CHARS) can
be converted to iso-8859-15 by adding OFFSET.

In Emacs-23 it doesn't make much sense (because unification, OFFSET is
always 0).  It's used in mm-find-mime-charset-region (via
mm-iso-8859-x-to-15-region) to provide a "poor man's unification":

    (if (and (> (length charsets) 1)
	     (memq 'iso-8859-15 charsets)
	     (memq 'iso-8859-15 hack-charsets)
	     (save-excursion (mm-iso-8859-x-to-15-region b e)))
	(dolist (x mm-iso-8859-15-compatible)
	  (setq charsets (delq (car x) charsets))))

i.e. if we need more than 1 coding-system to encode the region and
iso-8859-15 is among them, then use the above table to turn some of the
other chars into iso-8859-15 in the hope to reduce the number of
coding-systems to use (and hence the number of chunk into which the
text needs to be split).

Now that we have utf-8, this is unnecessary since we can always encode
the whole text with just a single coding-system, without having to break
it down into chunks.

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20  3:13 ` Eli Zaretskii
@ 2009-07-20 19:01   ` Richard Stallman
  2009-07-21  0:51     ` Kenichi Handa
  2009-07-21  0:59     ` Stefan Monnier
  0 siblings, 2 replies; 19+ messages in thread
From: Richard Stallman @ 2009-07-20 19:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

    Why do we need to cater to unibyte operation in Emacs 23?

1. Unibyte buffers may be needed for certain internal operations.  I
am not sure -- we need to ask Handa.

2. As for editing of unibyte buffers by users, maybe nobody will mind
if we get rid of that now.  But we should not eliminate it without
first polling the users, to see who uses it and whether it is important.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20 18:13 ` Stefan Monnier
@ 2009-07-20 20:50   ` Reiner Steib
  2009-07-21 14:41     ` Richard Stallman
  2009-07-21 14:42   ` Richard Stallman
  1 sibling, 1 reply; 19+ messages in thread
From: Reiner Steib @ 2009-07-20 20:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rms, emacs-devel, ding

On Mon, Jul 20 2009, Stefan Monnier wrote:

> In Emacs-23 it doesn't make much sense (because unification, OFFSET is
> always 0).  It's used in mm-find-mime-charset-region (via
> mm-iso-8859-x-to-15-region) to provide a "poor man's unification":
[...]
> Now that we have utf-8, this is unnecessary since we can always encode
> the whole text with just a single coding-system, without having to break
> it down into chunks.

Please keep in mind that Gnus wants to support older Emacs versions
and XEmacs, see (info "(gnus)Emacsen").  So please don't remove this
code.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20 19:01   ` Richard Stallman
@ 2009-07-21  0:51     ` Kenichi Handa
  2009-07-21  3:07       ` Eli Zaretskii
                         ` (2 more replies)
  2009-07-21  0:59     ` Stefan Monnier
  1 sibling, 3 replies; 19+ messages in thread
From: Kenichi Handa @ 2009-07-21  0:51 UTC (permalink / raw)
  To: rms; +Cc: eliz, emacs-devel

In article <E1MSy7Q-0005Mn-7H@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>     Why do we need to cater to unibyte operation in Emacs 23?
> 1. Unibyte buffers may be needed for certain internal operations.  I
> am not sure -- we need to ask Handa.

Currently, we are using unibyte buffers for various internal
operations (e.g rmail, tar-mode, jka-compr, ...).

It is theoretically possible to modify all of them to use a
multibyte buffer that contains only ASCII and eight-bit
chars.  But, as it may make the operations slow, I don't see
a merit in doing that.

Is that what you want to know?

> 2. As for editing of unibyte buffers by users, maybe nobody will mind
> if we get rid of that now.  But we should not eliminate it without
> first polling the users, to see who uses it and whether it is important.

I agree.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20 19:01   ` Richard Stallman
  2009-07-21  0:51     ` Kenichi Handa
@ 2009-07-21  0:59     ` Stefan Monnier
  2009-07-21 14:41       ` Richard Stallman
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2009-07-21  0:59 UTC (permalink / raw)
  To: rms; +Cc: Eli Zaretskii, emacs-devel

>     Why do we need to cater to unibyte operation in Emacs 23?
> 1. Unibyte buffers may be needed for certain internal operations.  I
> am not sure -- we need to ask Handa.

Maybe Gnus maintainers would know better (it seems more related to Gnus
than to Mule).

> 2. As for editing of unibyte buffers by users, maybe nobody will mind
> if we get rid of that now.  But we should not eliminate it without
> first polling the users, to see who uses it and whether it is important.

Editing unibyte buffers is an important feature, which we use whenever
we edit binary files.  I see no need/reason to get rid of it.

In this area, there is something that we should eliminate, it's the
notion of a unibyte-session (which means something like forcing
default-enable-multibyte-characters to t, tho it's a bit trickier than
that since default-enable-multibyte-characters is sometimes let-bound
around buffer-creation in place of calling set-buffer-multibyte
explicitly after the creation).

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21  0:51     ` Kenichi Handa
@ 2009-07-21  3:07       ` Eli Zaretskii
  2009-07-21  4:10         ` Kenichi Handa
  2009-07-21 14:41       ` Richard Stallman
  2009-07-22  6:58       ` Stephen J. Turnbull
  2 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2009-07-21  3:07 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: rms, emacs-devel

> From: Kenichi Handa <handa@m17n.org>
> Cc: eliz@gnu.org, emacs-devel@gnu.org
> Date: Tue, 21 Jul 2009 09:51:08 +0900
> 
> In article <E1MSy7Q-0005Mn-7H@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> 
> >     Why do we need to cater to unibyte operation in Emacs 23?
> > 1. Unibyte buffers may be needed for certain internal operations.  I
> > am not sure -- we need to ask Handa.
> 
> Currently, we are using unibyte buffers for various internal
> operations (e.g rmail, tar-mode, jka-compr, ...).

The code in question decides how to encode portions of a mail buffer
for sending an email message.  I doubt that the above uses of unibyte
buffers can be ever exposed to this code, since for that they would
need to be parts of an email message being composed.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21  3:07       ` Eli Zaretskii
@ 2009-07-21  4:10         ` Kenichi Handa
  0 siblings, 0 replies; 19+ messages in thread
From: Kenichi Handa @ 2009-07-21  4:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, emacs-devel

In article <83ocreohrv.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> > >     Why do we need to cater to unibyte operation in Emacs 23?
> > > 1. Unibyte buffers may be needed for certain internal operations.  I
> > > am not sure -- we need to ask Handa.
> > 
> > Currently, we are using unibyte buffers for various internal
> > operations (e.g rmail, tar-mode, jka-compr, ...).

> The code in question decides how to encode portions of a mail buffer
> for sending an email message.

Then, the mail buffer itself may be unibyte if a user starts
Emacs with --unibyte.

But, that answer may be out of focus because I'm now writing
without reading "the code in question".  I'm sorry in that
case.  I don't have a time to read all the mails of this
thread at the moment.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21  0:59     ` Stefan Monnier
@ 2009-07-21 14:41       ` Richard Stallman
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2009-07-21 14:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, emacs-devel

    In this area, there is something that we should eliminate, it's the
    notion of a unibyte-session (which means something like forcing
    default-enable-multibyte-characters to t, tho it's a bit trickier than
    that since default-enable-multibyte-characters is sometimes let-bound
    around buffer-creation in place of calling set-buffer-multibyte
    explicitly after the creation).

That change might be ok, but we should poll the users first.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21  0:51     ` Kenichi Handa
  2009-07-21  3:07       ` Eli Zaretskii
@ 2009-07-21 14:41       ` Richard Stallman
  2009-07-22  6:58       ` Stephen J. Turnbull
  2 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2009-07-21 14:41 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: eliz, emacs-devel

    Currently, we are using unibyte buffers for various internal
    operations (e.g rmail, tar-mode, jka-compr, ...).

    It is theoretically possible to modify all of them to use a
    multibyte buffer that contains only ASCII and eight-bit
    chars.  But, as it may make the operations slow, I don't see
    a merit in doing that.

    Is that what you want to know?

Yes.  It means we certainly don't want to eliminate unibyte buffers.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20 20:50   ` Reiner Steib
@ 2009-07-21 14:41     ` Richard Stallman
  2009-07-21 17:34       ` Reiner Steib
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Stallman @ 2009-07-21 14:41 UTC (permalink / raw)
  To: Reiner Steib; +Cc: monnier, emacs-devel, ding

    Please keep in mind that Gnus wants to support older Emacs versions
    and XEmacs, see (info "(gnus)Emacsen").  So please don't remove this
    code.

I'm going to remove this code in the version of the code that I
install elsewhere in Emacs.  But the functions in my version will have
different names, so the change won't affect Gnus.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-20 18:13 ` Stefan Monnier
  2009-07-20 20:50   ` Reiner Steib
@ 2009-07-21 14:42   ` Richard Stallman
  2009-07-21 18:21     ` Eli Zaretskii
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Stallman @ 2009-07-21 14:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

    In Emacs-23 it doesn't make much sense (because unification, OFFSET is
    always 0).  It's used in mm-find-mime-charset-region (via
    mm-iso-8859-x-to-15-region) to provide a "poor man's unification":

Does that mean I can get rid of the code that uses this?
That is a nice simplification.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21 14:41     ` Richard Stallman
@ 2009-07-21 17:34       ` Reiner Steib
  2009-07-22 22:21         ` Richard Stallman
  0 siblings, 1 reply; 19+ messages in thread
From: Reiner Steib @ 2009-07-21 17:34 UTC (permalink / raw)
  To: Richard Stallman; +Cc: monnier, emacs-devel, ding

On Tue, Jul 21 2009, Richard Stallman wrote:

>     Please keep in mind that Gnus wants to support older Emacs versions
>     and XEmacs, see (info "(gnus)Emacsen").  So please don't remove this
>     code.
>
> I'm going to remove this code in the version of the code that I
> install elsewhere in Emacs.  But the functions in my version will have
> different names, so the change won't affect Gnus.

Sounds like code duplication to me.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21 14:42   ` Richard Stallman
@ 2009-07-21 18:21     ` Eli Zaretskii
  0 siblings, 0 replies; 19+ messages in thread
From: Eli Zaretskii @ 2009-07-21 18:21 UTC (permalink / raw)
  To: rms; +Cc: monnier, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Date: Tue, 21 Jul 2009 10:42:01 -0400
> Cc: emacs-devel@gnu.org
> 
>     In Emacs-23 it doesn't make much sense (because unification, OFFSET is
>     always 0).  It's used in mm-find-mime-charset-region (via
>     mm-iso-8859-x-to-15-region) to provide a "poor man's unification":
> 
> Does that mean I can get rid of the code that uses this?

Yes, I think so.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21  0:51     ` Kenichi Handa
  2009-07-21  3:07       ` Eli Zaretskii
  2009-07-21 14:41       ` Richard Stallman
@ 2009-07-22  6:58       ` Stephen J. Turnbull
  2 siblings, 0 replies; 19+ messages in thread
From: Stephen J. Turnbull @ 2009-07-22  6:58 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: eliz, rms, emacs-devel

Kenichi Handa writes:

 > It is theoretically possible to modify all of them to use a
 > multibyte buffer that contains only ASCII and eight-bit
 > chars.  But, as it may make the operations slow, I don't see
 > a merit in doing that.

XEmacs has always done it this way; it is more than a theoretical
possibility.  In XEmacs it doesn't make much difference for the
operations you describe (less than a factor of 2) because the overhead
of interfacing to the pipes is greater than the overhead of
conversion.  This is based on some profiling Ben did many years ago;
ISTR it was basically linear up to 128MB or maybe 256MB.  I don't
recall the exact numbers, but I do remember being very surprised that
it was substantially less than 2.

In XEmacs, I think it makes a much bigger difference for mail buffers
(VM, or Gnus nnfolder) because of the frequency of byte<->char
conversions in those more or less random access applications.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: One example of code I can't understand
  2009-07-21 17:34       ` Reiner Steib
@ 2009-07-22 22:21         ` Richard Stallman
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2009-07-22 22:21 UTC (permalink / raw)
  To: Reiner Steib; +Cc: monnier, emacs-devel, ding

    > I'm going to remove this code in the version of the code that I
    > install elsewhere in Emacs.  But the functions in my version will have
    > different names, so the change won't affect Gnus.

    Sounds like code duplication to me.

Yes, for the moment.  Eventually we can delete the corresponding code
from mm-util.el and replace it with aliases pointing at these
functions.  You could keep the old code to include in a compatibility
pack when distributing Gnus separately.





^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-07-22 22:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-19 23:21 One example of code I can't understand Richard Stallman
2009-07-20  3:13 ` Eli Zaretskii
2009-07-20 19:01   ` Richard Stallman
2009-07-21  0:51     ` Kenichi Handa
2009-07-21  3:07       ` Eli Zaretskii
2009-07-21  4:10         ` Kenichi Handa
2009-07-21 14:41       ` Richard Stallman
2009-07-22  6:58       ` Stephen J. Turnbull
2009-07-21  0:59     ` Stefan Monnier
2009-07-21 14:41       ` Richard Stallman
  -- strict thread matches above, loose matches on Subject: below --
2009-07-19 23:21 Richard Stallman
2009-07-20 18:13 ` Stefan Monnier
2009-07-20 20:50   ` Reiner Steib
2009-07-21 14:41     ` Richard Stallman
2009-07-21 17:34       ` Reiner Steib
2009-07-22 22:21         ` Richard Stallman
2009-07-21 14:42   ` Richard Stallman
2009-07-21 18:21     ` Eli Zaretskii
2009-07-19 23:21 Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).