Loading souce Elisp faster

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Loading souce Elisp faster
@ 2013-02-25  1:40 Stefan Monnier
  2013-02-25  1:53 ` Lennart Borgman
                   ` (7 more replies)
  0 siblings, 8 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25  1:40 UTC (permalink / raw)
  To: emacs-devel

It used to be the case that compiling one's .emacs was silly because it
provided no measurable speed difference.  But nowadays this is not true
any more: loading a source Elisp file is significantly slower because it
goes through load-with-code-conversion.

For source files in utf-8 encoding this does not need to be the case: we
could load them without going through load-with-code-conversion.
And given that utf-8 should be the standard encoding for Elisp files
(if not quite now, surely in some not too distant future), this is an
important case.

So basically, all we need to do is to be able to easily recognize "Elisp
source in utf-8 encoding".  One way to do that would be to use
a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
the very beginning of the file or right after a semi-colon (for better
backward compatibility).

        Stefan

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
@ 2013-02-25  1:53 ` Lennart Borgman
  2013-02-25  2:53   ` Stefan Monnier
  2013-02-25  7:24 ` Achim Gratz
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25  1:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs-Devel devel

On Mon, Feb 25, 2013 at 2:40 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>
> So basically, all we need to do is to be able to easily recognize "Elisp
> source in utf-8 encoding".  One way to do that would be to use
> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
> the very beginning of the file or right after a semi-colon (for better
> backward compatibility).

Why not the normal comment about utf-8?



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:53 ` Lennart Borgman
@ 2013-02-25  2:53   ` Stefan Monnier
  2013-02-25  2:55     ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25  2:53 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Emacs-Devel devel

>> So basically, all we need to do is to be able to easily recognize "Elisp
>> source in utf-8 encoding".  One way to do that would be to use
>> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
>> the very beginning of the file or right after a semi-colon (for better
>> backward compatibility).
> Why not the normal comment about utf-8?

The "normal comment" can also be at the end of the file, which is not
an option.  But yes, the "-*- coding: utf-8 -*-" is another option
(although properly supporting it is kind of a pain, and I'd like to find
a more discrete solution for the "99% utf-8 future").


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  2:53   ` Stefan Monnier
@ 2013-02-25  2:55     ` Lennart Borgman
  2013-02-25  3:57       ` Stefan Monnier
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25  2:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs-Devel devel

On Mon, Feb 25, 2013 at 3:53 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>>> So basically, all we need to do is to be able to easily recognize "Elisp
>>> source in utf-8 encoding".  One way to do that would be to use
>>> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
>>> the very beginning of the file or right after a semi-colon (for better
>>> backward compatibility).
>> Why not the normal comment about utf-8?
>
> The "normal comment" can also be at the end of the file, which is not
> an option.  But yes, the "-*- coding: utf-8 -*-" is another option
> (although properly supporting it is kind of a pain, and I'd like to find
> a more discrete solution for the "99% utf-8 future").

Why not assuming it is utf-8? ;-)
Unless there is a "-*- coding: ..." comment.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  2:55     ` Lennart Borgman
@ 2013-02-25  3:57       ` Stefan Monnier
  2013-02-25  4:35         ` Stephen J. Turnbull
                           ` (4 more replies)
  0 siblings, 5 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25  3:57 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Emacs-Devel devel

> Why not assuming it is utf-8? ;-)
> Unless there is a "-*- coding: ..." comment.

Hmm... that's very appealing, but what about those Elisp files that have
a "coding:" tag at the end of the file?  Or are these so rare it's not
worth the trouble to worry about them?


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  3:57       ` Stefan Monnier
@ 2013-02-25  4:35         ` Stephen J. Turnbull
  2013-02-25  4:51           ` Stefan Monnier
  2013-02-25  5:24         ` Paul Eggert
                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 90+ messages in thread
From: Stephen J. Turnbull @ 2013-02-25  4:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Lennart Borgman, Emacs-Devel devel

Stefan Monnier writes:
 > > Why not assuming it is utf-8? ;-)
 > > Unless there is a "-*- coding: ..." comment.
 > 
 > Hmm... that's very appealing, but what about those Elisp files that have
 > a "coding:" tag at the end of the file?  Or are these so rare it's not
 > worth the trouble to worry about them?

Recognize it -- you're going to read those lines anyway and you need
to recognize the comments at least, and a Boyer-Moore on the last 3000
characters of the file isn't too bad right? -- and either

(1) error, or
(2) warn and reload (too bad for the user if it's not reentrant ;-).

You can also fix up lisp-mode to warn about that, and maybe even make
it annoying to save in anything but utf-8.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  4:35         ` Stephen J. Turnbull
@ 2013-02-25  4:51           ` Stefan Monnier
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25  4:51 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: Lennart Borgman, Emacs-Devel devel

> Recognize it -- you're going to read those lines anyway and you need
> to recognize the comments at least, and a Boyer-Moore on the last 3000
> characters of the file isn't too bad right? -- and either

> (1) error, or
> (2) warn and reload (too bad for the user if it's not reentrant ;-).

> You can also fix up lisp-mode to warn about that, and maybe even make
> it annoying to save in anything but utf-8.

I'm eagerly waiting to see your patch to implement it ;-)


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  3:57       ` Stefan Monnier
  2013-02-25  4:35         ` Stephen J. Turnbull
@ 2013-02-25  5:24         ` Paul Eggert
  2013-02-25  6:12           ` Xue Fuqiao
  2013-02-25 15:39           ` Eli Zaretskii
  2013-02-25  5:47         ` Leo Liu
                           ` (2 subsequent siblings)
  4 siblings, 2 replies; 90+ messages in thread
From: Paul Eggert @ 2013-02-25  5:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs-Devel devel

On 02/24/2013 07:57 PM, Stefan Monnier wrote:
> what about those Elisp files that have
> a "coding:" tag at the end of the file?  Or are these so rare it's not
> worth the trouble to worry about them?

Of the 1633 Elisp files in the Emacs trunk,
I counted 22 with a "coding:" tag at the end
of the file that specified a coding other than utf-8.

It'd be easy to fix these to put the "coding:" tag
at the start.  But how about simply converting all
the non-UTF-8 Emacs source code files to UTF-8?
That shouldn't be hard and should be simpler in the
long run.

Here are the 22 files I found:

leim/quail/cyril-jis.el
leim/quail/cyrillic.el
leim/quail/vntelex.el
leim/quail/vnvni.el
lisp/gnus/deuglify.el
lisp/gnus/gnus-cite.el
lisp/gnus/gnus-delay.el
lisp/gnus/gnus-spec.el
lisp/gnus/gnus-sum.el
lisp/gnus/message.el
lisp/gnus/mm-decode.el
lisp/gnus/mml1991.el
lisp/gnus/shr.el
lisp/ibuffer.el
lisp/international/ja-dic-cnv.el
lisp/international/ja-dic-utl.el
lisp/international/mule-util.el
lisp/language/cyril-util.el
lisp/language/thai-word.el
lisp/ruler-mode.el
lisp/textmodes/tildify.el
lisp/wdired.el




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  3:57       ` Stefan Monnier
  2013-02-25  4:35         ` Stephen J. Turnbull
  2013-02-25  5:24         ` Paul Eggert
@ 2013-02-25  5:47         ` Leo Liu
  2013-02-25 16:28         ` Ted Zlatanov
  2013-02-27  4:18         ` Kenichi Handa
  4 siblings, 0 replies; 90+ messages in thread
From: Leo Liu @ 2013-02-25  5:47 UTC (permalink / raw)
  To: emacs-devel

On 2013-02-25 11:57 +0800, Stefan Monnier wrote:
> Hmm... that's very appealing, but what about those Elisp files that have
> a "coding:" tag at the end of the file?  Or are these so rare it's not
> worth the trouble to worry about them?

Do we need to support non-utf-8 elisp files? I say we convert all of
them (those in our control) to utf-8 and be done with it.

Leo




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  5:24         ` Paul Eggert
@ 2013-02-25  6:12           ` Xue Fuqiao
  2013-02-25 15:39           ` Eli Zaretskii
  1 sibling, 0 replies; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-25  6:12 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, Emacs-Devel devel

On Sun, 24 Feb 2013 21:24:06 -0800
Paul Eggert <eggert@cs.ucla.edu> wrote:

> But how about simply converting all
> the non-UTF-8 Emacs source code files to UTF-8?
> That shouldn't be hard and should be simpler in the
> long run.

I agree.

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
  2013-02-25  1:53 ` Lennart Borgman
@ 2013-02-25  7:24 ` Achim Gratz
  2013-02-25 11:43 ` Richard Stallman
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 90+ messages in thread
From: Achim Gratz @ 2013-02-25  7:24 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier writes:
> For source files in utf-8 encoding this does not need to be the case: we
> could load them without going through load-with-code-conversion.
> And given that utf-8 should be the standard encoding for Elisp files
> (if not quite now, surely in some not too distant future), this is an
> important case.
>
> So basically, all we need to do is to be able to easily recognize "Elisp
> source in utf-8 encoding".  One way to do that would be to use
> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
> the very beginning of the file or right after a semi-colon (for better
> backward compatibility).

There's always the option to introduce a new suffix ".el8" to indicate
that the file _must_ be UTF-8, which saves you the bother of checking.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf rackAttack:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
  2013-02-25  1:53 ` Lennart Borgman
  2013-02-25  7:24 ` Achim Gratz
@ 2013-02-25 11:43 ` Richard Stallman
  2013-02-25 15:19   ` Stefan Monnier
  2013-02-25 15:35   ` Drew Adams
  2013-02-25 13:33 ` Kenichi Handa
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 90+ messages in thread
From: Richard Stallman @ 2013-02-25 11:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

    It used to be the case that compiling one's .emacs was silly because it
    provided no measurable speed difference.  But nowadays this is not true
    any more: loading a source Elisp file is significantly slower because it
    goes through load-with-code-conversion.

Is the slowdown due to heuristically recognizing the encoding?

    It'd be easy to fix these to put the "coding:" tag
    at the start.

Is that really necessary?  Loading has to read in the whole file.
Can't we check for a tag at the beginning OR at the end
before trying to heuristically recognize the encoding?

Actually, I thought Emacs already did that.  If it doesn't, it should.

Then all we need is a different default when it is an Elisp file.

    Do we need to support non-utf-8 elisp files? I say we convert all of
    them (those in our control) to utf-8 and be done with it.

We could do this, but we still need to support other people's Lisp code,
so we can't drop support for other coding systems.

Even changing the default might break things for users.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
                   ` (2 preceding siblings ...)
  2013-02-25 11:43 ` Richard Stallman
@ 2013-02-25 13:33 ` Kenichi Handa
  2013-02-25 13:50   ` Xue Fuqiao
  2013-02-25 15:35   ` Drew Adams
  2013-02-25 16:45 ` David Engster
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 90+ messages in thread
From: Kenichi Handa @ 2013-02-25 13:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

In article <jwva9qtdvvl.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> So basically, all we need to do is to be able to easily recognize "Elisp
> source in utf-8 encoding".  One way to do that would be to use
> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
> the very beginning of the file or right after a semi-colon (for better
> backward compatibility).

We can have "coding: utf-8;" tag at the first line.  Isn't it
enough?

---
Kenichi Handa
handa@gnu.org



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 13:33 ` Kenichi Handa
@ 2013-02-25 13:50   ` Xue Fuqiao
  2013-02-25 15:35     ` Drew Adams
  2013-02-25 15:52     ` Eli Zaretskii
  2013-02-25 15:35   ` Drew Adams
  1 sibling, 2 replies; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-25 13:50 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Stefan Monnier, emacs-devel

On Mon, 25 Feb 2013 22:33:29 +0900
Kenichi Handa <handa@gnu.org> wrote:

> In article <jwva9qtdvvl.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > So basically, all we need to do is to be able to easily recognize "Elisp
> > source in utf-8 encoding".  One way to do that would be to use
> > a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
> > the very beginning of the file or right after a semi-colon (for better
> > backward compatibility).
> We can have "coding: utf-8;" tag at the first line.  Isn't it
> enough?

Yes, we can.  But most Elisp source files have already been in utf-8 encoding.

In (info "(elisp) Coding Conventions"):

     If your program contains non-ASCII characters in string or
     character constants, you should make sure Emacs always decodes
     these characters the same way, regardless of the user's settings.
     The easiest way to do this is to use the coding system
     `utf-8-emacs' (*note Coding System Basics::), and specify that
     coding in the `-*-' line or the local variables list.

Maybe this can be changed now (or in the near future), because of the "99%
utf-8 future".

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 11:43 ` Richard Stallman
@ 2013-02-25 15:19   ` Stefan Monnier
  2013-02-25 15:36     ` Drew Adams
  2013-02-25 21:51     ` Richard Stallman
  2013-02-25 15:35   ` Drew Adams
  1 sibling, 2 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 15:19 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel

>     It used to be the case that compiling one's .emacs was silly because it
>     provided no measurable speed difference.  But nowadays this is not true
>     any more: loading a source Elisp file is significantly slower because it
>     goes through load-with-code-conversion.
> Is the slowdown due to heuristically recognizing the encoding?

The fast path (used by elc) parses the data straight from the file,
whereas the slow path loads the file into a buffer (which includes any
decoding if needed) and only then passes the resulting byte-stream to
the parser.
The difference is definitely noticeable.

> Even changing the default might break things for users.

Indeed.  I think the breakage would be reasonably limited, so I'd be
willing to take this risk, but I'd rather avoid having to go to the end
of the file looking for a coding: cookie.

        Stefan

^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-25 11:43 ` Richard Stallman
  2013-02-25 15:19   ` Stefan Monnier
@ 2013-02-25 15:35   ` Drew Adams
  1 sibling, 0 replies; 90+ messages in thread
From: Drew Adams @ 2013-02-25 15:35 UTC (permalink / raw)
  To: rms, 'Stefan Monnier'; +Cc: emacs-devel

>     Do we need to support non-utf-8 elisp files? I say we 
>     convert all of them (those in our control) to utf-8 and
>     be done with it.
> 
> We could do this, but we still need to support other people's 
> Lisp code, so we can't drop support for other coding systems.
> 
> Even changing the default might break things for users.

Amen.

Some files need to be compatible with older Emacs versions.
Embedded Unicode chars in such files can be problematic.

(And using a BOM is also not a good idea, IMHO.)

What is wrong with continuing to use an explicit declaration for UTF-8?  I saw
no good argument for such a change.

This is the closest to an argument that I've seen:

> And given that utf-8 should be the standard encoding for Elisp
> files (if not quite now, surely in some not too distant future),
> this is an important case.

Where "this" is the case of optimizing the use of UTF-8 files by avoiding an
unnecessary `load-with-code-conversion'.

But AFAICT, nothing stops Emacs from doing that anyway.  Just declare that the
file is UTF-8 explicitly.  Where's the beef?

I am all in favor of using Unicode in Emacs and beyond.  Best thing that's
happened to Emacs in years, in fact.  But just because an encoding becomes the
new "standard" does not mean that other encodings should no longer be supported.

And that means continued support in older releases as well.  It does no good to
introduce something now to distinguish non-UTF-8 in new releases, if that won't
be recognized in older releases.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-25 13:33 ` Kenichi Handa
  2013-02-25 13:50   ` Xue Fuqiao
@ 2013-02-25 15:35   ` Drew Adams
  1 sibling, 0 replies; 90+ messages in thread
From: Drew Adams @ 2013-02-25 15:35 UTC (permalink / raw)
  To: 'Kenichi Handa', 'Stefan Monnier'; +Cc: emacs-devel

> > So basically, all we need to do is to be able to easily 
> > recognize "Elisp source in utf-8 encoding".  One way to do
> > that would be to use a BOM-like marker, e.g. start utf-8
> > Elisp files with "\ufeff" either at the very beginning of
> > the file or right after a semi-colon (for better
> > backward compatibility).
> 
> We can have "coding: utf-8;" tag at the first line.  Isn't it
> enough?

Thank you.  A voice of reason.

So far, the proposed changes (and the entire discussion) sound like a solution
looking for a problem.

What's the problem with an explicit "coding: utf-8;" declaration?  Isn't it
enough?




^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-25 13:50   ` Xue Fuqiao
@ 2013-02-25 15:35     ` Drew Adams
  2013-02-25 15:52     ` Eli Zaretskii
  1 sibling, 0 replies; 90+ messages in thread
From: Drew Adams @ 2013-02-25 15:35 UTC (permalink / raw)
  To: 'Xue Fuqiao', 'Kenichi Handa'
  Cc: 'Stefan Monnier', emacs-devel

> > We can have "coding: utf-8;" tag at the first line.  Isn't it
> > enough?
> 
> Yes, we can.  But most Elisp source files have already been 
> in utf-8 encoding.  In (info "(elisp) Coding Conventions"):
> 
>      If your program contains non-ASCII characters in string or
>      character constants, you should make sure Emacs always decodes
>      these characters the same way, regardless of the user's settings.
>      The easiest way to do this is to use the coding system
>      `utf-8-emacs' (*note Coding System Basics::), and specify that
>      coding in the `-*-' line or the local variables list.
> 
> Maybe this can be changed now (or in the near future), 
> because of the "99% utf-8 future".

Why?  What is wrong with UTF-8 code declaring itself as such?
Why break backward compatibility gratuitously?

What are you really saving by such a restrictive change?  A few chars at the
start of the file?  Is this only about laziness?

If you want to be sure, then make Emacs, upon saving, throw the user up against
the wall and ask if s?he is sure she really doesn't want to add a UTF-8
declaration, etc.

IOW, in the traditional Emacs way, have Emacs help you but not force you.  Have
Emacs make it easy to DTRT, but not stop you from doing otherwise.

And it is so simple to automatically add such a declaration line to every new
Emacs-Lisp file - automatic file headers.  Why hard-code this kind of thing?
Just let the file itself tell Emacs what it wants to be encoded in.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-25 15:19   ` Stefan Monnier
@ 2013-02-25 15:36     ` Drew Adams
  2013-02-25 16:09       ` Stefan Monnier
  2013-02-25 21:51     ` Richard Stallman
  1 sibling, 1 reply; 90+ messages in thread
From: Drew Adams @ 2013-02-25 15:36 UTC (permalink / raw)
  To: 'Stefan Monnier', 'Richard Stallman'; +Cc: emacs-devel

> I think the breakage would be reasonably limited, so I'd be
> willing to take this risk,

_You_ would not be taking any risk for third-party code that needs to work with
multiple Emacs versions.  You would be making others jump through hoops.

They would likely need to split a file that now works across versions into two
files, one that works only prior to your gratuitous change and the other that
works only after it.

That kind of condional test is something that we should do in software: (if X Y
Z).  We should not be making programmers duplicate and modify files to
accommodate encoding/version differences.

> but I'd rather avoid having to go 
> to the end of the file looking for a coding: cookie.

Do whatever you need to do to find the cookie.  Demand that the cookie be in the
first line, if you want.  No big deal.

But please do use a cookie (some kind of explicit declaration that is tolerated
by older Emacs versions).  Please do not make things hard for 3rd-party code
that works cross-version.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  5:24         ` Paul Eggert
  2013-02-25  6:12           ` Xue Fuqiao
@ 2013-02-25 15:39           ` Eli Zaretskii
  2013-02-25 18:41             ` Paul Eggert
  1 sibling, 1 reply; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-25 15:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

> Date: Sun, 24 Feb 2013 21:24:06 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: Emacs-Devel devel <emacs-devel@gnu.org>
> 
> It'd be easy to fix these to put the "coding:" tag
> at the start.  But how about simply converting all
> the non-UTF-8 Emacs source code files to UTF-8?

AFAIR, some (a small number, perhaps) of them cannot be converted,
because they use characters that we don't unify.

In any case, converting the files that come with Emacs will still
leave us with gobs of files in external packages, and also with user
.emacs files.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 13:50   ` Xue Fuqiao
  2013-02-25 15:35     ` Drew Adams
@ 2013-02-25 15:52     ` Eli Zaretskii
  2013-02-25 22:39       ` Xue Fuqiao
  1 sibling, 1 reply; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-25 15:52 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: handa, monnier, emacs-devel

> Date: Mon, 25 Feb 2013 21:50:38 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > We can have "coding: utf-8;" tag at the first line.  Isn't it
> > enough?
> 
> Yes, we can.  But most Elisp source files have already been in utf-8 encoding.
> 
> In (info "(elisp) Coding Conventions"):
> 
>      If your program contains non-ASCII characters in string or
>      character constants, you should make sure Emacs always decodes
>      these characters the same way, regardless of the user's settings.
>      The easiest way to do this is to use the coding system
>      `utf-8-emacs' (*note Coding System Basics::), and specify that
>      coding in the `-*-' line or the local variables list.

utf-8-emacs and utf-8 are not the same encoding.  The former is a
variant used by Emacs internally, it won't be 100% recognizable by any
program except Emacs.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 15:36     ` Drew Adams
@ 2013-02-25 16:09       ` Stefan Monnier
  2013-02-25 16:31         ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 16:09 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Richard Stallman', emacs-devel

>> I think the breakage would be reasonably limited, so I'd be
>> willing to take this risk,
> _You_ would not be taking any risk for third-party code that needs to
> work with multiple Emacs versions.  You would be making others jump
> through hoops.

Please think before you post such nonsense.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  3:57       ` Stefan Monnier
                           ` (2 preceding siblings ...)
  2013-02-25  5:47         ` Leo Liu
@ 2013-02-25 16:28         ` Ted Zlatanov
  2013-02-27  4:18         ` Kenichi Handa
  4 siblings, 0 replies; 90+ messages in thread
From: Ted Zlatanov @ 2013-02-25 16:28 UTC (permalink / raw)
  To: emacs-devel

On Sun, 24 Feb 2013 22:57:27 -0500 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> Why not assuming it is utf-8? ;-)
>> Unless there is a "-*- coding: ..." comment.

SM> Hmm... that's very appealing, but what about those Elisp files that have
SM> a "coding:" tag at the end of the file?  Or are these so rare it's not
SM> worth the trouble to worry about them?

10 years ago, maybe.  Today, it's definitely not worth the trouble IMHO.

Ted




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 16:09       ` Stefan Monnier
@ 2013-02-25 16:31         ` Lennart Borgman
  2013-02-25 18:31           ` Stefan Monnier
  2013-02-26  4:54           ` Stephen J. Turnbull
  0 siblings, 2 replies; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25 16:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

[-- Attachment #1: Type: text/plain, Size: 89 bytes --]

The .el8 (and .emacs8) way seems both fast and excluding backward
compatibility trouble.

[-- Attachment #2: Type: text/html, Size: 106 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
                   ` (3 preceding siblings ...)
  2013-02-25 13:33 ` Kenichi Handa
@ 2013-02-25 16:45 ` David Engster
  2013-02-26 12:56   ` Richard Stallman
  2013-02-25 21:12 ` Ivan Kanis
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 90+ messages in thread
From: David Engster @ 2013-02-25 16:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier writes:
> It used to be the case that compiling one's .emacs was silly because it
> provided no measurable speed difference.  But nowadays this is not true
> any more: loading a source Elisp file is significantly slower because it
> goes through load-with-code-conversion.

A very rough estimate through 'benchmark-run' indicates that
`find-file-literally' is indeed about 10x faster than
`find-file'. Still, the latter on my .emacs takes only about 0.02s on my
very slow laptop, so I don't really see the point. The difference might
be measurable, but surely not noticeable.

-David



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 16:31         ` Lennart Borgman
@ 2013-02-25 18:31           ` Stefan Monnier
  2013-02-25 19:20             ` Lennart Borgman
  2013-02-26  4:54           ` Stephen J. Turnbull
  1 sibling, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 18:31 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

> The .el8 (and .emacs8) way seems both fast and excluding backward
> compatibility trouble.

We need to worry about both backward and forward compatibility, so this
choice is rather undesirable.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 15:39           ` Eli Zaretskii
@ 2013-02-25 18:41             ` Paul Eggert
  2013-02-26  7:23               ` Werner LEMBERG
  0 siblings, 1 reply; 90+ messages in thread
From: Paul Eggert @ 2013-02-25 18:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 02/25/13 07:39, Eli Zaretskii wrote:
> AFAIR, some (a small number, perhaps) of them cannot be converted,
> because they use characters that we don't unify.

Thanks for bringing this up.  I'll check for this before converting
the Emacs source code to UTF-8.  Files that can't be converted,
I'll leave in their existing coding system, but I'll move the
"coding: WHATEVER" tag to the front as that's more convenient
for processing.

> converting the files that come with Emacs will still
> leave us with gobs of files in external packages, and also with user
> .emacs files.

Absolutely.  We should support those files somehow.
Still, Emacs itself should set a good example, and these
days UTF-8 is the way to go unless there's a good reason
otherwise.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 18:31           ` Stefan Monnier
@ 2013-02-25 19:20             ` Lennart Borgman
  2013-02-25 20:53               ` Stefan Monnier
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25 19:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

On Mon, Feb 25, 2013 at 7:31 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> The .el8 (and .emacs8) way seems both fast and excluding backward
>> compatibility trouble.
>
> We need to worry about both backward and forward compatibility, so this
> choice is rather undesirable.

When is compatibility a problem with this choice? (You can of course
put a "coding" comment too in the files.)



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 19:20             ` Lennart Borgman
@ 2013-02-25 20:53               ` Stefan Monnier
  2013-02-25 20:57                 ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 20:53 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

> When is compatibility a problem with this choice? (You can of course
> put a "coding" comment too in the files.)

Older emacsen won't find the .el8 files.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 20:53               ` Stefan Monnier
@ 2013-02-25 20:57                 ` Lennart Borgman
  2013-02-25 21:37                   ` Stefan Monnier
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25 20:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

On Mon, Feb 25, 2013 at 9:53 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> When is compatibility a problem with this choice? (You can of course
>> put a "coding" comment too in the files.)
>
> Older emacsen won't find the .el8 files.

Yes, but when will that be a compatibility problem?



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
                   ` (4 preceding siblings ...)
  2013-02-25 16:45 ` David Engster
@ 2013-02-25 21:12 ` Ivan Kanis
  2013-02-25 22:47   ` Glenn Morris
  2013-02-25 21:17 ` Barry Warsaw
  2013-02-26  7:59 ` Andreas Röhler
  7 siblings, 1 reply; 90+ messages in thread
From: Ivan Kanis @ 2013-02-25 21:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

February, 24 at 20:40 Stefan wrote:

> It used to be the case that compiling one's .emacs was silly because it
> provided no measurable speed difference.  But nowadays this is not true
> any more: loading a source Elisp file is significantly slower because it
> goes through load-with-code-conversion.

Why don't you let the user decides if she needs to compile her .emacs?
-- 
Nobody ever went broke underestimating the intelligence of the
American public.
    -- H L Mencken



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
                   ` (5 preceding siblings ...)
  2013-02-25 21:12 ` Ivan Kanis
@ 2013-02-25 21:17 ` Barry Warsaw
  2013-02-26  7:10   ` Thierry Volpiatto
  2013-02-26  7:59 ` Andreas Röhler
  7 siblings, 1 reply; 90+ messages in thread
From: Barry Warsaw @ 2013-02-25 21:17 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]

On Feb 24, 2013, at 08:40 PM, Stefan Monnier wrote:

>It used to be the case that compiling one's .emacs was silly because it
>provided no measurable speed difference.  But nowadays this is not true
>any more: loading a source Elisp file is significantly slower because it
>goes through load-with-code-conversion.

I didn't know that, and it's a shame for me because I've loaded all my
personal emacs files as source since the beginning of time (makes it easier to
integrate them with revision control systems).

>So basically, all we need to do is to be able to easily recognize "Elisp
>source in utf-8 encoding".  One way to do that would be to use
>a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
>the very beginning of the file or right after a semi-colon (for better
>backward compatibility).

Stephen's suggestion of assuming utf-8, with -*- overriding and warnings if
the end of file encoding landmarks don't match seems reasonable to me.

Cheers,
-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 20:57                 ` Lennart Borgman
@ 2013-02-25 21:37                   ` Stefan Monnier
  2013-02-25 21:57                     ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 21:37 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

>>> When is compatibility a problem with this choice? (You can of course
>>> put a "coding" comment too in the files.)
>> Older emacsen won't find the .el8 files.
> Yes, but when will that be a compatibility problem?

When someone wants to write a package that works on older and
newer Emacsen.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 15:19   ` Stefan Monnier
  2013-02-25 15:36     ` Drew Adams
@ 2013-02-25 21:51     ` Richard Stallman
  2013-02-25 23:54       ` Stefan Monnier
  1 sibling, 1 reply; 90+ messages in thread
From: Richard Stallman @ 2013-02-25 21:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

    The fast path (used by elc) parses the data straight from the file,
    whereas the slow path loads the file into a buffer (which includes any
    decoding if needed) and only then passes the resulting byte-stream to
    the parser.

If the coding system were known in advance to be utf-8, it would
still need to be read into a buffer, decoded, and parsed from there.

You can see how fast this would be by putting a coding tag
on the file.  How fast is it, compared with reading source
without the coding tag?  How fast is it, compared with reading the compiled
file?

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 21:37                   ` Stefan Monnier
@ 2013-02-25 21:57                     ` Lennart Borgman
  2013-02-25 23:59                       ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25 21:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

On Mon, Feb 25, 2013 at 10:37 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>>>> When is compatibility a problem with this choice? (You can of course
>>>> put a "coding" comment too in the files.)
>>> Older emacsen won't find the .el8 files.
>> Yes, but when will that be a compatibility problem?
>
> When someone wants to write a package that works on older and
> newer Emacsen.

Aren't packages normally byte compiled? (So using the .el files is no
problem there.)



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 15:52     ` Eli Zaretskii
@ 2013-02-25 22:39       ` Xue Fuqiao
  2013-02-26  3:48         ` Eli Zaretskii
  0 siblings, 1 reply; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-25 22:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: handa, monnier, emacs-devel

On Mon, 25 Feb 2013 17:52:50 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> utf-8-emacs and utf-8 are not the same encoding.  The former is a
> variant used by Emacs internally, it won't be 100% recognizable by any
> program except Emacs.

Why `utf-8-emacs'?  Why not `utf-8-auto' or `utf-8'?

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 21:12 ` Ivan Kanis
@ 2013-02-25 22:47   ` Glenn Morris
  0 siblings, 0 replies; 90+ messages in thread
From: Glenn Morris @ 2013-02-25 22:47 UTC (permalink / raw)
  To: Ivan Kanis; +Cc: Stefan Monnier, emacs-devel

Ivan Kanis wrote:

> February, 24 at 20:40 Stefan wrote:
>
>> It used to be the case that compiling one's .emacs was silly because it
>> provided no measurable speed difference.  But nowadays this is not true
>> any more: loading a source Elisp file is significantly slower because it
>> goes through load-with-code-conversion.
>
> Why don't you let the user decides if she needs to compile her .emacs?

(That's something of a non-sequitur...)

You're free to compile it if you like. I'm free to have an opinion about
whether it's worthwhile (and to write about it in the Emacs manual,
which I did 5 years ago.)




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 21:51     ` Richard Stallman
@ 2013-02-25 23:54       ` Stefan Monnier
  0 siblings, 0 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-25 23:54 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel

>     The fast path (used by elc) parses the data straight from the file,
>     whereas the slow path loads the file into a buffer (which includes any
>     decoding if needed) and only then passes the resulting byte-stream to
>     the parser.
> If the coding system were known in advance to be utf-8, it would
> still need to be read into a buffer, decoded, and parsed from there.

No, the .elc files are encoded in utf-8 and yet they're read straight
from the file without going through a buffer.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 21:57                     ` Lennart Borgman
@ 2013-02-25 23:59                       ` Lennart Borgman
  2013-02-26 17:27                         ` Achim Gratz
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-25 23:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Drew Adams, Emacs-Devel devel

On Mon, Feb 25, 2013 at 10:57 PM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> On Mon, Feb 25, 2013 at 10:37 PM, Stefan Monnier
> <monnier@iro.umontreal.ca> wrote:
>>>>> When is compatibility a problem with this choice? (You can of course
>>>>> put a "coding" comment too in the files.)
>>>> Older emacsen won't find the .el8 files.
>>> Yes, but when will that be a compatibility problem?
>>
>> When someone wants to write a package that works on older and
>> newer Emacsen.
>
> Aren't packages normally byte compiled? (So using the .el files is no
> problem there.)

Hm. What is needed to learn old Emacsen to load .el8 files? (Just as
they load .el files.)



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 22:39       ` Xue Fuqiao
@ 2013-02-26  3:48         ` Eli Zaretskii
  2013-02-26 10:44           ` Xue Fuqiao
  0 siblings, 1 reply; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-26  3:48 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: handa, monnier, emacs-devel

> Date: Tue, 26 Feb 2013 06:39:04 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> Cc: handa@gnu.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> On Mon, 25 Feb 2013 17:52:50 +0200
> Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > utf-8-emacs and utf-8 are not the same encoding.  The former is a
> > variant used by Emacs internally, it won't be 100% recognizable by any
> > program except Emacs.
> 
> Why `utf-8-emacs'?  Why not `utf-8-auto' or `utf-8'?

Are you asking my why it was named that way?  Because it's specific to
Emacs, I guess.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 16:31         ` Lennart Borgman
  2013-02-25 18:31           ` Stefan Monnier
@ 2013-02-26  4:54           ` Stephen J. Turnbull
  2013-02-26  8:29             ` Ulrich Mueller
  1 sibling, 1 reply; 90+ messages in thread
From: Stephen J. Turnbull @ 2013-02-26  4:54 UTC (permalink / raw)
  To: Lennart Borgman
  Cc: Emacs-Devel devel, Stefan Monnier, Drew Adams,
	Richard M. Stallman

Lennart Borgman writes:

 > The .el8 (and .emacs8) way seems both fast and excluding backward
 > compatibility trouble.

It's ugly, though.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 21:17 ` Barry Warsaw
@ 2013-02-26  7:10   ` Thierry Volpiatto
  0 siblings, 0 replies; 90+ messages in thread
From: Thierry Volpiatto @ 2013-02-26  7:10 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

Barry Warsaw <barry@python.org> writes:

> Stephen's suggestion of assuming utf-8, with -*- overriding and warnings if
> the end of file encoding landmarks don't match seems reasonable to me.

I agree with this, for people that use already "coding: utf-8" at end of
their files in locals vars (like me), they could move this tag of top of
file to keep compatibility with older emacs, in this case (even if not
needed for emacs-24.4 because default is utf-8) no warnings please ;-)

-- 
Thierry
Get my Gnupg key:
gpg --keyserver pgp.mit.edu --recv-keys 59F29997 

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 18:41             ` Paul Eggert
@ 2013-02-26  7:23               ` Werner LEMBERG
  2013-02-26  8:48                 ` Andreas Schwab
  0 siblings, 1 reply; 90+ messages in thread
From: Werner LEMBERG @ 2013-02-26  7:23 UTC (permalink / raw)
  To: eggert; +Cc: eliz, emacs-devel


> I'll check for this before converting the Emacs source code to
> UTF-8.  Files that can't be converted, I'll leave in their existing
> coding system, but I'll move the "coding: WHATEVER" tag to the front
> as that's more convenient for processing.

Please let Handa-san have a look at your file-coding changes before
committing them.  I think he know those issues best.


    Werner



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
                   ` (6 preceding siblings ...)
  2013-02-25 21:17 ` Barry Warsaw
@ 2013-02-26  7:59 ` Andreas Röhler
  7 siblings, 0 replies; 90+ messages in thread
From: Andreas Röhler @ 2013-02-26  7:59 UTC (permalink / raw)
  To: emacs-devel; +Cc: Stefan Monnier

Am 25.02.2013 02:40, schrieb Stefan Monnier:
> It used to be the case that compiling one's .emacs was silly because it
> provided no measurable speed difference.  But nowadays this is not true
> any more: loading a source Elisp file is significantly slower because it
> goes through load-with-code-conversion.
>
> For source files in utf-8 encoding this does not need to be the case: we
> could load them without going through load-with-code-conversion.
> And given that utf-8 should be the standard encoding for Elisp files
> (if not quite now, surely in some not too distant future), this is an
> important case.
>
> So basically, all we need to do is to be able to easily recognize "Elisp
> source in utf-8 encoding".  One way to do that would be to use
> a BOM-like marker, e.g. start utf-8 Elisp files with "\ufeff" either at
> the very beginning of the file or right after a semi-colon (for better
> backward compatibility).
>
>
>          Stefan
>
>

Just out of curiosity, might file magic number system be suitable?
Somewhere pointed at

EF BB BF

Andreas



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26  4:54           ` Stephen J. Turnbull
@ 2013-02-26  8:29             ` Ulrich Mueller
  0 siblings, 0 replies; 90+ messages in thread
From: Ulrich Mueller @ 2013-02-26  8:29 UTC (permalink / raw)
  To: Stephen J. Turnbull
  Cc: Richard M. Stallman, Lennart Borgman, Stefan Monnier, Drew Adams,
	Emacs-Devel devel

>>>>> On Tue, 26 Feb 2013, Stephen J Turnbull wrote:

> Lennart Borgman writes:
>> The .el8 (and .emacs8) way seems both fast and excluding backward
>> compatibility trouble.

> It's ugly, though.

Not only that, it would also break compatibility with third-party
tools. For example, .el is in the suffix list of Make. Or there's the
elisp-comp script in various packages (also in gnulib, IIRC). Not to
mention installation scripts of distros.

Ulrich



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26  7:23               ` Werner LEMBERG
@ 2013-02-26  8:48                 ` Andreas Schwab
  0 siblings, 0 replies; 90+ messages in thread
From: Andreas Schwab @ 2013-02-26  8:48 UTC (permalink / raw)
  To: Werner LEMBERG; +Cc: eliz, eggert, emacs-devel

Werner LEMBERG <wl@gnu.org> writes:

>> I'll check for this before converting the Emacs source code to
>> UTF-8.  Files that can't be converted, I'll leave in their existing
>> coding system, but I'll move the "coding: WHATEVER" tag to the front
>> as that's more convenient for processing.
>
> Please let Handa-san have a look at your file-coding changes before
> committing them.  I think he know those issues best.

Generally, those that use iso-2022 need to stay that way.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26  3:48         ` Eli Zaretskii
@ 2013-02-26 10:44           ` Xue Fuqiao
  2013-02-26 17:02             ` Eli Zaretskii
  0 siblings, 1 reply; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-26 10:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: handa, monnier, emacs-devel

On Tue, 26 Feb 2013 05:48:09 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> > > utf-8-emacs and utf-8 are not the same encoding.  The former is a
> > > variant used by Emacs internally, it won't be 100% recognizable by any
> > > program except Emacs.
> > Why `utf-8-emacs'?  Why not `utf-8-auto' or `utf-8'?
> Are you asking my why it was named that way?  Because it's specific to
> Emacs, I guess.

I didn't mean that, sorry.  I mean why `utf-8-emacs'(i.e., `emacs-internal') is recommended in "(elisp) Coding Conventions".

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 16:45 ` David Engster
@ 2013-02-26 12:56   ` Richard Stallman
  2013-02-26 16:26     ` David Engster
                       ` (2 more replies)
  0 siblings, 3 replies; 90+ messages in thread
From: Richard Stallman @ 2013-02-26 12:56 UTC (permalink / raw)
  To: David Engster; +Cc: monnier, emacs-devel

    A very rough estimate through 'benchmark-run' indicates that
    `find-file-literally' is indeed about 10x faster than
    `find-file'.

find-file-literally avoids DOING the code conversion.  However, even
if we use utf-8 as the default, or unconditionally, conversion would
still need to be done.

All we could potentially avoid with the proposal for utf-8 as default
is the step of choosing the coding system heuristically.  This would
not make it as fast as find-file-literally.  It would make it as fast
as if you had a coding: tag on the first line.  How much speedup is
that?

If we could manage to speed up the entire code conversion stage,
that would be a much bigger improvement, helping every user.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 12:56   ` Richard Stallman
@ 2013-02-26 16:26     ` David Engster
  2013-02-26 20:19       ` Richard Stallman
  2013-02-26 16:57     ` Eli Zaretskii
  2013-02-27  2:22     ` Stephen J. Turnbull
  2 siblings, 1 reply; 90+ messages in thread
From: David Engster @ 2013-02-26 16:26 UTC (permalink / raw)
  To: Richard Stallman; +Cc: monnier, emacs-devel

Richard Stallman writes:
>     A very rough estimate through 'benchmark-run' indicates that
>     `find-file-literally' is indeed about 10x faster than
>     `find-file'.
>
> find-file-literally avoids DOING the code conversion.  However, even
> if we use utf-8 as the default, or unconditionally, conversion would
> still need to be done.
>
> All we could potentially avoid with the proposal for utf-8 as default
> is the step of choosing the coding system heuristically.  This would
> not make it as fast as find-file-literally.  It would make it as fast
> as if you had a coding: tag on the first line.  How much speedup is
> that?

I wouldn't know how to test this without implementing it. Using
`find-file-literally' at least provides a number for the ideal case with
no decoding at all. All I was saying is that Stefan's proposal looks
like micro-optimization to me, since detecting the coding system of the
init file is only a very small part of Emacs' startup time.

-David



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 12:56   ` Richard Stallman
  2013-02-26 16:26     ` David Engster
@ 2013-02-26 16:57     ` Eli Zaretskii
  2013-02-26 20:19       ` Richard Stallman
  2013-02-27  2:22     ` Stephen J. Turnbull
  2 siblings, 1 reply; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-26 16:57 UTC (permalink / raw)
  To: rms; +Cc: monnier, deng, emacs-devel

> Date: Tue, 26 Feb 2013 07:56:22 -0500
> From: Richard Stallman <rms@gnu.org>
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
>     A very rough estimate through 'benchmark-run' indicates that
>     `find-file-literally' is indeed about 10x faster than
>     `find-file'.
> 
> find-file-literally avoids DOING the code conversion.  However, even
> if we use utf-8 as the default, or unconditionally, conversion would
> still need to be done.

Since Emacs now uses utf-8 based encoding internally, conversion from
utf-8 is a no-op (except for the EOL conversion).



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 10:44           ` Xue Fuqiao
@ 2013-02-26 17:02             ` Eli Zaretskii
  0 siblings, 0 replies; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-26 17:02 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: handa, monnier, emacs-devel

> Date: Tue, 26 Feb 2013 18:44:51 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> Cc: handa@gnu.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> I mean why `utf-8-emacs'(i.e., `emacs-internal') is recommended in "(elisp) Coding Conventions".

Because, by its very definition, utf-8-emacs can encode every
character supported by Emacs, and reading such a file is guaranteed to
reproduce the original characters exactly.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25 23:59                       ` Lennart Borgman
@ 2013-02-26 17:27                         ` Achim Gratz
  2013-02-26 21:38                           ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Achim Gratz @ 2013-02-26 17:27 UTC (permalink / raw)
  To: emacs-devel

Lennart Borgman writes:
> Hm. What is needed to learn old Emacsen to load .el8 files? (Just as
> they load .el files.)

If they should pick up the new files before the "ordinary" ones:

(setq load-suffixes '(".elc" ".el8" ".el"))


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 16:57     ` Eli Zaretskii
@ 2013-02-26 20:19       ` Richard Stallman
  2013-02-26 20:45         ` Eli Zaretskii
  0 siblings, 1 reply; 90+ messages in thread
From: Richard Stallman @ 2013-02-26 20:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, deng, emacs-devel

    Since Emacs now uses utf-8 based encoding internally, conversion from
    utf-8 is a no-op (except for the EOL conversion).

I don't think that is true.  The Emacs internal encoding is not
identical to Unicode, and conversion to or from utf-8 is not a no-op.

Maybe it is a no-op for coding system utf-8-emacs, but I won't swear to
that without its being checked.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 16:26     ` David Engster
@ 2013-02-26 20:19       ` Richard Stallman
  2013-02-26 21:00         ` David Engster
  0 siblings, 1 reply; 90+ messages in thread
From: Richard Stallman @ 2013-02-26 20:19 UTC (permalink / raw)
  To: David Engster; +Cc: monnier, emacs-devel

    > All we could potentially avoid with the proposal for utf-8 as default
    > is the step of choosing the coding system heuristically.  This would
    > not make it as fast as find-file-literally.  It would make it as fast
    > as if you had a coding: tag on the first line.  How much speedup is
    > that?

    I wouldn't know how to test this without implementing it.

It is simple to test it.  Put `-*-coding: utf-8;-*-' on the first line
of the file, then visit it using find-file.  How fast is that?
How does it compare, in time, with the two cases you tested?

    All I was saying is that Stefan's proposal looks
    like micro-optimization to me, since detecting the coding system of the
    init file is only a very small part of Emacs' startup time.

My guess is the same as yours.  The method I suggest above will make
it possible to answer the matter factually.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 20:19       ` Richard Stallman
@ 2013-02-26 20:45         ` Eli Zaretskii
  0 siblings, 0 replies; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-26 20:45 UTC (permalink / raw)
  To: rms; +Cc: monnier, deng, emacs-devel

> Date: Tue, 26 Feb 2013 15:19:57 -0500
> From: Richard Stallman <rms@gnu.org>
> CC: deng@randomsample.de, monnier@iro.umontreal.ca,
> 	emacs-devel@gnu.org
> 
>     Since Emacs now uses utf-8 based encoding internally, conversion from
>     utf-8 is a no-op (except for the EOL conversion).
> 
> I don't think that is true.  The Emacs internal encoding is not
> identical to Unicode

It is identical for all the characters that are valid Unicode
codepoints.

> and conversion to or from utf-8 is not a no-op.

See decode_coding_utf_8.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 20:19       ` Richard Stallman
@ 2013-02-26 21:00         ` David Engster
  2013-02-26 21:12           ` Eli Zaretskii
                             ` (3 more replies)
  0 siblings, 4 replies; 90+ messages in thread
From: David Engster @ 2013-02-26 21:00 UTC (permalink / raw)
  To: Richard Stallman; +Cc: monnier, emacs-devel

Richard Stallman writes:
>     > All we could potentially avoid with the proposal for utf-8 as default
>     > is the step of choosing the coding system heuristically.  This would
>     > not make it as fast as find-file-literally.  It would make it as fast
>     > as if you had a coding: tag on the first line.  How much speedup is
>     > that?
>
>     I wouldn't know how to test this without implementing it.
>
> It is simple to test it.  Put `-*-coding: utf-8;-*-' on the first line
> of the file, then visit it using find-file.  How fast is that?
> How does it compare, in time, with the two cases you tested?

I actually tried that, but it took *longer* with the "coding: utf-8;"
comment, which is why I thought that this is apparently not working and
some additional changes are needed first.

Here's what I tried: I took files.el from emacs/lisp and copied it to
files_utf8.el and added the "coding" comment there. Then I used this:

(setq filename "~/files_utf8.el")
(setq final 0)
(dotimes (i 1000)
  (let ((res
	 (benchmark-run (find-file filename))))
    (setq final (+ final (car res)))
    (kill-buffer (find-buffer-visiting filename))))
(message "Result: %f" final)

Without the coding-comment, this takes ~7.5s, but *with* the
coding-comment it takes ~9.5s.

-David



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:00         ` David Engster
@ 2013-02-26 21:12           ` Eli Zaretskii
  2013-02-26 21:18             ` David Engster
  2013-02-27  1:22           ` Stefan Monnier
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 90+ messages in thread
From: Eli Zaretskii @ 2013-02-26 21:12 UTC (permalink / raw)
  To: David Engster; +Cc: emacs-devel, rms, monnier

> From: David Engster <deng@randomsample.de>
> Date: Tue, 26 Feb 2013 22:00:34 +0100
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> I actually tried that, but it took *longer* with the "coding: utf-8;"
> comment, which is why I thought that this is apparently not working and
> some additional changes are needed first.
> 
> Here's what I tried: I took files.el from emacs/lisp and copied it to
> files_utf8.el and added the "coding" comment there. Then I used this:
> 
> (setq filename "~/files_utf8.el")
> (setq final 0)
> (dotimes (i 1000)
>   (let ((res
> 	 (benchmark-run (find-file filename))))
>     (setq final (+ final (car res)))
>     (kill-buffer (find-buffer-visiting filename))))
> (message "Result: %f" final)
> 
> Without the coding-comment, this takes ~7.5s, but *with* the
> coding-comment it takes ~9.5s.

Try "coding: utf-8-unix;" instead.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:12           ` Eli Zaretskii
@ 2013-02-26 21:18             ` David Engster
  2013-02-26 22:40               ` Xue Fuqiao
  2013-02-26 22:51               ` David Engster
  0 siblings, 2 replies; 90+ messages in thread
From: David Engster @ 2013-02-26 21:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, rms, monnier

Eli Zaretskii writes:
>> From: David Engster <deng@randomsample.de>
>> Date: Tue, 26 Feb 2013 22:00:34 +0100
>> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
>
>> 
>> I actually tried that, but it took *longer* with the "coding: utf-8;"
>> comment, which is why I thought that this is apparently not working and
>> some additional changes are needed first.
>> 
>> Here's what I tried: I took files.el from emacs/lisp and copied it to
>> files_utf8.el and added the "coding" comment there. Then I used this:
>> 
>> (setq filename "~/files_utf8.el")
>> (setq final 0)
>> (dotimes (i 1000)
>>   (let ((res
>> 	 (benchmark-run (find-file filename))))
>>     (setq final (+ final (car res)))
>>     (kill-buffer (find-buffer-visiting filename))))
>> (message "Result: %f" final)
>> 
>> Without the coding-comment, this takes ~7.5s, but *with* the
>> coding-comment it takes ~9.5s.
>
> Try "coding: utf-8-unix;" instead.

This only reduces the time a tiny bit (could well be within
variance). It's still almost 2s slower.

Maybe I'm doing something stupid. I'd feel more comfortable if someone
else could confirm this.

-David



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 17:27                         ` Achim Gratz
@ 2013-02-26 21:38                           ` Lennart Borgman
  2013-02-26 21:43                             ` Dmitry Gutov
  0 siblings, 1 reply; 90+ messages in thread
From: Lennart Borgman @ 2013-02-26 21:38 UTC (permalink / raw)
  To: Achim Gratz; +Cc: Emacs-Devel devel

On Tue, Feb 26, 2013 at 6:27 PM, Achim Gratz <Stromeko@nexgo.de> wrote:
> Lennart Borgman writes:
>> Hm. What is needed to learn old Emacsen to load .el8 files? (Just as
>> they load .el files.)
>
> If they should pick up the new files before the "ordinary" ones:
>
> (setq load-suffixes '(".elc" ".el8" ".el"))
>
>
> Regards,
> Achim.

Thanks Achim. If the .el8 files contains a "coding" comment too then
compatibility does not seem to be a problem.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:38                           ` Lennart Borgman
@ 2013-02-26 21:43                             ` Dmitry Gutov
  2013-02-26 21:47                               ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Dmitry Gutov @ 2013-02-26 21:43 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Achim Gratz, Emacs-Devel devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

> On Tue, Feb 26, 2013 at 6:27 PM, Achim Gratz <Stromeko@nexgo.de> wrote:
>> Lennart Borgman writes:
>>> Hm. What is needed to learn old Emacsen to load .el8 files? (Just as
>>> they load .el files.)
>>
>> If they should pick up the new files before the "ordinary" ones:
>>
>> (setq load-suffixes '(".elc" ".el8" ".el"))
>
> Thanks Achim. If the .el8 files contains a "coding" comment too then
> compatibility does not seem to be a problem.

And where will this `setq' instruction reside? In each user's init file?



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:43                             ` Dmitry Gutov
@ 2013-02-26 21:47                               ` Lennart Borgman
  0 siblings, 0 replies; 90+ messages in thread
From: Lennart Borgman @ 2013-02-26 21:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Achim Gratz, Emacs-Devel devel

On Tue, Feb 26, 2013 at 10:43 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> Lennart Borgman <lennart.borgman@gmail.com> writes:
>
>> On Tue, Feb 26, 2013 at 6:27 PM, Achim Gratz <Stromeko@nexgo.de> wrote:
>>> Lennart Borgman writes:
>>>> Hm. What is needed to learn old Emacsen to load .el8 files? (Just as
>>>> they load .el files.)
>>>
>>> If they should pick up the new files before the "ordinary" ones:
>>>
>>> (setq load-suffixes '(".elc" ".el8" ".el"))
>>
>> Thanks Achim. If the .el8 files contains a "coding" comment too then
>> compatibility does not seem to be a problem.
>
> And where will this `setq' instruction reside? In each user's init file?

Somewhere... ;-)
Stefan gave as an example packages written to be used by both new and
old Emacsen. It should be possible to handle it in the package or the
package handler.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:18             ` David Engster
@ 2013-02-26 22:40               ` Xue Fuqiao
  2013-02-26 22:51               ` David Engster
  1 sibling, 0 replies; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-26 22:40 UTC (permalink / raw)
  To: David Engster; +Cc: Eli Zaretskii, rms, monnier, emacs-devel

On Tue, 26 Feb 2013 22:18:27 +0100
David Engster <deng@randomsample.de> wrote:

> Eli Zaretskii writes:
> >> I actually tried that, but it took *longer* with the "coding: utf-8;"
> >> comment, which is why I thought that this is apparently not working and
> >> some additional changes are needed first.
> >> Here's what I tried: I took files.el from emacs/lisp and copied it to
> >> files_utf8.el and added the "coding" comment there. Then I used this:
> >> (setq filename "~/files_utf8.el")
> >> (setq final 0)
> >> (dotimes (i 1000)
> >>   (let ((res
> >> 	 (benchmark-run (find-file filename))))
> >>     (setq final (+ final (car res)))
> >>     (kill-buffer (find-buffer-visiting filename))))
> >> (message "Result: %f" final)
> >> 
> >> Without the coding-comment, this takes ~7.5s, but *with* the
> >> coding-comment it takes ~9.5s.
> > Try "coding: utf-8-unix;" instead.
> This only reduces the time a tiny bit (could well be within
> variance). It's still almost 2s slower.
> Maybe I'm doing something stupid. I'd feel more comfortable if someone
> else could confirm this.

On my machine:

without coding-comment: 8.345521

with `utf-8': 11.226320

with `utf-8-unix': 10.710828

with `utf-8-emacs': 12.518312

with `utf-8-auto': 12.239123

> -David

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:18             ` David Engster
  2013-02-26 22:40               ` Xue Fuqiao
@ 2013-02-26 22:51               ` David Engster
  2013-02-27  0:44                 ` Drew Adams
  1 sibling, 1 reply; 90+ messages in thread
From: David Engster @ 2013-02-26 22:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, rms, monnier

David Engster writes:
> Eli Zaretskii writes:
>>> From: David Engster <deng@randomsample.de>
>>> Date: Tue, 26 Feb 2013 22:00:34 +0100
>>> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
>
>>
>>> 
>>> I actually tried that, but it took *longer* with the "coding: utf-8;"
>>> comment, which is why I thought that this is apparently not working and
>>> some additional changes are needed first.
>>> 
>>> Here's what I tried: I took files.el from emacs/lisp and copied it to
>>> files_utf8.el and added the "coding" comment there. Then I used this:
>>> 
>>> (setq filename "~/files_utf8.el")
>>> (setq final 0)
>>> (dotimes (i 1000)
>>>   (let ((res
>>> 	 (benchmark-run (find-file filename))))
>>>     (setq final (+ final (car res)))
>>>     (kill-buffer (find-buffer-visiting filename))))
>>> (message "Result: %f" final)
>>> 
>>> Without the coding-comment, this takes ~7.5s, but *with* the
>>> coding-comment it takes ~9.5s.
>>
>> Try "coding: utf-8-unix;" instead.
>
> This only reduces the time a tiny bit (could well be within
> variance). It's still almost 2s slower.
>
> Maybe I'm doing something stupid. I'd feel more comfortable if someone
> else could confirm this.

Well, it seems this happens because files.el is not UTF-8 encoded but
plain ASCII. If I put some multibyte characters in it and save it as
UTF-8, loading it becomes slower: it then takes ~12 seconds. If I put in
the magic utf8-comment, it stays at ~9.5s.

So unless I made some mistake, it seems that putting a "coding: utf-8"
comment into an actual UTF-8 encoded file speeds up loading, while
putting it in a plain ASCII file slows it down, both by about 20% resp.

-David



^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-26 22:51               ` David Engster
@ 2013-02-27  0:44                 ` Drew Adams
  0 siblings, 0 replies; 90+ messages in thread
From: Drew Adams @ 2013-02-27  0:44 UTC (permalink / raw)
  To: 'David Engster', 'Eli Zaretskii'
  Cc: rms, monnier, emacs-devel

> it seems that putting a "coding: utf-8"
> comment into an actual UTF-8 encoded file speeds up loading, while
> putting it in a plain ASCII file slows it down, both by about 
> 20% resp.

What proportion of the current Emacs source code distributed by GNU (measured
in, say, chars, not numbers of files) currently requires something other than
ASCII?  If the ratio is low, this would mean quite a slowdown.  And for what?

If that's the case, why not use UTF-8 only as needed?

Of course, if the slowdown can be fixed that's even better.  It would be good to
use UTF-8 for all distributed files, other things being equal.  The question is
whether other things are equal (or almost equal).

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:00         ` David Engster
  2013-02-26 21:12           ` Eli Zaretskii
@ 2013-02-27  1:22           ` Stefan Monnier
  2013-02-27  1:56             ` Lennart Borgman
                               ` (2 more replies)
  2013-02-27  4:49           ` Kenichi Handa
  2013-02-27 14:27           ` Richard Stallman
  3 siblings, 3 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-27  1:22 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel

> I actually tried that, but it took *longer* with the "coding: utf-8;"
> comment, which is why I thought that this is apparently not working and
> some additional changes are needed first.

In any case, this is not relevant to the question at hand, which aims to
completely side-step find-file when `load'ing a (utf-8) .el file.

As for the .el8 proposal: seriously, the problem I'd like to address is
(as others have said several times already) a very minor one, so the
solution has to be *really* lightweight to be worth the trouble.
Switching to .el8 would bring a whole lot of trouble for *very*
little benefit.

E.g. writing the C code that checks for the presence of a `coding' tag
is already more trouble than I'm willing to go through to fix
this issue.

OTOH, the "new utf-8 marker" I'd like to come up with should ideally
also work as a "lexical-binding" marker.  IOW it would serve as a kind of
"new Elisp" marker.  So maybe a unicode BOM marker would be too discrete
and we could go for something more clearly visible instead.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27  1:22           ` Stefan Monnier
@ 2013-02-27  1:56             ` Lennart Borgman
  2013-02-27 14:28             ` Richard Stallman
  2013-02-27 18:31             ` Achim Gratz
  2 siblings, 0 replies; 90+ messages in thread
From: Lennart Borgman @ 2013-02-27  1:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard Stallman, Emacs-Devel devel

On Wed, Feb 27, 2013 at 2:22 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>
> As for the .el8 proposal: seriously, the problem I'd like to address is
> (as others have said several times already) a very minor one, so the
> solution has to be *really* lightweight to be worth the trouble.
> Switching to .el8 would bring a whole lot of trouble for *very*
> little benefit.

It is not switching. It is adding .el8 as an alternative where the
parser no more checks for the "coding" comment. But I will not say
more about this. ;-)



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 12:56   ` Richard Stallman
  2013-02-26 16:26     ` David Engster
  2013-02-26 16:57     ` Eli Zaretskii
@ 2013-02-27  2:22     ` Stephen J. Turnbull
  2 siblings, 0 replies; 90+ messages in thread
From: Stephen J. Turnbull @ 2013-02-27  2:22 UTC (permalink / raw)
  To: rms; +Cc: monnier, David Engster, emacs-devel

Richard Stallman writes:

 > If we could manage to speed up the entire code conversion stage,
 > that would be a much bigger improvement, helping every user.

I suppose anything's possible, but when this issue came up for us
careful measurements showed that code conversion itself causes delays
only if the whole file is already in memory -- otherwise code
conversion throughput is at least as high as disk I/O throughput.  I
don't recall the details, that was done in the last millennium.

Again, for XEmacs comparisons of find-file-literally to anything else
(in particular, file-file) tell you very little about coding
conversion speed because find-file does a lot of processing in Lisp
that find-file-literally does not.  (files.el is nowhere near big
enough to amortize that cost; I would go for a file in the 10MB range
or bigger.)  I believe that Emacs is the same.

While presumably hooks and file handlers aren't actually called in the
tests reported in this thread, coding *detection* is costly, as it
involves a couple of extra seeks and reads on disk.  This is exactly
what the proposal to default Lisp files to UTF-8 addresses.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-25  3:57       ` Stefan Monnier
                           ` (3 preceding siblings ...)
  2013-02-25 16:28         ` Ted Zlatanov
@ 2013-02-27  4:18         ` Kenichi Handa
  2013-02-27 13:48           ` Stefan Monnier
  2013-02-27 14:28           ` Richard Stallman
  4 siblings, 2 replies; 90+ messages in thread
From: Kenichi Handa @ 2013-02-27  4:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lennart.borgman, emacs-devel

In article <jwvobf9avx9.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> > Why not assuming it is utf-8? ;-)
> > Unless there is a "-*- coding: ..." comment.

> Hmm... that's very appealing, but what about those Elisp files that have
> a "coding:" tag at the end of the file?  Or are these so rare it's not
> worth the trouble to worry about them?

Isn't it ok to make it faster to load only such a file that
has "-*- coding: utf-8-unix -*-", and document it clearly.

The following is just an quick idea.

(defun load-source-file-internal (fullname file &optional noerror nomessage)
  (let (need-decoding)
    (with-temp-buffer
      (let ((coding-system-for-read 'no-conversion))
	(insert-file-contents fullname nil 0 256)  ; or 512, 1024?
	(or (coding-tag-is-utf-8-unix-p) ; this function must be implemented.
	    (setq need-decoding t))))
    (if need-decoding
	(load-with-code-conversion fullname file noerror nomessage)
      (load fullname noerror nomessage nil nil
	    t		; new optional argument to tell Fopen to load
			; this file without through code-conversion.
	    ))))

(setq load-source-file-function 'load-source-file-internal)

---
Kenichi Handa
handa@gnu.org



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:00         ` David Engster
  2013-02-26 21:12           ` Eli Zaretskii
  2013-02-27  1:22           ` Stefan Monnier
@ 2013-02-27  4:49           ` Kenichi Handa
  2013-02-27 14:27           ` Richard Stallman
  3 siblings, 0 replies; 90+ messages in thread
From: Kenichi Handa @ 2013-02-27  4:49 UTC (permalink / raw)
  To: David Engster; +Cc: emacs-devel, rms, monnier

In article <87liaa4wp9.fsf@engster.org>, David Engster <deng@randomsample.de> writes:

> I actually tried that, but it took *longer* with the "coding: utf-8;"
> comment, which is why I thought that this is apparently not working and
> some additional changes are needed first.

> Here's what I tried: I took files.el from emacs/lisp and copied it to
> files_utf8.el and added the "coding" comment there. Then I used this:

Perhaps that's because files.el is ascii only.  Without
coding tag, emacs detects that the contents is ascii only,
and does short cut.  If utf-8 is specified, emacs skips
detection, but does code conversion while verifying bytes
are in valid utf-8, and this step involves convering bytes
to charater array (int charbuf[CHARBUF_SIZE]), and inserting
each character in charbuf into a buffer.  That step may take
longer time than detection step.

Please try with a coding tag raw-text-unix instead of utf-8.

I think it's not that difficult to improve the current code
in coding.c to make utf-8 decoding/encoding firster.

---
Kenichi Handa
handa@gnu.org

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27  4:18         ` Kenichi Handa
@ 2013-02-27 13:48           ` Stefan Monnier
  2013-02-27 14:50             ` Drew Adams
  2013-02-27 17:49             ` Werner LEMBERG
  2013-02-27 14:28           ` Richard Stallman
  1 sibling, 2 replies; 90+ messages in thread
From: Stefan Monnier @ 2013-02-27 13:48 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lennart.borgman, emacs-devel

> Isn't it ok to make it faster to load only such a file that
> has "-*- coding: utf-8-unix -*-", and document it clearly.

I'd rather not have to add such a coding tag to (virtually) all the files.


        Stefan



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-26 21:00         ` David Engster
                             ` (2 preceding siblings ...)
  2013-02-27  4:49           ` Kenichi Handa
@ 2013-02-27 14:27           ` Richard Stallman
  3 siblings, 0 replies; 90+ messages in thread
From: Richard Stallman @ 2013-02-27 14:27 UTC (permalink / raw)
  To: David Engster; +Cc: monnier, emacs-devel

    I actually tried that, but it took *longer* with the "coding: utf-8;"
    comment, which is why I thought that this is apparently not working and
    some additional changes are needed first.

Maybe you are right.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27  1:22           ` Stefan Monnier
  2013-02-27  1:56             ` Lennart Borgman
@ 2013-02-27 14:28             ` Richard Stallman
  2013-02-27 15:21               ` Stefan Monnier
  2013-02-27 18:31             ` Achim Gratz
  2 siblings, 1 reply; 90+ messages in thread
From: Richard Stallman @ 2013-02-27 14:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

    In any case, this is not relevant to the question at hand, which aims to
    completely side-step find-file when `load'ing a (utf-8) .el file.

Since decoding utf-8 is not in general trivial, that's not a trivial
thing to do.  On the other hand, if the default encoding were
utf-8-emacs, that would make it trivial.  Maybe in that case
it would be possible to just read the file.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27  4:18         ` Kenichi Handa
  2013-02-27 13:48           ` Stefan Monnier
@ 2013-02-27 14:28           ` Richard Stallman
  2013-02-28 14:10             ` Kenichi Handa
  1 sibling, 1 reply; 90+ messages in thread
From: Richard Stallman @ 2013-02-27 14:28 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lennart.borgman, monnier, emacs-devel

This change in decode_coding_utf_8 should make it considerably faster
in the usual case where few characters need conversion.

We might still want to do something else for Elisp files,
but this is worth doing anyway since it will help for other files too.

I have not installed it myself since I have not tested it enough.

It might be worth making a similar improvement for other cases,
such as  ! multibytep && ! eol_dos  or  multibytep && eol_dos.

=== modified file 'src/coding.c'
*** src/coding.c	2013-01-25 04:41:39 +0000
--- src/coding.c	2013-02-27 11:38:18 +0000
***************
*** 1294,1299 ****
--- 1294,1338 ----
  	  break;
  	}
  
+       /* In the simple case, rapidly handle ordinary characters */
+       if (multibytep && ! eol_dos
+ 	  && charbuf < charbuf_end - 6 && src < src_end - 6)
+ 	{
+ 	  while (charbuf < charbuf_end - 6 && src < src_end - 6)
+ 	    {
+ 	      c1 = *src;
+ 	      if (c1 & 0x80)
+ 		break;
+ 	      src++;
+ 	      consumed_chars++;
+ 	      *charbuf++ = c1;
+ 
+ 	      c1 = *src;
+ 	      if (c1 & 0x80)
+ 		break;
+ 	      src++;
+ 	      consumed_chars++;
+ 	      *charbuf++ = c1;
+ 
+ 	      c1 = *src;
+ 	      if (c1 & 0x80)
+ 		break;
+ 	      src++;
+ 	      consumed_chars++;
+ 	      *charbuf++ = c1;
+ 
+ 	      c1 = *src;
+ 	      if (c1 & 0x80)
+ 		break;
+ 	      src++;
+ 	      consumed_chars++;
+ 	      *charbuf++ = c1;
+ 	    }
+ 	  /* If we handled at least one character, restart the main loop.  */
+ 	  if (src != src_base)
+ 	    continue;
+ 	}
+ 
        if (byte_after_cr >= 0)
  	c1 = byte_after_cr, byte_after_cr = -1;
        else


-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-27 13:48           ` Stefan Monnier
@ 2013-02-27 14:50             ` Drew Adams
  2013-02-27 17:49             ` Werner LEMBERG
  1 sibling, 0 replies; 90+ messages in thread
From: Drew Adams @ 2013-02-27 14:50 UTC (permalink / raw)
  To: 'Stefan Monnier', 'Kenichi Handa'
  Cc: lennart.borgman, emacs-devel

> > Isn't it ok to make it faster to load only such a file that
> > has "-*- coding: utf-8-unix -*-", and document it clearly.
> 
> I'd rather not have to add such a coding tag to (virtually) 
> all the files.

Why?




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 14:28             ` Richard Stallman
@ 2013-02-27 15:21               ` Stefan Monnier
  2013-02-27 17:21                 ` Lennart Borgman
  0 siblings, 1 reply; 90+ messages in thread
From: Stefan Monnier @ 2013-02-27 15:21 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel

>     In any case, this is not relevant to the question at hand, which aims to
>     completely side-step find-file when `load'ing a (utf-8) .el file.
> Since decoding utf-8 is not in general trivial, that's not a trivial
> thing to do.  On the other hand, if the default encoding were
> utf-8-emacs, that would make it trivial.  Maybe in that case
> it would be possible to just read the file.

Yes, that's what this discussion is about: use the same fast-path as is
used for .elc files.


        Stefan


PS: AFAIK, even if it's utf-8-unix, we can read it as utf-8-emacs and
have the right answer.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 15:21               ` Stefan Monnier
@ 2013-02-27 17:21                 ` Lennart Borgman
  0 siblings, 0 replies; 90+ messages in thread
From: Lennart Borgman @ 2013-02-27 17:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Richard M. Stallman, Emacs-Devel devel

[-- Attachment #1: Type: text/plain, Size: 274 bytes --]

On Feb 27, 2013 4:21 PM, "Stefan Monnier"
> Yes, that's what this discussion is about: use the same fast-path as is
> used for .elc files.
> PS: AFAIK, even if it's utf-8-unix, we can read it as utf-8-emacs and
> have the right answer.

What about line ending conversation?

[-- Attachment #2: Type: text/html, Size: 363 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 13:48           ` Stefan Monnier
  2013-02-27 14:50             ` Drew Adams
@ 2013-02-27 17:49             ` Werner LEMBERG
  1 sibling, 0 replies; 90+ messages in thread
From: Werner LEMBERG @ 2013-02-27 17:49 UTC (permalink / raw)
  To: monnier; +Cc: handa, lennart.borgman, emacs-devel


>> Isn't it ok to make it faster to load only such a file that
>> has "-*- coding: utf-8-unix -*-", and document it clearly.
> 
> I'd rather not have to add such a coding tag to (virtually) all the
> files.

Why not?  This is kind of a meta information, comparable to HTML
source files which also need this kind of tagging.


    Werner



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27  1:22           ` Stefan Monnier
  2013-02-27  1:56             ` Lennart Borgman
  2013-02-27 14:28             ` Richard Stallman
@ 2013-02-27 18:31             ` Achim Gratz
  2013-02-27 18:39               ` Drew Adams
  2013-02-27 18:43               ` Lennart Borgman
  2 siblings, 2 replies; 90+ messages in thread
From: Achim Gratz @ 2013-02-27 18:31 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier writes:
> As for the .el8 proposal: seriously, the problem I'd like to address is
> (as others have said several times already) a very minor one, so the
> solution has to be *really* lightweight to be worth the trouble.
> Switching to .el8 would bring a whole lot of trouble for *very*
> little benefit.

I think that there are indeed other problems that would make a larger
difference[1,2].

But the question the ".el8" proposal was supposed to answer was "what if
you really don't want to decide this issue based on the contents of the
file", which means it needs a new suffix since it's a new file format.
If you stick with an already used suffix, you can still assume things
about that file, but you'll have to check and backtrack if that
assumption turns out to be wrong.  These checks could easily nullify the
effects that one was hoping to have by making the simplifying
assumptions in the first place.

[1] Being able to use a "byte-compilation-server" instead of having to
start a new Emacs instance for each byte compilation.

[2] A mode of compilation that ignores ".elc" files in the compilation
directory, at least when they are older than their source, like any
other compiler would do.

Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables

^ permalink raw reply	[flat|nested] 90+ messages in thread

* RE: Loading souce Elisp faster
  2013-02-27 18:31             ` Achim Gratz
@ 2013-02-27 18:39               ` Drew Adams
  2013-02-27 19:28                 ` Achim Gratz
  2013-02-27 18:43               ` Lennart Borgman
  1 sibling, 1 reply; 90+ messages in thread
From: Drew Adams @ 2013-02-27 18:39 UTC (permalink / raw)
  To: 'Achim Gratz', emacs-devel

> [2] A mode of compilation that ignores ".elc" files in the compilation
> directory, at least when they are older than their source, like any
> other compiler would do.

See `byte-recompile-directory'.  It only compiles files already compiled, and
only those that are younger than their .elc's.

But perhaps you meant something that does what `byte-recompile-directory' does
(recompiles as needed) but also compiles any .el that has never been compiled?

(BTW, I think you meant ignore .elc that are younger, not older, than .el.)




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 18:31             ` Achim Gratz
  2013-02-27 18:39               ` Drew Adams
@ 2013-02-27 18:43               ` Lennart Borgman
  1 sibling, 0 replies; 90+ messages in thread
From: Lennart Borgman @ 2013-02-27 18:43 UTC (permalink / raw)
  To: Achim Gratz; +Cc: Emacs-Devel devel

On Wed, Feb 27, 2013 at 7:31 PM, Achim Gratz <Stromeko@nexgo.de> wrote:
>
> [1] Being able to use a "byte-compilation-server" instead of having to
> start a new Emacs instance for each byte compilation.

In nXhtml (which I do not have time to maintain now) I start a new
Emacs for byte-compiling the whole nXhtml tree. There are a lot of
files there and of course Emacs have to be restarted to make use of
those byte compiled files.



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 18:39               ` Drew Adams
@ 2013-02-27 19:28                 ` Achim Gratz
  2013-02-27 22:32                   ` Xue Fuqiao
  0 siblings, 1 reply; 90+ messages in thread
From: Achim Gratz @ 2013-02-27 19:28 UTC (permalink / raw)
  To: emacs-devel

Drew Adams writes:
> See `byte-recompile-directory'.  It only compiles files already compiled, and
> only those that are younger than their .elc's.
>
> But perhaps you meant something that does what `byte-recompile-directory' does
> (recompiles as needed) but also compiles any .el that has never been compiled?

I know byte-recompile-directory in its various incarnations through
Emacs history, thanks.  This speeds up things, but is single threaded
and hence not very useful on multi-core machines (unless you happen to
do compiles in parallel directories).

> (BTW, I think you meant ignore .elc that are younger, not older, than .el.)

No, I meant what I wrote.  Emacs picks up stale .elc files in the
directory it is told to do compilations in, which almost never is the
correct thing to do.  With make-like build systems you first have to
remove these (or rather all .elc most of the time) to ensure that
subsequent compilations will work correctly.

Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 19:28                 ` Achim Gratz
@ 2013-02-27 22:32                   ` Xue Fuqiao
  0 siblings, 0 replies; 90+ messages in thread
From: Xue Fuqiao @ 2013-02-27 22:32 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-devel

On Wed, 27 Feb 2013 20:28:10 +0100
Achim Gratz <Stromeko@nexgo.de> wrote:

> Drew Adams writes:
> > But perhaps you meant something that does what `byte-recompile-directory' does
> > (recompiles as needed) but also compiles any .el that has never been compiled?
> I know byte-recompile-directory in its various incarnations through
> Emacs history, thanks.  This speeds up things, but is single threaded
> and hence not very useful on multi-core machines (unless you happen to
> do compiles in parallel directories).

I'm no expert on this.  But I think concurrency can help.

> With make-like build systems you first have to
> remove these (or rather all .elc most of the time) to ensure that
> subsequent compilations will work correctly.

I agree.  Doing these things is annoying.

> Regards,
> Achim.

-- 
Best regards, Xue Fuqiao.
http://www.emacswiki.org/emacs/XueFuqiao



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-27 14:28           ` Richard Stallman
@ 2013-02-28 14:10             ` Kenichi Handa
  2013-03-01  2:12               ` Richard Stallman
  0 siblings, 1 reply; 90+ messages in thread
From: Kenichi Handa @ 2013-02-28 14:10 UTC (permalink / raw)
  To: rms; +Cc: lennart.borgman, monnier, emacs-devel

In article <E1UAhzN-0001D1-8i@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

> This change in decode_coding_utf_8 should make it considerably faster
> in the usual case where few characters need conversion.

Even with that change, the decoder still copies the source
bytes to the array coding->charbuf, then store characters in
coding->charbuf into the destination buffer.  I think the
better place of tuning is in the function decode_coding.  It
has do-while loop that basically does this:

  do 
    {
      /* decode the source bytes and store characters in charbuf */
      (*(coding->decoder)) (coding);
      /* insert characters of charbuf in the destination buffer. */
      produce_chars (coding, translation_table, 0);
    }
  while (...)

We can change that to:

  do
    {
      int ascii_bytes = find_ascii_chars_in_source (coding);
      if (ascii_bytes > SHORT_CUT_THRESHOLD)
        {
           /* Copy source bytes directly to destination buffer.
              For reading a buffer, source bytes are in a
              gap area of the destination buffer.  So, with
              careful programming, we may even be able to avoid
              this coping.
            */
        }
      else 
        {
          (*(coding->decoder)) (coding);
          produce_chars (coding, translation_table, 0);
        }
    }
  while (...)

---
Kenichi Handa
handa@gnu.org



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-02-28 14:10             ` Kenichi Handa
@ 2013-03-01  2:12               ` Richard Stallman
  2013-03-02 12:52                 ` Kenichi Handa
  2013-03-10 15:19                 ` handa
  0 siblings, 2 replies; 90+ messages in thread
From: Richard Stallman @ 2013-03-01  2:12 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lennart.borgman, monnier, emacs-devel

      I think the
    better place of tuning is in the function decode_coding.

Please go ahead and do it.

Can you speed up the code to detect a coding, in a similar way?

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-01  2:12               ` Richard Stallman
@ 2013-03-02 12:52                 ` Kenichi Handa
  2013-03-03  0:35                   ` Richard Stallman
  2013-03-10 15:19                 ` handa
  1 sibling, 1 reply; 90+ messages in thread
From: Kenichi Handa @ 2013-03-02 12:52 UTC (permalink / raw)
  To: rms; +Cc: lennart.borgman, monnier, emacs-devel

In article <E1UBFSe-0004DA-Ks@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>       I think the
>     better place of tuning is in the function decode_coding.

> Please go ahead and do it.

Ok.

> Can you speed up the code to detect a coding, in a similar way?

The current detector already skips the heading ASCIIs at
first.  So, I'm not sure how much we can tune up the current
code.

---
Kenichi Handa
handa@gnu.org



^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-02 12:52                 ` Kenichi Handa
@ 2013-03-03  0:35                   ` Richard Stallman
  0 siblings, 0 replies; 90+ messages in thread
From: Richard Stallman @ 2013-03-03  0:35 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lennart.borgman, monnier, emacs-devel

    The current detector already skips the heading ASCIIs at
    first.  So, I'm not sure how much we can tune up the current
    code.

Maybe you're right.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-01  2:12               ` Richard Stallman
  2013-03-02 12:52                 ` Kenichi Handa
@ 2013-03-10 15:19                 ` handa
  2013-03-11  1:19                   ` Richard Stallman
  2013-03-15 16:20                   ` handa
  1 sibling, 2 replies; 90+ messages in thread
From: handa @ 2013-03-10 15:19 UTC (permalink / raw)
  To: rms; +Cc: lennart.borgman, monnier, emacs-devel

In article <E1UBFSe-0004DA-Ks@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>       I think the
>     better place of tuning is in the function decode_coding.

> Please go ahead and do it.

I've just installed a code of optimizing ASCII file
decoding.  I added that code in decode_coding_gap instead of
decode_coding because that was easier and what should be
tuned in mainly the file insertion.  Currently the
optimization work only when you explicitely specify such
ASCII compatible coding systems as utf-8-unix and
iso-8859-1-unix for ASCII only files.

When you compile Emacs as this:
  % make CFLAGS=-DCODING_DISABLE_ASCII_OPTIMIZATION
the optimization is disabled so that the effect of the
optimization can be checked.

The next tuning I am working on is for the case you don't
specify *-unix explicitly, and also for the case of utf-8
files (which need no decoding but need character counting).

---
Kenichi Handa
handa@gnu.org

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-10 15:19                 ` handa
@ 2013-03-11  1:19                   ` Richard Stallman
  2013-03-15 16:20                   ` handa
  1 sibling, 0 replies; 90+ messages in thread
From: Richard Stallman @ 2013-03-11  1:19 UTC (permalink / raw)
  To: handa; +Cc: lennart.borgman, monnier, emacs-devel

Thanks for improving this.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call




^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-10 15:19                 ` handa
  2013-03-11  1:19                   ` Richard Stallman
@ 2013-03-15 16:20                   ` handa
  2013-03-20  8:15                     ` Kenichi Handa
  1 sibling, 1 reply; 90+ messages in thread
From: handa @ 2013-03-15 16:20 UTC (permalink / raw)
  To: handa; +Cc: emacs-devel, lennart.borgman, rms, monnier

[-- Attachment #1: Type: text/plain, Size: 1240 bytes --]

In article <874ngjl1tw.fsf@gnu.org>, handa <handa@gnu.org> writes:

> The next tuning I am working on is for the case you don't
> specify *-unix explicitly, and also for the case of utf-8
> files (which need no decoding but need character counting).

I've just installed a decoder optimization for a file
without coding: tag and a file with dos-like EOL format.
Here the result of the benchmark test using the attached
Lisp code.  It seems that now Emacs can read ASCII only
files (with or without coding tags, unix-like or dos-like
EOL format) 1.3 to 6.4 times faster.

(benchmark-decoder)
With optimization:
~/tag-utf-8-unix.unix: (0.350511637 0 0.0)
~/tag-utf-8.unix: (0.70990067 0 0.0)
~/tag-none.unix: (0.995416864 0 0.0)
~/tag-utf-8-dos.dos: (1.0152946189999998 0 0.0)
~/tag-utf-8.dos: (1.2202745289999999 0 0.0)
~/tag-none.dos: (1.669854579 0 0.0)
Without optimization:
~/tag-utf-8-unix.unix: (2.239232573 0 0.0)
~/tag-utf-8.unix: (2.307786132 0 0.0)
~/tag-none.unix: (2.572636136 0 0.0)
~/tag-utf-8-dos.dos: (2.6095316459999998 0 0.0)
~/tag-utf-8.dos: (4.839366146000001 0 0.0)
~/tag-none.dos: (2.103833078 0 0.0)

Next work will be an optimization for UTF-8 files containing
non-ASCII characters.

---
Kenichi Handa
handa@gnu.org


[-- Attachment #2: check-decoder.el --]
[-- Type: application/emacs-lisp, Size: 1418 bytes --]

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: Loading souce Elisp faster
  2013-03-15 16:20                   ` handa
@ 2013-03-20  8:15                     ` Kenichi Handa
  0 siblings, 0 replies; 90+ messages in thread
From: Kenichi Handa @ 2013-03-20  8:15 UTC (permalink / raw)
  To: handa; +Cc: emacs-devel, lennart.borgman, rms, monnier

In article <87txock528.fsf@gnu.org>, handa <handa@gnu.org> writes:

> I've just installed a decoder optimization for a file
> without coding: tag and a file with dos-like EOL format.

As I found a serious bug in that patch (insert-file
segfaults), I set disable-ascii-optimization to t by default
as a temporary workaround.  After getting a stable code,
I'll set it back to nil.

---
Kenichi Handa
handa@gnu.org

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2013-03-20  8:15 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-25  1:40 Loading souce Elisp faster Stefan Monnier
2013-02-25  1:53 ` Lennart Borgman
2013-02-25  2:53   ` Stefan Monnier
2013-02-25  2:55     ` Lennart Borgman
2013-02-25  3:57       ` Stefan Monnier
2013-02-25  4:35         ` Stephen J. Turnbull
2013-02-25  4:51           ` Stefan Monnier
2013-02-25  5:24         ` Paul Eggert
2013-02-25  6:12           ` Xue Fuqiao
2013-02-25 15:39           ` Eli Zaretskii
2013-02-25 18:41             ` Paul Eggert
2013-02-26  7:23               ` Werner LEMBERG
2013-02-26  8:48                 ` Andreas Schwab
2013-02-25  5:47         ` Leo Liu
2013-02-25 16:28         ` Ted Zlatanov
2013-02-27  4:18         ` Kenichi Handa
2013-02-27 13:48           ` Stefan Monnier
2013-02-27 14:50             ` Drew Adams
2013-02-27 17:49             ` Werner LEMBERG
2013-02-27 14:28           ` Richard Stallman
2013-02-28 14:10             ` Kenichi Handa
2013-03-01  2:12               ` Richard Stallman
2013-03-02 12:52                 ` Kenichi Handa
2013-03-03  0:35                   ` Richard Stallman
2013-03-10 15:19                 ` handa
2013-03-11  1:19                   ` Richard Stallman
2013-03-15 16:20                   ` handa
2013-03-20  8:15                     ` Kenichi Handa
2013-02-25  7:24 ` Achim Gratz
2013-02-25 11:43 ` Richard Stallman
2013-02-25 15:19   ` Stefan Monnier
2013-02-25 15:36     ` Drew Adams
2013-02-25 16:09       ` Stefan Monnier
2013-02-25 16:31         ` Lennart Borgman
2013-02-25 18:31           ` Stefan Monnier
2013-02-25 19:20             ` Lennart Borgman
2013-02-25 20:53               ` Stefan Monnier
2013-02-25 20:57                 ` Lennart Borgman
2013-02-25 21:37                   ` Stefan Monnier
2013-02-25 21:57                     ` Lennart Borgman
2013-02-25 23:59                       ` Lennart Borgman
2013-02-26 17:27                         ` Achim Gratz
2013-02-26 21:38                           ` Lennart Borgman
2013-02-26 21:43                             ` Dmitry Gutov
2013-02-26 21:47                               ` Lennart Borgman
2013-02-26  4:54           ` Stephen J. Turnbull
2013-02-26  8:29             ` Ulrich Mueller
2013-02-25 21:51     ` Richard Stallman
2013-02-25 23:54       ` Stefan Monnier
2013-02-25 15:35   ` Drew Adams
2013-02-25 13:33 ` Kenichi Handa
2013-02-25 13:50   ` Xue Fuqiao
2013-02-25 15:35     ` Drew Adams
2013-02-25 15:52     ` Eli Zaretskii
2013-02-25 22:39       ` Xue Fuqiao
2013-02-26  3:48         ` Eli Zaretskii
2013-02-26 10:44           ` Xue Fuqiao
2013-02-26 17:02             ` Eli Zaretskii
2013-02-25 15:35   ` Drew Adams
2013-02-25 16:45 ` David Engster
2013-02-26 12:56   ` Richard Stallman
2013-02-26 16:26     ` David Engster
2013-02-26 20:19       ` Richard Stallman
2013-02-26 21:00         ` David Engster
2013-02-26 21:12           ` Eli Zaretskii
2013-02-26 21:18             ` David Engster
2013-02-26 22:40               ` Xue Fuqiao
2013-02-26 22:51               ` David Engster
2013-02-27  0:44                 ` Drew Adams
2013-02-27  1:22           ` Stefan Monnier
2013-02-27  1:56             ` Lennart Borgman
2013-02-27 14:28             ` Richard Stallman
2013-02-27 15:21               ` Stefan Monnier
2013-02-27 17:21                 ` Lennart Borgman
2013-02-27 18:31             ` Achim Gratz
2013-02-27 18:39               ` Drew Adams
2013-02-27 19:28                 ` Achim Gratz
2013-02-27 22:32                   ` Xue Fuqiao
2013-02-27 18:43               ` Lennart Borgman
2013-02-27  4:49           ` Kenichi Handa
2013-02-27 14:27           ` Richard Stallman
2013-02-26 16:57     ` Eli Zaretskii
2013-02-26 20:19       ` Richard Stallman
2013-02-26 20:45         ` Eli Zaretskii
2013-02-27  2:22     ` Stephen J. Turnbull
2013-02-25 21:12 ` Ivan Kanis
2013-02-25 22:47   ` Glenn Morris
2013-02-25 21:17 ` Barry Warsaw
2013-02-26  7:10   ` Thierry Volpiatto
2013-02-26  7:59 ` Andreas Röhler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).