* Compiling Elisp to a native code with a GCC plugin @ 2010-09-14 19:12 Wojciech Meyer 2010-09-14 19:32 ` Tom Tromey 0 siblings, 1 reply; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 19:12 UTC (permalink / raw) To: emacs-devel Hi, Recent version of GCC allow developing plugins. That would solve JITing intermediate representation (e.g. current bytecode) to the native code across different platforms. What do you think about it? Would that be possible or there might be some problems that would make it impossible (I am talking especially about dynamic scoping and GC interaction). Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 19:12 Compiling Elisp to a native code with a GCC plugin Wojciech Meyer @ 2010-09-14 19:32 ` Tom Tromey 2010-09-14 19:45 ` Wojciech Meyer 2010-09-15 10:47 ` Leo 0 siblings, 2 replies; 97+ messages in thread From: Tom Tromey @ 2010-09-14 19:32 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel >>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: Wojciech> Recent version of GCC allow developing plugins. That would Wojciech> solve JITing intermediate representation (e.g. current Wojciech> bytecode) to the native code across different platforms. What Wojciech> do you think about it? Would that be possible or there might Wojciech> be some problems that would make it impossible (I am talking Wojciech> especially about dynamic scoping and GC interaction). I think you can compile elisp without needing a plugin. Just convert the bytecode to C, and compile that. This is actually quite easy. However, it is not very likely to result in a performance improvement right now. A similar idea was tried and did not work out: http://www.mundell.ukfsn.org/native/ I think this idea might be worth revisiting when lexbind is merged in, emphasis on "might". E.g., it seems to me that this approach might work ok for the recently-discussed range-map code. As I recall, in my profiles, the GC and the regexp matcher were more costly the bytecode interpreter (though of course this is workload-dependent). If you are interested in performance, I suggest doing your own profiles and starting there. Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 19:32 ` Tom Tromey @ 2010-09-14 19:45 ` Wojciech Meyer 2010-09-14 20:17 ` Lars Magne Ingebrigtsen 2010-09-14 20:44 ` Tom Tromey 2010-09-15 10:47 ` Leo 1 sibling, 2 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 19:45 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel Tom Tromey <tromey@redhat.com> writes: > I think you can compile elisp without needing a plugin. Just convert > the bytecode to C, and compile that. This is actually quite easy. Yes, but you can't JIT it in memory, and that also builds dependency on a C compiler. > > However, it is not very likely to result in a performance improvement > right now. A similar idea was tried and did not work out: > > http://www.mundell.ukfsn.org/native/ > > I think this idea might be worth revisiting when lexbind is merged in, > emphasis on "might". E.g., it seems to me that this approach might work > ok for the recently-discussed range-map code. Well Elisp nature is dynamic, plus dynamic scoping makes it hard to compile, but somewhat C Lispy code *can* work faster. > > As I recall, in my profiles, the GC and the regexp matcher were more > costly the bytecode interpreter (though of course this is > workload-dependent). If you are interested in performance, I suggest > doing your own profiles and starting there. Mark and sweep is no good, it would be so good if we had generational GC... :( > > Tom Thanks, Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 19:45 ` Wojciech Meyer @ 2010-09-14 20:17 ` Lars Magne Ingebrigtsen 2010-09-14 20:52 ` Wojciech Meyer 2010-09-14 20:55 ` Tom Tromey 2010-09-14 20:44 ` Tom Tromey 1 sibling, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-14 20:17 UTC (permalink / raw) To: emacs-devel Wojciech Meyer <wojciech.meyer@googlemail.com> writes: > Well Elisp nature is dynamic, plus dynamic scoping makes it hard to > compile, but somewhat C Lispy code *can* work faster. Sure. Compiling to native code will probably yield some benefits, but I think we tend to overestimate the benefits. To take a random example from code I just wrote (part of `gnus-range-nconcat'), which works on lists and numbers and stuff, and is as low-level as Emacs Lisp code gets: (when (numberp (car last)) (setcar last (cons (car last) (car last)))) (if (= (1+ (cdar last)) (caar range)) (progn (setcdr (car last) (cdar range)) (setcdr last (cdr range)))) Just imagine what that would be in native code, as opposed to byte code. In either case, it'd just be a lot of calls to Fcar, Fcons, Fsetcar and so on. Would the byte-interpreter call those functions a lot slower than native code would? I kinda doubt it. Now, the same code in native C would of course be a lot faster, because you'd use other data types, and you wouldn't do the code by straight list manipulation at all. Possibly. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 20:17 ` Lars Magne Ingebrigtsen @ 2010-09-14 20:52 ` Wojciech Meyer 2010-09-14 20:55 ` Tom Tromey 1 sibling, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 20:52 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > To take a random example from code I just wrote (part of > `gnus-range-nconcat'), which works on lists and numbers and stuff, and > is as low-level as Emacs Lisp code gets: > > (when (numberp (car last)) > (setcar last (cons (car last) (car last)))) > (if (= (1+ (cdar last)) (caar range)) > (progn > (setcdr (car last) (cdar range)) > (setcdr last (cdr range)))) > > Just imagine what that would be in native code, as opposed to byte code. > In either case, it'd just be a lot of calls to Fcar, Fcons, Fsetcar and > so on. Would the byte-interpreter call those functions a lot slower > than native code would? I kinda doubt it. Some of the functions will be in-lined, some of the data pointers will be loaded to registers, some of the calls will be specialised against constants, some of the expressions simplified, the flat code peep-holed etc. So no, it is not direct translation even at the level of byte-code, and compiler frameworks (gcc & llvm) are getting better and better at optimising at high-level and low-level. Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 20:17 ` Lars Magne Ingebrigtsen 2010-09-14 20:52 ` Wojciech Meyer @ 2010-09-14 20:55 ` Tom Tromey 2010-09-14 21:05 ` Wojciech Meyer 1 sibling, 1 reply; 97+ messages in thread From: Tom Tromey @ 2010-09-14 20:55 UTC (permalink / raw) To: emacs-devel >>>>> "Lars" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes: Lars> Just imagine what that would be in native code, as opposed to byte code. Lars> In either case, it'd just be a lot of calls to Fcar, Fcons, Fsetcar and Lars> so on. Would the byte-interpreter call those functions a lot slower Lars> than native code would? I kinda doubt it. Actually, in this case it might be faster, since there are special opcodes like Bcar, Bcons, Bsetcar, etc. Whether it is "faster enough" is hard to guess; lexbind comes into play because with lexbind, the various local bindings can just be C local variables, as opposed to much more expensive elisp bindings. Lars> Now, the same code in native C would of course be a lot faster, because Lars> you'd use other data types, and you wouldn't do the code by straight Lars> list manipulation at all. Possibly. IIUC, yeah, if you're willing to have a new pure-C data structure with a new read and write syntax, you can do even better. Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 20:55 ` Tom Tromey @ 2010-09-14 21:05 ` Wojciech Meyer 0 siblings, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 21:05 UTC (permalink / raw) To: Tom Tromey; +Cc: emacs-devel Tom Tromey <tromey@redhat.com> writes: > is hard to guess; lexbind comes into play because with lexbind, the > various local bindings can just be C local variables, as opposed to much > more expensive elisp bindings. yep, because it just a run-time data structure, and no longer a pure stack frame. So somewhat accessing directly memory needs to be much less expensive. Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 19:45 ` Wojciech Meyer 2010-09-14 20:17 ` Lars Magne Ingebrigtsen @ 2010-09-14 20:44 ` Tom Tromey 2010-09-14 21:00 ` Wojciech Meyer 1 sibling, 1 reply; 97+ messages in thread From: Tom Tromey @ 2010-09-14 20:44 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel >>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: Wojciech> Mark and sweep is no good, it would be so good if we had generational Wojciech> GC... :( It could be done. It just requires someone willing to do the work. Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 20:44 ` Tom Tromey @ 2010-09-14 21:00 ` Wojciech Meyer 2010-09-14 21:16 ` Tom Tromey 0 siblings, 1 reply; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 21:00 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel Tom Tromey <tromey@redhat.com> writes: >>>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: > > Wojciech> Mark and sweep is no good, it would be so good if we had generational > Wojciech> GC... :( > > It could be done. It just requires someone willing to do the work. I know. I could get my old sources of generational garbage collector, to work. However it is a daunting job (the worse I could imagine, garbage collectors are nasty), plugging and debugging a new garbage collector to such huge and esoteric (I am sure people that who've been working on Emacs for years will not take this words badly and understand straight away what I am (a newbie) talking about) project like Emacs. However I might try to experiment with it (however unfortunately I am not that self confident about it ;) ). > Tom Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:00 ` Wojciech Meyer @ 2010-09-14 21:16 ` Tom Tromey 2010-09-14 21:29 ` Wojciech Meyer 2010-09-14 23:13 ` Thomas Lord 0 siblings, 2 replies; 97+ messages in thread From: Tom Tromey @ 2010-09-14 21:16 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel >>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: Tom> It could be done. It just requires someone willing to do the work. Wojciech> I know. I could get my old sources of generational garbage Wojciech> collector, to work. However it is a daunting job (the worse I Wojciech> could imagine, garbage collectors are nasty), plugging and Wojciech> debugging a new garbage collector to such huge and esoteric (I Wojciech> am sure people that who've been working on Emacs for years Wojciech> will not take this words badly and understand straight away Wojciech> what I am (a newbie) talking about) project like Wojciech> Emacs. However I might try to experiment with it (however Wojciech> unfortunately I am not that self confident about it ;) ). It is always ok to ask for help. The current collector is very simple to understand. If you read alloc.c, and look through the data structures representing lisp objects (in lisp.h), you will have a pretty good idea of what is going on. FWIW, I looked at writing an incremental collector for Emacs. I was primarily interested in using software write barriers... this turns out to be hard because there is a lot of code in Emacs of the form: FIELD_ACCESSOR (object) = value; ... which for a software barrier has to be converted to: SET_FIELD_ACCESSOR (object, value); (There are other bad things, too, like passing around a Lisp_Object* that points to the contents of a vector.) So, lots of grunge work, just to get the point where you could start actually working on the GC. I would look at automated rewriting to make this work -- that worked out great on the concurrent branch. There was a more real attempt based on the Boehm GC. I think the bits from that are still on a branch. This GC has a generational mode that, IIRC, is based on memory protection bits. Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:16 ` Tom Tromey @ 2010-09-14 21:29 ` Wojciech Meyer 2010-09-14 21:59 ` Tom Tromey 2010-09-14 23:13 ` Thomas Lord 1 sibling, 1 reply; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 21:29 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel Tom Tromey <tromey@redhat.com> writes: >>>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: > > Tom> It could be done. It just requires someone willing to do the work. > > Wojciech> I know. I could get my old sources of generational garbage > Wojciech> collector, to work. However it is a daunting job (the worse I > Wojciech> could imagine, garbage collectors are nasty), plugging and > Wojciech> debugging a new garbage collector to such huge and esoteric (I > Wojciech> am sure people that who've been working on Emacs for years > Wojciech> will not take this words badly and understand straight away > Wojciech> what I am (a newbie) talking about) project like > Wojciech> Emacs. However I might try to experiment with it (however > Wojciech> unfortunately I am not that self confident about it ;) ). > > It is always ok to ask for help. Thanks, I will keep in mind it. > > The current collector is very simple to understand. If you read > alloc.c, and look through the data structures representing lisp objects > (in lisp.h), you will have a pretty good idea of what is going on. It's open already :). > > > FWIW, I looked at writing an incremental collector for Emacs. I was > primarily interested in using software write barriers... this turns out > to be hard because there is a lot of code in Emacs of the form: > > FIELD_ACCESSOR (object) = value; > > ... which for a software barrier has to be converted to: > > SET_FIELD_ACCESSOR (object, value); Yep, we would need barriers for the second heap. For young heap it is OK to just scan it. > > (There are other bad things, too, like passing around a Lisp_Object* > that points to the contents of a vector.) > > So, lots of grunge work, just to get the point where you could start > actually working on the GC. I would look at automated rewriting to > make this work -- that worked out great on the concurrent branch. Maybe that work should be actually done even without thinking currently about GC. AFAIR MT Emacs rewriting was in Elisp, ideally maybe using GCC would be better at some point. > > > There was a more real attempt based on the Boehm GC. I think the bits > from that are still on a branch. This GC has a generational mode that, > IIRC, is based on memory protection bits. > Conservative Boehm will not bring much gain (I would think certainly loss) of performance. However I didn't know about marking pointers based on memory protection bits and that sounds interesting. Need to look at it. (however I am fully convinced that custom GC would be a superior option). > Tom Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:29 ` Wojciech Meyer @ 2010-09-14 21:59 ` Tom Tromey 2010-09-14 22:37 ` Wojciech Meyer 2010-09-14 22:49 ` Wojciech Meyer 0 siblings, 2 replies; 97+ messages in thread From: Tom Tromey @ 2010-09-14 21:59 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel >>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: Tom> So, lots of grunge work, just to get the point where you could start Tom> actually working on the GC. I would look at automated rewriting to Tom> make this work -- that worked out great on the concurrent branch. Wojciech> Maybe that work should be actually done even without thinking Wojciech> currently about GC. AFAIR MT Emacs rewriting was in Elisp, Wojciech> ideally maybe using GCC would be better at some point. You would have to hack GCC a little bit, because most of the code locations you want to change arise from macro expansion, and GCC does not keep all that information. (Though there's a WIP patch for this.) Maybe it could be done more simply using a simple parser in elisp that recognizes just the needed forms. Or maybe something based on clang. This would get you most of the way there, though there are still some bad things you have to fix up by hand. For the concurrency stuff, we did two kinds of automated rewriting. One was just pure elisp that searched the .c for DEFVAR_LISP and then made various changes. The other one modified the source (in a compile-breaking way), then ran the compiler, then visited each error to perform a rewrite. This approach might also work for the GC problem, I am not certain. These scripts are both in src/ on the concurrency branch. One problem with any compiler-based approach is that it only works on the sources it sees. That is, the not-taken #if branches won't get rewritten. This argues for trying some kind of custom parser. Another problem we ran into is that this approach doesn't work if the problem code itself appears in a macro. There were a few spots that we had to fix by hand -- no big deal, the automation is still worthwhile even if it only does 85% of the work. My advice is to try to do this bulk rewriting work on head, so that it doesn't rot. I think that's been a problem for the concurrency work :-( Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:59 ` Tom Tromey @ 2010-09-14 22:37 ` Wojciech Meyer 2010-09-14 22:55 ` Tom Tromey 2010-09-14 22:49 ` Wojciech Meyer 1 sibling, 1 reply; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 22:37 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel Tom Tromey <tromey@redhat.com> writes: >>>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: > > Tom> So, lots of grunge work, just to get the point where you could start > Tom> actually working on the GC. I would look at automated rewriting to > Tom> make this work -- that worked out great on the concurrent branch. > > Wojciech> Maybe that work should be actually done even without thinking > Wojciech> currently about GC. AFAIR MT Emacs rewriting was in Elisp, > Wojciech> ideally maybe using GCC would be better at some point. > > You would have to hack GCC a little bit, because most of the code > locations you want to change arise from macro expansion, and GCC does > not keep all that information. (Though there's a WIP patch for this.) Yes, I am aware about impreciseness of this. Currently I may not think about this for some other unrelated reasons as well... > > Maybe it could be done more simply using a simple parser in elisp that > recognizes just the needed forms. Or maybe something based on clang. > > This would get you most of the way there, though there are still some > bad things you have to fix up by hand. > > > For the concurrency stuff, we did two kinds of automated rewriting. If you could point me out with the tools you used for this job, I would be grateful, any points to git? > > One was just pure elisp that searched the .c for DEFVAR_LISP and then > made various changes. > > The other one modified the source (in a compile-breaking way), then ran > the compiler, then visited each error to perform a rewrite. This > approach might also work for the GC problem, I am not certain. That is clever and cheap to do, and it is worth to try (and it looks a bit like humans do re-factoring). However clang error messages are currently are more precise than GCC (yes, we can match-replace regex, and it will work fine in most cases). > > These scripts are both in src/ on the concurrency branch. > > One problem with any compiler-based approach is that it only works on > the sources it sees. That is, the not-taken #if branches won't get > rewritten. This argues for trying some kind of custom parser. Yes, this is a major problem, different configurations, different systems, and you are never sure if it will not break something, on some machine (assuming that the changes are required to generate correct code). > > Another problem we ran into is that this approach doesn't work if the > problem code itself appears in a macro. There were a few spots that we > had to fix by hand -- no big deal, the automation is still worthwhile > even if it only does 85% of the work. Definitely it is worthwhile, however I need to know what to rewrite at first... > > > My advice is to try to do this bulk rewriting work on head, so that it > doesn't rot. I think that's been a problem for the concurrency work > :-( Any chances to get it back to life? It would be nice to push it back... If you would like to get it back, I would volunteer with any help. If the maintainers are happy to accept patches with incremental improvements then it is OK, however I still think in terms `I want - but it will be hard and even nobody started it before'. If there will be any progress on this I will just let everybody know. (anyway starting such project is even OK for just *learning* about internals). Shall we setup a public repo somewhere then? I can commit there my generational gc as a sub-module, straight away. (however it is not the best starting point, as was said). > > Tom Thanks, Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 22:37 ` Wojciech Meyer @ 2010-09-14 22:55 ` Tom Tromey 2010-09-14 23:33 ` Wojciech Meyer 0 siblings, 1 reply; 97+ messages in thread From: Tom Tromey @ 2010-09-14 22:55 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel Tom> For the concurrency stuff, we did two kinds of automated rewriting. Wojciech> If you could point me out with the tools you used for this Wojciech> job, I would be grateful, any points to git? It is in the concurrency branch in Emacs bzr. I assume this is mirrored in git, but I'm not sure. I can email you the two .el scripts if you like, plus the little GCC patch I had to use to get the right location for the -> tokens. Just let me know. If you check out the concurrency branch, the files are src/hack-buffer-objfwd.el and src/rewrite-globals.el. Wojciech> However clang error messages are currently are more precise Wojciech> than GCC (yes, we can match-replace regex, and it will work Wojciech> fine in most cases). Recent versions of GCC are a lot better about marking the correct token when emitting an error. Red Hat put some work into this. What hack-buffer-objfwd.el actually does is go to the location of the error, then go backwards over sexps doing some pattern matching to see where the left-hand-side of the -> operator starts. Then it rewrites. This is rather hacky and IIRC there are some Emacs-specific heuristics in there. Tom> My advice is to try to do this bulk rewriting work on head, so that it Tom> doesn't rot. I think that's been a problem for the concurrency work Tom> :-( Wojciech> Any chances to get it back to life? I have zero (really negative) free time. But yes, it can be resurrected. One starting point would be doing a merge from trunk. Giuseppe did this but ran into problems... I still don't really know the details, just that things changed on trunk in some conflicting way. Wojciech> Shall we setup a public repo somewhere then? We had one on gitorious. But I think if you are going to try to work on trunk you should just use bzr. Or at least use the semi-official git mirror when preparing patches. Tom ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 22:55 ` Tom Tromey @ 2010-09-14 23:33 ` Wojciech Meyer 2010-09-15 1:38 ` Tom Tromey 0 siblings, 1 reply; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 23:33 UTC (permalink / raw) To: Tom Tromey; +Cc: emacs-devel Tom Tromey <tromey@redhat.com> writes: > Tom> For the concurrency stuff, we did two kinds of automated rewriting. > > Wojciech> If you could point me out with the tools you used for this > Wojciech> job, I would be grateful, any points to git? > > It is in the concurrency branch in Emacs bzr. I assume this is mirrored > in git, but I'm not sure. > > I can email you the two .el scripts if you like, plus the little GCC > patch I had to use to get the right location for the -> tokens. Just > let me know. Yes please, I would like to take a look, thanks. > > Wojciech> Any chances to get it back to life? > > I have zero (really negative) free time. But yes, it can be > resurrected. One starting point would be doing a merge from trunk. > Giuseppe did this but ran into problems... I still don't really know > the details, just that things changed on trunk in some conflicting > way. Yes, I saw some of the work that had been done, and posts on ML. > > Wojciech> Shall we setup a public repo somewhere then? > > We had one on gitorious. But I think if you are going to try to work > on trunk you should just use bzr. Or at least use the semi-official > git mirror when preparing patches. I am just not sure how efficiently and reliably branches work in bzr (I've managed to screw up some of my work once with bzr), and I am not sure how reliable are git mirrors. I quite like git, however there is a cost of troubles with integration with bzr. On other hand I find forking <-> merging unacceptable. I will try to mock something up, in the meantime. BTW: If somebody would like to enlighten me on how reliably mirrored git works with the Emacs source tree, I would be grateful. Thanks. Wojciech PS: sorry this time sent from gmail. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 23:33 ` Wojciech Meyer @ 2010-09-15 1:38 ` Tom Tromey 0 siblings, 0 replies; 97+ messages in thread From: Tom Tromey @ 2010-09-15 1:38 UTC (permalink / raw) To: Wojciech Meyer; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1580 bytes --] Wojciech> Yes please, I would like to take a look, thanks. I attached the scripts. They have a few comments, but probably not enough, given that they are pretty much one-off hacks. (Though, funnily, today I'm going to repurpose one to rewrite gdb...) The appended patch is needed to get GCC to emit error locations on the `->' token when the token appears in the arguments to a macro. Wojciech> I am just not sure how efficiently and reliably branches work Wojciech> in bzr (I've managed to screw up some of my work once with Wojciech> bzr), and I am not sure how reliable are git mirrors. I quite Wojciech> like git, however there is a cost of troubles with integration Wojciech> with bzr. On other hand I find forking <-> merging Wojciech> unacceptable. I will try to mock something up, in the Wojciech> meantime. Yeah, bzr is a pain compared to git. But, we're stuck with it. Wojciech> BTW: If somebody would like to enlighten me on how reliably Wojciech> mirrored git works with the Emacs source tree, I would be Wojciech> grateful. Thanks. I think the mirror is updated regularly. I'm not using it myself, but I gather it works ok. Tom Index: macro.c =================================================================== --- macro.c (revision 164202) +++ macro.c (working copy) @@ -1350,7 +1350,7 @@ pfile->set_invocation_location = true; result = cpp_get_token (pfile); - if (pfile->context->macro) + if (pfile->context->macro && pfile->invocation_location > result->src_loc) *loc = pfile->invocation_location; else *loc = result->src_loc; [-- Attachment #2: hack-buffer-objfwd.el --] [-- Type: text/plain, Size: 6649 bytes --] ;; Rewrite all references to buffer-objfwd fields in struct buffer ;; to use accessor macros. ;; This works in a tricky way: it renames all such fields, then ;; recompiles Emacs. Then it visits each error location and ;; rewrites the expressions. ;; This has a few requirements in order to work. ;; First, Emacs must compile before the script is run. ;; It does not handle errors arising for other reasons. ;; Second, you need a GCC which has been hacked to emit proper ;; column location even when the -> expression in question has ;; been wrapped in a macro call. (This is a one-liner in libcpp.) ;; After running this script, a few changes need to be made by hand. ;; These occur mostly in macros in headers, but also in ;; reset_buffer and reset_buffer_local_variables. Finally, ;; DEFVAR_PER_BUFFER and the GC should not use these accessors. (defvar gcc-prefix "/home/tromey/gnu/Trunk/install/") (defvar emacs-src "/home/tromey/gnu/Emacs/Gitorious/emacs-mt/src/") (defvar emacs-build "/home/tromey/gnu/Emacs/Gitorious/build/src/") (defun file-error (text) (error "%s:%d:%d: error: expected %s" buffer-file-name (line-number-at-pos (point)) (current-column) text)) (defun assert-looking-at (exp) (unless (looking-at exp) (file-error exp))) (defvar field-names nil) (defvar field-regexp nil) (defun modify-buffer.h () (message "Modifying fields in struct buffer") (find-file (expand-file-name "buffer.h" emacs-src)) (goto-char (point-min)) (re-search-forward "^struct buffer$") (forward-line) (assert-looking-at "^{") (let ((starting-point (point)) (closing-brace (save-excursion (forward-sexp) (point)))) ;; Find each field. (while (re-search-forward "^\\s *Lisp_Object\\s +" closing-brace 'move) (goto-char (match-end 0)) (while (not (looking-at ";")) (assert-looking-at "\\([A-Za-z0-9_]+\\)\\(;\\|,\\s *\\)") ;; Remember the name so we can generate accessors. (push (match-string 1) field-names) ;; Rename it. (goto-char (match-beginning 2)) (insert "_") ;; On to the next one, if any. (if (looking-at ",\\s *") (goto-char (match-end 0))))) ;; Generate accessors. (goto-char starting-point) (forward-sexp) (forward-line) (insert "\n") (dolist (name field-names) (insert "#define BUF_" (upcase name) "(BUF) " "*find_variable_location (&((BUF)->" name "_))\n")) (insert "\n")) (setq field-regexp (concat "\\(->\\|\\.\\)" (regexp-opt field-names t) "\\_>")) (save-buffer)) (defun get-field-name () (save-excursion (assert-looking-at "\\(\\.\\|->\\)\\([A-Za-z0-9_]+\\)\\_>") (prog1 (match-string 2) (delete-region (match-beginning 0) (match-end 0))))) (defun skip-backward-lhs () (skip-chars-backward " \t\n") (cond ((eq (char-before) ?\]) (file-error "array ref!") ;; fixme ) ((eq (char-before) ?\)) ;; A paren expression is preceding. ;; See if this is just a paren expression or whether it is a ;; function call. ;; For now assume that there are no function-calls-via-expr. (backward-sexp) (skip-chars-backward " \t\n") (if (save-excursion (backward-char) (looking-at "[A-Za-z0-9_]")) (backward-sexp))) ((save-excursion (backward-char) (looking-at "[A-Za-z0-9_]")) (backward-sexp)) (t (file-error "unhandled case!")))) (defun do-fix-instance () (cond ((looking-at "->") (let ((field-name (get-field-name))) (insert ")") (backward-char) (skip-backward-lhs) (insert "BUF_" (upcase field-name) " ("))) ((eq (char-after) ?.) (let ((field-name (get-field-name))) (insert ")") (backward-char) (backward-sexp) (assert-looking-at "\\(buffer_defaults\\|buffer_local_flags\\)") (insert "BUF_" (upcase field-name) " (&"))) (t (message "%s:%d:%d: warning: did not see -> or ., probably macro" buffer-file-name (line-number-at-pos (point)) (current-column))))) (defun update-header-files () (dolist (file (directory-files emacs-src t "h$")) (message "Applying header changes to %s" file) (find-file file) (while (re-search-forward "\\(current_buffer->\\|buffer_defaults\\.\\)" nil 'move) (goto-char (match-end 0)) (skip-chars-backward "->.") (when (looking-at field-regexp) (do-fix-instance))) (goto-char (point-min)) (while (search-forward "XBUFFER (" nil 'move) (goto-char (- (match-end 0) 1)) (forward-sexp) ;; This works even for the new #define BUF_ macros ;; because the field-regexp ends with \_>. (when (looking-at field-regexp) (do-fix-instance))) (save-buffer))) (defun fix-one-instance (filename line column) (message "%s:%d:%d: info: fixing instance" filename line column) (find-file filename) (goto-char (point-min)) (forward-line (- line 1)) ;; (move-to-column (- column 1)) (forward-char (- column 1)) (do-fix-instance)) (defvar make-accumulation "") (defvar last-error-line nil) (defvar error-list nil) (defun make-filter (process string) (setq make-accumulation (concat make-accumulation string)) (while (string-match "^[^\n]*\n" make-accumulation) (let ((line (substring (match-string 0 make-accumulation) 0 -1))) (setq make-accumulation (substring make-accumulation (match-end 0))) (message "%s" line) (if (string-match "^\\([^:]+\\):\\([0-9]+\\):\\([0-9]+\\)+: error:" line) (save-excursion (let ((file-name (match-string 1 line)) (line-no (string-to-number (match-string 2 line))) (col-no (string-to-number (match-string 3 line)))) ;; Process all errors on a given line in reverse order. (unless (eq line-no last-error-line) (dolist (one-item error-list) (apply #'fix-one-instance one-item)) (setq error-list nil) (setq last-error-line line-no)) (push (list file-name line-no col-no) error-list))))))) (defvar make-done nil) (defun make-sentinel (process string) (dolist (one-item error-list) (apply #'fix-one-instance one-item)) (setq make-done t)) (defun recompile-emacs () (let* ((default-directory emacs-build) (output-buffer (get-buffer-create "*recompile*")) (make (start-process "make" output-buffer "make" "-k"))) (set-process-filter make #'make-filter) (set-process-sentinel make #'make-sentinel) (while (not make-done) (accept-process-output)))) (modify-buffer.h) (update-header-files) (recompile-emacs) (dolist (buf (buffer-list)) (with-current-buffer buf (when buffer-file-name (message "Saving %s" buffer-file-name) (save-buffer)))) [-- Attachment #3: rewrite-globals.el --] [-- Type: text/plain, Size: 2007 bytes --] ;; Rewrite DEFVAR_LISP variables. ;; Each variable is renamed to start with impl_. ;; Compatibility defines are added to globals.h. ;; Invoke as: emacs --script rewrite-globals.el (defvar defvar-list '()) (defun extract-defvars () (let ((case-fold-search nil)) (while (re-search-forward "^[^#*]*\\(DEFVAR_[A-Z_]*\\)" nil 'move) (let ((kind (match-string 1))) (unless (member kind '("DEFVAR_KBOARD" "DEFVAR_PER_BUFFER")) ;; Skip the paren and the first argument. (skip-chars-forward " (") (forward-sexp) (skip-chars-forward ", \t\n&") (if (looking-at "\\_<\\(\\sw\\|\\s_\\)+\\_>") (let ((var-name (match-string 0))) (if (equal kind "DEFVAR_LISP") (push var-name defvar-list))))))))) (defun munge-V () (interactive) (while (re-search-forward "^\\(extern \\|static \\)?Lisp_Object " nil 'move) ;; skip function decls. (if (not (looking-at ".*(")) (while (looking-at "[a-z0-9A-Z_]+") (if (member (match-string 0) defvar-list) (progn ;; Rename them all to impl_ (goto-char (match-beginning 0)) (insert "impl_"))) (forward-sexp) (skip-chars-forward ", \t\n"))))) (defconst V-dir ".") (defun munge-V-directory () ;; First extract all defvars. (dolist (file (directory-files V-dir t "[ch]$")) (save-excursion (message "Scanning %s" file) (find-file file) (extract-defvars))) (setq defvar-list (delete-dups (sort defvar-list #'string<))) (dolist (file (directory-files V-dir t "[ch]$")) (save-excursion (message "Processing %s" file) (find-file file) (goto-char (point-min)) (munge-V) (save-buffer))) (find-file "globals.h") (erase-buffer) (dolist (v defvar-list) (insert "#define " v " *find_variable_location (&impl_" v ")\n")) ;; A few special cases for globals.h. (insert "\n") (dolist (v '("do_mouse_tracking" "Vmark_even_if_inactive" "Vprint_level")) (insert "extern Lisp_Object impl_" v ";\n")) (save-buffer)) (munge-V-directory) ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:59 ` Tom Tromey 2010-09-14 22:37 ` Wojciech Meyer @ 2010-09-14 22:49 ` Wojciech Meyer 1 sibling, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 22:49 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel Tom Tromey <tromey@redhat.com> writes: > These scripts are both in src/ on the concurrency branch. Ups I missed that. Thanks. Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 21:16 ` Tom Tromey 2010-09-14 21:29 ` Wojciech Meyer @ 2010-09-14 23:13 ` Thomas Lord 2010-09-14 23:42 ` Wojciech Meyer 1 sibling, 1 reply; 97+ messages in thread From: Thomas Lord @ 2010-09-14 23:13 UTC (permalink / raw) To: Tom Tromey; +Cc: Wojciech Meyer, emacs-devel I've only loosely followed this thread so its possible I'm off the mark (in which case - sorry) but, i'm pretty sure it might be helpful to remark about a variant of what I guess Tromey is suggesting: Years ago - not for GC but for managing critical sections wherein interrupts had to be deferred - we did something similar in a fork of GNU Guile. In that case, semi-automated ad-hoc rewriting was used a tiny bit but the most helpful thing turned out to be: a) rip a C grammar out of GCC (unless we used a different source, I forget). b) hack the actions to hook up to a scheme (or other lisp) run-time system and build an AST as a big S-EXP. Make sure this AST records source files and line numbers. c) write ad-hoc cheapo static analysis tools to walk the AST and find places where either it was obvious fixes were needed, or where it was not obvious fixes were not needed --- print those out like compiler error messages. d) Interactively page through those and, as you watch each case, apply the ad hoc semi-automated rewrite tools (or do it by hand in hard cases). Step (d) can go very, very fast and, at least in that case, steps (a .. c) can go a lot faster than you might guess at first glance. -t On Tue, 2010-09-14 at 15:16 -0600, Tom Tromey wrote: > >>>>> "Wojciech" == Wojciech Meyer <wojciech.meyer@googlemail.com> writes: > > Tom> It could be done. It just requires someone willing to do the work. > > Wojciech> I know. I could get my old sources of generational garbage > Wojciech> collector, to work. However it is a daunting job (the worse I > Wojciech> could imagine, garbage collectors are nasty), plugging and > Wojciech> debugging a new garbage collector to such huge and esoteric (I > Wojciech> am sure people that who've been working on Emacs for years > Wojciech> will not take this words badly and understand straight away > Wojciech> what I am (a newbie) talking about) project like > Wojciech> Emacs. However I might try to experiment with it (however > Wojciech> unfortunately I am not that self confident about it ;) ). > > It is always ok to ask for help. > > The current collector is very simple to understand. If you read > alloc.c, and look through the data structures representing lisp objects > (in lisp.h), you will have a pretty good idea of what is going on. > > > FWIW, I looked at writing an incremental collector for Emacs. I was > primarily interested in using software write barriers... this turns out > to be hard because there is a lot of code in Emacs of the form: > > FIELD_ACCESSOR (object) = value; > > ... which for a software barrier has to be converted to: > > SET_FIELD_ACCESSOR (object, value); > > (There are other bad things, too, like passing around a Lisp_Object* > that points to the contents of a vector.) > > So, lots of grunge work, just to get the point where you could start > actually working on the GC. I would look at automated rewriting to > make this work -- that worked out great on the concurrent branch. > > > There was a more real attempt based on the Boehm GC. I think the bits > from that are still on a branch. This GC has a generational mode that, > IIRC, is based on memory protection bits. > > Tom > ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 23:13 ` Thomas Lord @ 2010-09-14 23:42 ` Wojciech Meyer 0 siblings, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-14 23:42 UTC (permalink / raw) To: Thomas Lord; +Cc: Tom Tromey, Wojciech Meyer, emacs-devel Thomas Lord <lord@emf.net> writes: > Years ago - not for GC but for managing critical > sections wherein interrupts had to be deferred - > we did something similar in a fork of GNU Guile. > > In that case, semi-automated ad-hoc rewriting was > used a tiny bit but the most helpful thing turned > out to be: > > a) rip a C grammar out of GCC (unless we used a > different source, I forget). GCC has a hand written recursive descent parser, so probably I would need to use some other one (I do have one I think somewhere ;) ) And also macro definitions might be harder to handle. > > b) hack the actions to hook up to a scheme (or > other lisp) run-time system and build an AST > as a big S-EXP. Make sure this AST records source > files and line numbers. We could get those from Clang XML output (I hope), and transform it with xsltproc even to Sexp for easy loading. > > c) write ad-hoc cheapo static analysis tools to > walk the AST and find places where either it was > obvious fixes were needed, or where it was not obvious > fixes were not needed --- print those out like compiler > error messages. > > d) Interactively page through those and, as you watch > each case, apply the ad hoc semi-automated rewrite tools > (or do it by hand in hard cases). > > Step (d) can go very, very fast and, at least in that > case, steps (a .. c) can go a lot faster than > you might guess at first glance. > Sounds straightforward to me. Thanks for the tips. But the worst thing, I still don't know what I will be rewriting! That's the major problem here. (but I am willing to help improving existing code base once i got to that point..). > -t Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-14 19:32 ` Tom Tromey 2010-09-14 19:45 ` Wojciech Meyer @ 2010-09-15 10:47 ` Leo 2010-09-15 11:41 ` Andreas Schwab 2010-09-15 14:07 ` Stefan Monnier 1 sibling, 2 replies; 97+ messages in thread From: Leo @ 2010-09-15 10:47 UTC (permalink / raw) To: emacs-devel On 2010-09-14 20:32 +0100, Tom Tromey wrote: > As I recall, in my profiles, the GC and the regexp matcher were more > costly the bytecode interpreter (though of course this is > workload-dependent). Regarding regexp matcher, do you know if performance will be improved by using pcre? Leo ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 10:47 ` Leo @ 2010-09-15 11:41 ` Andreas Schwab 2010-09-15 12:10 ` Wojciech Meyer 2010-09-15 14:07 ` Stefan Monnier 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-15 11:41 UTC (permalink / raw) To: Leo; +Cc: emacs-devel Leo <sdl.web@gmail.com> writes: > On 2010-09-14 20:32 +0100, Tom Tromey wrote: >> As I recall, in my profiles, the GC and the regexp matcher were more >> costly the bytecode interpreter (though of course this is >> workload-dependent). > > Regarding regexp matcher, do you know if performance will be improved by > using pcre? You can't just switch to other regexp engines because they don't offer all Emacs features. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 11:41 ` Andreas Schwab @ 2010-09-15 12:10 ` Wojciech Meyer 0 siblings, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-15 12:10 UTC (permalink / raw) To: Andreas Schwab; +Cc: Leo, emacs-devel On Wed, Sep 15, 2010 at 12:41 PM, Andreas Schwab <schwab@linux-m68k.org> wrote: > Leo <sdl.web@gmail.com> writes: > >> On 2010-09-14 20:32 +0100, Tom Tromey wrote: >>> As I recall, in my profiles, the GC and the regexp matcher were more >>> costly the bytecode interpreter (though of course this is >>> workload-dependent). >> >> Regarding regexp matcher, do you know if performance will be improved by >> using pcre? > > You can't just switch to other regexp engines because they don't offer > all Emacs features. We could transform syntax from Elisp like to pcre using string substitution on the fly. For the features not present in pcre just use different set of functions, regexp is self contained it is just a string. Compiling regexp to a native code using gcc or GNU lightning (or any other framework), could also be a solution. > Andreas. Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 10:47 ` Leo 2010-09-15 11:41 ` Andreas Schwab @ 2010-09-15 14:07 ` Stefan Monnier 2010-09-15 14:27 ` Helmut Eller 2010-09-15 21:04 ` Leo 1 sibling, 2 replies; 97+ messages in thread From: Stefan Monnier @ 2010-09-15 14:07 UTC (permalink / raw) To: Leo; +Cc: emacs-devel >> As I recall, in my profiles, the GC and the regexp matcher were more >> costly the bytecode interpreter (though of course this is >> workload-dependent). > Regarding regexp matcher, do you know if performance will be improved by > using pcre? Using a different regexp-engine might be a good idea. But there are two issues: - Emacs needs to be able to match on buffer text rather than only on strings. Buffer text is made of 2 chunks of utf-8 byte arrays, so the regexp engine needs to be able to handle a whole in the middle of its input. - The main problem with Emacs regexps right now is that they have pathological cases where the match-time is enormous (potentially exponential explosion in the size of the input string). To be worthwhile a replacement should address this problem, which basically needs it should not be based on backtracking. IIUC pcre suffers from the same 2nd problem, which in my book makes it a poor candidate for replacement. Stefan ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 14:07 ` Stefan Monnier @ 2010-09-15 14:27 ` Helmut Eller 2010-09-15 14:59 ` Stefan Monnier 2010-09-15 21:04 ` Leo 1 sibling, 1 reply; 97+ messages in thread From: Helmut Eller @ 2010-09-15 14:27 UTC (permalink / raw) To: emacs-devel * Stefan Monnier [2010-09-15 14:07] writes: > - The main problem with Emacs regexps right now is that they have > pathological cases where the match-time is enormous (potentially > exponential explosion in the size of the input string). To be > worthwhile a replacement should address this problem, which basically > needs it should not be based on backtracking. Is it possible (theoretically) to implement all of Emacs regexps without backtracking? In particular those with back-references (\N) seem problematic. Or is it necessary to recognize "optimizable" regexps before using a different regexp engine? Helmut ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 14:27 ` Helmut Eller @ 2010-09-15 14:59 ` Stefan Monnier 2010-09-15 15:09 ` Lars Magne Ingebrigtsen 2010-09-15 15:46 ` Helmut Eller 0 siblings, 2 replies; 97+ messages in thread From: Stefan Monnier @ 2010-09-15 14:59 UTC (permalink / raw) To: Helmut Eller; +Cc: emacs-devel >> - The main problem with Emacs regexps right now is that they have >> pathological cases where the match-time is enormous (potentially >> exponential explosion in the size of the input string). To be >> worthwhile a replacement should address this problem, which basically >> needs it should not be based on backtracking. > Is it possible (theoretically) to implement all of Emacs regexps without > backtracking? In particular those with back-references (\N) seem > problematic. Or is it necessary to recognize "optimizable" regexps > before using a different regexp engine? IIRC regexps without back-refs can be matched (and searched) in O(N) where N is the length of the input. With back-refs, I think (not sure) the theoretical bound is O(N^2), which requires a non-backtracking algorithm. So yes, we'd need to handle back-refs specially. Several regexp engines do that already (they have a few different inner engines and choose which one to use based on the particular regexp at hand). Stefan ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 14:59 ` Stefan Monnier @ 2010-09-15 15:09 ` Lars Magne Ingebrigtsen 2010-09-15 15:31 ` Andreas Schwab 2010-09-15 15:42 ` Stefan Monnier 2010-09-15 15:46 ` Helmut Eller 1 sibling, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-15 15:09 UTC (permalink / raw) To: emacs-devel This reminds me of a question I've forgotten to ask. Many protocols that Gnus parses is such that I need to find some string that matches the beginning of the line, and I usually do (re-search-forward "^foo ") or the like. However, many times the it's really a substring and not a regexp, and I could say (search-forward "\nfoo ") if it weren't for that not matching the first entry in a buffer. So you end up with (or (looking-at "foo ") (search-forward "\nfoo ")) which creates a regexp, anyway, and seems clumsy. So what I wonder is whether there is a smarter way to do this, in general. (I'm assuming that a simple string search is faster than a regexp search, but I've never actually benchmarked this.) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:09 ` Lars Magne Ingebrigtsen @ 2010-09-15 15:31 ` Andreas Schwab 2010-09-15 15:35 ` Lars Magne Ingebrigtsen 2010-09-15 15:42 ` Stefan Monnier 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-15 15:31 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > So you end up with > > (or (looking-at "foo ") (search-forward "\nfoo ")) > > which creates a regexp, anyway, and seems clumsy. > > So what I wonder is whether there is a smarter way to do this, in > general. (I'm assuming that a simple string search is faster than a > regexp search, but I've never actually benchmarked this.) Trivial regexp searches are already optimized to bypass the regexp engine. Doing a similar check in looking-at might be worthwhile. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:31 ` Andreas Schwab @ 2010-09-15 15:35 ` Lars Magne Ingebrigtsen 2010-09-15 16:28 ` Andreas Schwab 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-15 15:35 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: > Trivial regexp searches are already optimized to bypass the regexp > engine. Doing a similar check in looking-at might be worthwhile. I did some trivial benchmarking with (while (search-backward "\n(defun " nil t))) and the equivalent re-search-backward in a buffer in a loop, and the search-backward version was about 8x faster. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:35 ` Lars Magne Ingebrigtsen @ 2010-09-15 16:28 ` Andreas Schwab 2010-09-16 16:57 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-15 16:28 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >> Trivial regexp searches are already optimized to bypass the regexp >> engine. Doing a similar check in looking-at might be worthwhile. > > I did some trivial benchmarking with > > (while (search-backward "\n(defun " nil t))) > > and the equivalent re-search-backward in a buffer in a loop, and the > search-backward version was about 8x faster. How did you measure that? When I tried I did not see any significant difference. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 16:28 ` Andreas Schwab @ 2010-09-16 16:57 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-16 16:57 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 550 bytes --] Andreas Schwab <schwab@linux-m68k.org> writes: >> I did some trivial benchmarking with >> >> (while (search-backward "\n(defun " nil t))) >> >> and the equivalent re-search-backward in a buffer in a loop, and the >> search-backward version was about 8x faster. > > How did you measure that? When I tried I did not see any significant > difference. I just call the stuff I want to benchmark a gazillion times with my tiny benchmark.el package: (benchmark 10000 (goto-char (point-min)) (while (search-forward "\n(defun " nil t) (forward-line 1))) [-- Attachment #2: benchmark.el --] [-- Type: application/emacs-lisp, Size: 2082 bytes --] [-- Attachment #3: Type: text/plain, Size: 103 bytes --] -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:09 ` Lars Magne Ingebrigtsen 2010-09-15 15:31 ` Andreas Schwab @ 2010-09-15 15:42 ` Stefan Monnier 2010-09-15 15:51 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Stefan Monnier @ 2010-09-15 15:42 UTC (permalink / raw) To: emacs-devel > So you end up with > (or (looking-at "foo ") (search-forward "\nfoo ")) > which creates a regexp, anyway, and seems clumsy. Unless the text you match is short, the above is probably the fastest, indeed. There is no built-in support for the above idiom, OTOH, so you have to pay for the extra Elisp interpretation overhead of calling looking-at and then search-forward. > So what I wonder is whether there is a smarter way to do this, in > general. (I'm assuming that a simple string search is faster than a > regexp search, but I've never actually benchmarked this.) A simple string search is indeed faster and uses one of those algorithms that are faster for longer search strings. Stefan ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:42 ` Stefan Monnier @ 2010-09-15 15:51 ` Lars Magne Ingebrigtsen 2010-09-15 15:57 ` Leo 2010-09-16 2:57 ` Stephen J. Turnbull 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-15 15:51 UTC (permalink / raw) To: emacs-devel Stefan Monnier <monnier@IRO.UMontreal.CA> writes: >> So you end up with >> (or (looking-at "foo ") (search-forward "\nfoo ")) >> which creates a regexp, anyway, and seems clumsy. > > Unless the text you match is short, the above is probably the fastest, indeed. > There is no built-in support for the above idiom, OTOH, so you have to > pay for the extra Elisp interpretation overhead of calling looking-at > and then search-forward. looking-at probably compiles the regexp, so there might be unnecessary overhead there. (The regexp compilation and caching and stuff.) Is there any function like (is-the-string-following-point-equal-to-this-string-p "foo ") in Emacs that I've overlooked somehow? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:51 ` Lars Magne Ingebrigtsen @ 2010-09-15 15:57 ` Leo 2010-09-15 16:01 ` Lars Magne Ingebrigtsen 2010-09-15 16:05 ` David Kastrup 2010-09-16 2:57 ` Stephen J. Turnbull 1 sibling, 2 replies; 97+ messages in thread From: Leo @ 2010-09-15 15:57 UTC (permalink / raw) To: emacs-devel On 2010-09-15 16:51 +0100, Lars Magne Ingebrigtsen wrote: > looking-at probably compiles the regexp, so there might be unnecessary > overhead there. (The regexp compilation and caching and stuff.) > > Is there any function like > > (is-the-string-following-point-equal-to-this-string-p "foo ") > > in Emacs that I've overlooked somehow? Can you build one using compare-strings? Leo ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:57 ` Leo @ 2010-09-15 16:01 ` Lars Magne Ingebrigtsen 2010-09-15 16:05 ` David Kastrup 1 sibling, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-15 16:01 UTC (permalink / raw) To: emacs-devel Leo <sdl.web@gmail.com> writes: > Can you build one using compare-strings? That would be a consing operation. It's an operation that doesn't have to create garbage, which is nice if you're doing loops like (while (or (following-string-p "foo ") (search-forward "\nfoo ")) ...) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:57 ` Leo 2010-09-15 16:01 ` Lars Magne Ingebrigtsen @ 2010-09-15 16:05 ` David Kastrup 2010-09-15 16:23 ` Leo 1 sibling, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-15 16:05 UTC (permalink / raw) To: emacs-devel Leo <sdl.web@gmail.com> writes: > On 2010-09-15 16:51 +0100, Lars Magne Ingebrigtsen wrote: >> looking-at probably compiles the regexp, so there might be unnecessary >> overhead there. (The regexp compilation and caching and stuff.) >> >> Is there any function like >> >> (is-the-string-following-point-equal-to-this-string-p "foo ") >> >> in Emacs that I've overlooked somehow? > > Can you build one using compare-strings? More likely compare-buffer-substrings. It would be nicer if compare-strings just accepted a buffer as either of its string arguments. Sure, one can use buffer-substring-no-properties with compare-strings or (with-temp-buffer (insert ... with compare-buffer-substrings, but that feels clumsy in comparison. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 16:05 ` David Kastrup @ 2010-09-15 16:23 ` Leo 2010-09-15 16:37 ` David Kastrup 0 siblings, 1 reply; 97+ messages in thread From: Leo @ 2010-09-15 16:23 UTC (permalink / raw) To: emacs-devel On 2010-09-15 17:05 +0100, David Kastrup wrote: > Leo <sdl.web@gmail.com> writes: > >> On 2010-09-15 16:51 +0100, Lars Magne Ingebrigtsen wrote: >>> looking-at probably compiles the regexp, so there might be unnecessary >>> overhead there. (The regexp compilation and caching and stuff.) >>> >>> Is there any function like >>> >>> (is-the-string-following-point-equal-to-this-string-p "foo ") >>> >>> in Emacs that I've overlooked somehow? >> >> Can you build one using compare-strings? > > More likely compare-buffer-substrings. It would be nicer if > compare-strings just accepted a buffer as either of its string > arguments. Sure, one can use buffer-substring-no-properties with > compare-strings or (with-temp-buffer (insert ... with > compare-buffer-substrings, but that feels clumsy in comparison. These two functions look similar, any idea why not extend compare-strings as David suggested? Leo ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 16:23 ` Leo @ 2010-09-15 16:37 ` David Kastrup 2010-09-16 16:58 ` Lars Magne Ingebrigtsen 2010-09-16 17:35 ` Lars Magne Ingebrigtsen 0 siblings, 2 replies; 97+ messages in thread From: David Kastrup @ 2010-09-15 16:37 UTC (permalink / raw) To: emacs-devel Leo <sdl.web@gmail.com> writes: > On 2010-09-15 17:05 +0100, David Kastrup wrote: >> Leo <sdl.web@gmail.com> writes: >> >>> On 2010-09-15 16:51 +0100, Lars Magne Ingebrigtsen wrote: >>>> looking-at probably compiles the regexp, so there might be unnecessary >>>> overhead there. (The regexp compilation and caching and stuff.) >>>> >>>> Is there any function like >>>> >>>> (is-the-string-following-point-equal-to-this-string-p "foo ") >>>> >>>> in Emacs that I've overlooked somehow? >>> >>> Can you build one using compare-strings? >> >> More likely compare-buffer-substrings. It would be nicer if >> compare-strings just accepted a buffer as either of its string >> arguments. Sure, one can use buffer-substring-no-properties with >> compare-strings or (with-temp-buffer (insert ... with >> compare-buffer-substrings, but that feels clumsy in comparison. > > These two functions look similar, any idea why not extend > compare-strings as David suggested? A plausible reason would be that it is not trivial to do and nobody needed it so far. Lars sounds like he would be better served with looking-at getting an optional "LITERAL" argument making it do its job without involving the regexp machinery. Of course, he could just try something like (search-forward string (+ (point) (length string)) t) which should work just fine in his case. In particular since he appears to want to move beyond the match (if any) anyhow. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 16:37 ` David Kastrup @ 2010-09-16 16:58 ` Lars Magne Ingebrigtsen 2010-09-16 21:11 ` Andreas Schwab 2010-09-16 17:35 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-16 16:58 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > Lars sounds like he would be better served with looking-at getting an > optional "LITERAL" argument making it do its job without involving the > regexp machinery. Yes, a LITERAL option to `looking-at' would make sense. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 16:58 ` Lars Magne Ingebrigtsen @ 2010-09-16 21:11 ` Andreas Schwab 2010-09-16 23:17 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-16 21:11 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > David Kastrup <dak@gnu.org> writes: > >> Lars sounds like he would be better served with looking-at getting an >> optional "LITERAL" argument making it do its job without involving the >> regexp machinery. > > Yes, a LITERAL option to `looking-at' would make sense. IMHO it should be good enough to teach looking-at to recognize trivial regexps. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 21:11 ` Andreas Schwab @ 2010-09-16 23:17 ` Lars Magne Ingebrigtsen 2010-09-17 8:13 ` Eli Zaretskii 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-16 23:17 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: > IMHO it should be good enough to teach looking-at to recognize trivial > regexps. That would do the trick, too. But having the LITERALP option doesn't exactly add a lot of code... Unless I've misunderstood how buffers and strings work, which is a very high possibility. Is there an architectural overview of the Emacs internal anywhere? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 23:17 ` Lars Magne Ingebrigtsen @ 2010-09-17 8:13 ` Eli Zaretskii 2010-09-17 13:17 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 8:13 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 01:17:05 +0200 > > Unless I've misunderstood how buffers and strings work, which is a very > high possibility. What aspects of buffers and strings you think you might not understand? Ask here any specific questions you have. > Is there an architectural overview of the Emacs internal anywhere? See the "Object Internals" node in the ELisp manual. Buffers are described there, but strings are not. OTOH, a Lisp string is a fairly simple object, so you should be able to grasp it by looking at the definition of `struct Lisp_string' in lisp.h and how strings are allocated and handled in alloc.c. (There are subtleties about strings and buffers when Emacs allocates large chunks of memory, but I don't think those subtleties matter in the context of this discussion.) ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 8:13 ` Eli Zaretskii @ 2010-09-17 13:17 ` Lars Magne Ingebrigtsen 2010-09-17 13:30 ` Eli Zaretskii 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 13:17 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > What aspects of buffers and strings you think you might not > understand? Ask here any specific questions you have. I mainly wonder how the text in the buffer is really represented. Is it like a string (which is utf8-ish), but with a gap somewhere? >> Is there an architectural overview of the Emacs internal anywhere? > > See the "Object Internals" node in the ELisp manual. Buffers are > described there, but strings are not. Thanks; that's a helpful node. However, the node seems to, er, not match up to how functions really are written. The code is full of BEGV, PT and CHECK_STRING, which is likely somewhat mysterious to new hackers. At least I was confused. :-) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:17 ` Lars Magne Ingebrigtsen @ 2010-09-17 13:30 ` Eli Zaretskii 2010-09-17 13:34 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 13:30 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 15:17:37 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > What aspects of buffers and strings you think you might not > > understand? Ask here any specific questions you have. > > I mainly wonder how the text in the buffer is really represented. Is it > like a string (which is utf8-ish), but with a gap somewhere? Yes. > Thanks; that's a helpful node. However, the node seems to, er, not > match up to how functions really are written. The code is full of BEGV, > PT and CHECK_STRING, which is likely somewhat mysterious to new > hackers. At least I was confused. :-) BEGV, PT, etc. are covered by the "Buffer Internals" node, except that they are lowercased there (because they describe the corresponding struct members to which the upper-cased macros expand). CHECK_FOO just checks that the Lisp_Object is of type FOO. This is useful when you need to be sure you get the arguments of the right type before you process them. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:30 ` Eli Zaretskii @ 2010-09-17 13:34 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 13:34 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> I mainly wonder how the text in the buffer is really represented. Is it >> like a string (which is utf8-ish), but with a gap somewhere? > > Yes. Oh, right. Then I'll rewrite the html-parse-buffer function to use the buffer string instead of taking a string as an argument, again. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 16:37 ` David Kastrup 2010-09-16 16:58 ` Lars Magne Ingebrigtsen @ 2010-09-16 17:35 ` Lars Magne Ingebrigtsen 1 sibling, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-16 17:35 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 285 bytes --] David Kastrup <dak@gnu.org> writes: > Lars sounds like he would be better served with looking-at getting an > optional "LITERAL" argument making it do its job without involving the > regexp machinery. Here's a quick take on how an optional LITERALP option to looking-at might look: [-- Attachment #2: literal --] [-- Type: application/octet-stream, Size: 2806 bytes --] === modified file 'src/lisp.h' *** src/lisp.h 2010-09-12 14:35:37 +0000 --- src/lisp.h 2010-09-16 17:27:50 +0000 *************** *** 3114,3120 **** EXFUN (Fmatch_beginning, 1); EXFUN (Fmatch_end, 1); extern void record_unwind_save_match_data (void); ! EXFUN (Flooking_at, 1); extern int fast_string_match (Lisp_Object, Lisp_Object); extern int fast_c_string_match_ignore_case (Lisp_Object, const char *); extern int fast_string_match_ignore_case (Lisp_Object, Lisp_Object); --- 3114,3120 ---- EXFUN (Fmatch_beginning, 1); EXFUN (Fmatch_end, 1); extern void record_unwind_save_match_data (void); ! EXFUN (Flooking_at, 2); extern int fast_string_match (Lisp_Object, Lisp_Object); extern int fast_c_string_match_ignore_case (Lisp_Object, const char *); extern int fast_string_match_ignore_case (Lisp_Object, Lisp_Object); === modified file 'src/search.c' *** src/search.c 2010-08-09 09:35:21 +0000 --- src/search.c 2010-09-16 17:27:35 +0000 *************** *** 281,286 **** --- 281,309 ---- \f static Lisp_Object + looking_at_literally (Lisp_Object string) + { + int start_byte = CHAR_TO_BYTE (PT); + int end_byte, end = PT + SCHARS (string); + + CHECK_STRING (string); + + if (PT < GPT && GPT < end) + move_gap (PT); + + if (end > ZV) + return Qnil; + + end_byte = CHAR_TO_BYTE (end); + + if (! memcmp (SDATA (string), BYTE_POS_ADDR (start_byte), + end_byte - start_byte)) + return Qt; + else + return Qnil; + } + + static Lisp_Object looking_at_1 (Lisp_Object string, int posix) { Lisp_Object val; *************** *** 357,370 **** return val; } ! DEFUN ("looking-at", Flooking_at, Slooking_at, 1, 1, 0, doc: /* Return t if text after point matches regular expression REGEXP. This function modifies the match data that `match-beginning', `match-end' and `match-data' access; save and restore the match ! data if you want to preserve them. */) ! (Lisp_Object regexp) { ! return looking_at_1 (regexp, 0); } DEFUN ("posix-looking-at", Fposix_looking_at, Sposix_looking_at, 1, 1, 0, --- 380,398 ---- return val; } ! DEFUN ("looking-at", Flooking_at, Slooking_at, 1, 2, 0, doc: /* Return t if text after point matches regular expression REGEXP. This function modifies the match data that `match-beginning', `match-end' and `match-data' access; save and restore the match ! data if you want to preserve them. ! If LITERALP is non-nil, REGEXP will be interpreted as a string, and the ! match data will not be modified. */) ! (Lisp_Object regexp, Lisp_Object literalp) { ! if (NILP (literalp)) ! return looking_at_literally (regexp); ! else ! return looking_at_1 (regexp, 0); } DEFUN ("posix-looking-at", Fposix_looking_at, Sposix_looking_at, 1, 1, 0, [-- Attachment #3: Type: text/plain, Size: 313 bytes --] Benchmarking shows the expected -- that this is a *lot* faster than the alternatives. A (while (or (looking-at "thing" t) (search-forward "\nthing"))) loop is about 5% slower than a pure search-forward loop. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:51 ` Lars Magne Ingebrigtsen 2010-09-15 15:57 ` Leo @ 2010-09-16 2:57 ` Stephen J. Turnbull 2010-09-16 6:54 ` David Kastrup 2010-09-16 17:01 ` Lars Magne Ingebrigtsen 1 sibling, 2 replies; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-16 2:57 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel Lars Magne Ingebrigtsen writes: > Is there any function like > > (is-the-string-following-point-equal-to-this-string-p "foo ") Does every one-line function need to be a built-in? (defun is-the-string-following-point-equal-to-this-string-p (s) (string= s (buffer-substring (point) (+ (point) (length s))))) or (defun is-the-string-following-point-equal-to-this-string-p (s) (search-forward s (+ (point) (length s)) t)) ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 2:57 ` Stephen J. Turnbull @ 2010-09-16 6:54 ` David Kastrup 2010-09-16 8:10 ` Stephen J. Turnbull 2010-09-16 17:01 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-16 6:54 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > Lars Magne Ingebrigtsen writes: > > > Is there any function like > > > > (is-the-string-following-point-equal-to-this-string-p "foo ") > > Does every one-line function need to be a built-in? > > (defun is-the-string-following-point-equal-to-this-string-p (s) > (string= s (buffer-substring (point) (+ (point) (length s))))) > > or > > (defun is-the-string-following-point-equal-to-this-string-p (s) > (search-forward s (+ (point) (length s)) t)) The former will signal an error when the string is longer than the rest of the buffer. The latter won't. You can't figure this out by reading the doc strings of the used functions. You have to read their source code. Since a user is not likely to pick the correct one-liner, it might make sense to define a function for that. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 6:54 ` David Kastrup @ 2010-09-16 8:10 ` Stephen J. Turnbull 2010-09-16 8:31 ` David Kastrup 0 siblings, 1 reply; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-16 8:10 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > > (defun is-the-string-following-point-equal-to-this-string-p (s) > > (string= s (buffer-substring (point) (+ (point) (length s))))) > > > > or > > > > (defun is-the-string-following-point-equal-to-this-string-p (s) > > (search-forward s (+ (point) (length s)) t)) > The former will signal an error when the string is longer than the > rest of the buffer. The latter won't. > > You can't figure this out by reading the doc strings of the used > functions. You have to read their source code. > > Since a user is not likely to pick the correct one-liner, it might > make sense to define a function for that. It might, but I would prefer a documentation patch. ;-) ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 8:10 ` Stephen J. Turnbull @ 2010-09-16 8:31 ` David Kastrup 0 siblings, 0 replies; 97+ messages in thread From: David Kastrup @ 2010-09-16 8:31 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > David Kastrup writes: > > > > (defun is-the-string-following-point-equal-to-this-string-p (s) > > > (string= s (buffer-substring (point) (+ (point) (length s))))) > > > > > > or > > > > > > (defun is-the-string-following-point-equal-to-this-string-p (s) > > > (search-forward s (+ (point) (length s)) t)) > > > The former will signal an error when the string is longer than the > > rest of the buffer. The latter won't. > > > > You can't figure this out by reading the doc strings of the used > > functions. You have to read their source code. > > > > Since a user is not likely to pick the correct one-liner, it might > > make sense to define a function for that. > > It might, but I would prefer a documentation patch. ;-) Can't really do the trick here. "If START or END are outside the accessible buffer region, an error is signaled. In case you wanted to check for the presence of a string str at buffer point, consider using `search-forward' with a LIMIT of (+ (length str) (point)) and a NOERROR of t. Or use (condition-case nil (string= s (buffer-substring (point) (+ (point) (length s)))) (args-out-of-range nil)) Or maybe (string= s (buffer-substring (point) (min (point-max) (+ (point) (length s))))) " I mean, get real. One can't write suggestions for all possible intended uses into a DOC string. Would make more sense to give buffer-substring a NOERROR argument of its own. Still ugh. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 2:57 ` Stephen J. Turnbull 2010-09-16 6:54 ` David Kastrup @ 2010-09-16 17:01 ` Lars Magne Ingebrigtsen 2010-09-17 6:52 ` Stephen J. Turnbull 1 sibling, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-16 17:01 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > Does every one-line function need to be a built-in? > > (defun is-the-string-following-point-equal-to-this-string-p (s) > (string= s (buffer-substring (point) (+ (point) (length s))))) > > or > > (defun is-the-string-following-point-equal-to-this-string-p (s) > (search-forward s (+ (point) (length s)) t)) Does everything have to be slow? :-) The first one (in addition to not really working all that well) makes a benchmark of (benchmark 10000 (goto-char (point-min)) (while (or (s (buffer-substring (point) (+ (point) (length "(defun "))) "(defun ") (search-forward "\n(defun " nil t)) (forward-line 1))) take 24 seconds, while the non-consing version takes 9 seconds. The latter version is only 50% slower than the single-search-forward version. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-16 17:01 ` Lars Magne Ingebrigtsen @ 2010-09-17 6:52 ` Stephen J. Turnbull 2010-09-17 13:09 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-17 6:52 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel Lars Magne Ingebrigtsen writes: > Does everything have to be slow? :-) Wrong polarity. The question here is "does everything have to be fast?", and the answer is "no -- I mean, HELL NO!" Make your case that this function needs to be faster than a pure elisp implementation. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 6:52 ` Stephen J. Turnbull @ 2010-09-17 13:09 ` Lars Magne Ingebrigtsen 2010-09-17 13:31 ` David Kastrup 2010-09-17 17:40 ` Stephen J. Turnbull 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 13:09 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > > Does everything have to be slow? :-) > > Wrong polarity. The question here is "does everything have to be > fast?", and the answer is "no -- I mean, HELL NO!" > > Make your case that this function needs to be faster than a pure elisp > implementation. I think that's a rather ... stunning approach, but might explain many things about XEmacs. I've presented a use case, and I've demonstrated how all the alternative implementations are 50-150% slower than my suggested new implementation, and I've done the implementation, which turned out to be totally trivial. What more do you need? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:09 ` Lars Magne Ingebrigtsen @ 2010-09-17 13:31 ` David Kastrup 2010-09-17 13:39 ` Lars Magne Ingebrigtsen 2010-09-17 13:49 ` Andreas Schwab 2010-09-17 17:40 ` Stephen J. Turnbull 1 sibling, 2 replies; 97+ messages in thread From: David Kastrup @ 2010-09-17 13:31 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > "Stephen J. Turnbull" <stephen@xemacs.org> writes: > >> > Does everything have to be slow? :-) >> >> Wrong polarity. The question here is "does everything have to be >> fast?", and the answer is "no -- I mean, HELL NO!" >> >> Make your case that this function needs to be faster than a pure elisp >> implementation. > > I think that's a rather ... stunning approach, but might explain many > things about XEmacs. > > I've presented a use case, and I've demonstrated how all the alternative > implementations are 50-150% slower than my suggested new implementation, > and I've done the implementation, which turned out to be totally > trivial. Not really sure about that. + looking_at_literally (Lisp_Object string) + { + int start_byte = CHAR_TO_BYTE (PT); + int end_byte, end = PT + SCHARS (string); PT + SCHARS (string) can overflow here. Better check first rather than later whether ZV - PT < SCHARS (string). Yes, I know that most-positive-fixnum <= MAX_INT/2, but just on principle. + end_byte = CHAR_TO_BYTE (end); + + if (! memcmp (SDATA (string), BYTE_POS_ADDR (start_byte), + end_byte - start_byte)) That is assuming that both string and buffer are identically encoded (nowadays that likely means both have the same multibyteness). -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:31 ` David Kastrup @ 2010-09-17 13:39 ` Lars Magne Ingebrigtsen 2010-09-17 13:55 ` David Kastrup 2010-09-17 13:49 ` Andreas Schwab 1 sibling, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 13:39 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > + end_byte = CHAR_TO_BYTE (end); > + > + if (! memcmp (SDATA (string), BYTE_POS_ADDR (start_byte), > + end_byte - start_byte)) > > That is assuming that both string and buffer are identically encoded > (nowadays that likely means both have the same multibyteness). Which brings me back to the other question I had, about buffer internals. :-) It was a guess based on Fbuffer_substring doing this: memcpy (SDATA (result), BYTE_POS_ADDR (start_byte), end_byte - start_byte); So I thought that if you could create a string by just memcpy()-ing data from the buffer, then it seemed likely that you could compare them with memcmp(). But that's probably wrong? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:39 ` Lars Magne Ingebrigtsen @ 2010-09-17 13:55 ` David Kastrup 2010-09-17 14:18 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-17 13:55 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > David Kastrup <dak@gnu.org> writes: > >> + end_byte = CHAR_TO_BYTE (end); >> + >> + if (! memcmp (SDATA (string), BYTE_POS_ADDR (start_byte), >> + end_byte - start_byte)) >> >> That is assuming that both string and buffer are identically encoded >> (nowadays that likely means both have the same multibyteness). > > Which brings me back to the other question I had, about buffer > internals. :-) > > It was a guess based on Fbuffer_substring doing this: > > memcpy (SDATA (result), BYTE_POS_ADDR (start_byte), end_byte - start_byte); > > So I thought that if you could create a string by just memcpy()-ing data > from the buffer, then it seemed likely that you could compare them with > memcmp(). But that's probably wrong? Fbuffer_substring also likely copies the multibyteness, not just the bytes. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:55 ` David Kastrup @ 2010-09-17 14:18 ` Lars Magne Ingebrigtsen 2010-09-17 14:57 ` David Kastrup 2010-09-17 16:18 ` Eli Zaretskii 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 14:18 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > Fbuffer_substring also likely copies the multibyteness, not just the > bytes. Yes, probably, but I don't know what that means. The code, in more fullness, is: if (start < GPT && GPT < end) move_gap (start); if (! NILP (current_buffer->enable_multibyte_characters)) result = make_uninit_multibyte_string (end - start, end_byte - start_byte); else result = make_uninit_string (end - start); memcpy (SDATA (result), BYTE_POS_ADDR (start_byte), end_byte - start_byte); So no matter whether it creates a "multibyte string" or not (and I don't know what the difference is), it still just does a memcpy from the buffer representation over to the string representation. But I've now read "33.1 Text Representations", and I think I understand a bit better. If you have a unibyte string with the bytes #x68 #xe9 #x6c #x6c #x6f in it, and you have a multibyte buffer with the string héllo in it, then they won't match. But are they supposed to? (equal (unibyte-string #x68 #xe9 #x6c #x6c #x6f) "héllo") => nil I guess not. So I'm thinking the memcmp() is sufficient to give the desired result. Isn't it? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 14:18 ` Lars Magne Ingebrigtsen @ 2010-09-17 14:57 ` David Kastrup 2010-09-17 15:06 ` Lars Magne Ingebrigtsen 2010-09-17 16:18 ` Eli Zaretskii 1 sibling, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-17 14:57 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > David Kastrup <dak@gnu.org> writes: > >> Fbuffer_substring also likely copies the multibyteness, not just the >> bytes. > > Yes, probably, but I don't know what that means. This: > if (! NILP (current_buffer->enable_multibyte_characters)) > result = make_uninit_multibyte_string (end - start, end_byte - start_byte); > else > result = make_uninit_string (end - start); > So no matter whether it creates a "multibyte string" or not (and I > don't know what the difference is), it still just does a memcpy from > the buffer representation over to the string representation. Sure. But when you compare you can be in the unfortunate situation that multibyteness _differs_. In this case, the byte patterns can be the same and the texts still different, and vice versa. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 14:57 ` David Kastrup @ 2010-09-17 15:06 ` Lars Magne Ingebrigtsen 2010-09-17 15:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:11 ` David Kastrup 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 15:06 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > In this case, the byte patterns can be the same and the texts still > different, and vice versa. So just to be clear here, you think this is correct: (equalp (unibyte-string #x68 #xe9 #x6c #x6c #x6f) "héllo") => nil But that this behaviour is incorrect? (equalp (unibyte-string #x68 #x6c #x6c #x6f) "hllo") => t But that this is correct: (equalp (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) "héllo") => nil And this is incorrect? (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo => t *scratches head* Well, I guess I can see that... -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:06 ` Lars Magne Ingebrigtsen @ 2010-09-17 15:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:11 ` Eli Zaretskii 2010-09-17 16:33 ` David Kastrup 2010-09-17 16:11 ` David Kastrup 1 sibling, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 15:24 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo > => t > > *scratches head* Well, I guess I can see that... I've looked at what Fequal does. It first compares the number of characters in the strings, and then the number of bytes, and then it does a memcmp(). So the only change necessary in at_literal() was to add a check that the number of bytes in the string and in the region we're comparing with is the same. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:24 ` Lars Magne Ingebrigtsen @ 2010-09-17 16:11 ` Eli Zaretskii 2010-09-17 16:33 ` David Kastrup 1 sibling, 0 replies; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 16:11 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 17:24:34 +0200 > > Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > > > (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo > > => t > > > > *scratches head* Well, I guess I can see that... > > I've looked at what Fequal does. It first compares the number of > characters in the strings, and then the number of bytes, and then it > does a memcmp(). So the only change necessary in at_literal() was to > add a check that the number of bytes in the string and in the region > we're comparing with is the same. I don't recommend to get your hands dirty with unibyte strings and unibyte buffers. Just do it right for multibyte ones and wait until someone hollers. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:11 ` Eli Zaretskii @ 2010-09-17 16:33 ` David Kastrup 2010-09-17 16:41 ` Andreas Schwab 2010-09-17 17:24 ` Lars Magne Ingebrigtsen 1 sibling, 2 replies; 97+ messages in thread From: David Kastrup @ 2010-09-17 16:33 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > >> (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo >> => t >> >> *scratches head* Well, I guess I can see that... > > I've looked at what Fequal does. It first compares the number of > characters in the strings, and then the number of bytes, and then it > does a memcmp(). It also checks that the multibyteness is the same IIRC. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:33 ` David Kastrup @ 2010-09-17 16:41 ` Andreas Schwab 2010-09-17 17:17 ` David Kastrup 2010-09-17 17:24 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 16:41 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup <dak@gnu.org> writes: > Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > >> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: >> >>> (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo >>> => t >>> >>> *scratches head* Well, I guess I can see that... >> >> I've looked at what Fequal does. It first compares the number of >> characters in the strings, and then the number of bytes, and then it >> does a memcmp(). > > It also checks that the multibyteness is the same IIRC. If both the number of chars and bytes agree then they must be of the same multibyteness. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:41 ` Andreas Schwab @ 2010-09-17 17:17 ` David Kastrup 2010-09-17 18:24 ` David Kastrup 2010-09-17 18:53 ` Stephen J. Turnbull 0 siblings, 2 replies; 97+ messages in thread From: David Kastrup @ 2010-09-17 17:17 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: > David Kastrup <dak@gnu.org> writes: > >> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: >> >>> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: >>> >>>> (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo >>>> => t >>>> >>>> *scratches head* Well, I guess I can see that... >>> >>> I've looked at what Fequal does. It first compares the number of >>> characters in the strings, and then the number of bytes, and then it >>> does a memcmp(). >> >> It also checks that the multibyteness is the same IIRC. > > If both the number of chars and bytes agree then they must be of the > same multibyteness. Why? (length (string-as-multibyte "h\251llo")) => 5 (equal "h\251llo" (string-as-multibyte "h\251llo")) => nil -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 17:17 ` David Kastrup @ 2010-09-17 18:24 ` David Kastrup 2010-09-17 20:30 ` David Kastrup 2010-09-17 18:53 ` Stephen J. Turnbull 1 sibling, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-17 18:24 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >> David Kastrup <dak@gnu.org> writes: >> >>> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: >>> >>>> Lars Magne Ingebrigtsen <larsi@gnus.org> writes: >>>> >>>>> (looking-at (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) t)héllo >>>>> => t >>>>> >>>>> *scratches head* Well, I guess I can see that... >>>> >>>> I've looked at what Fequal does. It first compares the number of >>>> characters in the strings, and then the number of bytes, and then it >>>> does a memcmp(). >>> >>> It also checks that the multibyteness is the same IIRC. >> >> If both the number of chars and bytes agree then they must be of the >> same multibyteness. > > Why? > > (length (string-as-multibyte "h\251llo")) => 5 > > (equal "h\251llo" (string-as-multibyte "h\251llo")) => nil Just seen that string-as-multibyte actually _does_ convert the content (did not use to do so pre-23). So string equality indeed does not check explicitly for multibyteness which is sort of ugly: (equal "abc" (string-to-multibyte "abc")) => t (string= "abc" (string-to-multibyte "abc")) => t (multibyte-string-p "abc") => nil (multibyte-string-p (string-to-multibyte "abc")) => t So it is likely that the correct way to deal with the uni/multibyte issue in the case of buffer/string comparison is the same: just check that both character and byte counts are identical. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 18:24 ` David Kastrup @ 2010-09-17 20:30 ` David Kastrup 2010-09-17 20:49 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: David Kastrup @ 2010-09-17 20:30 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > So it is likely that the correct way to deal with the uni/multibyte > issue in the case of buffer/string comparison is the same: just check > that both character and byte counts are identical. Oh, we are talking about `looking-at'. Did I mention that we should take case-fold-search into account? -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 20:30 ` David Kastrup @ 2010-09-17 20:49 ` Lars Magne Ingebrigtsen 2010-09-18 4:31 ` David Kastrup 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 20:49 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > Oh, we are talking about `looking-at'. Did I mention that we should > take case-fold-search into account? Or we could say that LITERALP literally means literal. :-) -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 20:49 ` Lars Magne Ingebrigtsen @ 2010-09-18 4:31 ` David Kastrup 0 siblings, 0 replies; 97+ messages in thread From: David Kastrup @ 2010-09-18 4:31 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > David Kastrup <dak@gnu.org> writes: > >> Oh, we are talking about `looking-at'. Did I mention that we should >> take case-fold-search into account? > > Or we could say that LITERALP literally means literal. :-) No point in creating an API different to the other search functions. Come to think of it, all of them have separate names rather than a LITERAL argument. It is just replace-match that has LITERAL (as well as an explicit FIXEDCASE). And all of them set match-data, apparently. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 17:17 ` David Kastrup 2010-09-17 18:24 ` David Kastrup @ 2010-09-17 18:53 ` Stephen J. Turnbull 2010-09-17 20:57 ` Eli Zaretskii 1 sibling, 1 reply; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-17 18:53 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > > If both the number of chars and bytes agree then they must be of > > the same multibyteness. > > Why? Actually, there's an exceptional case: if both strings are pure ASCII. In that case it might be possible that one string is multibyte and the other unibyte, while the numbers of characters and of bytes are equal. However, in that case the two strings have the same semantics, so I would suspect that allowing them to compare equal if their representations are equal (ignoring multi-byte-ness) is intentional. The example you gave proves nothing, however. In fact, when that string is presented by `string-as-multibyte', ?\351 will be converted to a private space character in Unicode and therefore will have more than one byte in its representation. Thus the length in bytes of the string (as multibyte) will be 7 (or maybe more, I forget which private space naked bytes live in). Here's one way to get byte length of a string: (defun string-byte-count (s) (length (if (string-multibyte-p s) (encode-coding-string s 'utf-8) s))) ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 18:53 ` Stephen J. Turnbull @ 2010-09-17 20:57 ` Eli Zaretskii 2010-09-18 14:19 ` Stephen J. Turnbull 0 siblings, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 20:57 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: dak, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Date: Sat, 18 Sep 2010 03:53:27 +0900 > Cc: emacs-devel@gnu.org > > Actually, there's an exceptional case: if both strings are pure ASCII. > In that case it might be possible that one string is multibyte and the > other unibyte, while the numbers of characters and of bytes are equal. A unibyte string in Emacs has its `size_byte' member set to a negative value: /* Mark STR as a unibyte string. */ #define STRING_SET_UNIBYTE(STR) \ do { if (EQ (STR, empty_multibyte_string)) \ (STR) = empty_unibyte_string; \ else XSTRING (STR)->size_byte = -1; } while (0) By contrast, a multibyte string holds there the number of bytes in its internal representation. So a pure ASCII string could be unibyte or multibyte, and the `size_byte' member will be negative in the former case and positive in the latter case. However, AFAIK Emacs always makes a unibyte string if all the characters are pure ASCII. So this does not matter in practice. > The example you gave proves nothing, however. In fact, when that > string is presented by `string-as-multibyte', ?\351 will be converted > to a private space character in Unicode and therefore will have more > than one byte in its representation. Thus the length in bytes of the > string (as multibyte) will be 7 (or maybe more, I forget which private > space naked bytes live in). Here's one way to get byte length of a > string: > > (defun string-byte-count (s) > (length (if (string-multibyte-p s) (encode-coding-string s 'utf-8) s))) See above: this is not accurate. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 20:57 ` Eli Zaretskii @ 2010-09-18 14:19 ` Stephen J. Turnbull 2010-09-18 15:46 ` Eli Zaretskii 2010-09-18 15:58 ` Stefan Monnier 0 siblings, 2 replies; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-18 14:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dak, emacs-devel Eli Zaretskii writes: > However, AFAIK Emacs always makes a unibyte string if all the > characters are pure ASCII. So this does not matter in practice. That's true in Stefan's personal Emacs, AIUI, but otherwise I bet aset on a multibyte string can turn it into pure ASCII. > > (defun string-byte-count (s) > > (length (if (string-multibyte-p s) (encode-coding-string s > > 'utf-8) s))) > > See above: this is not accurate. I don't understand what "above" you're referring to. Unless maybe you mean because in unibyte strings the byte count is negative? ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-18 14:19 ` Stephen J. Turnbull @ 2010-09-18 15:46 ` Eli Zaretskii 2010-09-18 15:58 ` Stefan Monnier 1 sibling, 0 replies; 97+ messages in thread From: Eli Zaretskii @ 2010-09-18 15:46 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: dak, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: dak@gnu.org, > emacs-devel@gnu.org > Date: Sat, 18 Sep 2010 23:19:11 +0900 > > > > (defun string-byte-count (s) > > > (length (if (string-multibyte-p s) (encode-coding-string s > > > 'utf-8) s))) > > > > See above: this is not accurate. > > I don't understand what "above" you're referring to. The definition of STRING_SET_UNIBYTE. > Unless maybe you mean because in unibyte strings the byte count is > negative? Yes. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-18 14:19 ` Stephen J. Turnbull 2010-09-18 15:46 ` Eli Zaretskii @ 2010-09-18 15:58 ` Stefan Monnier 1 sibling, 0 replies; 97+ messages in thread From: Stefan Monnier @ 2010-09-18 15:58 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Eli Zaretskii, dak, emacs-devel >> However, AFAIK Emacs always makes a unibyte string if all the >> characters are pure ASCII. So this does not matter in practice. > That's true in Stefan's personal Emacs, AIUI, but otherwise I bet aset > on a multibyte string can turn it into pure ASCII. Actually, it's not true in my personal branch because I use there some (failed-)experimental code which distinguishes between "unibyte/multibyte/anybyte" where "anybyte" means "could be either" and is used for purely ascii strings (or more specifically, it's used for those strings where the unibyte and multibyte representation are the same). It might have been a good idea, but it seems this is not worth the trouble changing now. So all that's left from this experiment is that when I see problem with multibyte/unibyte I absolutely need to try it with the trunk code, because it may just be a bug in my branch. Stefan ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:33 ` David Kastrup 2010-09-17 16:41 ` Andreas Schwab @ 2010-09-17 17:24 ` Lars Magne Ingebrigtsen 1 sibling, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 17:24 UTC (permalink / raw) To: emacs-devel David Kastrup <dak@gnu.org> writes: > It also checks that the multibyteness is the same IIRC. It's possible, but I can't find that in the code. I think this is the relevant bit in internal_equal(): case Lisp_String: if (SCHARS (o1) != SCHARS (o2)) return 0; if (SBYTES (o1) != SBYTES (o2)) return 0; if (memcmp (SDATA (o1), SDATA (o2), SBYTES (o1))) return 0; if (props && !compare_string_intervals (o1, o2)) return 0; return 1; -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:06 ` Lars Magne Ingebrigtsen 2010-09-17 15:24 ` Lars Magne Ingebrigtsen @ 2010-09-17 16:11 ` David Kastrup 1 sibling, 0 replies; 97+ messages in thread From: David Kastrup @ 2010-09-17 16:11 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > David Kastrup <dak@gnu.org> writes: > >> In this case, the byte patterns can be the same and the texts still >> different, and vice versa. > > So just to be clear here, you think this is correct: > > (equalp (unibyte-string #x68 #xe9 #x6c #x6c #x6f) "héllo") > => nil equalp does not exist. I'll change to equal for the rest of the discussion. > But that this behaviour is incorrect? > > (equal (unibyte-string #x68 #x6c #x6c #x6f) "hllo") > => t No. Correct. > But that this is correct: > > (equal (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f) "héllo") > => nil Still correct. But notice that (equal (string-as-multibyte (unibyte-string #x68 #xc3 #xa9 #x6c #x6c #x6f)) "héllo") => t And notice (equal (unibyte-string #x68 #xe9 #x6c #x6c #x6f) "h\351llo") => t whereas (equal "h\351llo" "héllo") => nil while (equal "h\u00e9llo" "héllo") => t It should probably be called an error that the print form of "h\351llo" is "héllo" on an utf-8 terminal. That does not particularly help to make things less confusing since the input forms are non-equivalent. -- David Kastrup ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 14:18 ` Lars Magne Ingebrigtsen 2010-09-17 14:57 ` David Kastrup @ 2010-09-17 16:18 ` Eli Zaretskii 2010-09-17 16:24 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 16:18 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 16:18:30 +0200 > > So I'm thinking the memcmp() is sufficient to give the desired result. > Isn't it? No, because of the gap issue. That's why Fbuffer_substring moves the gap out of its way: if (start < GPT && GPT < end) move_gap (start); This ensures that the region of buffer text between START and END is contiguous, without the gap. Which is why you can really use memcpy et al. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:18 ` Eli Zaretskii @ 2010-09-17 16:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:39 ` Eli Zaretskii 2010-09-17 16:39 ` Eli Zaretskii 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 16:24 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > No, because of the gap issue. That's why Fbuffer_substring moves the > gap out of its way: > > if (start < GPT && GPT < end) > move_gap (start); Yeah, I do that to in at_literal(). The thing I'm must unclear on now is whether it's valid to say int thing = PT; and whether it's valid to say PT + 1; I've grepped through the code, and this seems to be used all over the place, so I'm guessing that perhaps the size of a buffer is constrained to be less than INT_MAX? Even though PT is EMACS_INT. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:24 ` Lars Magne Ingebrigtsen @ 2010-09-17 16:39 ` Eli Zaretskii 2010-09-17 17:30 ` Lars Magne Ingebrigtsen 2010-09-17 16:39 ` Eli Zaretskii 1 sibling, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 16:39 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 18:24:34 +0200 > > The thing I'm must unclear on now is whether it's valid to say > > int thing = PT; > > and whether it's valid to say > > PT + 1; PT is just a number, an integral data type. So adding one to it is okay. But storing into an `int' is not, because PT is an EMACS_INT, see `struct buffer': struct buffer { ... /* Char position of point in buffer. */ EMACS_INT pt; /* Byte position of point in buffer. */ EMACS_INT pt_byte; EMACS_INT is a 64-bit data type on 64-bit machines, so assigning it to an int is a bug waiting to happen. You should do this instead: EMACS_INT thing = PT; > I've grepped through the code, and this seems to be used all over the > place Each place where you see PT assigned to an int is a bug, please either report it or fix it right away. > so I'm guessing that perhaps the size of a buffer is constrained > to be less than INT_MAX? No, it's constrained by most-positive-fixnum (MOST_POSITIVE_FIXNUM in C), but we still have many places where we use a int, which is why buffers larger than MAX_INT not always work well. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:39 ` Eli Zaretskii @ 2010-09-17 17:30 ` Lars Magne Ingebrigtsen 2010-09-17 18:49 ` Eli Zaretskii 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 17:30 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> I've grepped through the code, and this seems to be used all over the >> place > > Each place where you see PT assigned to an int is a bug, please either > report it or fix it right away. Right. Things like int pt = PT; in buffer.c is easy enough, but is the following (from insdel.c) correct? int b = XINT (Fmarker_position (current_buffer->mark)); int e = XINT (make_number (PT)); I don't really understand the last line at all. It first creates a Lisp_Object number from PT, and then gets the C-level EMACS_INT value from that again? And then casts it to an int? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 17:30 ` Lars Magne Ingebrigtsen @ 2010-09-17 18:49 ` Eli Zaretskii 0 siblings, 0 replies; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 18:49 UTC (permalink / raw) To: Lars Magne Ingebrigtsen, Chong Yidong; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 19:30:52 +0200 > > Right. Things like > > int pt = PT; > > in buffer.c is easy enough, but is the following (from insdel.c) > correct? > > int b = XINT (Fmarker_position (current_buffer->mark)); I think it's a bug, should use "EMACS_INT b". > int e = XINT (make_number (PT)); > > I don't really understand the last line at all. It first creates a > Lisp_Object number from PT, and then gets the C-level EMACS_INT value > from that again? And then casts it to an int? I think it's a bug, should use "EMACS_INT e = PT;" As for why it converts it to Lisp integer and then back to a C EMACS_INT: it seems to be a historical curiosity. Revision 101018 made this change: === modified file 'src/insdel.c' --- src/insdel.c 2010-08-07 19:39:04 +0000 +++ src/insdel.c 2010-08-07 20:26:55 +0000 @@ -2055,13 +2055,12 @@ prepare_to_modify_buffer (EMACS_INT star && !NILP (Vtransient_mark_mode) && NILP (Vsaved_region_selection)) { - Lisp_Object b = Fmarker_position (current_buffer->mark); - Lisp_Object e = make_number (PT); - if (NILP (Fequal (b, e))) - { - validate_region (&b, &e); - Vsaved_region_selection = make_buffer_string (XINT (b), XINT (e), 0); - } + int b = XINT (Fmarker_position (current_buffer->mark)); + int e = XINT (make_number (PT)); + if (b < e) + Vsaved_region_selection = make_buffer_string (b, e, 0); + else if (b > e) + Vsaved_region_selection = make_buffer_string (e, b, 0); } I guess the change mechanically used XINT to move the comparison to C-level integers, but didn't pay attention to the fact that it would be easier to just remove the make_number call and use PT directly. Chong, did I miss something? ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:39 ` Eli Zaretskii @ 2010-09-17 16:39 ` Eli Zaretskii 1 sibling, 0 replies; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 16:39 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 18:24:34 +0200 > > The thing I'm must unclear on now is whether it's valid to say > > int thing = PT; > > and whether it's valid to say > > PT + 1; PT is just a number, an integral data type. So adding one to it is okay. But storing into an `int' is not, because PT is an EMACS_INT, see `struct buffer': struct buffer { ... /* Char position of point in buffer. */ EMACS_INT pt; /* Byte position of point in buffer. */ EMACS_INT pt_byte; EMACS_INT is a 64-bit data type on 64-bit machines, so assigning it to an int is a bug waiting to happen. You should do this instead: EMACS_INT thing = PT; > I've grepped through the code, and this seems to be used all over the > place Each place where you see PT assigned to an int is a bug, please either report it or fix it right away. > so I'm guessing that perhaps the size of a buffer is constrained > to be less than INT_MAX? No, it's constrained by most-positive-fixnum (MOST_POSITIVE_FIXNUM in C), but we still have many places where we use a int, which is why buffers larger than MAX_INT not always work well. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:31 ` David Kastrup 2010-09-17 13:39 ` Lars Magne Ingebrigtsen @ 2010-09-17 13:49 ` Andreas Schwab 2010-09-17 13:55 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 13:49 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup <dak@gnu.org> writes: > Not really sure about that. > > + looking_at_literally (Lisp_Object string) > + { > + int start_byte = CHAR_TO_BYTE (PT); PT_BYTE > + int end_byte, end = PT + SCHARS (string); > > PT + SCHARS (string) can overflow here. Better check first rather than > later whether ZV - PT < SCHARS (string). > > Yes, I know that most-positive-fixnum <= MAX_INT/2 How do you "know" that? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:49 ` Andreas Schwab @ 2010-09-17 13:55 ` Lars Magne Ingebrigtsen 2010-09-17 14:31 ` Wojciech Meyer 2010-09-17 14:40 ` Andreas Schwab 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 13:55 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: >> PT + SCHARS (string) can overflow here. Better check first rather than >> later whether ZV - PT < SCHARS (string). >> >> Yes, I know that most-positive-fixnum <= MAX_INT/2 > > How do you "know" that? Don't the Lisp integers use a bit for the type tag? Anyway, I agree with the change (and I've fixed it along the lines of what David said), but it seems rather, er, unlikely that a buffer would approach MAX_INT in size. But perhaps that's just me. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:55 ` Lars Magne Ingebrigtsen @ 2010-09-17 14:31 ` Wojciech Meyer 2010-09-17 14:40 ` Andreas Schwab 1 sibling, 0 replies; 97+ messages in thread From: Wojciech Meyer @ 2010-09-17 14:31 UTC (permalink / raw) To: emacs-devel On Fri, Sep 17, 2010 at 2:55 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: > Andreas Schwab <schwab@linux-m68k.org> writes: > >>> PT + SCHARS (string) can overflow here. Better check first rather than >>> later whether ZV - PT < SCHARS (string). >>> >>> Yes, I know that most-positive-fixnum <= MAX_INT/2 >> >> How do you "know" that? > > Don't the Lisp integers use a bit for the type tag? Precise GC requires to distinguish between pointers and integers. Sine the pointers are always aligned, the least significant bit can be used for tagging integer. However, i am not sure what is approach in Emacs, reading docs. Wojciech ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:55 ` Lars Magne Ingebrigtsen 2010-09-17 14:31 ` Wojciech Meyer @ 2010-09-17 14:40 ` Andreas Schwab 2010-09-17 14:47 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 14:40 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >>> PT + SCHARS (string) can overflow here. Better check first rather than >>> later whether ZV - PT < SCHARS (string). >>> >>> Yes, I know that most-positive-fixnum <= MAX_INT/2 >> >> How do you "know" that? > > Don't the Lisp integers use a bit for the type tag? most-positive-fixnum is a variable defined in `data.c'. Its value is 2305843009213693951 Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 14:40 ` Andreas Schwab @ 2010-09-17 14:47 ` Lars Magne Ingebrigtsen 2010-09-17 15:10 ` Andreas Schwab 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 14:47 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: >> Don't the Lisp integers use a bit for the type tag? > > most-positive-fixnum is a variable defined in `data.c'. > Its value is 2305843009213693951 And by that you mean "yes" or "no"? (format "%x" most-positive-fixnum) => "1fffffffffffffff" That's at least a few bits less than MAX_INT, isn't it? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 14:47 ` Lars Magne Ingebrigtsen @ 2010-09-17 15:10 ` Andreas Schwab 2010-09-17 15:16 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 15:10 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >>> Don't the Lisp integers use a bit for the type tag? >> >> most-positive-fixnum is a variable defined in `data.c'. >> Its value is 2305843009213693951 > > And by that you mean "yes" or "no"? > > (format "%x" most-positive-fixnum) > => "1fffffffffffffff" > > That's at least a few bits less than MAX_INT, isn't it? $ printf '#include <limits.h>\nINT_MAX\n' | gcc -E -xc - | tail -n1 2147483647 Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:10 ` Andreas Schwab @ 2010-09-17 15:16 ` Lars Magne Ingebrigtsen 2010-09-17 15:39 ` Andreas Schwab 2010-09-17 16:14 ` Eli Zaretskii 0 siblings, 2 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 15:16 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: >>>> Don't the Lisp integers use a bit for the type tag? >>> >>> most-positive-fixnum is a variable defined in `data.c'. >>> Its value is 2305843009213693951 >> >> And by that you mean "yes" or "no"? >> >> (format "%x" most-positive-fixnum) >> => "1fffffffffffffff" >> >> That's at least a few bits less than MAX_INT, isn't it? > > $ printf '#include <limits.h>\nINT_MAX\n' | gcc -E -xc - | tail -n1 > 2147483647 You're being rather gnomic. That most-positive-fixnum is a 64-bit number in your Emacs, but that you have an include file somewhere that says that INT_MAX is a 32-bit number doesn't really make much sense. On a 32-bit machine, this is what I get. (format "%x" most-positive-fixnum) => "fffffff" Instead of posting these snippets, it would make the discussion go much quicker if you actually said what it is you were trying to convey by posting these numbers without comment. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:16 ` Lars Magne Ingebrigtsen @ 2010-09-17 15:39 ` Andreas Schwab 2010-09-17 15:42 ` Lars Magne Ingebrigtsen 2010-09-17 16:14 ` Eli Zaretskii 1 sibling, 1 reply; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 15:39 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > You're being rather gnomic. That most-positive-fixnum is a 64-bit > number in your Emacs, but that you have an include file somewhere that > says that INT_MAX is a 32-bit number doesn't really make much sense. Which part of "int is a 32-bit type" do you not understand? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:39 ` Andreas Schwab @ 2010-09-17 15:42 ` Lars Magne Ingebrigtsen 2010-09-17 16:04 ` Andreas Schwab 0 siblings, 1 reply; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 15:42 UTC (permalink / raw) To: emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: > Which part of "int is a 32-bit type" do you not understand? The part where you said "int is a 32-bit type, but Lisp Integers aren't"? See how much faster a discussion can go if you use words? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:42 ` Lars Magne Ingebrigtsen @ 2010-09-17 16:04 ` Andreas Schwab 0 siblings, 0 replies; 97+ messages in thread From: Andreas Schwab @ 2010-09-17 16:04 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >> Which part of "int is a 32-bit type" do you not understand? > > The part where you said "int is a 32-bit type, but Lisp Integers > aren't"? DK> Yes, I know that most-positive-fixnum <= MAX_INT/2 Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 15:16 ` Lars Magne Ingebrigtsen 2010-09-17 15:39 ` Andreas Schwab @ 2010-09-17 16:14 ` Eli Zaretskii 2010-09-17 19:22 ` James Cloos 1 sibling, 1 reply; 97+ messages in thread From: Eli Zaretskii @ 2010-09-17 16:14 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 17 Sep 2010 17:16:25 +0200 > > Andreas Schwab <schwab@linux-m68k.org> writes: > > >>>> Don't the Lisp integers use a bit for the type tag? > >>> > >>> most-positive-fixnum is a variable defined in `data.c'. > >>> Its value is 2305843009213693951 > >> > >> And by that you mean "yes" or "no"? > >> > >> (format "%x" most-positive-fixnum) > >> => "1fffffffffffffff" > >> > >> That's at least a few bits less than MAX_INT, isn't it? > > > > $ printf '#include <limits.h>\nINT_MAX\n' | gcc -E -xc - | tail -n1 > > 2147483647 > > You're being rather gnomic. Psst, Lars: it's pointless to ask Andreas for human-readable explanations. You won't get them. He enjoys to get you puzzled. The issue here is that EMACS_INT can be a 64-bit type (on a 64-bit host), but MAX_INT is always the maximum possible value of a 32-bit int, even on a 64-bit machine. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 16:14 ` Eli Zaretskii @ 2010-09-17 19:22 ` James Cloos 0 siblings, 0 replies; 97+ messages in thread From: James Cloos @ 2010-09-17 19:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lars Magne Ingebrigtsen, emacs-devel >>>>> "EZ" == Eli Zaretskii <eliz@gnu.org> writes: EZ> The issue here is that EMACS_INT can be a 64-bit type (on a 64-bit EZ> host), but MAX_INT is always the maximum possible value of a 32-bit EZ> int, even on a 64-bit machine. More precisely that many 64 bit systems have sizeof(int)*CHAR_BIT == 32. Alpha is an obvious exception. It would have been much easier had someone just wrote s/INT_MAX/LONG_MAX/g. (Except on DOZE64, of course, where sizeof(long)==sizeof(int)==4.) -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 13:09 ` Lars Magne Ingebrigtsen 2010-09-17 13:31 ` David Kastrup @ 2010-09-17 17:40 ` Stephen J. Turnbull 2010-09-17 19:40 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 97+ messages in thread From: Stephen J. Turnbull @ 2010-09-17 17:40 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel Lars Magne Ingebrigtsen writes: > I think that's a rather ... stunning approach, but might explain > many things about XEmacs. I rather suspect it does. ;-) > I've presented a use case, and I've demonstrated how all the > alternative implementations are 50-150% slower than my suggested > new implementation, and I've done the implementation, which turned > out to be totally trivial. Nothing that deals with Emacs strings or buffers is totally trivial, as a general principle. Then, again, it looks like David has discovered at least one bug (texts with different values of multibyteness), maybe more (bounds checking and integral type confusion), in your "totally trivial" implementation already. > What more do you need? First, I'm curious which machine and what data (buffer) you're using that took 9 seconds to run that benchmark. When I ran the benchmark (environment described below) using XEmacs on a 1.8MHz Opteron machine (quad core, but XEmacs can't take advantage of it) I discovered that 10,000 iterations takes ~300ms uncompiled, ~200ms compiled, and ~150ms compiled and with the call to `search-forward' in question replaced with a call to `ignore'. I don't see a win here unless you're really calling that function 10,000 times or more, and in a very tight loop. So, is it really Gnus's habit to execute that form 10,000 times in a loop so that its execution time dominates the user's perceived lag time? I bet that most uses involve parsing 20-40 RFC 822-style headers, and the rest parse lines lazily. If so, even the reported 9 second benchmark really amounts to a total of 50-100ms, which is less than the "just noticable difference" for a fast human. ***** You can stop reading unless you really want to know the details. ***** Specifically, profiling 10000 iterations in a 997-byte buffer containing three instances of "^(defun " (none at BOB) returned almost faster than I can detect (288ms), based on (defun is-the-string-following-point-equal-to-this-string-p (s) (search-forward s (+ (point) (length s)) t)) (defun test () (while (or (is-the-string-following-point-equal-to-this-string-p "(defun ") (search-forward "\n(defun " nil t)) (forward-line 1))) (profile-expression '(let ((i 10000)) (while (> i 0) (goto-char (point-min)) (test) (setq i (1- i))))) giving the profiling results below (note, all functions defined above are *uncompiled*). (Note that the total in the Ticks column may not be the sum of the Ticks; this apparently has to do with reentering the profiler and Ben didn't think it was worth fixing it.) Function Name Ticks/Total %Usage Calls GC-Usage/ Total ========================/===== ====== ===== ========/======= search-forward 121/ 121 42.160 80000 (profile overhead) 101/ 101 35.192 is-the-string-following-point-equal-to-this-string-p 24/ 129 8.362 40000 while 11/ 288 3.833 10001 or 8/ 231 2.787 40000 + 6/ 6 2.091 40000 forward-line 5/ 5 1.742 30000 setq 5/ 6 1.742 10000 point 2/ 2 0.697 40000 test 2/ 262 0.697 10000 length 1/ 1 0.348 40000 > 1/ 1 0.348 10001 let 0/ 288 0.000 1 point-min 0/ 0 0.000 10000 1- 0/ 0 0.000 10000 goto-char 0/ 0 0.000 10000 ------------------------------------------------------------ Total 288 100.000 380005 0 Ticks/Total = Ticks this function/this function and descendants Calls = Number of calls to this function GC-Usage/Total = Lisp allocation this function/this function and descendants One tick = 1 ms Compiled (including the profiling expression, that's `test1'), the result is Function Name Ticks/Total %Usage Calls GC-Usage/ Total ========================/===== ====== ===== ========/======= search-forward 112/ 112 59.574 80000 (profile overhead) 42/ 42 22.340 is-the-string-following-point-equal-to-this-string-p 16/ 85 8.511 40000 test 13/ 181 6.915 10000 test1 5/ 190 2.660 1 ------------------------------------------------------------ Total 190 100.000 130003 0 Ticks/Total = Ticks this function/this function and descendants Calls = Number of calls to this function GC-Usage/Total = Lisp allocation this function/this function and descendants One tick = 1 ms And here's the result with the search-forward in the predicate replaced with ignore (which apparently conses because of the &rest argument, and I'm not sure why the number is so huge): Function Name Ticks/Total %Usage Calls GC-Usage/ Total ========================/===== ====== ===== ========/======= search-forward 74/ 74 44.578 40000 (profile overhead) 32/ 32 19.277 test 24/ 158 14.458 10000 0/3840000 ignore 19/ 22 11.446 40000 3840000/3840000 is-the-string-following-point-equal-to-this-string-p 11/ 46 6.627 40000 0/3840000 test1 6/ 166 3.614 1 0/3840000 ------------------------------------------------------------ Total 0 100.000 130003 3840000 Ticks/Total = Ticks this function/this function and descendants Calls = Number of calls to this function GC-Usage/Total = Lisp allocation this function/this function and descendants One tick = 1 ms ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-17 17:40 ` Stephen J. Turnbull @ 2010-09-17 19:40 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 97+ messages in thread From: Lars Magne Ingebrigtsen @ 2010-09-17 19:40 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > Then, again, it looks like David has discovered at least one bug > (texts with different values of multibyteness), maybe more (bounds > checking and integral type confusion), in your "totally trivial" > implementation already. Well, I didn't say the trivial code was bug-free. :-) > First, I'm curious which machine and what data (buffer) you're using > that took 9 seconds to run that benchmark. I was running it over the gnus-sum.el file. > So, is it really Gnus's habit to execute that form 10,000 times in a > loop so that its execution time dominates the user's perceived lag > time? I bet that most uses involve parsing 20-40 RFC 822-style > headers, and the rest parse lines lazily. If so, even the reported 9 > second benchmark really amounts to a total of 50-100ms, which is less > than the "just noticable difference" for a fast human. The reason I thought of this again now is that I'm doing IMAP handling. The only way in IMAP to get info on marks and stuff it to get one line per message, so if you have a 100K mail box, it's going to take some time to sync up your marks. Your example of "20-40" is somewhat irrelevant. Of course everything is fast enough if you just have little enough data. The problem is getting things to work fast enough in the presence of a lot of data. -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 14:59 ` Stefan Monnier 2010-09-15 15:09 ` Lars Magne Ingebrigtsen @ 2010-09-15 15:46 ` Helmut Eller 2010-09-15 16:28 ` Thomas Lord 1 sibling, 1 reply; 97+ messages in thread From: Helmut Eller @ 2010-09-15 15:46 UTC (permalink / raw) To: emacs-devel * Stefan Monnier [2010-09-15 14:59] writes: >>> - The main problem with Emacs regexps right now is that they have >>> pathological cases where the match-time is enormous (potentially >>> exponential explosion in the size of the input string). To be >>> worthwhile a replacement should address this problem, which basically >>> needs it should not be based on backtracking. >> Is it possible (theoretically) to implement all of Emacs regexps without >> backtracking? In particular those with back-references (\N) seem >> problematic. Or is it necessary to recognize "optimizable" regexps >> before using a different regexp engine? > > IIRC regexps without back-refs can be matched (and searched) in O(N) > where N is the length of the input. With back-refs, I think (not sure) > the theoretical bound is O(N^2), which requires > a non-backtracking algorithm. > > So yes, we'd need to handle back-refs specially. Several regexp engines > do that already (they have a few different inner engines and choose > which one to use based on the particular regexp at hand). After googleing a bit I found this page http://swtch.com/~rsc/regexp/regexp1.html which again links to this http://perl.plover.com/NPC/NPC-3SAT.html which says that regexp matching with backreferences is NP-complete. Cox (the first page) seems to say that backtracking-with-memoization is linear time at the expense of O(N) space. Helmut ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 15:46 ` Helmut Eller @ 2010-09-15 16:28 ` Thomas Lord 0 siblings, 0 replies; 97+ messages in thread From: Thomas Lord @ 2010-09-15 16:28 UTC (permalink / raw) To: Helmut Eller; +Cc: emacs-devel On Wed, 2010-09-15 at 17:46 +0200, Helmut Eller wrote: > * Stefan Monnier [2010-09-15 14:59] writes: > > IIRC regexps without back-refs can be matched (and searched) in O(N) > > where N is the length of the input. Not quite. Let R be the length of the pattern and L be the length of the string we seek to match. Assume that the pattern is a true regular expression (no back-references, no sub-expression position reporting, etc.) The problem itself is O(R * L): no algorithm can guarantee doing better than the product of the two lengths. There is an algorithm (compiling the pattern to a DFA first) which is O(2^R + L). Formally it is suboptimal. There is a catch though. For many common patterns, either R is very small or we don't actually need anything like 2^R steps to compile the pattern, and so the *expected* case (for these patterns) is O(L). The Thompson "DFA caching algorithm" described briefly in the Dragon compiler book is quite attractive because it offers (for true regular expressions) a worst case complexity of O(R * L) ... so it is optimal ... yet also delivers much closer to O(L) for many common patterns. The algorithm is a bit unattractive because it is tricky to implement well, very hard to generalize beyond true regular expressions, and can easily lose to more naive NFA algorithms by small but annoying constant factors. > With back-refs, I think (not sure) > > the theoretical bound is O(N^2), which requires > > a non-backtracking algorithm. > > So yes, we'd need to handle back-refs specially. Several regexp engines > > do that already (they have a few different inner engines and choose > > which one to use based on the particular regexp at hand). > > After googleing a bit I found this page > http://swtch.com/~rsc/regexp/regexp1.html > which again links to this > http://perl.plover.com/NPC/NPC-3SAT.html > which says that regexp matching with backreferences is NP-complete. The best known algorithms are not O(N^2) but, rather, O(2^N) in the worst case. It's dismal how such an innocent seeming feature can cause such havoc. > Cox (the first page) seems to say that backtracking-with-memoization is > linear time at the expense of O(N) space. If he said that regarding worst case performance and without mentioning some specific subset of true regular expressions that he's talking about, he must have made a mistake. For true regular expressions, the simpler case, you can not beat O(R * L) time as the worst case. -t ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: Compiling Elisp to a native code with a GCC plugin 2010-09-15 14:07 ` Stefan Monnier 2010-09-15 14:27 ` Helmut Eller @ 2010-09-15 21:04 ` Leo 1 sibling, 0 replies; 97+ messages in thread From: Leo @ 2010-09-15 21:04 UTC (permalink / raw) To: emacs-devel On 2010-09-15 15:07 +0100, Stefan Monnier wrote: > - The main problem with Emacs regexps right now is that they have > pathological cases where the match-time is enormous (potentially > exponential explosion in the size of the input string). To be > worthwhile a replacement should address this problem, which basically > needs it should not be based on backtracking. Any good free regexp libs that suffer no such problem? I have found http://code.google.com/p/re2/ by google but it is written in C++. Leo ^ permalink raw reply [flat|nested] 97+ messages in thread
end of thread, other threads:[~2010-09-18 15:58 UTC | newest] Thread overview: 97+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-09-14 19:12 Compiling Elisp to a native code with a GCC plugin Wojciech Meyer 2010-09-14 19:32 ` Tom Tromey 2010-09-14 19:45 ` Wojciech Meyer 2010-09-14 20:17 ` Lars Magne Ingebrigtsen 2010-09-14 20:52 ` Wojciech Meyer 2010-09-14 20:55 ` Tom Tromey 2010-09-14 21:05 ` Wojciech Meyer 2010-09-14 20:44 ` Tom Tromey 2010-09-14 21:00 ` Wojciech Meyer 2010-09-14 21:16 ` Tom Tromey 2010-09-14 21:29 ` Wojciech Meyer 2010-09-14 21:59 ` Tom Tromey 2010-09-14 22:37 ` Wojciech Meyer 2010-09-14 22:55 ` Tom Tromey 2010-09-14 23:33 ` Wojciech Meyer 2010-09-15 1:38 ` Tom Tromey 2010-09-14 22:49 ` Wojciech Meyer 2010-09-14 23:13 ` Thomas Lord 2010-09-14 23:42 ` Wojciech Meyer 2010-09-15 10:47 ` Leo 2010-09-15 11:41 ` Andreas Schwab 2010-09-15 12:10 ` Wojciech Meyer 2010-09-15 14:07 ` Stefan Monnier 2010-09-15 14:27 ` Helmut Eller 2010-09-15 14:59 ` Stefan Monnier 2010-09-15 15:09 ` Lars Magne Ingebrigtsen 2010-09-15 15:31 ` Andreas Schwab 2010-09-15 15:35 ` Lars Magne Ingebrigtsen 2010-09-15 16:28 ` Andreas Schwab 2010-09-16 16:57 ` Lars Magne Ingebrigtsen 2010-09-15 15:42 ` Stefan Monnier 2010-09-15 15:51 ` Lars Magne Ingebrigtsen 2010-09-15 15:57 ` Leo 2010-09-15 16:01 ` Lars Magne Ingebrigtsen 2010-09-15 16:05 ` David Kastrup 2010-09-15 16:23 ` Leo 2010-09-15 16:37 ` David Kastrup 2010-09-16 16:58 ` Lars Magne Ingebrigtsen 2010-09-16 21:11 ` Andreas Schwab 2010-09-16 23:17 ` Lars Magne Ingebrigtsen 2010-09-17 8:13 ` Eli Zaretskii 2010-09-17 13:17 ` Lars Magne Ingebrigtsen 2010-09-17 13:30 ` Eli Zaretskii 2010-09-17 13:34 ` Lars Magne Ingebrigtsen 2010-09-16 17:35 ` Lars Magne Ingebrigtsen 2010-09-16 2:57 ` Stephen J. Turnbull 2010-09-16 6:54 ` David Kastrup 2010-09-16 8:10 ` Stephen J. Turnbull 2010-09-16 8:31 ` David Kastrup 2010-09-16 17:01 ` Lars Magne Ingebrigtsen 2010-09-17 6:52 ` Stephen J. Turnbull 2010-09-17 13:09 ` Lars Magne Ingebrigtsen 2010-09-17 13:31 ` David Kastrup 2010-09-17 13:39 ` Lars Magne Ingebrigtsen 2010-09-17 13:55 ` David Kastrup 2010-09-17 14:18 ` Lars Magne Ingebrigtsen 2010-09-17 14:57 ` David Kastrup 2010-09-17 15:06 ` Lars Magne Ingebrigtsen 2010-09-17 15:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:11 ` Eli Zaretskii 2010-09-17 16:33 ` David Kastrup 2010-09-17 16:41 ` Andreas Schwab 2010-09-17 17:17 ` David Kastrup 2010-09-17 18:24 ` David Kastrup 2010-09-17 20:30 ` David Kastrup 2010-09-17 20:49 ` Lars Magne Ingebrigtsen 2010-09-18 4:31 ` David Kastrup 2010-09-17 18:53 ` Stephen J. Turnbull 2010-09-17 20:57 ` Eli Zaretskii 2010-09-18 14:19 ` Stephen J. Turnbull 2010-09-18 15:46 ` Eli Zaretskii 2010-09-18 15:58 ` Stefan Monnier 2010-09-17 17:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:11 ` David Kastrup 2010-09-17 16:18 ` Eli Zaretskii 2010-09-17 16:24 ` Lars Magne Ingebrigtsen 2010-09-17 16:39 ` Eli Zaretskii 2010-09-17 17:30 ` Lars Magne Ingebrigtsen 2010-09-17 18:49 ` Eli Zaretskii 2010-09-17 16:39 ` Eli Zaretskii 2010-09-17 13:49 ` Andreas Schwab 2010-09-17 13:55 ` Lars Magne Ingebrigtsen 2010-09-17 14:31 ` Wojciech Meyer 2010-09-17 14:40 ` Andreas Schwab 2010-09-17 14:47 ` Lars Magne Ingebrigtsen 2010-09-17 15:10 ` Andreas Schwab 2010-09-17 15:16 ` Lars Magne Ingebrigtsen 2010-09-17 15:39 ` Andreas Schwab 2010-09-17 15:42 ` Lars Magne Ingebrigtsen 2010-09-17 16:04 ` Andreas Schwab 2010-09-17 16:14 ` Eli Zaretskii 2010-09-17 19:22 ` James Cloos 2010-09-17 17:40 ` Stephen J. Turnbull 2010-09-17 19:40 ` Lars Magne Ingebrigtsen 2010-09-15 15:46 ` Helmut Eller 2010-09-15 16:28 ` Thomas Lord 2010-09-15 21:04 ` Leo
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).