* State-machine based syntax highlighting @ 2006-12-07 6:14 spamfilteraccount 2006-12-07 10:53 ` Robert Thorpe 0 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 6:14 UTC (permalink / raw) I just read that in the text editor FTE does syntax highlighting can be defined with state-machines. Here's a LUA example I found: http://t-o-m-e.net/tmp/m_lua.fte Does anyone know the dis/advantages of this method compared to the regexp-based emacs approach? E.g. would it work faster than the current emacs implementation? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 6:14 State-machine based syntax highlighting spamfilteraccount @ 2006-12-07 10:53 ` Robert Thorpe 2006-12-07 11:56 ` spamfilteraccount 0 siblings, 1 reply; 28+ messages in thread From: Robert Thorpe @ 2006-12-07 10:53 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > I just read that in the text editor FTE does syntax highlighting can be > defined with state-machines. > > Here's a LUA example I found: http://t-o-m-e.net/tmp/m_lua.fte > > Does anyone know the dis/advantages of this method compared to the > regexp-based emacs approach? Regexp are state machines. Or, to be more precise the regexp engine compiles regexp it is given into discrete finite state machines. Defining state machines manually is usually worse than generating them from regexp normally, because a human cannot do the regexp optimizations that the regexp engine can. In my view the real way to improve Emacs syntax highlighting would be to make it based on parsing. > E.g. would it work faster than the current > emacs implementation? Do you have a problem with the speed of a regexp you have written? If so it's probably down to the regexp or the way you're trying to do things. Post the code here and someone may be able to help you. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 10:53 ` Robert Thorpe @ 2006-12-07 11:56 ` spamfilteraccount 2006-12-07 12:42 ` Robert Thorpe 0 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 11:56 UTC (permalink / raw) Robert Thorpe wrote: > > In my view the real way to improve Emacs syntax highlighting would be > to make it based on parsing. > Yes, it could be better, though in this case emacs would rely on external tools doing the actual parsing, because I don't think the syntax parsing of every language should be reimplemented in elisp. That's not a big deal, since if I need to work with c files then usually have the c compiler installed. The only thing needed is a compiler for the given language which outputs syntax information for the source file. I don't know if GCC can do that. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 11:56 ` spamfilteraccount @ 2006-12-07 12:42 ` Robert Thorpe 2006-12-07 14:27 ` spamfilteraccount 0 siblings, 1 reply; 28+ messages in thread From: Robert Thorpe @ 2006-12-07 12:42 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > Robert Thorpe wrote: > > > > In my view the real way to improve Emacs syntax highlighting would be > > to make it based on parsing. > > > > Yes, it could be better, though in this case emacs would rely on > external tools doing the actual parsing, because I don't think the > syntax parsing of every language should be reimplemented in elisp. > > That's not a big deal, since if I need to work with c files then > usually have the c compiler installed. No, it would probably have to be reimplemented inside Emacs. There are many differences between parsing an language in order to compile it and parsing a language in order to perform syntax highlighting and movement commands. In the later case you have to be able to tolerate expressions near to point that are incorrectly formatted because the user is still typing them. Also you have to be able to trigger the process from any given point, so that if the user jumps from line 1 to line 5794 you don't have to parse everything in the intervening code. Also, when highlighting you don't care about the contents of the code much. The Emacs "Semantic" package already does much of this, so do some other editors. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 12:42 ` Robert Thorpe @ 2006-12-07 14:27 ` spamfilteraccount 2006-12-07 14:39 ` Robert Thorpe 0 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 14:27 UTC (permalink / raw) Robert Thorpe wrote: > > The Emacs "Semantic" package already does much of this, so do some > other editors. It would make more sense to create one such parser than reimplementing parsing in every editor... I did a quick search and found this page http://harmonia.cs.berkeley.edu/harmonia/projects/harmonia-mode/doc/index.html with a demo xemacs package with syntax highlighting and stuff. Looked interesting. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 14:27 ` spamfilteraccount @ 2006-12-07 14:39 ` Robert Thorpe 2006-12-07 17:02 ` spamfilteraccount 0 siblings, 1 reply; 28+ messages in thread From: Robert Thorpe @ 2006-12-07 14:39 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > Robert Thorpe wrote: > > > > The Emacs "Semantic" package already does much of this, so do some > > other editors. > > It would make more sense to create one such parser than reimplementing > parsing in every editor... In many ways it would. But I expect it will be reimplemented in every editor, for several reasons:- * The insides of different editors work very differently * External dependencies make building harder and irritate people * Elisp is considerably nicer than many programming languages reimplementing is not so hard * Many people will make parsers as closed-source, or refuse to assign copyright to the FSF * People don't like helping other editors so they don't offer functionality in an easily usable form * GNU will not want to offer _parsers_ in an easily usable form, because doing so would allow proprietery compilers to be built very easily. I'm not saying this is necessarily the best way, but I expect it's what will happen. > I did a quick search and found this page > > > http://harmonia.cs.berkeley.edu/harmonia/projects/harmonia-mode/doc/index.html > > with a demo xemacs package with syntax highlighting and stuff. Looked > interesting. Interesting. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 14:39 ` Robert Thorpe @ 2006-12-07 17:02 ` spamfilteraccount 2006-12-07 17:42 ` Stefan Monnier [not found] ` <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 17:02 UTC (permalink / raw) Robert Thorpe wrote: > > > I did a quick search and found this page > > > > http://harmonia.cs.berkeley.edu/harmonia/projects/harmonia-mode/doc/index.html > > > > with a demo xemacs package with syntax highlighting and stuff. Looked > > interesting. > > Interesting. I wondered why they supported xemacs only, so I downloaded the source. Seems they wrote xemacs extensions in c which have to be compiled into xemacs. Not a usual way to extend an emacs, but probably advantageous from a performance point of view. The idea already occured to me that font locking should be implemented in pure c in emacs for speed, but I guess it's kind of against the extensible editor concept or something. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 17:02 ` spamfilteraccount @ 2006-12-07 17:42 ` Stefan Monnier [not found] ` <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 28+ messages in thread From: Stefan Monnier @ 2006-12-07 17:42 UTC (permalink / raw) >> > I did a quick search and found this page >> > http://harmonia.cs.berkeley.edu/harmonia/projects/harmonia-mode/doc/index.html >> > with a demo xemacs package with syntax highlighting and stuff. Looked >> > interesting. >> Interesting. > I wondered why they supported xemacs only, so I downloaded the source. > Seems they wrote xemacs extensions in c which have to be compiled into > xemacs. > Not a usual way to extend an emacs, but probably advantageous from a > performance point of view. > The idea already occured to me that font locking should be implemented > in pure c in emacs for speed, but I guess it's kind of against the > extensible editor concept or something. Actually, font-locking *is* implemented in C. The elisp part usually takes a negligible amount of time. The problem start appearing when the functionality of the C code is not sufficient and you start trying to parse the code in elisp, which is slow. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org>]
* Re: State-machine based syntax highlighting [not found] ` <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org> @ 2006-12-07 18:35 ` spamfilteraccount 2006-12-07 18:57 ` Robert Thorpe 2006-12-07 19:02 ` Stefan Monnier 0 siblings, 2 replies; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 18:35 UTC (permalink / raw) Stefan Monnier wrote: > > Actually, font-locking *is* implemented in C. The elisp part usually takes > a negligible amount of time. The problem start appearing when the > functionality of the C code is not sufficient and you start trying to parse > the code in elisp, which is slow. Good to know. I thought font-lock was implemented in elisp and didn't bother to check. BTW, I checked the situation in the enemy camp and seems they also have problems with performance: - The colors are wrong when scrolling bottom to top. Vim doesn't read the whole file to parse the text. It starts parsing wherever you are viewing the file. That saves a lot of time, but sometimes the colors are wrong. A simple fix is hitting CTRL-L. Or scroll back a bit and then forward again. For a real fix, see |:syn-sync|. Some syntax files have a way to make it look further back, see the help for the specific syntax file. For example, |tex.vim| for the TeX syntax. ... Displaying text in color takes a lot of effort. If you find the displaying too slow, you might want to disable syntax highlighting for a moment: :syntax clear When editing another file (or the same one) the colors will come back. http://vimdoc.sourceforge.net/htmldoc/usr_06.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 18:35 ` spamfilteraccount @ 2006-12-07 18:57 ` Robert Thorpe 2006-12-07 20:24 ` Perry Smith ` (2 more replies) 2006-12-07 19:02 ` Stefan Monnier 1 sibling, 3 replies; 28+ messages in thread From: Robert Thorpe @ 2006-12-07 18:57 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > Stefan Monnier wrote: > > > > Actually, font-locking *is* implemented in C. The elisp part usually takes > > a negligible amount of time. The problem start appearing when the > > functionality of the C code is not sufficient and you start trying to parse > > the code in elisp, which is slow. > > Good to know. I thought font-lock was implemented in elisp and didn't > bother to check. Precisely speaking... The code that determines what rules are used to font-lock text is in Elisp. The regexp engine that finds the things to be font-locked is in the core of Emacs. The colourisation is implemented in the Emacs core. Overall this means that most of the work is in the Emacs core. If parsing were to be used to support syntax highlighting then maybe some work would have to be done to avoid having to use Elisp. But I'm not sure since it would still require loads of regexps and they would probably still eat up a lot of the runtime. > BTW, I checked the situation in the enemy camp and seems they also have > problems with performance: Almost every editor does with both large files and syntactically complex languages. As far as I know, Emacs is a little slower than Vim at least in some cases. If you want to avoid the problem then use <4000 line files and write your programs in Lisp. Those are good things to do anyway ;) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 18:57 ` Robert Thorpe @ 2006-12-07 20:24 ` Perry Smith 2006-12-08 7:33 ` spamfilteraccount [not found] ` <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 28+ messages in thread From: Perry Smith @ 2006-12-07 20:24 UTC (permalink / raw) Cc: help-gnu-emacs [-- Attachment #1.1: Type: text/plain, Size: 1168 bytes --] On Dec 7, 2006, at 12:57 PM, Robert Thorpe wrote: > spamfilteraccount@gmail.com wrote: >> Stefan Monnier wrote: >>> >>> Actually, font-locking *is* implemented in C. The elisp part >>> usually takes >>> a negligible amount of time. The problem start appearing when the >>> functionality of the C code is not sufficient and you start >>> trying to parse >>> the code in elisp, which is slow. >> >> Good to know. I thought font-lock was implemented in elisp and didn't >> bother to check. > > Precisely speaking... > The code that determines what rules are used to font-lock text is in > Elisp. > The regexp engine that finds the things to be font-locked is in the > core of Emacs. > The colourisation is implemented in the Emacs core. Instead of a state machine, how about a lalr parser? It would be a fun project to take the lalr table generation logic from bison, smash it into emacs, along with some predefined actions and hooks back into emacs. The grammers could be loaded when needed. Perry Smith ( pedz@easesoftware.com ) Ease Software, Inc. ( http://www.easesoftware.com ) Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems [-- Attachment #1.2: Type: text/html, Size: 6796 bytes --] [-- Attachment #2: Type: text/plain, Size: 152 bytes --] _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 18:57 ` Robert Thorpe 2006-12-07 20:24 ` Perry Smith @ 2006-12-08 7:33 ` spamfilteraccount 2006-12-08 8:10 ` Tim X [not found] ` <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org> 2 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-08 7:33 UTC (permalink / raw) Robert Thorpe wrote: > > If parsing were to be used to support syntax highlighting then maybe > some work would have to be done to avoid having to use Elisp. But I'm > not sure since it would still require loads of regexps and they would > probably still eat up a lot of the runtime. > That may be true, but the advantage is that parsing actually understands code, not just matches it with some regexps, so it could be used for much more than syntax highlighting (some kind of error checking, code completion, etc.). I think if there are already parsers written in elisp they should be intergrated into the official emacs distribution (e.g. in directory lisp/parsers), so that packages can use them to understand the code better. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 7:33 ` spamfilteraccount @ 2006-12-08 8:10 ` Tim X 2006-12-08 8:36 ` spamfilteraccount ` (4 more replies) 0 siblings, 5 replies; 28+ messages in thread From: Tim X @ 2006-12-08 8:10 UTC (permalink / raw) "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes: > Robert Thorpe wrote: >> >> If parsing were to be used to support syntax highlighting then maybe >> some work would have to be done to avoid having to use Elisp. But I'm >> not sure since it would still require loads of regexps and they would >> probably still eat up a lot of the runtime. >> > > That may be true, but the advantage is that parsing actually > understands code, not just matches it with some regexps, so it could be > used for much more than syntax highlighting (some kind of error > checking, code completion, etc.). > > I think if there are already parsers written in elisp they should be > intergrated into the official emacs distribution (e.g. in directory > lisp/parsers), so that packages can use them to understand the code > better. > Have a look at http://cedet.sourceforge.net/ The combination of semantic and cedet is, amongst other things, aimed at providing parse based functionality for emacs. some of this is (I think) going to be bundled in with emacs 22. The idea is to provide a more powerful devleopment environment that can do things like code completion based on more than just abbrevs and dynamic completion based on recently used keywords and regexp. The problem with parse based analysis is that you need an in-built parser for all the languages that the editor is used to develop in and this is not a trivial task. I suspect some sort of plugin architecture that is able to use stand-alone parses for some language of interest would probably be the way to go as it is unlikely even a small subset of the languages devleoped within an emacs environment can have a parser developed in elisp which is readily maintained. Tim -- tcross (at) rapttech dot com dot au ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 8:10 ` Tim X @ 2006-12-08 8:36 ` spamfilteraccount 2006-12-08 16:17 ` Robert Thorpe 2006-12-08 13:14 ` Leo ` (3 subsequent siblings) 4 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-08 8:36 UTC (permalink / raw) Tim X wrote: > > The problem with parse based analysis is that you need an in-built > parser for all the languages that the editor is used to develop in and > this is not a trivial task. I suspect some sort of plugin architecture > that is able to use stand-alone parses for some language of interest > would probably be the way to go as it is unlikely even a small subset > of the languages devleoped within an emacs environment can have a > parser developed in elisp which is readily maintained. I think too that some kind of bridge or plugin architecture is the answer. Lots of languages provide access to syntax trees in some form (python, java, etc.), so it would be much simpler to use their native implementation than reinveinting everything in elisp. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 8:36 ` spamfilteraccount @ 2006-12-08 16:17 ` Robert Thorpe 2006-12-08 21:14 ` spamfilteraccount 2006-12-09 2:06 ` Stefan Monnier 0 siblings, 2 replies; 28+ messages in thread From: Robert Thorpe @ 2006-12-08 16:17 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > Tim X wrote: > > > > The problem with parse based analysis is that you need an in-built > > parser for all the languages that the editor is used to develop in and > > this is not a trivial task. I suspect some sort of plugin architecture > > that is able to use stand-alone parses for some language of interest > > would probably be the way to go as it is unlikely even a small subset > > of the languages devleoped within an emacs environment can have a > > parser developed in elisp which is readily maintained. > > I think too that some kind of bridge or plugin architecture is the > answer. > > Lots of languages provide access to syntax trees in some form (python, > java, etc.), so it would be much simpler to use their native > implementation than reinveinting everything in elisp. That isn't really appropriate though. Consider the following. When I open a project I generally open all files in the directory by doing something like C-x C-x project_foo/*.c . I also use save-places, so point appears in each file wherever I left it last. I think both of these are quite common ways to use Emacs. Doing this with normal parsing technology is difficult. If the editor just feeds every file into the external parser then back into the editor then this will be a lot of work. It would be similar to the work of a compiler doing a full rebuild. In fact it would be less because parsing for font-locking involves nothing similar to compiler optimization or code generation. But it would still be a big task. A much better strategy is to start parsing at point in each file and only parse a screenful at a time, doing this with an external parser would be very hard. There are other problems. What if a part of the code is incorrect? Imagine, in C for example, if a function were written "foo (;" on line 10. The effect of the error would propagate down far away from where it occurs, even line 300 might be treated wrongly. The parser would have to cope with this eventuality. Also, in many languages there are bits of the meaning that depend on the names used. In C for example the code " (foo) (bar)" means something different if foo is a type than it does if it's an identifier. The C compiler can cope with this because it tracks all typedefs and identifiers through not only the current file but those included in it with #include. The only way for a font-lock system based on a normal parser to deal with this situation would be for it to read all the include files, which may not even be present. Compiler parsers and font-locking/navigating code have different intentions. Compiler parsers must be fast when handling a whole file, and they must generate accurate error messages. Font-locking code must be fast when starting at any arbitrary part of the code, and it must tolerate incomplete information and errors. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 16:17 ` Robert Thorpe @ 2006-12-08 21:14 ` spamfilteraccount 2006-12-09 2:08 ` Stefan Monnier 2006-12-09 2:06 ` Stefan Monnier 1 sibling, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-08 21:14 UTC (permalink / raw) Robert Thorpe wrote: > > Doing this with normal parsing technology is difficult. If the editor > just feeds every file into the external parser then back into the > editor then this will be a lot of work. Yes, this is a problematic part. I was thinking about feeding only code snippets to the external parser, but even determining what snippet should be fed from the current source code would need some kind of parsing, so using external parsers might not be feasible after all. > Compiler parsers and font-locking/navigating code have different > intentions. Compiler parsers must be fast when handling a whole file, > and they must generate accurate error messages. Font-locking code must > be fast when starting at any arbitrary part of the code, and it must > tolerate incomplete information and errors. Of course, and I wasn't thinking of using the existing compiler as is, rather utilizing somehow the existing infrastructure in the compiler if it's accessible to implement partial parsing. But given the problems discussed above it may not be the way to go. If parsing needs to be implemented in the editor and every editor must have it's own implementation then at least the concepts could be shared. I mean there should be a wiki or something about discussing issues of partial parsing for a particular language (java, c++, etc.), instead of everyone reinventing the wheel differently. For example, one could check the current implementation in Eclipse of java code completion and parsing, before embarking to implement the same thing again and the same goes for other open source editors supporting other languages. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 21:14 ` spamfilteraccount @ 2006-12-09 2:08 ` Stefan Monnier 0 siblings, 0 replies; 28+ messages in thread From: Stefan Monnier @ 2006-12-09 2:08 UTC (permalink / raw) > I mean there should be a wiki or something about discussing issues of > partial parsing for a particular language (java, c++, etc.), instead of > everyone reinventing the wheel differently. Indeed. The only such info I know of is a paper about the implementation of Visual Haskell. I'd be interested to hear about the techniques used elsewhere. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 16:17 ` Robert Thorpe 2006-12-08 21:14 ` spamfilteraccount @ 2006-12-09 2:06 ` Stefan Monnier 2006-12-09 3:24 ` Lennart Borgman 1 sibling, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2006-12-09 2:06 UTC (permalink / raw) > A much better strategy is to start parsing at point in each file and only > parse a screenful at a time, doing this with an external parser would be > very hard. Starting the parse "at point" can be terribly difficult since you don't know the state of the parser at that point. You can infer the state by parsing backward (this is what the indentation code does typically), or by jumping to some previous spot assumed to have some known parsing state and then parse forward. If all you care about is indentation, then parsing backward gives the best results in terms of being robust in the face of partially incorrect code (because it only parses just as far back as necessary to determine indentation, so the resulting indentation behavior has some kind of locality quality to it). If you really need the full state because you're going to keep parsing forward some arbitrary distance, then you're better off jumping back to a "safe spot" and parsing forward from there. In some languages it's not so easy to figure out what are such safe spots other than the beginning of the file. But maybe with enough parse-state caching (à la syntax-ppss in Emacs-22, although beefed up to keep the state of a real parser), and with a fast enough forward parsing code, you can get away with always parsing "from the beginning of the buffer", although it then suffers from problems when faced with invalid/misunderstood source code. Of note: parsing backward can be fiendishly difficult because languages are designed without paying any attention to it. > There are other problems. What if a part of the code is incorrect? > Imagine, in C for example, if a function were written "foo (;" on line > 10. The effect of the error would propagate down far away from where > it occurs, even line 300 might be treated wrongly. The parser would > have to cope with this eventuality. What if it's correct, but only after passing through some special purpose preprocessor? > Also, in many languages there are bits of the meaning that depend on > the names used. In C for example the code " (foo) (bar)" means > something different if foo is a type than it does if it's an > identifier. The C compiler can cope with this because it tracks all > typedefs and identifiers through not only the current file but those > included in it with #include. The only way for a font-lock system > based on a normal parser to deal with this situation would be for it to > read all the include files, which may not even be present. > Compiler parsers and font-locking/navigating code have different > intentions. Compiler parsers must be fast when handling a whole file, > and they must generate accurate error messages. Font-locking code must > be fast when starting at any arbitrary part of the code, and it must > tolerate incomplete information and errors. Note that this last point can be seen as an advantage: we don't have to detect invalid code. I believe there are two alternatives: one is the way taken by things like Visual Haskell where you integrate the editor and the build system, so the editor can run the parser with the exact same args as the compiler would/will. I think this is a very workable solution and I hope it will be developped in Emacs as well. If you decide not to integrate the editor so closely with the build system (i.e. follow the way Emacs currently works), then you really can't reliably parse the buffer and thus can't reuse existing parsers. So you end up having to design a new parser for every language. For a real programming language, writing its grammar is a non-trivial task, so it can be a problem. The only thing that would save us is that we can be as permissive as we want: we don't have to reject invalid programs. Better: we can presume that the code is valid, and if it isn't we can anything we please. I hope Emacs will also develop in this direction. One good step in that direction would be to spice up syntax-table so that the basic syntactic elements can be bigger than a single-char (e.g. so as to handle begin...end). I've recently experimented with the use of an infix-precedence system where each infix "operator" can have a different left and right precedence, and then to try and use that to parse things like if/then/else (where `then' and `else' are seen as "infix" operators). It doesn't seem quite powerful enough for what I want (to indent Coq code in this case), but sufficiently close that maybe some minor extension will get me there. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-09 2:06 ` Stefan Monnier @ 2006-12-09 3:24 ` Lennart Borgman 0 siblings, 0 replies; 28+ messages in thread From: Lennart Borgman @ 2006-12-09 3:24 UTC (permalink / raw) Cc: help-gnu-emacs Stefan Monnier wrote: >> A much better strategy is to start parsing at point in each file and only >> parse a screenful at a time, doing this with an external parser would be >> very hard. >> I know nearly nothing about those things, but I have noticed that nxml-mode uses something I think was called an rng parser. It catches syntax errors in xml files as you type. Is that something useful? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 8:10 ` Tim X 2006-12-08 8:36 ` spamfilteraccount @ 2006-12-08 13:14 ` Leo 2006-12-08 14:00 ` Robert Thorpe ` (2 subsequent siblings) 4 siblings, 0 replies; 28+ messages in thread From: Leo @ 2006-12-08 13:14 UTC (permalink / raw) On FRI, 8 DEC 2006, Tim X. wrote: > Have a look at > > http://cedet.sourceforge.net/ > But this one is really slow. > ... going to be bundled in with emacs 22. And I only see speedbar bundled. -- Leo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 8:10 ` Tim X 2006-12-08 8:36 ` spamfilteraccount 2006-12-08 13:14 ` Leo @ 2006-12-08 14:00 ` Robert Thorpe 2006-12-09 2:10 ` Stefan Monnier [not found] ` <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org> 2006-12-08 21:17 ` spamfilteraccount 4 siblings, 1 reply; 28+ messages in thread From: Robert Thorpe @ 2006-12-08 14:00 UTC (permalink / raw) Tim X wrote: > "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes: > > Robert Thorpe wrote: > >> > >> If parsing were to be used to support syntax highlighting then maybe > >> some work would have to be done to avoid having to use Elisp. But I'm > >> not sure since it would still require loads of regexps and they would > >> probably still eat up a lot of the runtime. > >> > > > > That may be true, but the advantage is that parsing actually > > understands code, not just matches it with some regexps, so it could be > > used for much more than syntax highlighting (some kind of error > > checking, code completion, etc.). > > > > I think if there are already parsers written in elisp they should be > > intergrated into the official emacs distribution (e.g. in directory > > lisp/parsers), so that packages can use them to understand the code > > better. > > > > Have a look at > > http://cedet.sourceforge.net/ > > The combination of semantic and cedet is, amongst other things, aimed > at providing parse based functionality for emacs. some of this is (I > think) going to be bundled in with emacs 22. The idea is to provide a > more powerful devleopment environment that can do things like code > completion based on more than just abbrevs and dynamic completion > based on recently used keywords and regexp. > > The problem with parse based analysis is that you need an in-built > parser for all the languages that the editor is used to develop in and > this is not a trivial task. I suspect some sort of plugin architecture > that is able to use stand-alone parses for some language of interest > would probably be the way to go as it is unlikely even a small subset > of the languages devleoped within an emacs environment can have a > parser developed in elisp which is readily maintained. I think that would be a very difficult approach. If Emacs wants to keep it's portability then interfacing it with other programs is difficult. It's not as though the language parsers in compilers etc could be reused anyway, they are inappropriate. As far as I can see implementing a data-driven GLR parser into Emacs is the way to go. That way the parser could interface directly with the buffer. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 14:00 ` Robert Thorpe @ 2006-12-09 2:10 ` Stefan Monnier 0 siblings, 0 replies; 28+ messages in thread From: Stefan Monnier @ 2006-12-09 2:10 UTC (permalink / raw) > As far as I can see implementing a data-driven GLR parser into Emacs is > the way to go. That way the parser could interface directly with the > buffer. You may be right. After all, a GLR grammar for a language can be easily turned into a GLR grammar for the reversed language, so it could also be used for backward parsing, which I find to be important for indentation. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org>]
* Re: State-machine based syntax highlighting [not found] ` <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org> @ 2006-12-08 14:17 ` Robert Thorpe 0 siblings, 0 replies; 28+ messages in thread From: Robert Thorpe @ 2006-12-08 14:17 UTC (permalink / raw) Leo wrote: > On FRI, 8 DEC 2006, Tim X. wrote: > > > Have a look at > > > > http://cedet.sourceforge.net/ > > > > But this one is really slow. > > > ... going to be bundled in with emacs 22. > > And I only see speedbar bundled. This isn't new, Speedbar has been bundled since Emacs 20.3 at least. It's quite useful for some things. I used to use it, I will again once I get a really big monitor, so it doesn't take too much space. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-08 8:10 ` Tim X ` (3 preceding siblings ...) [not found] ` <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org> @ 2006-12-08 21:17 ` spamfilteraccount 4 siblings, 0 replies; 28+ messages in thread From: spamfilteraccount @ 2006-12-08 21:17 UTC (permalink / raw) Tim X wrote: > "spamfilteraccount@gmail.com" <spamfilteraccount@gmail.com> writes: > > Have a look at > > http://cedet.sourceforge.net/ > The last release date is June 2005. Is cedet dead? ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org>]
* Re: State-machine based syntax highlighting [not found] ` <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org> @ 2006-12-08 10:01 ` Robert Thorpe 0 siblings, 0 replies; 28+ messages in thread From: Robert Thorpe @ 2006-12-08 10:01 UTC (permalink / raw) Perry Smith wrote: > On Dec 7, 2006, at 12:57 PM, Robert Thorpe wrote: > > spamfilteraccount@gmail.com wrote: > >> Stefan Monnier wrote: > >>> > >>> Actually, font-locking *is* implemented in C. The elisp part > >>> usually takes > >>> a negligible amount of time. The problem start appearing when the > >>> functionality of the C code is not sufficient and you start > >>> trying to parse > >>> the code in elisp, which is slow. > >> > >> Good to know. I thought font-lock was implemented in elisp and didn't > >> bother to check. > > > > Precisely speaking... > > The code that determines what rules are used to font-lock text is in > > Elisp. > > The regexp engine that finds the things to be font-locked is in the > > core of Emacs. > > The colourisation is implemented in the Emacs core. > > Instead of a state machine, how about a lalr parser? It would be a fun > project to take the lalr table generation logic from bison, smash it > into emacs, along with some predefined actions and hooks back > into emacs. The grammers could be loaded when needed. Yes. I've thought about doing that myself, even better would be the GLR parser system in recent versions of Bison. It is capable of parsing any context-free grammar. I haven't got enough time to work on such a thing for Emacs myself though, unfortunately. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 18:35 ` spamfilteraccount 2006-12-07 18:57 ` Robert Thorpe @ 2006-12-07 19:02 ` Stefan Monnier 2006-12-07 19:29 ` spamfilteraccount 1 sibling, 1 reply; 28+ messages in thread From: Stefan Monnier @ 2006-12-07 19:02 UTC (permalink / raw) > Good to know. I thought font-lock was implemented in elisp and didn't > bother to check. If you look at the code you'll probably think it's implemented in elisp. But if you look at a profile, you'll probably see that it's spending most of its time in either text-property manipulation functions, or regexp-matching, or parse-partial-sexp, all of which are written in C. Stefan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 19:02 ` Stefan Monnier @ 2006-12-07 19:29 ` spamfilteraccount 2006-12-08 14:43 ` Robert Thorpe 0 siblings, 1 reply; 28+ messages in thread From: spamfilteraccount @ 2006-12-07 19:29 UTC (permalink / raw) Stefan Monnier wrote: > > Good to know. I thought font-lock was implemented in elisp and didn't > > bother to check. > > If you look at the code you'll probably think it's implemented in elisp. > But if you look at a profile, you'll probably see that it's spending most of > its time in either text-property manipulation functions, or > regexp-matching, or parse-partial-sexp, all of which are written in C. You wrote VIM is a little faster than Emacs. Is it because of the time spent in the elisp part in emacs or the C part itself is implemented more efficiently in VIM? If it's the latter then the C implementations could be compared to see what VIM does better. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: State-machine based syntax highlighting 2006-12-07 19:29 ` spamfilteraccount @ 2006-12-08 14:43 ` Robert Thorpe 0 siblings, 0 replies; 28+ messages in thread From: Robert Thorpe @ 2006-12-08 14:43 UTC (permalink / raw) spamfilteraccount@gmail.com wrote: > Stefan Monnier wrote: > > > Good to know. I thought font-lock was implemented in elisp and didn't > > > bother to check. > > > > If you look at the code you'll probably think it's implemented in elisp. > > But if you look at a profile, you'll probably see that it's spending most of > > its time in either text-property manipulation functions, or > > regexp-matching, or parse-partial-sexp, all of which are written in C. > > You wrote VIM is a little faster than Emacs. No, I said that. > Is it because of the time > spent in the elisp part in emacs or the C part itself is implemented > more efficiently in VIM? I doubt it's the Elisp part since it is not normally a major component of the runtime in font-locking. Also, Vim has it's own simple language for describing syntax highlighting. > If it's the latter then the C implementations could be compared to see > what VIM does better. Yes. There are many bits of Emacs where the performance could be improved. What is the problem you're seeing with performance anyway? Generally to even see the font-locking occur I have to set up some quite artificial situation, and the computers I use aren't that modern or fast. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2006-12-09 3:24 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-07 6:14 State-machine based syntax highlighting spamfilteraccount 2006-12-07 10:53 ` Robert Thorpe 2006-12-07 11:56 ` spamfilteraccount 2006-12-07 12:42 ` Robert Thorpe 2006-12-07 14:27 ` spamfilteraccount 2006-12-07 14:39 ` Robert Thorpe 2006-12-07 17:02 ` spamfilteraccount 2006-12-07 17:42 ` Stefan Monnier [not found] ` <mailman.1644.1165513359.2155.help-gnu-emacs@gnu.org> 2006-12-07 18:35 ` spamfilteraccount 2006-12-07 18:57 ` Robert Thorpe 2006-12-07 20:24 ` Perry Smith 2006-12-08 7:33 ` spamfilteraccount 2006-12-08 8:10 ` Tim X 2006-12-08 8:36 ` spamfilteraccount 2006-12-08 16:17 ` Robert Thorpe 2006-12-08 21:14 ` spamfilteraccount 2006-12-09 2:08 ` Stefan Monnier 2006-12-09 2:06 ` Stefan Monnier 2006-12-09 3:24 ` Lennart Borgman 2006-12-08 13:14 ` Leo 2006-12-08 14:00 ` Robert Thorpe 2006-12-09 2:10 ` Stefan Monnier [not found] ` <mailman.1672.1165586758.2155.help-gnu-emacs@gnu.org> 2006-12-08 14:17 ` Robert Thorpe 2006-12-08 21:17 ` spamfilteraccount [not found] ` <mailman.1653.1165523111.2155.help-gnu-emacs@gnu.org> 2006-12-08 10:01 ` Robert Thorpe 2006-12-07 19:02 ` Stefan Monnier 2006-12-07 19:29 ` spamfilteraccount 2006-12-08 14:43 ` Robert Thorpe
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.